More by Alex Hidalgo:
The SLO Book Cheat Sheet| Author: Alex Hidalgo
Avg. reading time: 3 minutes
Today I am joining Nobl9 as their principal site reliability engineer. My goal in life is to make people happy, and I think this company is following a noble pursuit to make this happen. Let me tell you the story about how my journey has led me here.
About a decade ago I ended up at Google, thrust into the world of site reliability engineering almost by accident. I had just started a job as a technical operations engineer at Admeld but they were acquired by Google only a few months later. Suddenly I found myself surrounded by colorful and playful office surroundings, free meals, and some of the most brilliant people on the planet.
When Nobl9 and I found each other, I examined the mission and the product. All I could think was: “Wow. This could save so many people so much time, make so many businesses happier, and so many services more reliable!”
At first, it wasn’t entirely clear to me what site reliability engineering actually was. Sure, I read some definitions and took some onboarding classes, but exactly how this role differed from the other operational work I had been doing wasn’t immediately apparent.
And then I was introduced to the bedrock of SRE: service level objectives.
Service level objectives are about many things, and if you’re reading this I’ll assume you don’t need a primer (but if you do, my book Implementing Service Level Objectives was just published last week). The most immediate thing that stood out to me was: nothing is ever perfect and no one needs anything to be perfect anyway. After I had parsed through the math and error budget policies and all of the other things that make SLOs at Google work, I realized I had known most of these concepts all along and that most people already do, too.
I’ve had an interesting set of careers — by number more of them outside of tech than within. And at every single one of those jobs I realized I had always implemented my own version of service level objectives, even if that’s not what I called them at the time.
When I was a bartender, my goal was to greet new customers within one minute of them approaching the bar. I knew I couldn’t always hit this goal because sometimes things got really busy, but I knew that if I hit it often enough I’d have a good shift with happy customers and walk out with a pocket filled with tips.
When I worked the floor of a furniture store, we had a rule that you greeted customers within 30 seconds, but you never approached them until they had been inside for about five minutes. You don’t want to scare away a potential sale by being too direct. Again, it was clear these goals wouldn’t always be met, but as long as they were met often enough I knew we’d have a good sales day.
When I was a DJ I knew that it was my job to play the hits. I wasn’t a trendsetter — I was paid to play Top 40 hits to mainstream crowds. So, that’s what I did. But sometimes I had to wonder, “What if this new song is a thing people would like?” so I’d let myself have a certain number of experimental picks every evening. As long as I played the hits everyone wanted to hear often enough I knew the crowd would be happy that night and come back to dance with me again.
All of these concepts I’ve described: they’re just service level objectives. They’re achievable targets for what kind of service you’re aiming to provide customers, and ones that will ensure they’re happy, too. Turns out that humans know things fail and they’re okay with that, as long as things don’t fail too often. This helps you strike a balance between meeting user needs and not over-extending yourself or your resources.
After Google I joined Squarespace and was told, “You know SLOs! We want them, too! Can you help?” So I thought to myself, “Sure!” and I got to work. But it turns out there is so much you need to actually do when you’re starting from scratch. What I thought would take me a few months ended up taking me just about my entire two-year tenure. That’s why I wrote the book: in a way it’s the story of my time at Squarespace, how I helped bring these concepts from obscurity to company-wide goals, and how the adoption of these processes made people’s lives better and products more reliable.
When Nobl9 and I found each other, I examined the mission and the product. All I could think was: “Wow. This could save so many people so much time, make so many businesses happier, and so many services more reliable!”
I’ve always said that I want to leave this industry a better place than I found it. I hope I’ve done that in some small ways already. But if I can help build the Nobl9 product into what I know it could be, I’ll know I’ve truly accomplished that goal. SLOs are about people. They’re about having reasonable expectations and not spending too many resources trying to be perfect. This in turn can help you with your feature velocity and a better bottom-line. This all sounds simple enough, but for complex computer services this is much easier said than done. At Nobl9 I hope to help make this as easy as possible for everyone. I’m incredibly excited to continue my journey here.
Do you want to add something? Leave a comment