An Intellyx BrainBlog for Nobl9, by Jason English
“SLOs are a cultural shift. It is about understanding our systems from our users’ perspective, ensuring that what we build provides the quality of service our users deserve. It is about measuring and tracking what matters.”
Thomas Césaré-Herriau in the Brex Tech Blog
Customer expectations of better service from our applications have never been higher.
In traditional technology vendor agreements, customer expectations were outlined as SLAs (service level agreements), which represented a boundary or contractual definition of a system’s failure to meet the customer’s minimum requirements, rather than a continuum of improvement toward customer satisfaction.
In our previous BrainBlog, we discussed how application delivery and business leaders can align their teams’ incentives around SLOs (service level objectives) to better define and collaborate on goals to continuously improve site reliability and performance, and ultimately, better serve customers.
While our journey toward successful SLOs involves lots of intra-organizational measurement and collaboration, what if the intended consumers of our applications also become wise to the concept of SLOs?
Find out what customers want
“If I had asked people what they wanted, they would have said faster horses.”
- Henry Ford NEVER said this!
Many solution-led founders have paraphrased this fake ‘faster horses’ parable as an example of why customer input can’t be trusted as a source of guidance for the business, when we innately know customer feedback is the most valuable input we can get.
Unfortunately, the opportunities for gathering meaningful feedback on our applications from customers are few and far between.
Customers don’t contact support to make suggestions — they do so to report when something is going wrong. Customer issues are very valuable input, and provide great opportunities to demonstrate responsiveness, but they can easily lead organizations to playing ‘whack-a-mole’ on trouble tickets rather than focusing on continuous improvement.
Inserting a survey or feedback form won’t tell you much either. Just under 50% of customers are willing to take an online survey “if it doesn’t take too long” and two-thirds have abandoned surveys — says yet another customer survey on survey fatigue.
Too-frequent requests for customer feedback are a nuisance and offering incentives to complete surveys leads to poor quality information anyway.
The golden standard of customer loyalty is the NPS (or net promoter score) which asks a customer if they would recommend a product to a friend or colleague. You can’t get much simpler than a one-question survey.
As the NPS metric nears its 20th anniversary, it is still useful at a strategic level for companies, but customers are getting tired of even this one question, and it is fading into irrelevance for the application and systems team who need more specific indicators of what to do next.
What constitutes a customer-facing SLO?
In enterprise application scenarios, a mix of distributed systems are becoming the norm for getting work done. Off-the-shelf and homegrown software running in the datacenter are being consumed and modernized underneath specialized functionality from SaaS providers and on-demand cloud-based infrastructure.
Businesses rely on a web of partners and vendors and negotiate these relationships by setting contractual minimum SLAs for availability, performance and security. If a vendor fails to meet the SLA, they face a penalty, but that still sets a very low floor for customer experience.
Conversely, end consumers of applications — a group we’re all members of — we automatically know when our expectations aren’t being met. We don’t need to refer to the T&C section of our user agreements to realize when we are dissatisfied.
Motivated by a fear of failure, the application owner in either situation might react by setting aspirational SLO goals for itself and its partners of ‘99.9995% uptime’ or ‘<20ns response time,’ but such targets may prove unrealistic to achieve and unnecessarily costly for everyone involved to attain.
A realistic SLO means different things to different teams in all collaborating parties, because they are looking at different SLIs (service level indicators) to form their opinions.
- To a front-end team responsible for a web UI, they’d look at SLIs for page load times, API request/response times, timeouts, and user actions such as cart abandonment or order cancellations.
- The broader DevOps or application delivery team might look at the rate of delivery of function points, reported customer bugs, and the labor cost of retirement of features in a backlog.
- The operations or SRE team may look at golden indicators like availability, CPU and memory utilization, the additional cost of reserving more cloud capacity, or the identification and remediation time for dealing with production incidents.
In the end, all of these SLIs measures are useful, if they contribute to better conversations about their causality on customer behavior, and help teams better decide which SLOs are worth setting.
Accelerating the pace of change
The ultimate test of an organization’s DevOps practices comes when Dev and Ops teams work together to deliver new product features into production, based on the demands of SLO-savvy customers.
Historically in waterfall-style application shops, customer demand was gathered as requirements, and followed by vendor claims to customers. When customers insisted, these delivery claims could be contractually formalized with roadmap SLA declarations — such as ‘you shall be able to import all of your legacy version’s inventory data through our new data converter by Q4 2022.’
By contrast, SLOs support an iterative, DevOps delivery process that embraces constant change. Continuous delivery of code to production is merged with continuous observability of the impact of each change in production, and the resulting SLIs can fulfill existing SLOs while helping to identify new SLOs for improvement.
One cloud services company faced down just such a challenge, as its customers contractually demanded SLAs for fast scaling and high performance with very low error margins that could prove too risky to honor.
By switching their DevOps teams to the Nobl9 SLO Platform, they were able to ‘shift reliability left,’ span multiple monitoring systems for success indicators, and prioritize feature and infrastructure enhancements based on SLOs goals that would not only meet, but reliably exceed their customers’ original SLA requirements.
The Intellyx Take
SLOs shouldn’t be set in stone, any more than your business strategy or product.
Businesses are on a continuous journey from promising customer delight, to promoting customer enlightenment about the impact of well-managed SLOs.
Customers may not yet know how to ask for SLOs versus demanding SLAs, but soon enough they will discover that the best-run businesses always seem to make the right next move in supporting their needs.