More by Kit Merker:
Delivering the Right Data for Better SLOs with Nobl9 & New Relic What is an SLO? Explained in 90 Seconds Nobl9 Demo: Kubernetes Cluster Failover Scenario Tame the YAML in 2021 Want a Reputation for Reliability? Keep it Simple. Interview with Matt Klein Nobl9 Has Joined The Cloud Native Computing Foundation Nobl9 Demo: Setting up a Prometheus SLO with the Web UI Nobl9 Demo: GitOps Ready sloctl and SLO YAML Reliability Evolution from Datacenter to Cloud: Interview with Less Lincoln, SRE at Microsoft An Easy Way to Explain SLOs and SLAs to Business Executives Nobl9 and Lightstep Partner to Integrate Distributed Tracing Technology into SLO Management Platform The Ultimate Guide to Reliability Talks at re:Invent 2020 Kubernetes Knative Serverless Latency Metrics: Interview with Matt Moore Measuring Technology ROI: SLOs for CFOs 5 “Reasons” I Hate SLOs| Author: Kit Merker
Avg. reading time: 2 minutes
At Nobl9, we’ve been working hard to build our service level objective (SLO) platform that takes your existing monitoring data and connects them to your user and business goals. That’s why we’re pleased to announce that Nobl9 now supports Datadog as a monitoring metrics source.
SLOs, which are a small set of service KPIs and goals, help you identify what matters to your users and your business. They are really powerful and can help optimize cloud overhead, balance features vs. technical debt, and increase overall velocity and reliability of the team.
“SLOs are a powerful tool to ensure organizations have clear goals and expectations for their service reliability…”
But for SLOs to deliver these benefits, they must be informed by accurate, timely, and relevant data, a great source of which can be your existing metrics and system log data. Datadog is clearly a leader in that space, so it was important to us to integrate with their APIs so that users of the Nobl9 platform can easily pull metrics data for development and management of their SLOs.
Our friend Ilan Rabinovitch, VP Product & Community at Datadog says, “SLOs are a powerful tool to ensure organizations have clear goals and expectations for their service reliability, and it ties those goals directly to end-user expectations for availability and performance.”
The service integration is now available in beta and will become GA along with the Nobl9 platform in the future.
SLOs in a GitOps workflow
Why move SLOs outside your monitoring system itself, you might ask?
First, you may have multiple monitoring tools and need to coordinate or analyze SLOs across them.
Second, if you want to add SLOs to your CICD pipeline or “GitOps” workflow, you need a software configuration asset or artifact to add to source control. The Nobl9 SLO YAML gives you a way to create and manage these in a simple file format.
Third, your non-technical business stakeholders may want to see SLO data in their own reporting. Bringing the SLO definition outside of monitoring allows you to include this subset of key service information in other tools and processes.
Here’s an example:
apiVersion: n9/v1alpha kind: slo metadata: name: sample-slo namespace: default spec: budgetingMethod: Occurrences description: An SLO based on metrics from Datadog indicator: indicatorType: Availability metricSource: nobl9-datadog sloSet: sample-config thresholds: - budgetTarget: 0.9995 countMetrics: good: datadog: query: sum:ingest.ok{*}.as_count() total: datadog: query: sum:ingest.total{*}.as_count() displayName: Available value: 0
You can use any Datadog query to gather service KPI (or Service Level Indicators) and apply them in Nobl9 to define new SLOs. Note that SLOs can combine metrics from multiple sources, for example, Datadog and Prometheus.
To apply an SLO from a YAML file in Nobl9, run the following command:
sloctl apply -f sample-slo-datadog.yaml
If you’d like to get a peek at our Beta release (Codename Helium), drop us a line at hello@nobl9.com or sign up at nobl9.com (scroll).
Do you want to add something? Leave a comment