Nobl9 and Datadog: Better Data Makes Better SLOs

Posted by Brian Singer on August 18, 2020

At Nobl9, we’ve been working hard to build our service level objective (SLO) platform that takes your existing monitoring data and connects them to your user and business goals. That’s why we’re pleased to announce that Nobl9 now supports Datadog as a monitoring metrics source. 

SLOs, which are a small set of service KPIs and goals, help you identify what matters to your users and your business. They are really powerful and can help optimize cloud overhead, balance features vs. technical debt, and increase overall velocity and reliability of the team. 

“SLOs are a powerful tool to ensure organizations have clear goals and expectations for their service reliability…”

But for SLOs to deliver these benefits, they must be informed by accurate, timely, and relevant data, a great source of which can be your existing metrics and system log data. Datadog is clearly a leader in that space, so it was important to us to integrate with their APIs so that users of the Nobl9 platform can easily pull metrics data for development and management of their SLOs.

Our friend Ilan Rabinovitch, VP Product & Community at Datadog says, “SLOs are a powerful tool to ensure organizations have clear goals and expectations for their service reliability, and it ties those goals directly to end-user expectations for availability and performance.”

The service integration is now available in beta and will become GA along with the Nobl9 platform in the future.

SLOs in a GitOps workflow

Why move SLOs outside your monitoring system itself, you might ask? 

First, you may have multiple monitoring tools and need to coordinate or analyze SLOs across them. 

Second, if you want to add SLOs to your CICD pipeline or “GitOps” workflow, you need a software configuration asset or artifact to add to source control. The Nobl9 SLO YAML gives you a way to create and manage these in a simple file format. 

Third, your non-technical business stakeholders may want to see SLO data in their own reporting. Bringing the SLO definition outside of monitoring allows you to include this subset of key service information in other tools and processes.

Here’s an example:

apiVersion: n9/v1alpha
kind: slo
metadata:
 name: sample-slo
 namespace: default
spec:
 budgetingMethod: Occurrences
 description: An SLO based on metrics from Datadog
 indicator:
   indicatorType: Availability
   metricSource: nobl9-datadog
 sloSet: sample-config
 thresholds:
 - budgetTarget: 0.9995
   countMetrics:
     good:
       datadog:
         query: sum:ingest.ok{*}.as_count()
     total:
       datadog:
         query: sum:ingest.total{*}.as_count()
   displayName: Available
   value: 0

You can use any Datadog query to gather service KPI (or Service Level Indicators) and apply them in Nobl9 to define new SLOs. Note that SLOs can combine metrics from multiple sources, for example, Datadog and Prometheus. 

To apply an SLO from a YAML file in Nobl9, run the following command:

sloctl apply -f sample-slo-datadog.yaml

If you’d like to get a peek at our Beta release (Codename Helium), drop us a line at hello@nobl9.com or sign up at nobl9.com (scroll).


READY TO UNLOCK
EFFICIENT RELIABILTY?

Start a Free Trial
Talk to a Nobl9 SLOgician

We can answer questions and walk you through step-by-step. No obligation or sales pitch, we are here to help and understand.