· Observability · 6 min read
DORA Metrics Without the Spreadsheet: Setting Up DevLake in Your Platform
DORA metrics tell you whether your engineering org is actually healthy or just busy. DevLake is the open-source tool that pulls the data from your existing stack and turns it into answers — without asking engineers to fill in forms.
Every engineering leader says they care about DORA metrics. Fewer actually measure them. And the ones who do often measure them wrong — counting deploys from a spreadsheet, estimating lead time by feel, and declaring the team “elite” based on vibes.
DevLake fixes this. It is an open-source engineering analytics platform that pulls data from your real tools — GitHub, GitLab, Jira, Jenkins, ArgoCD, PagerDuty — correlates it, and produces accurate DORA dashboards without anyone having to manually track anything.
This post covers what DORA metrics actually mean, why most teams measure them badly, and how to get a working DevLake setup in an afternoon.
What DORA Metrics Are (And Aren’t)
The DORA research program identified four metrics that consistently predict software delivery performance and organisational health:
| Metric | What it measures |
|---|---|
| Deployment Frequency | How often you ship to production |
| Lead Time for Changes | Time from commit to production |
| Change Failure Rate | Percentage of deployments that cause an incident |
| Mean Time to Recovery | How long to restore service after a failure |
The benchmarks matter less than the direction. An elite team deploys multiple times per day with sub-hour lead time, under 5% change failure rate, and recovers in under an hour. A low performer ships monthly, takes weeks from commit to prod, breaks things 46% of the time, and takes days to recover.
What DORA metrics are not: a tool for ranking engineers. Lead time measures your deployment pipeline, not individual velocity. Change failure rate measures incident response and testing maturity, not developer carelessness. Measuring these at an individual level is both wrong and counterproductive.
Why Most Teams Measure Them Badly
The typical approach: ask someone to pull deployment data from CI, pull incident data from PagerDuty, and join them in a spreadsheet. This fails for several reasons:
- Deployment data lives in three different systems and none of them agree on what counts as a “deployment”
- Lead time requires linking a production deployment back to the commits it contains, which requires knowing what SHA was deployed and when
- Change failure rate requires correlating deployments with incidents in the same time window, which requires both systems to have accurate timestamps
- Nobody keeps the spreadsheet updated
DevLake automates all of this by treating your tools as data sources, pulling their APIs on a schedule, and building a unified data model that already understands the relationships.
How DevLake Works
DevLake has three layers:
- Plugins — connectors for each data source (GitHub, Jira, Jenkins, etc.) that pull data via API and store it in a normalised schema
- Domain layer — a common data model that maps source-specific concepts (GitHub PR, Jira ticket, ArgoCD deployment) to generic ones (pull request, issue, deployment, incident)
- Grafana dashboards — pre-built DORA dashboards that query the domain layer
The result: you configure your data sources once, and the DORA dashboards populate automatically.
Setting Up DevLake
Prerequisites
- Docker and Docker Compose
- Access to your data sources (GitHub token, Jira credentials, etc.)
- About 30 minutes
Step 1 — Run DevLake with Docker Compose
curl -s https://raw.githubusercontent.com/apache/incubator-devlake/main/docker-compose.yml \
-o docker-compose.yml
curl -s https://raw.githubusercontent.com/apache/incubator-devlake/main/env.example \
-o .env
docker compose up -dDevLake exposes three services:
localhost:4000— the DevLake config UIlocalhost:3002— Grafana (dashboards)- The MySQL database (internal)
Step 2 — Connect Your GitHub Data Source
In the DevLake UI at localhost:4000, go to Connections → GitHub and create a connection with a personal access token that has repo scope.
Then create a project and add a scope — this is where you specify which repos to include. For a monorepo setup, add the single repo. For a multi-repo org, add each repo that counts as a deployable unit.
The GitHub plugin pulls:
- Pull requests (for lead time calculation)
- Commits (for linking PRs to deployments)
- Releases (which DevLake can treat as deployment events)
Step 3 — Define Deployments
This is the step most setups get wrong. DevLake needs to know what counts as a deployment. There are three options depending on your stack:
Option A: GitHub releases
If you cut a GitHub release on every production deploy, DevLake can use that. In the scope config, set the deployment pattern to match your release tag format:
v\d+\.\d+\.\d+Option B: CI/CD pipeline runs
If you use GitHub Actions with a named deployment workflow, configure the plugin to treat runs of that workflow as deployments:
deploy-productionOption C: ArgoCD (recommended for Kubernetes shops)
Connect the ArgoCD plugin. It understands sync operations natively and maps them to deployments in DevLake’s domain layer. This gives you the most accurate data because ArgoCD knows exactly when a revision was running in production.
# DevLake ArgoCD connection config
endpoint: https://argocd.your-cluster.internal
token: <argocd-api-token>Step 4 — Connect Incident Data
Change failure rate and MTTR require incident data. Connect PagerDuty, OpsGenie, or your incident management tool.
For PagerDuty:
Connection type: PagerDuty
API token: <your-pagerduty-token>DevLake will correlate incidents with deployments by timestamp: if an incident opens within a configurable window after a deployment, it counts as a change failure.
The default window is 24 hours. For high-frequency teams, tighten this to 1–4 hours. For weekly-deploy teams, 72 hours is more realistic.
Step 5 — Configure the DORA Dashboard
Open Grafana at localhost:3002 (default credentials: admin/admin). The DORA dashboard is pre-installed under Dashboards → DevLake → DORA.
Set the time range to at least 90 days for meaningful baselines. The four panels map directly to the four metrics:
- Deployment Frequency: deployments per day/week/month
- Lead Time for Changes: median and 95th percentile, from first commit to deployment
- Change Failure Rate: incidents / total deployments
- MTTR: median incident duration for deployment-correlated incidents
Interpreting the Numbers
A few things to calibrate before drawing conclusions:
Lead time will be longer than you expect. DevLake measures from the first commit in a PR, not from when the PR was created. If engineers commit early and iterate, that time counts. This is correct — it reflects actual cycle time.
Change failure rate is not just about bugs. A deployment that triggers an alert because of a config change you intended counts as a failure if it opens an incident. This is also correct — the system experienced degradation.
MTTR includes detection time. If your monitoring takes 20 minutes to page someone after a bad deploy, that 20 minutes is in your MTTR. Improving alerting latency improves MTTR without changing anything about how you respond.
Running DevLake in Production
The Docker Compose setup is fine for evaluation. For a persistent installation:
- Use an external MySQL or PostgreSQL instance for the database
- Set up a Kubernetes deployment using the official Helm chart
- Configure data collection to run on a schedule (DevLake supports cron expressions per pipeline)
- Put Grafana behind your SSO if engineers will use it directly
# Example Helm values
devlake:
database:
externalUrl: 'mysql://devlake:password@mysql.internal:3306/devlake'
grafana:
enabled: true
ingress:
enabled: true
host: devlake.internal.your-company.comThe Bottom Line
DORA metrics are only useful if they’re accurate, and they’re only accurate if they’re automated. Manual measurement introduces bias, gaps, and the temptation to game the numbers.
DevLake takes about half a day to set up properly. After that, the data is continuous, consistent, and requires no human effort to maintain. The hard part isn’t the tooling — it’s deciding what your deployment event actually is and making sure every production change goes through a path that DevLake can see.
Get that right, and you have an honest baseline. From a baseline, you can improve. From vibes, you can’t.
