FLX-ENG-RFC-004 — DORA Metrics Framework · 5 Metrics × Baseline × Threshold × Cadence¶
| Field | Value |
|---|---|
| RFC ID | FLX-ENG-RFC-004 |
| Status | Active — Weeks 2–3 |
| Author | Arun Singh, Senior Distinguished Engineer / Architect (Consulting) |
| Reviewers | Raja Choudhary (sign-off), Rahul (Eng Lead) |
| Scope | Defines all 5 DORA metrics: collection method, baseline, 3-month target, reporting cadence |
| Parent Epic | GitHub Issue #4 — [EPIC] DORA Metrics Framework |
| Priority | P0-CRITICAL |
| Related Issues | #30 (Lead Time), #31 (Deploy Freq), #32 (Recovery Time), #33 (Fail Rate), #34 (Rework Rate) |
TL;DR¶
This RFC is the single source of truth for all DORA measurements in this engagement. It defines what each metric means, how it's collected (passive CodePulse + manual), what baseline looks like, the elite-performer target, and the reporting cadence. Each metric maps to one GitHub issue for tracking.
1. Why DORA¶
DORA (DevOps Research & Assessment) is the industry standard for measuring software delivery performance. 10+ years of research across 33,000+ respondents shows that high-performing teams on all 4 DORA metrics deliver better business outcomes: 2× more likely to exceed profitability goals, 50% lower change fail rate.
For Flexli, the 5-metric suite (4 standard DORA + 1 custom Rework Rate) provides: - An objective before/after comparison for the engagement's value - A shared language between engineering (Rahul, Tushar) and business (Raja) - An input to the defect catalog (high change fail rate → find root cause)
2. Metric Definitions & Collection¶
Metric 1 · Change Lead Time (GitHub Issue #30)¶
Priority: P1 | Effort: 0.5 hr active
| Property | Value |
|---|---|
| Definition | Time from first commit on a feature branch to that commit running in production |
| Measurement point | PR merge timestamp → production deployment timestamp |
| Data source | CodePulse passive collection (GitHub PR events + deployment events) |
| Elite threshold | < 1 hour |
| High performer | 1 hour – 1 day |
| Medium performer | 1 week – 1 month |
| Low performer | > 1 month |
Collection steps: 1. CodePulse automatically captures PR merge and deployment events 2. Lead time = deployment_timestamp - pr_merge_timestamp per PR 3. Week 3 action (US-3.1 #35): read the P50, P90, P99 values from CodePulse dashboard 4. Document in baseline report as: Current P50 lead time: X hours 5. Target (set in US-3.2 #36): Raja to approve 3-month target (suggest: current P50 → current P50 × 0.5)
Baseline capture (US-5.1 step-by-step):
1. Open CodePulse → Metrics → Change Lead Time
2. Set date range: last 30 days (or all available data if < 30 days)
3. Note: P50 (median), P90, P99 values in hours
4. Note: number of PRs in sample
5. Screenshot dashboard → save to docs/dora-baseline/lead-time-baseline.png
6. Commit numbers to docs/dora-baseline/BASELINE-NUMBERS.md
Metric 2 · Deployment Frequency (GitHub Issue #31)¶
Priority: P1 | Effort: 0.5 hr active
| Property | Value |
|---|---|
| Definition | How often the team deploys to production |
| Measurement point | Merge to main branch (or tag push if semver tagging is in use) |
| Data source | CodePulse passive collection |
| Elite threshold | Multiple per day |
| High performer | Once per day – once per week |
| Medium performer | Once per week – once per month |
| Low performer | < once per month |
Collection steps: 1. CodePulse automatically counts deployments per time period 2. Week 3 action: read "deployments per week" average from CodePulse 3. Document: Current deployment frequency: X deploys per week (averaged over N weeks) 4. Target: Raja approves — suggest increase by 50% over 3 months
Baseline capture (US-5.2 step-by-step):
1. Open CodePulse → Metrics → Deployment Frequency
2. Set date range: last 30 days
3. Note: average deployments per week
4. Note: deployment dates (to identify any multi-week gaps = deployment risk)
5. Screenshot → docs/dora-baseline/deploy-freq-baseline.png
6. Commit to BASELINE-NUMBERS.md
Metric 3 · Failed Deployment Recovery Time (GitHub Issue #32)¶
Priority: P1 | Effort: 1 hr active (some manual research needed)
| Property | Value |
|---|---|
| Definition | Mean time to restore service after a production incident or failed deployment |
| Measurement point | Incident open timestamp → incident closed (service restored) timestamp |
| Data source | GitHub issues with incident label + CodePulse hotfix detection |
| Elite threshold | < 1 hour |
| High performer | < 1 day |
| Medium performer | 1 day – 1 week |
| Low performer | > 1 week |
Important: This metric requires linking production incidents to code fixes. If no incident tracking system exists yet, this metric must be collected partially manually.
Collection steps: 1. Ask Rahul: "What production incidents occurred in the last 3 months? How were they resolved?" 2. For each incident: record incident_start, incident_end, hotfix_PR_number 3. CodePulse will detect PRs labelled hotfix/* — confirm these match the manual list 4. If no formal incident log exists: this metric will be self-reported for baseline → note as "self-reported, not yet instrumented" 5. Note the raw times and compute mean: MTTR = sum(incident_durations) / count(incidents) 6. If zero incidents in period: record as "0 incidents in baseline window" — not as MTTR=0
Baseline capture (US-5.3 step-by-step):
1. Open CodePulse → Metrics → Mean Time to Restore
2. Note: automated hotfix detection results
3. Cross-reference with manual incident list from Rahul
4. Compute mean time if CodePulse value is incomplete
5. Document: "N incidents in baseline window; MTTR = X hours (Y% from CodePulse, Z% manual)"
6. Commit to BASELINE-NUMBERS.md
Metric 4 · Change Fail Rate (GitHub Issue #33)¶
Priority: P1 | Effort: 0.5 hr active
| Property | Value |
|---|---|
| Definition | % of deployments that result in a rollback or hotfix within 24 hours |
| Measurement point | Any hotfix/* PR merged within 24 hours of a production deployment |
| Data source | CodePulse passive collection (branch name pattern detection) |
| Elite threshold | < 5% |
| High performer | 5% – 10% |
| Medium performer | 11% – 25% |
| Low performer | > 25% |
Collection steps: 1. CodePulse automatically identifies hotfix/* branches and links them to preceding deployments 2. Requires hotfix/ branch naming convention — confirm with Rahul during Week 1 sync 3. If hotfix/ convention not used: manually identify emergency fixes from git log 4. Change Fail Rate = count(deployments_followed_by_hotfix_within_24h) / count(total_deployments)
Baseline capture (US-5.4 step-by-step):
1. Open CodePulse → Metrics → Change Fail Rate
2. Note: % value and sample size (N deployments in window)
3. If hotfix branch naming is non-standard: git log --oneline main | grep -i "fix\|revert\|rollback"
4. Document rate and whether it was CodePulse-computed or manually calculated
5. Commit to BASELINE-NUMBERS.md
Metric 5 · Deployment Rework Rate (GitHub Issue #34)¶
Priority: P2 | Effort: 1 hr active
| Property | Value |
|---|---|
| Definition | % of deployments where a file changed in deployment N is also changed in deployment N+1 (within 7 days) |
| Measurement point | File-level overlap between consecutive deployments |
| Data source | Hotfix branch detection (CodePulse) + manual git analysis |
| Elite threshold | < 10% |
| Target | < 10% at 3-month mark |
Custom metric: This is a Flexli-specific 5th metric beyond the standard 4 DORA metrics. It measures re-work — how often a just-deployed change needs to be immediately fixed.
Collection steps:
# Manual computation via git log
git log --oneline --merges main | head -20 # list last 20 merges to main
# For each pair of consecutive deployments (merge commits):
git diff <deploy_N_sha> <deploy_N1_sha> --name-only > /tmp/deploy-N-files.txt
git diff <deploy_N1_sha> <deploy_N2_sha> --name-only > /tmp/deploy-N1-files.txt
# Find overlap
comm -12 <(sort /tmp/deploy-N-files.txt) <(sort /tmp/deploy-N1-files.txt)
# Count overlapping files / total files in deploy N = rework rate for that deployment
Baseline capture:
1. Identify last 10 deployment pairs from git log
2. Compute rework rate for each pair using the script above
3. Average across all pairs
4. Document: "Rework rate = X% (averaged over N deployment pairs)"
5. Commit to BASELINE-NUMBERS.md + commit script to scripts/dora/rework-rate.sh
3. Baseline Report Template¶
The output of all 5 metrics feeds into docs/dora-baseline/BASELINE-NUMBERS.md:
# DMS DORA Baseline Numbers
Date: YYYY-MM-DD
Collection window: YYYY-MM-DD → YYYY-MM-DD (N days)
Tool: CodePulse SaaS + manual supplement
| Metric | Baseline Value | Sample Size | DORA Band | 3-Month Target |
|--------|---------------|-------------|-----------|----------------|
| Change Lead Time (P50) | X hours | N PRs | Medium/High/Elite | Y hours |
| Deployment Frequency | X/week | N deployments | | Y/week |
| Recovery Time (MTTR) | X hours | N incidents | | Y hours |
| Change Fail Rate | X% | N deployments | | Y% |
| Rework Rate | X% | N deploy pairs | | Y% |
## Notes
- Recovery Time: N incidents in baseline window (self-reported / CodePulse)
- Rework Rate: computed manually using scripts/dora/rework-rate.sh
4. Dependencies¶
| Dependency | Required for |
|---|---|
| CodePulse active (US-1.6 #14) | All 5 metrics |
| Week 2 collection window (US-2.1 #17) | Data availability in Week 3 |
| Hotfix branch naming convention confirmed | Metrics #4 and #5 |
| Raja sign-off on 3-month targets (US-3.2 #36) | Final report |
5. Success Criteria¶
- All 5 DORA metrics have baseline values documented in
BASELINE-NUMBERS.md - Each metric value includes sample size (not just a number)
- DORA performance band identified for each metric (Elite/High/Medium/Low)
- 3-month targets agreed with Raja and committed
- Collection methodology documented (CodePulse vs. manual) for each metric