FLX-ENG-RFC-003 — DORA Baseline Collection + Objective CI¶
| Field | Value |
|---|---|
| RFC ID | FLX-ENG-RFC-003 |
| Status | Active — Week 2 (2026-07-06 → 2026-07-12) |
| Author | Arun Singh, Senior Distinguished Engineer / Architect (Consulting) |
| Reviewers | Raja Choudhary, Rahul (Eng Lead) |
| Scope | 7-day passive DORA data collection via CodePulse; CI gate validation pair session |
| Parent Epic | GitHub Issue #2 — [EPIC] Week 2 · DORA Baseline Collection + Objective-Review CI |
| Milestone | MS#2 — due 2026-07-12 |
| Priority | P0-CRITICAL |
| Related Issues | #17, #18, #19 (and DORA metric stories #30–#34 from EPIC #4) |
TL;DR¶
Week 2 runs passively: CodePulse collects real PR and deployment data for ≥7 days. In parallel, a 1-hour pair session with the DMS developer validates that the 10 CI gates (Epic #6) are correctly tuned for the DMS codebase. The week ends with raw DORA data ready for Week 3 analysis.
1. Problem¶
DORA metrics are only valid if measured against real production deployments and real PR merge events. A 7-day minimum collection window is required to capture at least 2–3 deployment cycles and sufficient PR throughput to compute lead time and change fail rate. Week 2 is the collection window — analysis happens in Week 3.
The CI gates defined in Epic #6 must be validated against DMS-specific characteristics (test project name, threshold calibration, file exclusions) before they are applied as merge gates in Week 3. A pair session catches misconfiguration early.
2. Step-by-Step Tasks¶
Task 1 · US-2.1 — CodePulse Passive DORA Data Collection (GitHub Issue #17)¶
Priority: P1 | Effort: 0 hr active (passive monitoring) | Owner: Arun Singh (monitoring)
Setup verification (Day 1 of Week 2): 1. Confirm CodePulse is showing PR events from the previous week (Week 1 first-pass scans) 2. Verify the following events are being captured: - PR opened events (author, branch, target) - PR merged events (merge time, commit SHA) - Deployment events (from main merge or tag push) 3. Check "Data completeness" indicator in CodePulse — should show ≥ 90% of PRs captured 4. If <90%: review GitHub webhook → re-authorize CodePulse app on org
Daily check (takes 5 minutes per day):
Day 1: Confirm collection active, zero PR gaps
Day 2: Verify deployment event captured for any pushes to main
Day 3: Mid-week spot check — any PRs merged today appear in dashboard?
Day 5: Confirm collection window is ≥7 days from CodePulse activation
Day 7: Export raw data snapshot (CSV) — commit to docs/dora-baseline/week2-raw-export.csv
DORA metrics being passively collected (per FLX-ENG-RFC-004): | Metric | What CodePulse measures | |--------|------------------------| | Change Lead Time | PR merge time → deployment time | | Deployment Frequency | Count of production deployments per day/week | | Recovery Time | Incident open → resolution time (requires PagerDuty/GitHub issue link) | | Change Fail Rate | % of deployments that result in a hotfix within 24 hours | | Rework Rate | % of PRs that re-touch the same file within 7 days |
Exit signal: 7+ days of continuous collection; CodePulse dashboard shows ≥ 5 deployment events or ≥ 10 PR events; raw export committed
Task 2 · US-2.2 — Pair Session with DMS Developer (GitHub Issue #18)¶
Priority: P1 | Effort: 1 hr | Owner: Arun Singh + Rahul (or Tushar)
Session goal: Validate the 10 CI gate configurations (Epic #6 issues #20–#29) against the real DMS codebase before treating them as blocking merge gates.
Pre-session preparation (Arun — 30 min before session): 1. Draft preliminary gate configurations based on DMS read-through (Week 1 Task 3) 2. Run each gate locally against DMS and note raw output: - dotnet build → current error count - dotnet test → current pass/fail count + coverage % - semgrep → current finding count by severity - trivy → current CVE count by severity - gitleaks → any findings? 3. Identify any gates where the initial threshold will cause an immediate fail (red-light gates)
Session agenda (60 min):
00:00 – Context: what CI gates are and why they matter (5 min)
05:00 – Walk through current build + test output together (10 min)
15:00 – Review preliminary thresholds:
- Coverage threshold: current actual% → set threshold at (actual - 5%) for Week 2
- Complexity threshold: what's the current max complexity in DMS?
- SAST threshold: any findings that are false positives to exclude?
- CVE threshold: any packages with known false-positive CVEs?
30:00 – Agree exclusion lists:
- Files/paths excluded from coverage (generated migrations, Meesho-specific adapters)
- Semgrep rules to suppress (document each suppression with reason)
45:00 – Confirm test project name: DistributionServerUnitTest/ (for CI workflow)
50:00 – Open questions and next steps (10 min)
60:00 – End
Action items from session (Arun commits within 24 hrs): - Updated threshold values for all 10 gates → update CI gate issues #20–#29 - Agreed suppression list → commit to .semgrepignore and trivy.yaml - Calibrated coverage threshold → update US-6.2 (#21) - Exit signal: Session notes committed; all 10 gate thresholds documented with rationale
Task 3 · US-2.3 — [P3-Optional] Knowledge-Graph Parser (GitHub Issue #19)¶
Priority: P3 | Effort: 3–4 hrs | Owner: Arun Singh (if time permits)
Note: This is optional. Only start if Tasks 1 and 2 are complete and Week 2 has buffer time.
Steps (if pursued): 1. Install Neo4j Desktop (free local version) or use AuraDB free tier 2. Write Python parser (scripts/knowledge-graph/parse-dms.py): - Parse all .cs files with roslyn-python or tree-sitter-c-sharp - Extract: class names, method names, call relationships, interface implementations - Generate Cypher CREATE statements 3. Load into Neo4j: neo4j-admin import or CALL apoc.import.json() 4. Run queries: - "Show me all callers of IClientService" - "What classes have no test coverage?" - "What is the depth of call chain from InfeedController to DB?" 5. Export graph as PNG → commit to docs/architecture/dms-knowledge-graph.png 6. Exit signal: Neo4j query results show valid DMS call graph; PNG exported
3. DORA Metrics — Collection Framework¶
This section cross-references FLX-ENG-RFC-004 which defines the full 5-metric framework.
What Week 2 collects¶
| # | Metric | Source | Collection Method |
|---|---|---|---|
| 1 | Change Lead Time | GitHub PRs | CodePulse passive |
| 2 | Deployment Frequency | GitHub main merges | CodePulse passive |
| 3 | Recovery Time | Incidents + hotfix PRs | CodePulse + manual |
| 4 | Change Fail Rate | Hotfix PRs within 24h of deploy | CodePulse passive |
| 5 | Rework Rate | File re-touch patterns | CodePulse passive |
What Week 3 does with the data¶
- Read raw numbers from CodePulse dashboard (US-3.1 #35)
- Calibrate targets with Raja — set 3-month improvement deltas (US-3.2 #36)
- Produce the 4 DORA baseline numbers in the final report
4. Dependencies¶
| Dependency | Required for |
|---|---|
| CodePulse active (US-1.6 #14) | Task 1 — collection cannot start without active SaaS |
| .NET 8 port complete (RFC-PoC-1.1 #53) | Task 2 — pair session runs gates on .NET 8 |
| Week 1 orientation complete (#11, #13) | Task 2 — threshold calibration requires knowing DMS characteristics |
5. Success Criteria¶
- ≥7 continuous days of CodePulse DORA data collection
- Raw data export committed to
docs/dora-baseline/ - Pair session held; notes committed; all 10 gate thresholds documented
- At least 4 of 5 DORA metrics have non-zero data in CodePulse (Recovery Time may be zero if no incidents)
- CodePulse "Data Quality" indicator shows ≥ 90%