engineering metrics for engineers david abram sponsors 2 / 36 the kids are playing musical chairs. there are three chairs and four kids left. two consultants are observing the game. consultant 1: "we should add a chair so every kid gets one." consultant 2: "too expensive. we should just reduce the headcount." 3 / 36 hi everyone, i'm david abram • software development consultant • legacy system migrations
• crocoder, inc. (crocoder.dev) • software architecture
• executing strategy initiaves 4 / 36 measuring developer productivity? • easy -> prs merged per dev per week • everyone breaks down prs to few lines of code changes • devs have over 9000 prs to review • tickets closed? • lines of code? • commits per day? 5 / 36 why leadership loves metrics • visibility
• predictability
• roi
companies want clarity - but software engineering is messy. 6 / 36 why engineers fear metrics • feels like surveillance • turns improvement into judgment • individual scorecards metrics meant to guide teams end up punishing individuals. 7 / 36 the core question ▍ are we effective at turning effort into impact?
the problem isn't that we measure; it's that we often measure the wrong things. 8 / 36 what are engineering metrics? quantitative measures of software development processes, team performance, and system health
• build time, deploy frequency, bug rates • pr review time, lead time, cycle time • developer satisfaction, cognitive load numbers that tell you how work flows through your system 9 / 36 dora: devops research and assessment by google • change lead time
• deploy frequency
• change failure rate
• mean time to recovery
don't use it to evaluate team-by-team performance - rather the whole org devops maturity 10 / 36 dora performance levels metric │ elite │ low ───────────────┼──────────────┼──────────────────── lead time │ < 1 hour │ > 1 month
deploy freq │ multiple/day │ once per 1-6 months
change failure │ 0-15% │ 46-60%
mttr │ < 1 hour │ > 1 week
11 / 36 space framework by microsoft • satisfaction
• performance
• activity
• communication
• efficiency
productivity includes people and collaboration. 12 / 36 devex framework (acquired by atlassian) • tool friction • feedback loops • cognitive load focus on daily friction, how work feels. 13 / 36 the fuller picture speed + wellbeing + experience = engineering health
14 / 36 three buckets in every org ▓▓▓ operations ▓▓▓ tactics ▓▓▓ strategy
developer's world team's world leadership world
devs, tech leads, pos... vps, managers... cxo, directors...
each bucket tells a different story - it's really difficult to communicate stuff between them properly. 15 / 36 operations: the developers • build times, flaky tests, pr pickup / review time • focus: how easy is it to get something done?
when experience breaks, productivity disappears. 16 / 36 tactics: teams • throughput, pr size, predictability • focus: are we executing with steady rhythm and quality?
this is about tempo - flow, not speed. 17 / 36 strategy: the leadership • roi, satisfaction, customer impact • focus: are we investing in the right things?
alignment matters - fast in the wrong direction is still wrong. 18 / 36 the balance developer experience — engineers stay productive
team tempo — delivery stays consistent
organizational alignment — work creates impact
all three buckets matter. 19 / 36 red flags: when metrics go wrong ▍ Caution
▍
▍ watch out for these warning signs
• weaponized against individuals • vanity metrics (commits/day, loc) • metric theater • context-free comparisons we've all seen them. 20 / 36 spot the red flag which would you trust? • commits per day • tickets closed • pr review time • developer satisfaction why the others fail: • commits -> easily gamed, no quality signal • tickets -> depends on ticket sizing • pr review time -> actionable friction point ✓ • developer satisfaction -> leading indicator ✓ 21 / 36 metric theater ▍ Warning
▍
▍ when we optimize for the metric, we stop improving the system.
real examples: • track prs merged → devs split every change into 10 tiny prs • track tickets closed → devs cherry-pick easy bugs, avoid hard work • track code coverage → tests that assert true === true
goodhart's law: "when a measure becomes a target, it ceases to be a good measure." 22 / 36 green flags: when metrics work ▍ Tip
▍
▍ signs your metrics are helping, not hurting
• reveal bottlenecks
• justify investment
• pair numbers with context
• stay in the right bucket
good metrics spark better conversations, not fear. 23 / 36 the communication challenge different audiences, different languages • developers -> friction • managers -> flow • executives -> outcomes if you mismatch, you lose them. 24 / 36 match the message audience │ focus │ question ───────────┼────────────┼───────────────────── developers │ experience │ can i get work done?
managers │ tempo │ are we flowing well?
executives │ alignment │ right direction?
25 / 36 translation practice ▒▒▒▒ before ▒▒▒▒ after
"ci is too slow" "we lose 15 hours/week = 0.4 fte of productivity"
translation turns pain into impact. 26 / 36 your turn problem: "pr reviews take too long!"
how would you rephrase this for your manager? common mistakes: • "reviews are slow" - vague • "john takes 5 days" - personal attack • "we need more reviewers" - jumps to solution better: "prs wait x days on average, creating y% idle time. fixing this recovers a full sprint's capacity per quarter."
27 / 36 closing story instead of accepting the bad metric, you translate it: "the real issue is lead time growth from waiting on reviews and builds." from defending yourself to explaining the system. 28 / 36 but, metrics ain't for us! "we're too small to track metrics" => you're not too small to have problems. start with one: what hurts most today? "our work is too unique to measure" => every team deploys code, reviews prs, and fixes bugs. start there. 29 / 36 getting started monday 1. operations: build success rate & time, pr review time
2. tactics: deploy frequency, lead time
3. strategy: ktlo vs new features
4. all around: developer satisfaction survey
tools: even a simple spreadsheet works
start small, measure what hurts most. 30 / 36 preventing metric gaming protection strategies: • never tie metrics to individual performance reviews • measure multiple dimensions (speed + quality) • review metrics quarterly - prune vanity metrics make gaming harder than doing good work. 31 / 36 final takeaway ▍ metrics should be a mirror, not a weapon.
▍ if a metric doesn't make developers' lives better, it's not a good metric.
32 / 36 the slow build you're waiting 40 minutes for ci. tests are flaky. you've brewed a second coffee. which metric proves this is expensive? a) deploy frequency b) build success rate c) mean time to recovery answer: b - builds failing 1 in 3 runs = 15 hours/week lost
33 / 36 the bottleneck sprint team hit all story points, but features aren't shipping. which metric shows why? a) deploy frequency b) lead time for changes c) change failure rate answer: b - high activity doesn't mean good flow
34 / 36 the money problem dashboards green, velocity up, teams shipping... this isn't moving our "revenue". what's missing? a) alignment between engineering and company goals b) developer satisfaction metrics c) team velocity metrics answer: a - this isn't a metric... but fast in the wrong direction is still wrong
35 / 36 resources & references research & frameworks: • dora state of devops reports - dora.dev • space framework - microsoft research (2021) • devex framework - getdx.com further reading: • "accelerate" by nicole forsgren, jez humble, gene kim • "the space of developer productivity" (acm queue) 36 / 36