engineering metrics for engineers

                                                                                        david abram

                                                                                      sponsors

                                                                                                                                                                                     2 / 36

                the kids are playing musical chairs.

                there are three chairs and four kids left.

                two consultants are observing the game.

                consultant 1: "we should add a chair so every kid gets one."

                consultant 2: "too expensive. we should just reduce the headcount."

                                                                                                                                                                                     3 / 36

                                                                  hi everyone, i'm david abram

                   •  software development consultant                                                •  legacy system migrations

                   •  crocoder, inc. (crocoder.dev)                                                  •  software architecture

                                                                                                     •  executing strategy initiaves

                                                                                                                                                                                     4 / 36

                                                             measuring developer productivity?

                   •  easy -> prs merged per dev per week

                   •  everyone breaks down prs to few lines of code changes

                   •  devs have over 9000 prs to review

                   •  tickets closed?

                   •  lines of code?

                   •  commits per day?

                                                                                                                                                                                     5 / 36

                                                                  why leadership loves metrics

                   •  visibility

                   •  predictability

                   •  roi

                companies want clarity - but software engineering is messy.

                                                                                                                                                                                     6 / 36

                                                                    why engineers fear metrics

                   •  feels like surveillance

                   •  turns improvement into judgment

                   •  individual scorecards

                metrics meant to guide teams end up punishing individuals.

                                                                                                                                                                                     7 / 36

                                                                             the core question

                ▍ are we effective at turning effort into impact?

                the problem isn't that we measure; it's that we often measure the wrong things.

                                                                                                                                                                                     8 / 36

                                                                 what are engineering metrics?

                quantitative measures of software development processes, team performance, and system health

                   •  build time, deploy frequency, bug rates

                   •  pr review time, lead time, cycle time

                   •  developer satisfaction, cognitive load

                numbers that tell you how work flows through your system

                                                                                                                                                                                     9 / 36

                                                dora: devops research and assessment by google

                   •  change lead time

                   •  deploy frequency

                   •  change failure rate

                   •  mean time to recovery

                don't use it to evaluate team-by-team performance - rather the whole org devops maturity

                                                                                                                                                                                    10 / 36

                                                                       dora performance levels

                                                                    metric         │ elite        │ low

                                                                    ───────────────┼──────────────┼────────────────────

                                                                    lead time      │ < 1 hour     │ > 1 month

                                                                    deploy freq    │ multiple/day │ once per 1-6 months

                                                                    change failure │ 0-15%        │ 46-60%

                                                                    mttr           │ < 1 hour     │ > 1 week

                                                                                                                                                                                    11 / 36

                                                                  space framework by microsoft

                   •  satisfaction

                   •  performance

                   •  activity

                   •  communication

                   •  efficiency

                productivity includes people and collaboration.

                                                                                                                                                                                    12 / 36

                                                       devex framework (acquired by atlassian)

                   •  tool friction

                   •  feedback loops

                   •  cognitive load

                focus on daily friction, how work feels.

                                                                                                                                                                                    13 / 36

                                                                            the fuller picture

                                                                    speed + wellbeing + experience = engineering health

                                                                                                                                                                                    14 / 36

                                                                    three buckets in every org

                ▓▓▓ operations                                          ▓▓▓ tactics                                         ▓▓▓ strategy

                developer's world                                       team's world                                        leadership world

                devs, tech leads, pos...                                vps, managers...                                    cxo, directors...

                each bucket tells a different story - it's really difficult to communicate stuff between them properly.

                                                                                                                                                                                    15 / 36

                                                                    operations: the developers

                   •  build times, flaky tests, pr pickup / review time

                   •  focus: how easy is it to get something done?

                when experience breaks, productivity disappears.

                                                                                                                                                                                    16 / 36

                                                                                tactics: teams

                   •  throughput, pr size, predictability

                   •  focus: are we executing with steady rhythm and quality?

                this is about tempo - flow, not speed.

                                                                                                                                                                                    17 / 36

                                                                      strategy: the leadership

                   •  roi, satisfaction, customer impact

                   •  focus: are we investing in the right things?

                alignment matters - fast in the wrong direction is still wrong.

                                                                                                                                                                                    18 / 36

                                                                                   the balance

                developer experience — engineers stay productive

                team tempo — delivery stays consistent

                organizational alignment — work creates impact

                all three buckets matter.

                                                                                                                                                                                    19 / 36

                                                              red flags: when metrics go wrong

                ▍ 󰳦 Caution

▍

                ▍ watch out for these warning signs

                   •  weaponized against individuals

                   •  vanity metrics (commits/day, loc)

                   •  metric theater

                   •  context-free comparisons

                we've all seen them.

                                                                                                                                                                                    20 / 36

                                                                             spot the red flag

                which would you trust?

                   •  commits per day

                   •  tickets closed

                   •  pr review time

                   •  developer satisfaction

                why the others fail:

                   •  commits -> easily gamed, no quality signal

                   •  tickets -> depends on ticket sizing

                   •  pr review time -> actionable friction point ✓

                   •  developer satisfaction -> leading indicator ✓

                                                                                                                                                                                    21 / 36

                                                                                metric theater

                ▍  Warning

▍

                ▍ when we optimize for the metric, we stop improving the system.

                real examples:

                   •  track prs merged → devs split every change into 10 tiny prs

                   •  track tickets closed → devs cherry-pick easy bugs, avoid hard work

                   •  track code coverage → tests that assert true === true

                goodhart's law: "when a measure becomes a target, it ceases to be a good measure."

                                                                                                                                                                                    22 / 36

                                                                green flags: when metrics work

                ▍  Tip

▍

                ▍ signs your metrics are helping, not hurting

                   •  reveal bottlenecks

                   •  justify investment

                   •  pair numbers with context

                   •  stay in the right bucket

                good metrics spark better conversations, not fear.

                                                                                                                                                                                    23 / 36

                                                                   the communication challenge

                different audiences, different languages

                   •  developers -> friction

                   •  managers -> flow

                   •  executives -> outcomes

                if you mismatch, you lose them.

                                                                                                                                                                                    24 / 36

                                                                             match the message

                                                                       audience   │ focus      │ question

                                                                       ───────────┼────────────┼─────────────────────

                                                                       developers │ experience │ can i get work done?

                                                                       managers   │ tempo      │ are we flowing well?

                                                                       executives │ alignment  │ right direction?

                                                                                                                                                                                    25 / 36

                                                                          translation practice

                ▒▒▒▒ before                                                                       ▒▒▒▒ after

                "ci is too slow"                                                                  "we lose 15 hours/week = 0.4 fte of productivity"

                translation turns pain into impact.

                                                                                                                                                                                    26 / 36

                                                                                     your turn

                problem: "pr reviews take too long!"

                how would you rephrase this for your manager?

                common mistakes:

                   •  "reviews are slow" - vague

                   •  "john takes 5 days" - personal attack

                   •  "we need more reviewers" - jumps to solution

                better: "prs wait x days on average, creating y% idle time. fixing this recovers a full sprint's capacity per quarter."

                                                                                                                                                                                    27 / 36

                                                                                 closing story

                instead of accepting the bad metric, you translate it:

                "the real issue is lead time growth from waiting on reviews and builds."

                from defending yourself to explaining the system.

                                                                                                                                                                                    28 / 36

                                                                    but, metrics ain't for us!

                "we're too small to track metrics"

                => you're not too small to have problems. start with one: what hurts most today?

                "our work is too unique to measure"

                => every team deploys code, reviews prs, and fixes bugs. start there.

                                                                                                                                                                                    29 / 36

                                                                        getting started monday

                   1. operations: build success rate & time, pr review time

                   2. tactics: deploy frequency, lead time

                   3. strategy: ktlo vs new features

                   4. all around: developer satisfaction survey

                tools: even a simple spreadsheet works

                start small, measure what hurts most.

                                                                                                                                                                                    30 / 36

                                                                      preventing metric gaming

                protection strategies:

                   •  never tie metrics to individual performance reviews

                   •  measure multiple dimensions (speed + quality)

                   •  review metrics quarterly - prune vanity metrics

                make gaming harder than doing good work.

                                                                                                                                                                                    31 / 36

                                                                                final takeaway

                ▍ metrics should be a mirror, not a weapon.

                ▍ if a metric doesn't make developers' lives better, it's not a good metric.

                                                                                                                                                                                    32 / 36

                                                                                the slow build

                you're waiting 40 minutes for ci. tests are flaky. you've brewed a second coffee.

                which metric proves this is expensive?

                a) deploy frequency

                b) build success rate

                c) mean time to recovery

                answer: b - builds failing 1 in 3 runs = 15 hours/week lost

                                                                                                                                                                                    33 / 36

                                                                         the bottleneck sprint

                team hit all story points, but features aren't shipping.

                which metric shows why?

                a) deploy frequency

                b) lead time for changes

                c) change failure rate

                answer: b - high activity doesn't mean good flow

                                                                                                                                                                                    34 / 36

                                                                             the money problem

                dashboards green, velocity up, teams shipping... this isn't moving our "revenue".

                what's missing?

                a) alignment between engineering and company goals

                b) developer satisfaction metrics

                c) team velocity metrics

                answer: a - this isn't a metric... but fast in the wrong direction is still wrong

                                                                                                                                                                                    35 / 36

                                                                        resources & references

                research & frameworks:

                   •  dora state of devops reports - dora.dev

                   •  space framework - microsoft research (2021)

                   •  devex framework - getdx.com

                further reading:

                   •  "accelerate" by nicole forsgren, jez humble, gene kim

                   •  "the space of developer productivity" (acm queue)

                                                                                                                                                                                    36 / 36