Metrics That Actually Move the Needle for CTOs

After interviewing more than 50 tech leaders on Proxify’s Builders podcast, one theme kept surfacing: metrics. From startups to giants like Microsoft, CTOs of all kinds wrestle with the same question:

Which metrics actually matter—and which ones just create noise?

Modern engineering organizations are flooded with dashboards: deployment frequency, bug counts, lead times, and more. But behind the charts, many CTOs are still asking:

Are we really getting better?

Here’s what seasoned CTOs suggest, plus what research from Stanford University and Google Cloud’s DORA report reveals about measuring real progress.


What Successful CTOs Suggest

Kirill Groshkov, CTO at Natural Cycles

Kirill is candid: “I almost don’t know a single software engineering metric that works!”

Unlike user behavior metrics—which are often deterministic, predictable, and analyzable—engineering work resists simple measurement. Developers are building something new, and context shifts constantly.

But if forced to choose, Kirill tracks:

Codebase size vs. product features delivered over time.

  • The upward force is product growth: business requirements driving more code and features.

  • The downward force is technical care: refactoring, pruning obsolete code, removing dependencies.

When plotted over years, these forces create a picture of whether growth is sustainable or heading toward a “complexity explosion.”

Why it matters: This isn’t about a perfect ratio—it’s about ensuring the codebase evolves in a way that stays maintainable.


Jonas Martinsson, CTO at Stravito

Jonas flips the script. For him, the key isn’t velocity—it’s developer curiosity.

He measures curiosity on two axes:

  1. Breadth – how many engineers explore new tools and resources.

  2. Depth – how deeply they engage and share their learnings.

Curious engineers anticipate challenges, innovate solutions, and level up teammates. Over time, curiosity compounds, delivering far greater value than simply “going faster.”

Pro tip: Celebrate curiosity. Highlight learning initiatives in all-hands and offsites so intellectual growth is as valued as shipping features.


Oliver Spragg, Former CTO at PlanA and Augmento

Oliver’s takeaway: metrics work best when tied to a specific problem.

For example, during a migration that spiked bug counts, his team tracked the Automated Defect Detection Rate (ADDR):

ADDR=Bugs caught by automated testsTotal bugs reportedADDR = \frac{\text{Bugs caught by automated tests}}{\text{Total bugs reported}}

This normalized metric measured progress toward the real goal—fewer customer-facing issues—rather than just test coverage or raw bug counts.

Lesson: Metrics are tools for transparency and focus, not blunt instruments for ongoing performance evaluation.


Stanford Research: Code Review as a Signal

Engineering leaders have long known that “not all code is equal.” But until recently, measuring the effort behind code—its complexity, integration pain, or hidden overhead—was elusive.

A 2024 Stanford study (Predicting Expert Evaluations in Software Code Reviews) changes that. Researchers used machine learning on thousands of pull requests to model expert evaluations across three dimensions:

  1. Code complexity – how hard it is to read, understand, or modify.

  2. Coding time – how long it likely took to write.

  3. Implementation time – how long it took to integrate into the system.

These dimensions shift the focus from raw output to cognitive effort.

Why CTOs Should Care

  • High complexity may point to the need for architectural guardrails, pair programming, or simpler internal libraries.

  • Long implementation time often signals fragmented systems, poor onboarding, or broken modular boundaries.

Instead of punishing “slowness,” these insights highlight where to invest in reducing friction.

Strategic takeaway:

  • Treat pull request metadata as a strategic lens, not a scoreboard.

  • Benchmark effort, not just throughput.

  • Use these signals in retros, 1:1s, and postmortems to spark better conversations: “Why was this so hard?” and “How can we make it easier?”

Bottom line: smarter, context-aware metrics help CTOs see invisible cognitive load—and lead more effectively.


DORA Findings: Scaling Insights to the Organization

Stanford’s research zooms in on individual effort. DORA’s research zooms out to show how engineering practices connect to team health, product outcomes, and business success.

The 2024 Accelerate State of DevOps Report identifies metrics that consistently correlate with stronger organizational outcomes, especially when AI adoption is in play.

The Four Core DORA Metrics

  1. Change lead time – How fast code goes from commit to production.

  2. Deployment frequency – How often changes are deployed.

  3. Change fail rate – How many deployments cause failures.

  4. Time to recover – How quickly failures are resolved.

Together, these metrics provide a holistic view of delivery health. Interpreted wisely, they help leaders:

  • Spot bottlenecks.

  • Balance speed with stability.

  • Earn credibility with the board by tying engineering progress to business resilience.


Beyond the Core: Five Broader Themes

  1. Flow – Deep focus matters. AI adoption improves flow by reducing repetitive work, which correlates with higher quality, speed, and morale.

    • Measure flow (and cognitive load) through surveys.

    • Cut process overhead and distractions.

  2. Organizational performance – A 25% increase in AI adoption links to a 2.3% boost in profitability, efficiency, and customer satisfaction.

    • Use this to justify major transformations.

  3. Product performance – Barely moved by AI adoption (+0.2%).

    • User-centricity and UX remain the real levers here.

  4. Productivity – AI raises productivity by 2.1% by reducing context switching.

    • Value delivered per unit of effort—not busyness—is what matters.

  5. Team performance – Up by 1.4% with AI, via better collaboration and adaptability.

    • Invest in autonomy, leadership, and psychological safety.


Final Takeaways for CTOs

  • From Stanford: Focus on signals of effort (complexity, coding time, implementation time). They reveal hidden friction and guide smarter investments.

  • From DORA: Use delivery and organizational metrics to connect engineering performance to business outcomes.

Metrics should never be scoreboards. They’re conversation starters—tools to ask better questions, spot invisible work, and steer both technology and teams toward sustainable success.

For CTOs, that’s not just better measurement.
 It’s better leadership.