RDEL #141: How can engineering leaders calculate the return on their AI investments?
AI acts as an amplifier of existing systems, and a sample first-year ROI of 39% only materializes after teams budget for a J-Curve productivity dip.
Welcome back to Research-Driven Engineering Leadership. Each week, we pose an interesting topic in engineering leadership and apply the latest research in the field to drive to an answer.
AI usage has crossed the line from experiment to default. As that usage scales, the conversation among engineering leaders has shifted from “should we adopt” to “what is all of this actually worth,” and ROI has become a regular topic in staff meetings, board prep, and budget reviews. This week we ask: what does it take to realize ROI from AI-assisted software development?
The context
While enterprise AI adoption in engineering is now near-universal across different maturity stages, the financial impact is not. Surveys show a striking split: most executives report a return from at least one gen AI use case, while a sizable share of organizations see flat productivity, marginal gains, or even net drag from initiatives that look identical on paper. The question of “why” has stopped being academic — it determines whether AI budgets keep getting renewed.
Part of the difficulty is that engineering ROI is unusually indirect. Code generation is fast to measure, but the value chain runs through quality, throughput, instability, developer experience, and eventually revenue. Each step adds attribution complexity, and the most quantifiable benefit (cost efficiency) often obscures the more important systemic effects. Without a baseline measurement, leaders can’t tell whether a tool delivered a return or simply shifted effort to a different part of the lifecycle.
The research
The DORA team’s 2026 ROI of AI-assisted Software Development report builds on its 2025 study of engineering teams across the industry, combined with a financial framework derived from Google Cloud’s value realization practice. The report synthesizes DORA’s measured effects of AI adoption on engineering outcomes with a structured ROI calculator that models hard costs, the productivity dip during early adoption, and downstream business value.
Key findings include:
AI is an amplifier, not a transformer. AI magnifies the existing strengths of effective teams and the dysfunctions of struggling ones. As the report puts it, “Without this foundation, AI creates localized pockets of productivity that are often lost in downstream chaos.”
Adoption follows a J-Curve. Most organizations encounter a temporary productivity drop before exponential value, driven by the learning curve, the verification tax (reviewing AI output for hallucinations), and pipeline adaptation as downstream processes scale to higher code volume.
“Initiatives often fail not because the technology is flawed but because leadership misinterprets this learning phase as a failure and pulls funding during the inevitable dip.” - DORA report
Individual effectiveness rose the most, but software delivery instability rose too. DORA’s data identifies individual effectiveness as the strongest positive outcome of AI adoption, with instability the second-largest effect overall. Friction and burnout did not meaningfully decrease, suggesting that with AI, “friction moves” rather than disappears.
Productivity gains are highly context-dependent. Research cited in the report finds roughly 35–40% productivity gains on simple, greenfield tasks, but “10% or less” on complex, legacy brownfield code. The amplifier cuts both ways.
A realistic first-year ROI is achievable but not automatic. DORA’s sample calculator yields a 39% first-year ROI and an eight-month payback — but only when the value side ($11.6M) factors in reinvested headcount capacity offset by an “instability tax,” and the cost side ($8.4M) explicitly budgets a 15% productivity drop across the first three months.
The application
The most important takeaway is that AI investment is a systems decision, not a tooling decision. Buying licenses without investing in the surrounding platform, data, and governance simply accelerates the rate at which existing dysfunction shows up in production. Leaders who measure their baseline before adoption, plan for a J-Curve, and explicitly budget the “tuition cost” of learning are far more likely to realize the modeled returns than those treating AI as a procurement exercise.
Here are a few ways engineering leaders can put this research into practice:
Budget the J-Curve before signing the contract. Build a deliberate productivity-drop assumption (the report uses 15% over three months) into your first-year cost model. This sets honest expectations with finance and protects the initiative from being killed during the dip.
Measure baseline software delivery performance now. Throughput and instability metrics are the early warning system for whether AI is amplifying value or chaos. You cannot calculate ROI without a “before”, and you cannot afford to wait until after rollout to start measuring.
Apply different ROI assumptions to greenfield vs. brownfield work. Don’t extrapolate a 35% gain from a new-service prototype to your legacy monolith. Build at least two scenarios in your model. A 0.8x conservative multiplier on the value side for older codebases is a reasonable starting point.
Reinvest reclaimed capacity into innovation, not headcount cuts. The report is explicit: a headcount-reduction strategy hurts morale and erodes the very productivity AI is supposed to deliver. Frame freed-up time as capacity for higher-value work.
Whatever your AI roadmap looks like this year, the research suggests the leaders who’ll see real returns are the ones doing the unglamorous foundational work — platform quality, data hygiene, baseline measurement — before the next model upgrade.
Happy Research Tuesday!
Lizzie



