RDEL #94: How do experienced engineers actually review code?
Reviewers build three internal models to decide what to read, question, or approve.
Welcome back to Research-Driven Engineering Leadership. Each week, we pose an interesting topic in engineering leadership, and apply the latest research in the field to drive to an answer.
Code reviews are foundational to modern software development—but how do experienced developers actually read, understand, and evaluate changes? What mental models do they use, and how can we better support those strategies across teams and tools? This week we ask: How do experienced engineers comprehend code during review, and what can leaders do to support more effective, scalable review practices?
The context
Code review is one of the most impactful rituals in engineering culture—used to ensure quality, prevent defects, transfer knowledge, and socialize standards. But it’s also one of the most cognitively demanding tasks engineers do. Unlike writing code, reviewing requires reading code written by others, understanding unfamiliar logic, validating edge cases, and assessing context—all without full access to the original author’s thinking.
As teams grow and engineering velocity increases, reviews are getting larger, more complex, and often asynchronous. The traditional framing of review as a gatekeeping function misses a key truth: good review requires deep comprehension. But what does comprehension look like in practice, and how can we support it systematically?
The research
To answer this, researchers conducted a qualitative study using 10 experienced developers known for doing regular, frequent, and high-quality reviews. They observed 25 real-world review sessions (including both open source and in-house projects) and followed up with semi-structured interviews. They applied multiple cognitive science theories to understand how reviewers built understanding and made decisions.
Key findings:
Comprehension is strategic and scoped.
Reviewers don’t try to understand everything. They scope their attention based on complexity, risk, and available time. One participant noted, “If it’s too complex, I just ask for a walkthrough.”
Review follows a three-stage workflow: context building, inspection, and decision-making.
Most reviews begin by skimming the PR title and description (used in 84% of reviews), then shift to inspecting code through reading, chunking, testing, or discussions, before making a final judgment.
Mental models drive comprehension.
Reviewers compare the proposed change to three internal models: the actual code, the expected change, and the ideal implementation. Discrepancies between these models trigger comments or change requests.
Information sources vary widely.
In addition to the PR itself, reviewers consulted issue trackers (44%), prior discussion threads (40%), and sometimes external tools like ChatGPT. More experienced reviewers drew on personal system knowledge and conventions.
Comprehension is incremental and interactive.
Reviewers often updated their understanding mid-review, collaborated with teammates, and shaped their final mental models through conversation—not just isolated inspection.
The application
This research challenges the idea that code review is simply “checking for bugs.” Instead, it’s a layered comprehension process guided by strategic scoping and mental model comparison. Engineering leaders should treat review like a cognitive skill—not just a checklist—and design tooling, processes, and training that support how people actually read and reason about code.
Here are a few ways to apply these insights:
Encourage scoping and chunking.
Support reviewers in narrowing the review scope (by commit, file, or feature), and avoid expecting full comprehension of every detail in large reviews.
Make context building easier.
Require clear, high-signal PR descriptions, link related tickets, and summarize intent and rationale up top. This sets up reviewers to build effective mental models faster.
Use automation to remove low-signal distractions.
Reviewers in the study scoped their attention strategically—often skipping minor style issues to focus on whether the change aligned with domain expectations. Use linters and CI checks to catch formatting, naming, and structural consistency issues before review. This frees reviewers to focus on architectural alignment, correctness, and business logic—the places where human judgment matters most.
—
Happy Research Tuesday!
Lizzie
I think one of the most important contributions that leaders can have is to limit discussions around low signal distractions.
At the end of the day, meaning is more important than syntax.
Very good reading. Thanks for sharing! I believe experts don’t see more. They see differently. They chunk information, rely on prior patterns, and zoom in on anomalies. By designing processes around this way of viewing things, we respect the reviewer’s mind instead of trying to control it.