RDEL #40: Do nudges improve code review completion?
This week, we look at whether nudging code reviewers improves completion times, and what level of intervention is most effective.
Welcome back to Research-Driven Engineering Leadership. Each week, we pose an interesting topic in engineering leadership, and apply the latest research in the field to drive to an answer.
Code reviews are an important tool used by many engineering teams to collaboratively review software before it gets merged. However, they can slow down the software development process and delay features, particularly when teammates don’t actively engage within a reasonable timeframe. This week, we review research on using nudges in the code review process and ask: do nudges actually improve code review completion?
The context
Once opened, a code review can cycle through a number of states as it moves to completion. The process requires multiple developers, from the code author to the code reviewer(s). In order to keep a review moving, the author and reviewers need to be aware of and available for review and revision of the code before it can be merged into the main feature branch.
It is common for branches to become long-lived, or stay in an “open” state for over 1-2 days. Long-lived feature branches can cause a number of consequences, including integration issues, communication challenges, merge conflicts, and delayed feature releases. For these reasons, it is important for teams to keep track of how long code reviews stay open and to reduce the code review cycle time as needed.
The research
Researchers at Microsoft and Delft University of Technology designed an end-to-end service, aptly called Nudge, to determine whether nudges will improve code review completion times. To construct Nudge, the team first designed an ML-based model to predict the lifetime of a given code review using the data points available to them through 147 repositories, and later with 8,000 repositories. Then, the team built an activity detection module to find the current state of a code review, as well as an actor determination model to determine who needs to take action. The system would nudge the right actor within the code review directly, as well as via email.
To determine the highest-priority code reviews to nudge, the team performed correlation analysis from over 22,000 code reviews across 10 repositories at Microsoft. The four features that contributed most to a code review’s lifetime were:
Day of the week. Code reviews created later in the week stay idle during the weekend
Average duration of code reviews created by the author. Newer developers, for example, are subject to more thorough reviews and testing.
Number of reviewers of a code review. The more people are active on the review, the more comments or questions are raised which can impact completion time.
Important configuration files being changed. Certain files (ie .csproj files) indicate major changes in a project and therefore take longer to test and review.
After implementing Nudge, researchers found that:
Code review resolution time decreased by 60% for 147 repositories (8,500 code reviews), when compared to overdue code reviews where nudge did not send a notification.
Developers perceived the full nudge as overwhelmingly positive, with 78% of developers explicitly marking their code review nudge as “resolved”.
Notably, the “light” nudge which only showed code review lifetime estimates, and not actor identification and alerts, were seen as significantly less helpful. In user interviews, developers preferred the notification tool ping blocking reviewers directly.
When scaling up to 8000 repositories, results were similar. 71.5% of code reviews were resolved positively and code reviews were closed with times consistent with the previous results.
The application
As developers increase responsibilities in development, testing, and deployment, it is easy to imagine how code reviews can get lost in the day-to-day tasks. These findings suggest that nudging blocking teammates on their code reviews has a positive, and significant, impact on code reviews.
There are numerous ways that teams can use processes and tools to improve code review speed and keep code reviews from going stale. These include:
Code review notification tools. Tools (or automated scripts) can alert engineers of their code reviews responsibilities and whether they are blockers.
Equitable code review distribution. When a minority of reviewers own a majority of reviews, the team will experience bottlenecks in code reviews as well as a loss of knowledge sharing opportunities. Teams should review the distribution of code reviews and ensure they are more equally distributed across teammates. This can be automated (ie via Github) so reviews are split automatically based on various algorithms.
Team agreement on code reviews. When teams have a shared understanding of how to focus their attention on code reviews, they can more effectively manage their various responsibilities. Create a team agreement to outline the team’s goals on code review completion times, as well as reasonable timelines for when developers should respond to a review.
—
Wishing everyone a week of swift, high-quality code reviews. Have a great week and happy Research Monday!
Lizzie
From the Quotient team