RDEL #11: What is the relationship between self-reported productivity and system-measured productivity?
A deeper look at how productivity in software engineering can be explained by self-reported data and system data.
Welcome back to RDEL! Each week, we pose an interesting topic in engineering leadership, and apply the latest research in the field to drive to an answer.
✍️ This week, we consider how software engineers self-report their productivity compared to what systems show. How do the two prominent measures of productivity relate, and how do we bridge the potential gap between them?
The context:
Note: For a primer on productivity in software engineering, check out our previous RDELs on how to define and how to measure productivity.
A common way to measure productivity is by looking at system measurements through tools that developers use, such as Github. In the last few years, significant work has been done in both the research and commercial world to show that in order to measure productivity, teams need to include developer perception alongside system measurements.
Often, these two ways to measure productivity have lived in isolation from one another, and even today many companies look at developer surveys separate from system metrics. As defined in the SPACE framework, combining the two together creates a far more complete picture of productivity. So how do we bridge the gap between what developers say versus what systems output?
The research:
A team of of researchers at both Meta and Microsoft studied this relationship using 81 software engineers at Microsoft. Participants submitted daily productivity surveys that asked them to self-assess their productivity and day-to-day attributes. Researchers enriched their answers with telemetry data that included time spent on pure coding, debugging, code reviewing, testing, and other development tasks. Finally, they used a series of linear models to model the relationships between self-reported data and telemetry data. (Note: The researchers also included a baseline intercept for every developer to adapt to different rating behaviors among participants.)
Their results found that:
Telemetry data explained 9% of the variance observed, with coding time alone being 7%. This provides evidence that having opportunities for long stretches of coding time improves a developers perception of productivity.
Measuring productivity against participants daily attribute data, researchers found that traveling had the highest impact (by far) on impacting self-assessed productivity.
The second most common attribute to impact productivity was being a “designated response individual”, i.e for on-call rotations
“I worked from home” turned out to not be a significant factor in self-reported productivity
After the study, researchers sent visualizations of the productivity data to engineers that participated. 72% of engineers stated they learned something new from the data, and 89% were interested in participating in future studies on productivity.
The application:
This paper gave early evidence of how productivity can be explained using both system metrics (ie telemetry data) as well as developer perception. The results offer a few actionable tips for managers who are thinking about software productivity.
Use both system and self-reported measures to assess productivity. This paper was published shortly before the SPACE framework emerged, and adds to the body of literature around how the two forms of measurement create a much richer picture of productivity. If you are thinking of improving productivity on the team, start with a developer survey and then layer in system metrics as evidence.
Share productivity data with the team. Engineers in this study overwhelmingly voted to view the data, and it makes sense: the most effective way to improve productivity is to empower engineering teams.
Managers often (fairly) worry about the impact that sharing metrics may have on the team. This is an issue in two cases - if the data is used for measuring performance instead of for empowering productivity, and if engineer perceptions are not represented. To avoid this, aggregate (and anonymize) data to the team level, and make sure perception metrics are included.
—
Thanks again for reading, and happy Research Monday!
Lizzie
From the Quotient team