Teaser: Over the last few decades, a variety of static code metrics have been published and promoted to measure the maintainability of software systems. This study evaluates 12 common static code metrics for their correlation with observed maintenance efforts. Leveraging modern repository mining techniques, we examine the historical data of three large open-source software systems with a combined size of over 1M LOC and over 10k classes. We automatically identify maintenance activities and measure the effort needed to perform them through revised lines of code. Then, we investigate if the state of the system as captured by these metrics is an indicator for the required maintenance effort.
In contrast to earlier research, our results could not validate a general correlation between any of the examined metrics and maintainability. Instead, all evaluated metrics showed positive and negative correlations with maintenance effort depending on the considered time interval. Strong correlations only hold for specific projects, and within these projects, only for limited time spans. Across the project history, however, all metrics showed moderate correlations at most.
As no metric was found to be a good indicator for high maintenance efforts in all contexts, we advocate against using any of the evaluated metrics without project-specific validation. If metrics are to be used to monitor the maintainability of a system, either directly or through models based on these metrics, engineers have to validate their applicability not just for the project at hand, but also for the current time span.
New Paper at SANER: Revisiting Inter-Class Maintainability Indicators