"After my last post, which hypothesized a relationship between the total size of a program and the bugs in it, I was led to this paper, via this blog post, via this comment. This, by the way, is my favorite thing about blogging."


Vivek Haldar: Size is the best predictor of code quality

That “last post”, Smeed’s Law for Programming was a good read. I meant to pass it along and forgot. Yet the follow-up is already out!

Let’s see if Tumblr allows a “Quote” post to include a block quote tag in the source description.

Vivek is referring to this paper:

K. El Emam, S. Benlarbi, N. Goel and S. N. Rai: “The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics“. IEEE Transasctions on Software Engineering, 27(7), July 2001.

Here’s the “old” model: Structural properties of code, as gauged by various metrics, affect its cognitive complexity, and drive external attributes such as number of bugs and maintainability.

Vivek cites that paper because it 

uncovers a strong association between most code metrics and the size of the code… [which] partially backs up my hypothesis that the number of bugs can primarily be predicted only by the total lines of code….

Next:

I haven’t found any studies [describing] this relationship. Do the number of bugs grow linearly with code size, or…?

Stay tuned for more!

All bold font emphasis was mine. Same is true for pun. Get it? Stay “tuned”, like performance “tuning”? Never mind… sorry…