Claims From Research on Programming

Some claims with empirical studies behind them from a talk by Greg Wilson:
  • (working?) lines of code written per time is the same for high and low level languages
  • 25% increase in size of program description = doubling of program size (interactions?)
  • code review is worth doing; almost all the benefit comes from the first review. (Cohen 2006)
  • most people don’t find anything useful when continuing to read code past the first hour. (Cohen 2006).
  • (post-compile?) 60-90% of bugs can be found before first execution by reading code. this beats writing unit tests (although those can be re-run to detect future regressions, at least)
  • the rate of review that finds bugs in that hour (maximum) is a few hundred lines max
  • telling people that programming ability is mostly talent/genetic based vs. telling them it relies on practice: both men and women do worse - not trying as hard when they hit an obstacle?
  • physical distance: being near or cross-country (or even +9 hours) doesn’t impact (released) bug rate
  • distance in org chart: collaboration between programmers far apart in org tree means higher bug rate (I don’t think this implies that if you give two groups of people an identical task starting from scratch, then a different org chart will change things much; rather, I guess that the tasks that rope together different-domain software teams’ expertise and/or systems are inherently harder)
  • no code metric as of 2001 predicts bugs; all the benefit comes from code size of a program (more code -> more bugs)
  • anchoring (or perhaps the subservient desire to not disappoint by seeming incompetent) applies strongly to implementation time estimates no matter how experienced the programmer or low-authority (“I know nothing about software but I think this should take about … 3 weeks … 20 weeks”)
  • bugs that are fixed later in development are more expensive (by what measure?) than those caught earlier (because the cheap ones are easy to spot?)
(disclaimer: most studies of software programming productivity are not blind controlled studies, are mere correlations, use an undergrad population, and/or have data consisting of fewer than 100 person-hours)