Sunday, September 21, 2014

Programming Research I Would Like to See

When discussing programming and programming language research with others, I often feel that topics of research that would benefit my daily work the most are missing. The following is a quick overview of topics I think need more looking into.

Correctness

A large part of programming research focuses on correctness. The main goal of static typing, probably the biggest tool in the research toolkit, is to ensure correctness. Bug-free software is a noble goal, but there is a small catch:

The absolute correctness of the program is not the main goal of most programming.

Realistically, a (non-trivial) piece of software is never entirely bug-free. The more bugs you fix, the more effort you have to put into fixing the next bug. At some point, the effects of the bugs is just not severe enough to warrant the needed effort.

There is a saying for new products that goes “make sure you are building the right it before you build it right,” which encapsulates the priorities nicely. A slightly buggy product now is quite often infinitely better than a slightly less buggy product next week. For a new product, you need to see if there is actual demand for it so you can drop it early if there isn’t. The more time you spend getting a minimum viable product out of the door the more money you are wasting if it fails. And make no mistake, most new products and services fail.

So while it is nice if programs are correct and bug-free, the effort to get to that point has to be weighed against other priorities.

This does not mean that buggy programs are great, or that programmers shouldn’t worry about bugs that they add to the software. It is the insight that there are different classes of bugs that warrant different levels of effort to fix.

For the average web app, a bug that causes an internal server error once in a while that’s fixed by a simple reload is pretty much irrelevant. And while this changes in fields like medicine or nuclear power plants, not all programmers work in those fields.

On the other hand, some bugs are indeed critical even for web applications. Leaked user details, remote exploits and similar security-related bugs are important to avoid even for catselfies.org.

When all bugs are treated the same, this leads to the situation where programmers have to either prioritize fixing irrelevant bugs or down-prioritize fixing important bugs. Neither of which is a good choice.

I would love to see more research that looks at different classes of bugs, their relative severities in different contexts, and how to focus on avoiding specific classes of bugs while caring less about others.

Change Management

Not all bugs are programming mistakes. Sometimes, a programmer implemented exactly what they wanted to implement, and it’s still wrong. Either the programmer misunderstood the specification, or the specification changed. This is a very common situation and actively embraced by agile programming, where products are released early and often to get quick feedback precisely because requirements tend to not survive the first contact with reality.

“Embrace change” is one of the mantras of agile programming. But there is very little research on how to do this well.

Automated tests have been the main tool that really helped in this regard. From xUnit via BDD-style tests to Cucumber/Gherkin, from regression tests to two-cycle TDD, there are a lot of approaches and ideas on how to do tests well, but very little concrete evidence.

What sort of features in a language or in a library make it easy to change a program to adapt to new or changed requirements, but still adhere to the unchanged ones? What is the best way of writing tests, or which way of writing tests excels in which situation?

I think this is such a central part of programming that it really could do with being the focus of more research.

Empiricism

These questions quickly cease to be about formal methods and proofs and tend to go into empiric science, even social science. Empiricism is key here.

A while back, I participated in a discussion about truthiness in programming languages. Should there be more than one value that is considered true or false? Should if statements only work for strictly boolean arguments? An interesting topic with different experiences.

In this discussion, one participant at one point complained about people learning Scheme who think that 0 (the number) is a false value, and how it has to be repeated over and over again that 0 is a true value in Scheme. People who argued that a single false value is a good idea argued that this was a sign of bad influence from other languages and the fault of the learners, not a problem of the language at hand.

A bit later in the same discussion, someone mentioned that they checked for a value other than None in Python using a simple truth check, which failed as Python has multiple false values. The same people as above saw this as a problem of the language and the confusion of the user as a proof of how having more than one false value is obviously not a good idea.

This is a beautiful example of confirmation bias which is quite common in programming language discussions, although it’s not always this obvious. We have preconceived opinions on what we consider true and good, and will interpret any evidence we find to support our opinions. This is normal human behavior, and part of what the scientific method is meant to help us avoid. But as programming language research is often treated as a purely theoretical field, anything beyond formal models is rarely treated with the same scientific rigor.

As so many aspects of programming are outside of the scope of formal models, we really need more empiricism in programming language research and discussions.

Summary

I would like to see more research that differentiates between different classes of bugs, as not all bugs are equal and bug-free software is not the only (or even main) focus of programming.

I would like to see more research on change management in programming, how to design and write programs so that they are more easily adapted to ever changing requirements.

And I would like to see all of this done with empiricism instead of emotional arguments.

This research exists, and I greatly enjoyed reading a number of papers on these topics, but really …

More research is needed.