[Profile picture of Ruben Verborgh]

Ruben Verborgh

Programming is an Art

Elegant code adopts aesthetics to achieve high maintainability.

People who have programmed with me or have seen my open-source work on GitHub know that I put a lot of effort in my coding style. I indeed consider programming a creative act, which necessarily involves aesthetics. And then, some people consider aesthetics the enemy of the pragmatic: “don’t spend time writing beautiful code when you can write effective code”. However, I argue that my sense of beauty serves pragmatism much better, because it leads to more concise and maintainable code, and is thereby far more effective.

Computer scientists are taught to program properly, just like linguists are taught how to write literary texts. However, not all computer scientists are programmers, just like not all linguists are writers, although most of us program regularly. For some reason, many programmers seem to think their role is to write code that makes things work. It’s not. Your job is to solve a problem with code in the most elegant way, such that your solution is easy to adapt, modify and reuse. Just like “the grass is green and the sky blue” is not literature, although it might perfectly describe the environment, coding something that just works™ is not programming. It’s fiddling.

Recognizing the creative aspect makes programming more than meeting functionality.

Not a matter of style

Many IT people think it’s a matter of style. One person writes poems, another likes to write really boring prose, but in the end of the day, both chunks of code do the same. They don’t. The first law of programming: code is read more than written. While functionally speaking, both chunks might look identical, functionality is only one of the many aspects of code. Robustness and maintainability, to name a few. Your code is not a black box, as it will be read and modified in the future, possibly by different people, maybe also by yourself. By coding without paying attention to style, you essentially make your code a black box, undecipherable for anybody.

Maybe an example can enlighten you. Consider the following piece of code, which is pretty close to something I actually encountered a few years ago. The problem it aims to solve is: given a list of labels, generate a fragment of HTML that contains the first two labels (alphabetically speaking). Somebody solved it this way.

function representFeaturedLabels (labels) {
  // Remember smallest and second-smallest indices
  var firstIndex = -1, secondIndex = -1;
  var html = "";
  for (var i = 0; i < labels.length; i++) {
    // Is this label smaller than the current second-smallest?
    if (secondIndex == -1 || labels[i] < labels[secondIndex]) {
      // Is this label also smaller than the current smallest?
      if (firstIndex == -1 || labels[i] < labels[firstIndex]) {
        // This label is the smallest, so (re-)represent
        html = "<div><span>" + labels[i] + "</span> <span>" +
               labels[secondIndex] + "</span></div>";
        firstIndex = i;
      }
      else {
        // Replace the current second-smallest label
        html = html.replace(labels[secondIndex], labels[i]);
        secondIndex = i;
      }
    }
  }
  return html;
}

That didn’t look too bad, did it? I mean, it’s commented and all, with sensible variable names and it doesn’t look hacky or copy/pasted. Reread the code one more time, while I’ll show you below some code with the same functionality, with a bit more style but without the bug.

function representFeaturedLabels (labels) {
  // Find the alphabetically first two labels
  var featured = labels.sort().slice(0, 2);
  // Return a <div> containing the first labels
  return "<div><span>" + featured.join("</span> <span>") +
         "</span></div>";
}

Now that was shocking, wasn’t it? Does the second code really do the same? Yes. Does the first code really contain a bug? Yes, several in fact.

First understand the problem, only then attempt to solve it

Obviously, the first programmer has been in the dark. Focused on writing code that met the functionality, he didn’t take a step back and think. Can we break this problem down into subproblems? Can we reuse existing solutions for these subproblems? My observation is this programmer didn’t understand the problem: he understands his solution. If you look at the first solution, you probably feel like: wow, this is a difficult problem that had to be solved in an ingenious way. It’s not ingenious, it’s unnecessarily complicated and this leads to unmaintainable code and an increased time for bug hunting, although admittedly, it does a great job in showing how awesomely difficult programming can be. Please don’t do this.

What distinguishes the first programmer from the second, is that it took more time for the second to actually start coding. What? How inefficient! No. The second programmer actually made sure he understood the problem before trying to solve it. He correctly identified the two subproblems: finding the featured labels, then representing them. The first programmer’s code couples both tasks, so anybody reading this code has to think of both the same time, which is not easy if you need to debug one of them. It makes it much more difficult to reason about the problem and the possible solution.

Consequently, in the first code, bugs are difficult to spot. For instance, inputs such as ['A','C','B'] work well, but ['B','C','A'] fails, because the second-smallest element 'B' occurs before the smallest 'A'. Of course, if you inform the first programmer about this, he will gladly (and quickly) fix his code, by adding more complexity. The same will happen if you tell him about another bug: the code misbehaves if the list contains only one label. Check out these bugs yourself. Note that both fragments of code contain additional bugs, such as the case when the list is empty, or when the list contains unescaped HTML entities. However, I think it’s clear in which of the two fragments those would be easier to fix.

Oh and, how about when the application suddenly requires 3 instead of 2 labels? I won’t count how many lines you’d have to change for that in the first code, but in the second code, it’s only one character. Now who has been more efficient?
Programmer 2 created a solution that is durable and extensible, and that is likely to solve more problems than it causes.

The root of all evil

People writing bad code are often really good at defending it, sometimes with an attitude of pride that they don’t belong to a group of hippies who think code should be beautiful. “But… coding in style takes more time!“ Oh, does it? Tell me what takes more time: sitting back to think and write beautiful code, or quickly writing something that just works™ and then having to fix the above bugs? After all, the number of bugs in code is proportional to the number of lines, so expressing ideas clearly and elegantly is important for code quality. More lines means more bugs.

Perhaps the coder of the first fragment likes to point out that the first code is faster, since it takes linear time, whereas sorting takes at least log-linear time. First, premature optimization is the root of all evil, as Knuth famously said. It makes you end up with code that might be fast, but contains bugs. Second, in many cases, elegant code will be faster than the original because it’s less complicated. And thirdly, if performance really is an issue for that particular code fragment, which of two would be easiest to optimize?

Programmers are functional artists

Programming is an art and therefore, we are artists. However, we’re not the kind of artists that create purely for beauty. We are functional artists. We have a functional task as well as the duty to write beautiful code, because it is effective and thus lasts. Writing elegant code is our job. In a sense, this makes us similar to magazine lay-outers. Their job is to make articles readable, yet in the most pleasing way. This helps people find their way to the contents of those articles, just like great code helps people find their way to places they need to enhance.

Don’t pat yourself on the back because you can write complicated code. It’s likely you’re doing it wrong. Programming is understanding a problem well enough to be able to explain it as simple as possible to a machine. Defending sloppy code by claiming effectiveness is trying to hide what you don’t understand: the problem.

Ruben Verborgh

Enjoyed this blog post? Subscribe to the feed for updates!

Comment on this post