Code will be read more than it is written

This might be obvious, but code will be read more than it is written.

Barring exceptional circumstances, you will certainly read the code you write at least once - when you first write it. Then a bug appear in testing, and you will have to read that code again to fix it. Then the product side request a change, and you will have to read that code yet again to change it. Then you switch team and now someone else is maintaining your code, so they have to read it whenever the need arise. Et cetera et cetera.

Of course, not all part of your code will get to be read over and over like this, only select ones will. Some is just already reliable enough with no requirement change ever happening to them. While the others are left collecting dust unused until they meet their inevitable end when the encompassing feature/product get shutdown. However, most of the time you cannot be sure which one is going to be which, so I reckon it's better to just make sure all your code is easy to read.

Having code that is easy to read is nice because:

  • When someone want to make a change, they may not have all the context that you have when you first write the code. Even you yourself may already forgot some of the context if enough time has passed. If the code is easy to read it will speed up the time needed to understand it as well as minimizing the chance of missing some crucial context.
  • It will minimize the chance of bugs appearing. It is harder to spot a bug when code itself is hard to read. Conversely, if the code is easy to read then it's also easier to spot a bug.
  • Most of the time, code that is easy to read is also easy to change. This is because to write code that is easy to read you will tend to naturally use the appropriate amount of abstractions and gravitate towards clearer and more concise logical flow for your code.
  • Some people (like me) get irrational pleasure when we see beautifully written code :)

And yet, there are engineers (even experienced ones!) who still don't have enough care for the readability of their code even when they have time for it and not being rushed by management/customer. To be honest, I don't understand what's going on in their head. Do they never read code written by other people/their past selves and think "Wow, this code is a mess. I wish they/my past self wrote this better." or something? If they really have no problem understanding even the most jumbled code then great for them for having an extraordinary cognitive ability I guess. But for the rest of us mere mortals, a code that is easy to read is certainly better for our work than the one that is hard to read.

That being the case, here are some tips from my experiences to make your code easier to read:

 

1. Use descriptive and consistent names

This has been repeated rather often, so I trust that we are all at least already graduated from using one letter as variable names (other than for loop index). However, there are some pitfalls that we have to be careful for.

First is when a name is not descriptive enough. This is when the name we use for variable/function/class/etc is not reasonably representative of the actual thing. For example, I once get admonished in code review when I use getConfig(string $configName) as a function name even though what the function actually do is get the config from database if it exist or insert a new config with the default value if it doesn't exist. So the name of this function actually only cover half of the reasonably expected behavior. To fix this, either we change the name to be something like getOrInsert(string $configName) or change the function itself so that it really only do the 'get' portion while the 'insert' portion is moved to another function and called independently. I went with the latter.

The second pitfall is when the names we used are not consistent and therefore ambiguous. For example, using variable called $transaction that some of the time refer to a database transaction, while some of the other time refer to an actual class named Transaction. This means that the reader of the code cannot be instantly sure what this $transaction variable refer to when they encounter it, or even worse, misunderstand the type of the variable. The only way to fix this is to use a clearer name, in this case we can use $dbTransaction to refer to the database transaction, so it won't be mistaken with the one that refer to the Transaction class.

The third pitfall is when we are overzealous in trying to use a good name. For example, using a very long name like FizzBuzzOutputGenerationContextVisitorFactory in an attempt to be descriptive. This can end up confusing the reader instead of helping them to gain understanding. When you find an overly long name like this, usually it's a sign that the abstraction you are using can be improved or even refactored altogether. Either that, or the writer of the code is simply not good at naming and we can just rename it. Do note that not all very long name is bad, sometimes it can be a perfectly fine name for the situation at hand. Just don't forget to be wary.

Lastly, naming is subjective. What one person consider a good name can be regarded by another person as a bad name and vice versa. But this does not mean we should give up on trying to use good names in our code. Music is subjective too, but I'm pretty sure most people consider Pachelbel's Canon a good music. In the same veins, we should strive to give names that most people will consider descriptive and consistent. We may fail, but at least we have tried. And as our experiences accumulate, we will get better over time. While in the process, we will make life just a tiny bit easier for anyone unfortunate enough to be involved with our code.

 

2. Arrange your code to be visually convenient

Consider the following piece of code:









 

Functionally there is nothing wrong with this code. But I think this code is harder to read than it should be, mainly because there is not a single whitespace between the lines.

Sadly, not using whitespace to separate lines into visually convenient block is a relatively common occurrences. In fact, what motivated me to write this article is because I've seen a bit too many code without whitespace that I feel this has to be changed. In my opinion, reading code that has no space between the lines is like reading a book that does not use paragraph. Yes, technically no information is lost, but it sure is annoying to read.

Compare it with the code below:











 

Isn't the latter easier to read?

For a code with straightforward functionality of this size the difference may not be much, but the longer the code and the more logic it has, the more difference arranging that code to be visually convenient will make. 

You might think that if the code is long enough to the point that reading it is hard, then we should just refactor it into several functions/classes/etc, and you are probably right. However, sometimes we simply don't have that luxury, so we should just do our best with what we have. In this case, by using that enter button tactfully while writing our code.

 

3. Use abstractions appropriately

Robert C. Martin is his book Clean Code wrote the following:

The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that. Smaller functions are mostly robust, easy to read and maintain.

Well, I respectfully disagree with him. 

In my humble opinion, function (and other abstractions like class) should be sized appropriately instead of as small as possible. Going back to a book analogy, wanting every function to be as small as possible is like wanting every sentence in a book to be short. Yeah, each sentence in isolation might be simple to understand, but I'm pretty sure it's not the best way to convey the overall information.

Certainly, having abstractions that is too big can also be a problem, and usually it's a bigger problem than having abstractions that is too small. My point here is that abstractions should be sized appropriately. Most of the time the appropriate size is small indeed, but there are times where medium sized abstractions is better, and once in a while there are big abstractions that works perfectly fine. Don't force every abstraction to be smaller than what is necessary.

Another thing about abstraction that we should be careful of is its usage. This is because too little or too much abstraction usages will lower the maintainability of the code.

What makes a code maintainable?

As far as my understanding goes, a code is maintainable if it is:

  • Easy to understand
  • Easy to change

That's it!

The principle is simple, but in reality, writing simple code that is easy to understand and change can often be harder than writing a complicated code. So in practice, our abstraction usages are often too little or too much instead of striking the appropriate balance.

In the extreme end, too little abstraction usages means that the code is just a giant blob without any functions/classes/interfaces/etc to organize it. It is apparent that this code won't be easy to understand and to change for anyone unlucky enough to maintain it.

On the opposite extreme, too much abstraction usages means that there are too many functions/classes/interfaces/etc to the point that you have to jump around 10 different files to do a minor change or investigate a supposedly trivial bug. Clearly, this is far from ideal.

Of course, in the real world it's virtually impossible for us to write that extreme of a code (if it's by one person at least, legacy code that is worked by many people is a whole different beast). More likely, we will write code that is just moderately off from the appropriate abstraction usages. Reasonable enough to pass code review, but in actuality rather difficult to understand and change, yet there seems to be no obvious way to improve it.

Regrettably, there is no clear-cut formula to decide how much abstractions is appropriate, it all depends on the context and the situation. So we can only grit our teeth and hone this crucial skill through experiences, hopefully when nothing big is at stake.


Closing words

Earlier in the point #1 (Use descriptive and consistent names), I said that naming is subjective. This actually apply to the point #2 (Arrange your code to be visually convenient) and #3 (Use abstractions appropriately) as well. They are all subjective and thus are hard rules that we should follow to the letter, only guidelines. So don't be afraid to break these guidelines when you think it will make your code better. 

Also because they are subjective, it is very possible that you will come into disagreement with your fellow engineers over them. Most of the time this is not a worthy hill to die for. So if after a good discussion there is still no solution, acquiescing might be the preferable choice.

That is all. I hope you will find this useful. Any comment or feedback (that is not too harsh 😅) is appreciated. Thanks for reading until the end!



Comments

Popular posts from this blog

An Introduction to Message Queue