Note: This was originally posted on Medium.
Schools teach you how to write code—to take an idea and turn it into software that does what you want. Coding interviews reinforce the idea that your job will be to take an algorithm and implement it. This is great and fundamental. However, school doesn't prepare you for the realities of maintaining a large codebase. Your real job is dealing with constantly changing requirements and shared ownership of code.
For that, you need to write code that is easy for other people to understand and change. You (and others) will have to modify your code to meet new requirements, and you will need to do this all the time.
How can we build software that makes this easy? In this post, I'll share some advice based on my experience with complexity, readability, and architecture applicable to all modern languages.
Is great code easy to understand? Not always!
Well-designed code should be easy to read, but the "why" is not always obvious. A classic example of this is computing the midpoint of a binary search. The most obvious code is this:
mid = (start + end) / 2
However, the code is often implemented as this:
mid = start + (end - start) / 2
These two equations are mathematically equivalent, but computers can't do math perfectly. The first could overflow while the second cannot.
Modern code is full of workarounds like this. Workarounds might be needed for technical (what computers can't do), system (what your infrastructure can't do), or business or legal reasons. The goal of well-designed code is to break the problem into smaller, simpler units that are composed together to perform complex tasks. These units might not make sense until you understand the systems surrounding your code.
We can optimize for readability and architectural simplicity (covered next), but we are bound by the essential complexity of the problem we are trying to solve—unless we can modify the problem itself. There might be a technical or infrastructure change we could make that would allow us to solve it more simply. You could even negotiate a change to an organizational requirement. The latter is hard and non-technical but often very useful work.
An essential element of readability (possibly the most important) is how you name things. Variables, functions, classes, and packages all have names, and names are expected to be accurate and complete summaries of what they represent. When people read your code, they will first see the names you've chosen.
Here are a few suggestions for naming:
Spend time thinking about the names you choose—they create expectations for people reading your code. Don't worry about clever or cute names. Be accurate.
Think of names as forming a hierarchical outline of your program, acting as summaries at various levels of detail.
Choose names based on intent (
load_training_data) and not implementation (
read_training_csv). Implementations are more likely to change.
If an accurate and complete name is too long, this is a sign that the function is doing too much or the code should be reorganized. For example, if you have a function named
generate_invoice_report_for_international_order, consider naming it
invoice_reportunder a module named
Reevaluate your names whenever you make a code change, and walk back up the call stack to ensure the names are still appropriate.
Another common practice is the use of comments to improve readability. Beware of comments! You might think that comments are great for explaining what the code is doing, but they are not. Instead, recognize that comments are a source of technical debt. The further away a comment is from the code it describes, the higher the chance it will get out of sync over time. Once this happens, the effort needed to determine whether the code or comment is correct is very high. Like other forms of debt, it is sometimes worth taking on. You can use comments to explain workarounds and provide context. Just don't overuse them.
One last point about readability: If you are not already using a style guide and linter, as your organization dictates, try to get one adopted. Style guides get everyone on the same page about how code should be laid out and what constructs are acceptable. This is a relatively easy way to improve readability once everyone has gotten used to it.
Developers use terms like "spaghetti code" or "coupling"1 to describe code that is not simple. You might have also heard of "The Law of Demeter". Another way to think about coupling is through dependencies: what the code needs to know about to perform its function. Like naming, you should consider dependencies in the code you write.
There are two reasons you should try to minimize dependencies:
If one of your dependencies changes in a way you don't anticipate, your code can fail. This is the main reason to avoid coupling. Remember that your real job is to deal with constant change, so the more dependencies you have, the more time you'll spend updating your code to keep up.
If your code needs to know about something, you must also know about it. Our memory, especially our working memory, is limited. The more things we need to keep track of, the higher the chance we'll get something wrong. This is why abstractions are so important. They give us common patterns we can apply across many situations, decreasing the amount we need to keep track of.
Here are a few suggestions for reducing the impact of dependencies:
First, functions should be as small as possible while providing a useful unit of work. Pure functions are ideal since they have no dependencies outside their inputs and the functions they call, so use pure functions wherever possible.
Second, use interfaces and encapsulation to reduce your exposure to your dependencies. Do this the first time a breaking change in a dependency catches you unaware.
Third, separate the parts of your code performing your business logic from everything else. The workarounds and performance optimizations you make today will likely need changes more urgently than the core features of your software.
Fourth, keep in mind that "Simplicity is not about counting." You will likely create more things when you simplify: more functions, modules, etc. There's nothing wrong with that. The point of streamlining is to produce smaller units of work that can be modified independently and easily composed in the future. Be aware that less code is generally better, however, so you might face a trade-off between architecture and readability.
Here are some key takeaways to remember as you write code:
Requirements are constantly changing, so make sure you spend time thinking about readability and architecture.
Treat names like summaries and choose names based on intent rather than implementation. Don't rely on comments to explain your code.
Our working memory is limited, so take advantage of abstraction and encapsulation to reduce the surface of dependencies. The first time you encounter a breaking change in a dependency, write a wrapper around it.
Don't get discouraged if you find these concepts difficult to practice. I can tell you from experience that engineers of all levels struggle with them and no one gets it right the first time. Keep working at it and it will get easier!
Michael Nygard has spoken about how programmers use the term "coupling" to mean an inflexible link between different parts of our code. In contrast, in mechanical systems, it usually refers to a flexible link, such as a coupling between train cars. Couplings keep train cars together but also reduce the impact of shocks. They are a great idea!↩