Have you ever had your head explode while trying to trace the execution of a use case through a complex codebase? Well, I certainly have. Sometimes the code I am trying to decipher is my own, marking the process not only painful, but shameful as well.
In this article, I will explore the concept of complexity and how to tackle it using decision hiding (aka information hiding).
What is complexity?
The state of having many parts and being difficult to understand or find an answer to.Cambridge Dictionary
I believe this to be a better definition than anything our industry can produce. Language has been the tool of scientists, philosophers and many others for centuries. Our industry has existed for a few decades.
Let’s try to contextualize the definition.
What is complexity in software?
A “Hello, World!” program is usually something we consider simple. It:
- does one thing;
- uses one standard library function;
- has one entry point (e.g. the terminal);
- is written in one language and so on;
Notice the emphasis on one.
In contrast, we consider ERP systems complex. They are doing… Nobody really knows what they are doing, but it is a lot. Sometimes they integrate hundreds of other systems and technologies, driven by thousands of configurations. Some ERPs come with their own IDE, so they are not just configurable, but programmable as well.
Google Search might serve as a more interesting example because it does one thing – it finds relevant results based on a search term. So it is simple. However, you want to have personalized search results, account for SEO, distribute the system so it can serve millions of requests per second, etc. The way the search works, the “how” is much more complex.
Borrowing from the Lean Architecture  book, the software has a form (what the system is) and a function (what the system does). Complexity may affect either of them differently.
All being said, it would seem like we cannot have a production software that is truly simple. Production software will always have many parts. Even the “Hello, World!” example, if we imagine it production-ready, we need to figure out installation, updates, localization, multiple targeted operating systems, etc. It could be shipped as a Docker container, adding more parts.
Luckily, this is not completely true. We can have simplicity as a property of production software.
Is math complex?
Several years ago, while I was at the University, I had difficulties understanding the math behind a particular computer science paper. I discussed it with a few colleagues, and everybody told me it is very complex. So I went to one of my math professors. He sat down, looked at the paper and told me it is very simple. He explained. I understood.
How did that happen?
Where my colleagues saw a big question mark, my professor saw structure:
The professor had to explaining 3 distinct things and I was able to understand the original problem.
How does math battle complexity?
An example for dealing with complexity in mathematical proofs are lemmas.
Every lemma has a single result (i.e. a conclusion) that is a stepping stone for the larger problem. However, the lemma has no particular knowledge of the context it is used in. Similarly, the final proof depends only on the conclusion of the lemma.
This is essentially decision hiding – the lemma hides the decision of how to prove it: what theorems and lemmas to combine, what methodology to follow. Only the conclusion matters to others.
Coupling and dependencies in software
As software engineers, we often talk and think about coupling. We consider loose coupling a good thing, whereas tight coupling should be avoided at all cost.
But the devil is in the details.
For example, it is common in Java to reduce coupling by using interfaces and factories.
However, some people do so in a very mechanical way, following a broken principle: “If you don’t have a reference to a concrete class, you only depend on the abstraction, therefore you have loose coupling.”. The issue with this line of thinking is that it is focused on code. “If I don’t depend directly on a piece of code, I am not coupled with it.” This cannot be further from the truth.
Imagine a University scheduling system, which models students, classes, rooms and workstations.
Let’s assume we have the system up and running, we have nice clean interfaces and dependency injection.
There is a problem, though. Students cannot find their place quickly in the room, so we decide to show the faculty number on the screen – we implement it in the workstation setup.
We have coupling.
Inside our workstation setup module, we depend on the decision that every student will have a faculty number. We know there are constraints in the database that enforce this, so we depend on it.
If you do not agree that this is coupling, imagine the University wants to use the system for community workshops where anyone can sign up with an email. Domain experts say they are students, but they don’t have a faculty number. All of a sudden, there are dozens of places in our codebase that need to change (workstation setup included), because they were all developed knowing the faculty number is required.
Making faculty number required was a decision we took at some point and dependencies to that decision could easily infest our entire codebase.
There is a lot of data flowing through our systems and we sometimes forget that abstractions do not produce the data – concrete implementations do. The concrete leaves its mark on the data, allowing us to implicitly depend on code that was executed “miles away”, hidden under abstract factories and configuration-based dependency injection.
A production software as a whole has many parts and is therefore complex. But this complexity can be managed – simplified – by dealing with one problem at a time. This is what math is doing, this is what civil engineering is doing. Furthermore, this is what we are usually trying to do in software using abstractions.
However, when you work on a problem, you are forced to deal with all its dependencies – implicit ones included.
Let’s look at another example.
According to Robert Martin in Clean Code :
- A function should do one thing and do it well
- Error handling is one thing
Take our university system, more specifically, the placement module:
- Students should be distributed across the room
- Should send an email to the class administrator if there is no room
It may be intuitive to some that we need two separate functions – one to deal with the placement itself and another to deal with the error:
def place_all_students(university_class): try: placement = place_students(university_class.room, university_class.students) setup_room(placement) return True except NotEnoughRoom as ex: email = create_not_enough_room_email(university_class.signature, ex.extra_students) send_email(university_class.administrator, email) return False
We could move the orchestration logic to another function, but apart from that, we have a very focused code.
However, we failed to apply proper decision hiding. The place_all_students function knows “how” students are placed – using some greedy algorithm that will leave a bunch of extra students at the end. If we revise this decision in the future, we will have to change a lot of code (e.g. place_all_students and create_not_enough_room_email).
Decision hiding, when applied properly, does two things for us:
- Protect us from change. When we revise decisions, we would change only that part of the system that contains those decisions . We can simulate this to test our design.
- Help us focus on one thing at a time. When there is no knowledge or dependency to a another decision, we can ignore it and focus on the task at hand.
Getting back to our previous example, a better design would be to abstract the error, allowing us to change the placement algorithm without touching the rest of the system:
def place_all_students(university_class): try: placement = place_students(university_class.room, university_class.students) setup_room(placement) return True except PlacementNotPossible as ex: email = create_placement_error_email(university_class.signature, ex.reason) send_email(university_class.administrator, email) return False
Don’t blame the requirements
You can argue that in the example above, it is the requirement that we should send an email with the extra students. Requirements are forcing us to have this dependency. You would be right.
However, you would also be wrong. “It is the requirements” is the ultimate excuse I have heard from engineers delivering complex spaghetti solutions. They say it is complex because this is how product/customers want it. I haven’t done a survey, but I can’t imagine any customer would explicitly want complex software.
Requirements are an integral part of the software. They shouldn’t be just thrown over the fence to engineering – they should be a product of discussion, engineering and careful considerations.
Everybody, all together, from early on.The Lean Secret 
A lot of decisions are taken as requirements and later just implemented as code. If we leak decisions by definition, we cannot later hide them in our code.
Tips for decision hiding
Even though decision hiding is not a recipe you can memorize and use, you can have a checklist to remind you of its principles.
Here is my checklist:
- Be mindful of the facts and decisions you depend on.
- Prefer depending on hard domain decisions over solution decisions. For example:
- “All students will have faculty number” is a domain decision.
- “Greedy placement of students” is a solution decision.
- “We will ship software as Docker container” is a solution decision.
- Challenge requirements, especially ones that revolve around the solution rather than the problem.
- Be careful what you ask and share. If you know internal details about another team or component, you might accidentally depend on otherwise hidden facts and decisions.
- Think about the data you share between components – it is the highway for implicit dependencies.
- Test your decision hiding – pick a decision, change it and see how it affects your code. If the change is contained, your design is solid.
Decision hiding is not just a way of structuring code or building software. It is a way of thinking that has influenced many software engineering “best practices” like encapsulation, indirection, SOLID, clean coding, etc.
If we want to have software that is both rich and simple, we must apply decision hiding to every aspect of its creation. Otherwise, we will always end up entangled in complexity.
- Lean Architecture: for Agile Software Development byJames O. Coplien, Gertrud Bjørnvig
- Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin
- On the criteria to be used in decomposing systems into modules by D. L. Parnas
A few more thoughts on dependencies by me – Code Reuse – the Good, the Bad and the Ugly