Pick two

You have probably encountered the famous quality triangle. This triangle is made of the three elements Correct - Cheap - Fast, and you have to pick only two of its sides, the third one will stay forever out of your reach. Can't have your cake and eat it too, right?

Pick two

If you’ve ever been programming in a professional context, you have probably encountered the famous quality triangle. This triangle is made of the three elements Correct - Cheap - Fast, and you have to pick only two of its sides, the third one will stay forever out of your reach. This kind of tradeoff seems absolutely natural: one cannot expect to be absolutely perfect on all metrics.You cannot have your cake and eat it.

There are, however, some contexts under which such an apparent contradictory state makes sense. One such paradox is defended by the Bazel build system, which claims “Fast - Correct choose two”. The rationale behind that statement is very coherent, to understand it fully I’ll need to introduce the concept of dependencies. A dependency (dep in short) is a unit of software (a library, an application, a configuration file) which is required for another one to function correctly. For example if you want to use the python interpreter, you’ll need to install it first. Dependencies are fickle beasts because they start simple, and they become awfully complicated really fast. In our example, what if you need to run two python scripts, but one is in python2 and another one is in python3. They’ll share a “python” dependency, but not to the same version. And what if you have a hundredth of these versions floating around, with changing codebases? Do you start by installing every python version available under the sun, just in case you need it? This would be very wasteful. Instead Bazel says you need to be correct in describing every unit’s dependencies, in that if they are known perfectly, you can simply let the system decide on the minimum set of versions to install. Thus, by being correct the system can now afford to be lazy because doing the bare minimum (all the described dependencies) is in fact guaranteed to also be the maximum needed. In other words it’s an optimum.

Another false tradeoff is the Small - Correct one: as in an element of data that is compressed to its limits will necessarily be correct, it cannot bear no meaning. To understand that let’s start with a very simple game: the game of flag. This is a two player game in which there is always a loser and a winner. I carry a flag, and if it is raised, then you win, else I am the one winning. We can say that our system can represent 2 states (flag up or down) and it also encodes 2 states. We have a perfect match (bijection) between the two and whichever way I place the flag I won’t be able to encode a state that should not exist (such as both of us losing). If on the contrary I were using a suboptimal encoding, such as giving you a flag, and keeping one from myself, and we were saying that the person raising the flag wins, then we could encode the state for both of us winning, or both of us losing, but these would be contrary to the given rule: only one person wins. By having more freedom in the way we can represent the data, we open up the possibility of representing invalid data. Thus, by extending this reasoning to an arbitrary number of state, we can conclude that having an encoding that is perfectly optimal in size (it will encode exactly the number of states of our system), not one more, not one less, we also have an encoding that is always correct. Perfection brings perfection.

This leads me to a pet peeve of mine: the notion of safety by construction. To illustrate this, let’s imagine you have to sort a list of numbers. Being given 8, 2 ,4 You can of course return a list of values 2, 4, 8. The issue is that if I need the list to be absolutely sorted, I should either trust that your algorithm works fine, or I could check myself that they are sorted, which would be pretty inefficient since this check will always be superfluous unless there is a bug in your code. Instead, what I could do is force the encoding of the list to be 2, 2, 4, where the first value is the starting value, and each one after it is an increment. This gets decoded as 2, (2 + 2), (2 + 2 + 4) = 2, 4, 8 As long as all increments are >= 1 (leaving aside machine types limitations), the encoding is guaranteed to always return a sorted list.