Monthly Archives: November 2015

Black Boxes and Boundaries

Introduction

The premise of this post is that there is no such thing as a black box. David Parnas’s idea of data hiding, which he first described in 1972 has been taken to the extreme and we need to understand that hard boundaries are not necessarily what we humans need, or put it another way to use hard boundaries is to deny our humanity. We seem to have forgotten Postel’s law which stated that “an implementation should be conservative in its sending behaviour, and liberal in its receiving behaviour.”[1]

Background

Early version of Microsoft operating systems were treated as black box with little documentation of the effect of a call. It was not till a book appeared that provided an idea of what the calls did that writing programs for the new operating system take off (there was also undocumented windows by Andrew Schulman which reflected the earlier work). This is fine if the interfaces are largely independent and there are no transitive dependencies between calls. If a call to A, affects B which in turn affects C we have a linear casual chain. If we have another sequence where D affects E and in turn E is dependent on C we have a transitive dependency. To understand how the interface will behave over time we need to have an understanding of this and if it is not documented you need to probe the object in an attempt to determine what the effect – they are casual in nature but there is no direct mechanism visible and therefore we cannot treat the system as a black box.  This is one of the issues with Smalltalk where everything is an object and therefore it has hard boundaries.

The other issue that arose is that the interface is the interface and you have to abide by the implied contract in order to be able to consume it. If you stray from the contract then you may experience unforeseen effects again. There was also the related issue that there was no formal language for describing these interfaces other than the printed documentation so a typo was not likely to be picked up. The interface constraint what was offered and if the functionality was basic then you may find yourself providing an level of abstract to make it simpler and more intuitive.

The SOA and ESB Era

We did not help ourselves with SOA (Service Orientated Architecture) and ESB (Enterprise Service Bus)[2] as they inherited these issues. We added the ability to formally define the interface via WSDL but these just enforced the hard boundaries. This meant that people could validate their code to ensure that syntactically they were calling the interface in an appropriate manner but they were expected to honour the contract (the interface). This presents basically a one size fits all and assumes that we can envisage all the scenarios that are likely to consume the interface.

This meant that we needed to change the interface if there was a minor change in the usage scenario and what about large change? We developed elaborate scheme for versioning of the interface so it could change over time but often the rate of change was insufficient in some instances.

We experienced a similar issue with ESB as the data was tied to a standard structure (later we realised that the trick was to use a canonical structure) which meant that we then needed to map the data to each consumer and each of these had to be written. Later offerings provides the means to undertaken mapping at the boundaries but this relied on the canonical structure offering appropriate containers for the information.

We also explored the ideas of orchestration and choreography which were intended to provide more flexibility as we could have a service that orchestrated others or one that combined a number of services at a level of the system architecture to provide a new service. These just build more services that had to be learnt and make the system more fragile – if an underlying service failed what was the effect as all these other services expected their contracts to be honoured?

Complexity Perspective

If we look at this from a complexity perspective we need flexibility as we can never foresee every situation. That is we should expect change and need a paradigm that supports this. Alicia Juarrero uses the phrase ‘sloppy fit’ to stress this, which is similar to Postel’s law. This allows an element of adaptability as the interface is not hard and it can support some degree of ambiguity. We need the boundaries to be ‘loose’ so that we don’t have to redefine or extend the interface every time there is a minor change in the usage scenario. The interface needs to be flexible and acknowledge that this is a sociotechnical problem and not just a technical one. This ‘sloppiness’ promote resilience through adaptability and flexibility. The alternative is to accept that the interface will fail (not meet needs) at times and provide a means of early detection and fast recovery.

We do not seem to be very good at temporal aspects of systems so causality is something that we need laid out. This means that the boundaries need to be flexible or permeable so that we have an understanding of what happens – from a complexity perspective a black box is the last thing we need. The open source community has been useful here as it has allowed people to look ‘inside’ the box and work out what is going on. In some instances they may have a better way of achieving the same thing and it also supports tailoring of the service to the current content, not some arbitrary scenario.

The third point is that we have assumed a fail-safe mentality in building these system architectures. From a complexity perspective failures will happen and the system architecture should be safe-fail and not fail-safe. The later assumes that we can foresee all the failure scenarios whereas the former assumes that things will fail that we cannot foresee and therefore the system should be built to be tolerant.

Micro-Services

These are things that I believe micro-services has the opportunity to address as they make the interactions explicit and allow the interface to reflect the consumer’s requirements. This allows for flexibility as it not necessarily a one size fits all model. It also allows for evolution of the interface over time (although we need to manage the diversity this is getting easier as the tooling improves). What we do need to take note of Postel Law and try to ensure that the intereface supports some degree of sloppiness.

Because the dependencies are visible it makes it easier to consider the implications of failure and start to make system safe-to-fail and not just fail-safe. This leads to a more resilient architecture that should degrade gracefully and make containment of failure easier.

The issues that were inherent in the SOA are not completely addressed by micro-services but they a large step in the right direction acknowledging that we need more flexibility, transparency and design for safe-fail. The interfaces are simple (or potentially simple) and therefore cause-effect can be easily determined and understood – in addition we can make the interface a bit sloppy (this can be applied to most interface design) so that we have some slack.

Closing

This is not an argument that everything should be developed as a micro-service – Hotels.com for example have split their monolithic system architecture into 5 parts and this provide sufficient isolation and flexibility for them. The point is the next time someone tells you that we should just treat it as a black box don’t take the statement at face value.

[1] https://en.wikipedia.org/wiki/Jon_Postel

[2] I’m glossing over remoting and attempts like DCE to provide structure approaches to remote objects