Author Archives: Greg Brougham

Black Boxes and Boundaries

Introduction

The premise of this post is that there is no such thing as a black box. David Parnas’s idea of data hiding, which he first described in 1972 has been taken to the extreme and we need to understand that hard boundaries are not necessarily what we humans need, or put it another way to use hard boundaries is to deny our humanity. We seem to have forgotten Postel’s law which stated that “an implementation should be conservative in its sending behaviour, and liberal in its receiving behaviour.”[1]

Background

Early version of Microsoft operating systems were treated as black box with little documentation of the effect of a call. It was not till a book appeared that provided an idea of what the calls did that writing programs for the new operating system take off (there was also undocumented windows by Andrew Schulman which reflected the earlier work). This is fine if the interfaces are largely independent and there are no transitive dependencies between calls. If a call to A, affects B which in turn affects C we have a linear casual chain. If we have another sequence where D affects E and in turn E is dependent on C we have a transitive dependency. To understand how the interface will behave over time we need to have an understanding of this and if it is not documented you need to probe the object in an attempt to determine what the effect – they are casual in nature but there is no direct mechanism visible and therefore we cannot treat the system as a black box.  This is one of the issues with Smalltalk where everything is an object and therefore it has hard boundaries.

The other issue that arose is that the interface is the interface and you have to abide by the implied contract in order to be able to consume it. If you stray from the contract then you may experience unforeseen effects again. There was also the related issue that there was no formal language for describing these interfaces other than the printed documentation so a typo was not likely to be picked up. The interface constraint what was offered and if the functionality was basic then you may find yourself providing an level of abstract to make it simpler and more intuitive.

The SOA and ESB Era

We did not help ourselves with SOA (Service Orientated Architecture) and ESB (Enterprise Service Bus)[2] as they inherited these issues. We added the ability to formally define the interface via WSDL but these just enforced the hard boundaries. This meant that people could validate their code to ensure that syntactically they were calling the interface in an appropriate manner but they were expected to honour the contract (the interface). This presents basically a one size fits all and assumes that we can envisage all the scenarios that are likely to consume the interface.

This meant that we needed to change the interface if there was a minor change in the usage scenario and what about large change? We developed elaborate scheme for versioning of the interface so it could change over time but often the rate of change was insufficient in some instances.

We experienced a similar issue with ESB as the data was tied to a standard structure (later we realised that the trick was to use a canonical structure) which meant that we then needed to map the data to each consumer and each of these had to be written. Later offerings provides the means to undertaken mapping at the boundaries but this relied on the canonical structure offering appropriate containers for the information.

We also explored the ideas of orchestration and choreography which were intended to provide more flexibility as we could have a service that orchestrated others or one that combined a number of services at a level of the system architecture to provide a new service. These just build more services that had to be learnt and make the system more fragile – if an underlying service failed what was the effect as all these other services expected their contracts to be honoured?

Complexity Perspective

If we look at this from a complexity perspective we need flexibility as we can never foresee every situation. That is we should expect change and need a paradigm that supports this. Alicia Juarrero uses the phrase ‘sloppy fit’ to stress this, which is similar to Postel’s law. This allows an element of adaptability as the interface is not hard and it can support some degree of ambiguity. We need the boundaries to be ‘loose’ so that we don’t have to redefine or extend the interface every time there is a minor change in the usage scenario. The interface needs to be flexible and acknowledge that this is a sociotechnical problem and not just a technical one. This ‘sloppiness’ promote resilience through adaptability and flexibility. The alternative is to accept that the interface will fail (not meet needs) at times and provide a means of early detection and fast recovery.

We do not seem to be very good at temporal aspects of systems so causality is something that we need laid out. This means that the boundaries need to be flexible or permeable so that we have an understanding of what happens – from a complexity perspective a black box is the last thing we need. The open source community has been useful here as it has allowed people to look ‘inside’ the box and work out what is going on. In some instances they may have a better way of achieving the same thing and it also supports tailoring of the service to the current content, not some arbitrary scenario.

The third point is that we have assumed a fail-safe mentality in building these system architectures. From a complexity perspective failures will happen and the system architecture should be safe-fail and not fail-safe. The later assumes that we can foresee all the failure scenarios whereas the former assumes that things will fail that we cannot foresee and therefore the system should be built to be tolerant.

Micro-Services

These are things that I believe micro-services has the opportunity to address as they make the interactions explicit and allow the interface to reflect the consumer’s requirements. This allows for flexibility as it not necessarily a one size fits all model. It also allows for evolution of the interface over time (although we need to manage the diversity this is getting easier as the tooling improves). What we do need to take note of Postel Law and try to ensure that the intereface supports some degree of sloppiness.

Because the dependencies are visible it makes it easier to consider the implications of failure and start to make system safe-to-fail and not just fail-safe. This leads to a more resilient architecture that should degrade gracefully and make containment of failure easier.

The issues that were inherent in the SOA are not completely addressed by micro-services but they a large step in the right direction acknowledging that we need more flexibility, transparency and design for safe-fail. The interfaces are simple (or potentially simple) and therefore cause-effect can be easily determined and understood – in addition we can make the interface a bit sloppy (this can be applied to most interface design) so that we have some slack.

Closing

This is not an argument that everything should be developed as a micro-service – Hotels.com for example have split their monolithic system architecture into 5 parts and this provide sufficient isolation and flexibility for them. The point is the next time someone tells you that we should just treat it as a black box don’t take the statement at face value.

[1] https://en.wikipedia.org/wiki/Jon_Postel

[2] I’m glossing over remoting and attempts like DCE to provide structure approaches to remote objects


The Risks of Adopting the Wrong Approach

In Achieving success in large, complex software projects Sriram Chandrasekaran, Sauri Gudlavalleti and Sanjay Kaniyar of McKinsey (1) advocate moving from a functional delivery model that is silo based to one that is based on cross-functional teams that are module orientated. There are two problems with this model that I would like to raise.

The first is that the article describes large project as complex in nature without understanding that in a complex system behaviour is emergent. There is a short introduction to complex system in the recently published Cynefin paper on InfoQ (2). One of the key points is that no amount of analysis or planning will lead to understand of how a complex system will develop. They are dispositional in nature and therefore have a tendency to evolve in certain directions but this is not given and cannot be assumed. In this type of system the only viable delivery strategy is one that iterative/incremental in nature which allows you to manage for development of the systems in a desired direction. Trying to base the delivery based on a set of point in time requirements is unrealistic and fundamentally flawed, which the agile community has known for years. Simply moving to a cross functional model will not address this fundamental issue with traditional delivery models. As an aside Brian Appleyard (3) notes, in his most recent book, that simple solutions don’t work for complex problems.

The paper goes on talk about grouping of the work based on use case to support these cross functional teams so that can operate in parallel. This assumes that work can be grouped by use case but it does not elaborate on how, this in itself will support parallel working. One of the key things that you need to ensure is that the use cases are disjoint and one way that you can ensure this is by relating them to capabilities (4), along the lines of domain driven design (5). This allows for partitioning of the problem space and for parallel stream of work to be undertaken. This allows you to manage the parallelism which is mentioned as an issue with agile practices. It is not really an agile issue but one that is generic in nature of large programmes.

References

  1. Achieving success in large, complex, software, July 2014
  2. Cynefin 101 – An Introduction, July 2014
  3. The Brain is Wider Than the Sky: Why Simple Solutions Don’t Work in a Complex World, Bryan Appleyard, Sep 2012
  4. The Next Revolution in Productivity, Ric Merrifield, Jack Calhoun, and Dennis Stevens, Harvard Business Review, June 2008
  5. Domain-driven Design: Tackling Complexity in the Heart of Software, Eric Evans, Aug 2003