A Toolbox of Software Architecture Review Techniques - Pt 1

Part 1: The problem with many architecture reviews and the simplest thing you can do to make them better

I was at the Software Architect conference in London back in October and saw Rob Smallshire (@robsmallshire) give a talk on Conducting an Architecture Review. Rather than presenting techniques for reviewing an architecture, he instead focussed on things that go on around a review that you need to get right in order to be successful - such as how to get stakeholders involved, how to get the right buy in, how to set expectations with the project owner, and how to the right people to participate and what they should contribute. It was a good talk, you should check out the video in the link.

Chatting with people in the break it was clear that whilst lots of people do some kind of architecture / design review in their work, they aren’t very familiar with established techniques and methodologies for performing them. There was actually an industry study (Babar2009 [pdf]) a few years ago which showed that most organisations use home-grown, informal, ad-hoc techniques. And of the approximately 40% that claim to use a structured approach very few have heard of or use any established technique - despite how long they have existed and been successfully used.

So this is a short blog series about architecture review techniques - the ones that exist, the differences between them, and how to choose between them based on your context. In this first post I’ll start with the simplest technique that can help and then get into more detail later on.

Overview Stuff

Let’s start with a simple overview. Like many topics in software architecture, Grady Booch has provided a short and too the point description that explains how to perform an architecture review (Booch2010):

  • Identify the forces on the system.
  • Grok the system’s essential architecture.
  • Generate scenarios that exercise the relevant forces against the architecture.
  • Throw the essential architecture against those scenarios, then evaluate how they land relative to the relevant forces.
  • Wash, rines, repeat.

A short digression: This reference (Booch2010) is to an IEEE Software article. Although IEEE SW is an excellent source of information, it is behind a paywall and therefore it may as well not exist for the vast majority of software architects that I know. Luckily they provide a free podcast where Grady Booch reads the article for you. For all the references mentioned in these blog posts I’ll try to also link a publicly accessible sources for further reading (podcast index, mp3).

That list of steps is everything you need in a nutshell. Pretty much all other sources of information are just an elaboration of one or more of those steps in varying degrees of formality.

Let’s begin with a very basic technique, look at the structure for more detailed reviews, and different aspects concerning context that help you find which parts of a detailed review structure that are necessary for you.

Q: What’s the simplest thing I could do that would be useful?
A: Active Design Reviews

Active Design Reviews (Parnas1985 [pdf]) is one of the original discussions on the topic - and its still excellent. There are some older sources (e.g., Fagan1999 [pdf]) but this description from Parnas in the 1980s is a great place to start. The wording may read a little old fashioned in places - e.g., “...a description of the modules access programs…” - but the principles are just as applicable now as they were back then.

Parnas' article begins with a description of how ad-hoc "design reviews" are performed in practice and why this makes it difficult to for the review to produce a useful result. I’m going to repeat them here because even though these design review anti-patterns were described back in the 80s I still see them practiced today on large-scale projects with budgets in the order of 50-100M euros.

Here’s the sequence of events Parnas describes in an ad-hoc review:

  1. A massive quantity of highly detailed design documentation is delivered to the reviewers three to four weeks before the review.
  2. The designated reviewers, many of them administrators who are not trained in software development, read as much of the documentation as is possible in the time allowed. Often this is very little.
  3. During the review, a tutorial presentation of the design is given by the design team; during this presentation, the reviewers ask any questions they feel may help them to understand the design better.
  4. After the design tutorial, a round-table discussion of the design is held. The result is a list of suggestions for the designers.

And here are the consequences. Again, this is pretty much verbatim because Parnas captured it so well 30+ years ago
  1. The reviewers are swamped with information, much of which is not necessary to understand the design… decisions are hidden in a mass of implementation details.
  2. Most reviewers are not familiar with all of the goals of the design and the constraints placed on it.
  3. All reviewers may try to look at all of the documentation, with no part of the design receiving a concentrated examination.
  4. Reviewers who have a vague idea of their responsibilities or the design goals, or who feel intimidated by the review process, can avoid potential embarrassment by saying nothing.
  5. Detailed discussions of specific design issues become hard to pursue in a large all-encompassing design review meeting
  6. People who are mainly interested in learning the status of the project, or who are interested in learning about the purpose of the system may turn the review into a tutorial.
  7. Reviewers are often asked to examine issues beyond their competence.
  8. There is no systematic review procedure and no prepared set of questions to be asked about the design.
  9. As a result of unstated assumptions, subtle design errors may be implicit in the design documentation and go unnoticed

Sound familiar? I don’t understand how we were pretty clear about these problems - and how to overcome them - 30+ years ago and yet this still happens in many places - but that’s a topic for another rant...

Parnas goes into detail about how to address these problems but I wont repeat them now because we’ll get to that when we look at more recent techniques. But there a number of principles in his recommendations which provide the simplest thing you could possibly do.

Let’s start with the most important - and no prizes for guessing where he gets the name of his article from - the reviewer has to take an active stance. That involves two things:

  • As a reviewer you must prepare in advance checklists and scenarios that you want to test for the system qualities that you are interested in.
  • And for those scenarios you must ask active, open questions. I.e., questions that require a descriptive answer and not simply «yes/no» questions. You want the designer to explain how the solution will solve a particular problem or how the design would need to change for a proposed future need. That is, never ask questions such as "is it highly available?". Instead, ask "which state is maintained between calls?", "what are the consequences if that server goes down?", etc.

Some other useful principles are also part of his approach to running a review

  • You will most likely need an iterative approach to the review. That is, you’ll need to run a number of sessions where you start with the overall architecture and then subsequent iterations to dig into more detail or specific areas of concern.
  • You shouldn’t approach the task as a single, all-encompassing, «let’s just review everything» workshop. Instead, you need to identify particular aspects of the design - the qualities of the system - that are most important. Also, you will most likely need multiple people on the review team and they should focus on particular qualities. When working with your typical enterprise applications there is usually a security expert who has a specific area of focus, but other system qualities such modifiability, performance, useability, testing, operations, etc, can also be assigned to particular people on the team.
  • Finally, make sure there is a design representation that is actually reviewable. That doesn’t mean you need 300 pages with inch-perfect UML specifications and mathematical proofs. But it also doesn’t mean a random bunch of box and line pictures with no description of what those boxes and lines are supposed to represent. Identify the views that you need in order to depict the system qualities that are important. Then use a notation that other people understand. UML, Archimate, or whatever.

If there is only one thing that you get from this article series, then just focussing on these takeaways will help you perform better reviews. In fact, whenever you are in a review and you hear someone ask a question such as, «Is the response time good enough?», «is it secure?», «is it service oriented?», «can that be reused?», I recommend that you print out the Active Design Reviews article, roll it up, and give them a good whack with it - and then make them read it before they are allowed to take part in another review.

That’s probably enough for the first article - its already drifting into TL;DR territory.

Before we get into specific review techniques its worth looking at the structure of a comprehensive review (in a little more detail than Booch’s 4 steps) so that it’s easier to compare them. That will be the topic for the next post.

Thanks to colleagues @marioaparicio and @morten_moe for reviewing a draft of this article.

If all you have is an Elephant and 6 Blind Men

I was in a presentation the other day where (once again) a speaker invoked the Blind Men and the Elephant parable to explain some misunderstanding

I'm sure you know the one:
  • There's a large IT project with a lot of different people involved.
  • There are differences of opinion about what is important
  • According to the speaker, the reason we can't agree is because everybody is concentrating solely on their portion of the system rather than looking at the system as a whole.
I can't remember how many times I've heard this story used and I'm sure you've also heard it plenty of times. In fact, as architects in the IT world it seems to be one of those Metaphors We Live By.

Unfortunately it also seems to be the only analogy we use to explain this situation. As such it is a good example of another analogy that is used (and abused) in our discipline - If all you have is a hammer, everything looks like a nail.

Here's an alternative so we can have more than a single hammer to explain different perspectives on large projects:
Fly, Bat, Worm.

This was described in Neal Stephenson's book Anathem and it goes something like this. Instead of 6 blind men groping different parts of the elephant, imagine a Fly, a Bat, and a Worm perceiving the entire elephant and trying to communicate with each other about it. The (ideal) Fly perceives everything solely through sight, the (ideal) Bat perceives everything solely through hearing, and the ideal Worm solely through touch. How can these three discuss what they have perceived? How does a deaf Fly describe something to the blind Worm, or vice versa?
"[The Fly] says 'the worm seems to be relating some kind of account of its wormy doings, but since I don't squirm on the ground and can't imagine what it would be like to be blind, I haven't the faintest idea what it's trying to tell me!'"
While different people working on a large IT project don't perceive with different senses, they do conceive their mental models of the proposed solution differently. We're not so starkly different as to be communicating like a blind Worm to a deaf Fly, but the people working on a large project have different types of knowledge through which they both understand the problem domain and devise solutions using IT. Even people with nominally similar roles can have quite different detailed knowledge models of their area.

For example, Enterprise/Solution Architects are not completely divorced from the "wormy doings" of the Operations people who "squirm on ground"[*], but they are culturally different. Even if you have experience across all the different facets of Business and IT, different levels of expertise result in different levels of detail in the conceptual models we use to understand our problems and solutions. We might use the same terms in discussions but that does not mean we are immune to cultural differences in understanding. In fact the way we use the same terms often hides the differences in understanding that we have. We may think we have reached consensus, but internally we can have quite different interpretations of those words - i.e., the contexts that govern their meaning - and therefore the consequences for how they interact to provide a solution. "Service" anyone?
[*] No offense to the excellent Ops people I know. s/Operations/Development | TechArchitect | BusinessArch/ etc as required.

Designing a system requires constructing an understanding - a conceptual model of the solution. I.e., the concepts, relationships and interactions that need to exist for the solution to work as intended. It is a constantly evolving process that combines models of the problem domain with possibilities presented by the technology in our solution domain. We need to iterate on this process to improve our knowledge of both during design. And on large projects we also need to collaborate and reach consensus with others while performing this iterative learning process.

But how we understand the problem domain and how we devise solutions within it is subjective, not objective. How we identify concepts in the domain is theory-laden. How we devise explanatory models is influenced by our culture, language, and expertise. The definition of culture when we consider its influence on developing models can be as broad as "western culture" or as narrow as "java developers", "declarative language developers", "scrum practitioners", "enterprise architects", "EAI integration expert", "enterprise architect", "DevOps", etc.

We don't perceive the same reality and simply create different models for it. We understand that reality through our existing sets of concepts. To bastardize another metaphor. We see our domains, and solutions, through culture-tinted glasses. Whether we are explicitly aware of this bias or not.

I'm not going deep into the theory in this post, but the fields of metaphysics, epistemology, cognitive psychology, neurophilosophy, philosophy of science, philosophy of social science (and others) all deal with this issue in their respective fields. And while there are many debates in those fields about the extent to which subjectivity influences how we gain, use, and communicate knowledge, its certainly broadly agreed that it does influence it. If you're interested, here's a popular recent article in the NY Times on the role of language and how it shapes how we conceive reality.

The influence of subjectivity in system development is not new in IT. Some researchers and practitioners have been discussing its effects since the 60s. People such as Peter Naur, Terry Winograd, Meir M Lehman, and Bruce Blum. The most interesting aspect is that it is now slowing filtering through to the mainstream IT world's books and conferences. E.g., Domain Driven Design alluding to the influences of modelling and Kevlin Henney's JAOO presentation on the role of Metaphor (pdf) just to name two.

But we should be considering it more explicitly. As Gotleb Frege wrote,
The objects in the world are already delineated to some extent by the classifications embodied in socially inherited language. In fact, learning a language essentially means learning to grasp objective thought concepts.
We should be spending more effort understanding how this influences large-scale system design and development.

Which brings me back around to the situation that usually results in the Elephant and Blind Men analogy being deployed:
  • There's a large IT project with a lot of different people involved
  • There are differences of opinion about what is important
  • The reason they can't agree is because it is impossible for everyone to see the same holistic picture. Not because we are concentrating solely on individual portions of the system rather than looking at the system as a whole.

We observe and use the same terms to talk about the overall design but we conceive it differently based on our cognitive apparatus. We're not all looking at different parts of the same elephant. On a large IT project we can all see the whole elephant, we just see it differently - like looking at an Andy Warhol picture of an elephant.

The only way to overcome these differences is through interaction and feedback. Not to say, "but you aren't looking at the big picture", but rather, "what do you mean by that?", and "can you give me a specific scenario?".

Image references:

Blind Men and the Elephant: WikiMedia Commons
Fly, Bat, Worm: Cafepress.co.uk
Calvin and Hobbes on Relativism
Warhol Elephants

SEMAT and the origins of Software Engineering

I hope commentators on the SEMAT initiative - on both sides - have read enough history so that the rest of us aren't doomed to listen to a repeat performance of the last 40 years.

Its been interesting for me to follow the initial steps of the Software Engineering Method and Theory (SEMAT) initiative. Reading the SEMAT blog and articles revived memories from when I spent a bunch of years doing research on software engineering foundations. So I'm very curious to see how it progresses. Unfortunately, the initial buzz reminds me a lot of the fledgling steps of the original "software engineering" movement and I hope we don't waste an opportunity to deal with an important issue by continuing to bitch about the meaning of the word engineering.

Its hard to argue with the SEMAT Call to Action - especially the identification of immature practices. We do lack a theoretical foundation that provides both an explanation for the practices that have proven to be useful and a justification for proposing approaches for further improvement. (unless of course you believe we have already reached the pinnacle for software development effectiveness)

Many of the responses to the initial Vision and first workshop (google blog search semat,  twitter search #semat) are made by knowledgeable and insightful commentators. However, a great deal of what I read is simply a restatement of arguements that have been made in our discipline - that were often made more eloquently and constructively - for more than 40 years. Perhaps catching up with a bit of history can focus people on providing an improvement rather than rehashing old opinions.

P. Naur and B. Randell, (Eds.). Software Engineering: Report of a conference sponsored by the NATO Science Committee, Garmisch, Germany, 7-11 Oct. 1968, Brussels, Scientific Affairs Division, NATO (1969) 231pp.B. Randell and J.N. Buxton, (Eds.). Software Engineering Techniques: Report of a conference sponsored by the NATO Science Committee, Rome, Italy, 27-31 Oct. 1969, Brussels, Scientific Affairs Division, NATO (1970) 164pp.
The transcripts of the original two software engineering conferences are available online. They contain many working papers,  but most importantly, also transcribed debates between the participants which capture very interesting views on both sides. As a bonus, there is also the first analogies I can find between software design and the design theories of Christopher Alexander

The NATO conferences on software engineering were held in 1968 and 1969 and were intended to provoke ideas for improving software development. Indeed the term "software engineering" was meant to provoke and not to dictate a frame of mind needed to produce dependable software.
The phrase ‘software engineering’ was deliberately chosen as being provocative, in implying the need for software manufacture to be based on the types of theoretical foundations and practical disciplines, that are traditional in the established branches of engineering (1968, Preface)
There were many debates about the nature of engineering and whether or not it can be applied in software development which elicited useful insights into the nature of software development. Indeed, some of those comments allude to understandings which are considered relatively recent, such as agile development.
“[Naur] In my terms design consists of:
Flowchart until you think you understand the problem.
Write code until you realise that you don’t.
Go back and re-do the flowchart.
Write some more code and iterate to what you feel is the correct solution.” (1968 Report)
Additionally, there were points made, such as the usefulness of mathematical formalisms for software development, which are still not satisfactorily addressed today.

The important take away is that the result of the first software engineering conference was debate. The term engineering was put forward to provoke discussion on a commonly agreed set of problems - a Call to Action.

The problem was when they reconvened in 1969 for the next conference. The thing you notice in the transcripts is that the topic of debate has changed. There is no longer a sense of how should we frame our thinking to resolve these issues. The notion that software development could be "engineered" had taken hold. The only question now was, how. This was also explicitly pointed out by the editors of the transcripts:
Unlike the first conference, at which it was fully accepted that the term software engineering expressed a need rather than a reality, in Rome there was already a slight tendency to talk as if the subject already existed. And it became clear during the conference that the organizers had a hidden agenda, namely that of persuading NATO to fund the setting up of an International Software Engineering Institute. However things did not go according to their plan. The discussion sessions which were meant to provide evidence of strong and extensive support for this proposal were instead marked by considerable scepticism, and led one of the participants, Tom Simpson of IBM, to write a splendid short satire on "Masterpiece Engineering" (Editor's report)
(btw: anyone who thinks they have a witty and insightful take on why software development isn't like engineering should read "Masterpiece Engineering")

This communication gap in the 2nd software engineering conference happened very quickly and was not overcome. This resulted in a whole group of people who could have made useful contributions being removed from the debate.
The sense of urgency in the face of common problems was not so apparent as at Garmisch [1968]. Instead, a lack of communication between different sections of the participants became, in the editors’ opinions at least, a dominant feature.” (Editor's report)
Its pretty clear in the external commentary of the first SEMAT workshop that this is happening again. Whether or not they have chosen to remove themselves from the effort, there is now a significant group of people who could be helping to achieve the goals of the Call to Action who are simply not taking part.

There is always risk that the subject of debate gets 'highjacked' by a particular side - such as those who believe that software development can be engineered. Using the word "highjacked" is very emotive and implies a deliberate misdirection of a movement, but in reality it is that people are at risk of pushing an approach that simply conforms to their preconceived world-view. Many of the comments on SEMAT and software engineering in general appear obvious to their authors and their like-minded colleagues. But they make little sense to the people on the other side. They are based on incommensurable viewpoints. That is, you simply can't find common basis for comparing them.

We each have a worldview through which we understand our domain (in this case software development). Those worldviews can be incommensurable like the pro- and anti- software engineering camps. But there is no right or wrong answer answer here. The only thing that matters is whether or they provide useful suggestions for improving our discipline.

I don't know if engineering is the most useful paradigm through which software development can be improved. However, you only have to have a basic understanding of philosophy of science to understand that its not possible to tell until after theories have been developed, evaluated, and falsified.

The SEMAT Call to Action captures issues that exist in software development. The software-engineering group needs to devise theories which can be proven useful in resolving those issues. The its-not-engineering proponents should be doing the same. It will help everyone if the debate between them continues instead of becoming a divide that stifles possibilities.

I'll finish off with my favourite quote by Dijkstra and Randell in the '69 transcripts:
“Dijkstra: I would like to comment on the distinction that has been made between practical and theoretical people. I must stress that I feel this distinction to be obsolete, worn out, and fruitless. It is no good, if you want to do anything reasonable, to think you can work with such simple notions. Its inadequacy, amongst other things, is shown by the fact that I absolutely refuse to regard myself as either impractical or not theoretical.
What is actually happening, I am afraid, is that we all tell each other and ourselves that software engineering techniques should be improved considerably, because there is a crisis. But there are a few boundary conditions which apparently have to be satisfied. I will list them for you:
We may not change our thinking habits.
We may not change our programming tools.
We may not change our hardware.
We may not change our tasks.
We may not change the organisational set-up in which the work has to be done.
Now under these five immutable boundary conditions, we have to try to improve matters. This is utterly ridiculous. Thank you. (Applause).
Randell: ... ‘There’s none so blind as them that won’t see.’ ... If you have people who are completely stuck in their own ways, whether these are ways of running large projects without regard for possible new techniques, or whether these are ways of concentrating all research into areas of ever smaller relevance or importance, almost no technique that I know of is going to get these two types of people to communicate. ... You have to have good will. You have to have means for people to find out that what the others talk is occasionally sense. This conference may occasionally have done a little bit of that. I wish it had done a lot more. It has indicated what a terrible gulf we so stupidly have made for ourselves.

improving software architecture competence

How do you go about getting better as a software architect?
You'd think there would be plenty of information available.

Recently our local IASA chapter had a session on Improving Software Architecture Competence, so I threw together some thoughts to present.

I figured it would be pretty easy. There must be lots of documented experience of what individuals or organisations have done to improve themselves, right? Well, there doesn't seem to be. There is plenty of information about sw architecture course syllabuses (e.g., IASA's own education program) and no shortage of opinion about what software architects should do/read, but almost no experience reports saying, “this is what we tried and this is what we learned about trying to improve our software architecture competence”. (BTW, there may be some out there – I don't claim to have done any more exhaustive research than some simple googling).

However, I did dig up two interesting articles to summarise, and I think they are useful reading for someone interested in the topic. The first questions the ability for any sw architect to learn and improve using the kind of education techniques we currently have. The second – a workshop on improving software architecture competence – is the sole experience report I could find.
  •  “On System Design”, was an Essay delivered at OOPSLA a few years ago by Jim Waldo. In it, Waldo laments what he sees as a decline in the art and craft of system design. He points out a number of reasons for this, but the the most relevant here is his observation that we are unable to adequately train people in system design the prescriptive techniques we currently use. Indeed when considering the qualities of good designers, how they think and work may be more important than what courses they have done.
Both are useful reads if you are interested in the topic and there are plenty of notes in the presentation. If you've seen another resource on this topic I'd appreciate it if you provide a pointer in the comments.
"...good design is a capability that some people have, and others simply do not...
Whether this is an innate skill that people are born with, or one that is cultivated over time in ways that we don’t understand, is a question far too deep for me to address here. I neither know nor care...
But by the time someone is designing a computer system, whatever it takes to be a good designer is either there or it is not. When it is there, it can be developed and honed. It can also be degraded or warped. But when it is not there, there is no technique or process that can make up the deficit..."
"Everyone I talked to had a similar story of the master designer who had, either consciously or by example and correction taught him or her what they considered to be the important lessons in design
... Design, if my experience is any indication, is best learned by a long and varied process of trying, failing, and trying again under the guidance of someone who is an expert at the task". Waldo, On System Design

On a related note. I'm looking forward to Fred Brooks' upcoming book, The Design of Design: Essays from a Computer Scientist

new blog

Its seems to be quite popular for Sun people to be shifting blog address and I'm doing the same.

Now that OpenESB has been made "non-strategic" and while we wait for the legal processes that take place during the acquisition of global companies, I've got some more time to spend working on software architecture in general. So I've started up an another blog where I can keep track of my ideas.
I have no idea if anyone regularly follows this blog, but if you are interested in another viewpoint on software architecture ideas then feel free to come on by.

If I end up with a job at Oracle, and one that provides something worth writing about, then I'll resuscitate the old site.