A Toolbox of Software Architecture Review Techniques - Pt 3

Part 3: Evaluation Criteria - "Is it good enough FOR WHAT"

(This is the third post in the series on architecture review techniques. Check out Part 1 and Part 2 if you haven’t done so already)
Evaluation criteria - you should definitely have some
Seriously. That’s not just a throwaway line to kick off a blog post. You’ve probably also been in situations where someone has asked you, «Can you check out this architecture and tell me if its ok?». Sure, you can usually come up with some feedback based on experience, but mostly you want to reply with, «Can I tell you if it’s ok FOR WHAT? What's important for you? Is it security, performance, maintainability....?». You need to work on your evaluation criteria if you want to deliver a useful architecture review.

There are plenty of techniques for capturing evaluation criteria. Such as:
  • experience
  • scenarios
  • checklists
  • design principles
  • architecture smells
  • anti-patterns
  • metrics
We'll go through the most common ones and you can google around for the rest which are useful to have in your toolbox.

This post is a bit long but working with evaluation criteria can be the most important part of a review, so I think it's worth it. As an additional benefit, working with quality attributes, which are the basis for evaluation criteria, is a tool that will give you testable design criteria that you can use to drive an architecture design process and not just review an architecture.

But first a word on experience. You can come a long way by relying solely on experience to perform the review - especially if you actually have both the technical and domain experience that you think you do. But you will get more out of an architecture review if you take a the time to systematically work through your evaluation criteria first.

tl;dr summary:
  • determine which qualities (or quality attributes) are most important for the system to deliver - eg: performance, maintainability, usability, etc
  • use checklists, generic requirements and technical requirements for criteria that are relevant across solutions and projects
  • elicit verifiable scenarios that operationalise the generic requirements and to exemplify qualities that are specific to your solution
  • relying on experience can work if you have deep domain knowledge in addition to software architecture expertise.

System Qualities

«...The architecture is the primary carrier of system qualities, such as performance, modifiability, and security, none of which can be achieved without a unifying architectural vision. Architecture is an artifact for early analysis to make sure that the design approach will yield an acceptable system. Architecture holds the key to postdeployment system understanding, maintenance, and mining efforts…»
SEI definition
Evaluation criteria start with the system qualities. Typically these have been called non-functional requirements but more and more people are using the term “quality attributes” and “quality requirements” instead. Ideally, whoever wants the system reviewed would have a list of system qualities that they want the solution to achieve - but very few people have them at a level that can be used for a review. Often there is a document somewhere in the organisation called "Non functional requirements (NFR)" that lists a mix generic design goals, developer guidelines and specific test criteria and constraints that are applicable across all projects. For example, the following are taken from some real non-functional requirements documents:
  • “The solution must be flexible enough to evolve as the business requires”
  • “Response times for customer-facing webapps must be < 2secs for 95% of all requests”
  • “Always use the prescribed logging package and not System.out…”
Evaluation criteria should be verifiable and these generic non-functional requirements documents usually require some form of operationalising to get evaluation criteria that are useable. For instance, what does “flexible enough” mean in the context of the system you are evaluating? Other qualities such as “peak number of concurrent users” will also be specific to that system.

In addition there can be solution-specific quality attributes that are important for your system, and these need to be verified with their own quality requirements that are probably not mentioned in the generic list of NFRs. Consider a boring document archive system that you need to have for compliance reasons. It is most likely a long-lived, non-innovative solution where sustainability is a very important quality attribute - that is, how can the system be maintained over a 10+ year period. Sustainability is a quality attribute that is not usually mentioned in a common set of requirements but it could be quite important to some specific solutions.

Extracting the total set of requirements for the qualities that you want your system to have, involves both the identification of those requirements that are solution-specific and the operationalisation of those generic common requirements within the context of the system under review.

An important step in an architecture review is to elicit these criteria and one of the most useful results from an architecture review can simply be to get architects and product owners more proactive in eliciting, refining, maintaining, and testing these qualities in the design evolution of the solution.

Identifying and prioritising system qualities

Start by determining which qualities are relevant for your system. There are many sets of -illities you can use, but it can be useful to ones that are provided in the ISO 25010 software quality model standard and then customise them as needed for your particular project.

ISO25010 model

As an example of customisation, perhaps the Usability aspects aren’t relevant for your back-end project; or you need to extend the refinements for Security to include Authentication, Defense-in-depth, etc; or you need an additional quality which isn’t depicted in the model - for instance issues around DevOps or legal Compliance. Wikipedia has quite a comprehensive list of system qualities that you can use to customise the above list.

You might identify the important qualities and refinements simply from the business drivers and scope of the review or perhaps you will need to evolve the tree as part of scenario workshops, which I’ll describe later in this blog.

Most of these qualities are well understood, except perhaps those under “functional suitability”. You may assume that all these are covered under testing. However, I once worked on a project to automate the processing of social security applications. It was only in a scenario workshop with case-handlers that we learned the importance and consequence of ensuring that the system was completely correct with respect to the relevant legislation. For instance, if the case was sent to an appeals court then the govt agency needed to show exactly how the automatic processing occurred. This required two changes: a) extensions to the domain model to record far more information about rule execution so that processing could be recreated at a later date; and b) a change from user stories and acceptance criteria for functionality directly connected to legal compliance to a more detailed technique - RuleSpeak - that removed ambiguity and interpretation by developers.

The SEI uses a Quality Utility Tree, which can be cumbersome to use in practice, but you can represent them in multiple different ways. For instance my colleague Mario uses an impact map, and here’s some other examples from Arnon Rotem-Gal-Oz blog series on system qualities:
There are two purpose for this structured set of quality attributes:
  1. To get you thinking about the evaluation criteria you need use in an evaluation. Even if you are going to evaluate based on experience, it’s useful to work through all qualities that are important for the solution owner in order to minimise the chance that you overlook something important.
  2. Have a representation of system quality that you can discuss with product owners, domain experts, etc who will help identify, prioritise, and refine testable evaluation criteria. It's not often these stakeholders consider issues beyond functional user stories and a visual representation can help drive workshops to elicit evaluation criteria. It also helps get them to take more ownership of these quality issues - especially when they have to vote on priority and there are tradeoffs that need to be made.

Evaluation criteria based on system qualities

Once you have the important quality attributes for the solution then you can exemplify them with testable evaluation criteria. These usually takes the form of:
  • scenarios or
  • checklists
Before getting to those it’s worth mentioning that the most popular type of evaluation criteria is experience - i.e., no explicit evaluation criteria. If you have enough experience with both software architecture and the particular problem domain, then in effect, you have scenarios and checklists as tacit knowledge.


Scenarios are the most often recommended approach for capturing evaluation criteria. For each quality attribute you work with relevant stakeholders in the project to capture specific, quantified scenarios that you can use as acceptance test cases for the architecture. For instance, you can see example scenarios in a quality utility tree in following image.

These scenarios need to be verifiable. One technique is to use SMART scenarios - specific, measurable, actionable, realistic, and timebound. Another format is to specify Context, Stimulus, and Response from the system. These scenarios are like acceptance test cases for the architecture and the Context, Stimulus, Response format is very similar to the Given, When, Then format used in BDD/Specification By Example that you may be already familiar with.
  • Given (Context) normal conditions
  • When (Stimulus) a write operation on an entity occurs
  • Then the (Response) from the system should be less than 500 milliseconds.
It can be difficult to get stakeholders to specify evaluation criteria in this detailed format - especially if they have little experience with quality attributes. If that is the case then it's best to get them to start with simple, specific examples rather than trying for detailed, quantified, specifications of the requirements they want. In this sense you can think of scenarios as user stories. An example requirement that is the used to start a more detailed discussion with the domain experts rather than a perfect specification. You can use an iterative process to make them more detailed in subsequent workshops. As an example, flexibility/maintainability is always a tough quality to give testable criteria for. Crystal-ball gazing can be error-prone and throw up examples that may never occur, but stakeholders will likely have many examples of things that have been difficult to change in the past and will have domain knowledge of things that will likely change in the future - knowledge that IT people will usually not have themselves. Its important to make these criteria explicit and it may take a series of workshops with stakeholders to get the important, testable quality attribute scenarios out in the open.

A useful tool for generating, refining, and prioritising these scenarios is the Quality Attribute Workshop. The workshop gathers relevant stakeholders and works through the following steps:
  • identify the important qualities attributes and refinements (the architecture drivers)
  • brainstorming of scenarios with broad selection of stakeholders. At this stage those scenarios can be quite high level and as simple as a bit of prose on some post-it notes.
  • scenarios are then grouped together as appropriate and prioritised using voting from those present
  • the highest priority scenarios are then refined using the “Context, Stimulus, Response” format and quantified.
  • finally, those scenarios are classified for (Benefit, Difficulty to Realise) using a High, Middle, and Low rating.
The result is a tree or table such the one in the previous diagram. There is also a more lightweight workshop format you can use - the mini quality attribute workshop. (presentation slides and video)

These techniques for generating scenarios are especially helpful when running an architecture review in an domain in which you are not an expert - for example if you are called in as an external reviewer. Additionally, they can be useful when one of the objectives is to improve understanding of the system drivers to a larger group.

It may not always be useful, or necessary, to spend time generating scenarios. Creating scenarios with stakeholders is time consuming and it can be challenging to get them refined and quantified. An IEEE Software article from 2008 includes a debate between Tom Gilb and Alistair Cockburn on the relative merits of quantifying these scenarios and the return on investment for the amount of work you put in. Unfortunately, the debate is behind the IEEE paywall, but you can read Alistair’s thoughts from his website and Tom’s writings on the benefits of requirements quantification are well detailed. Additionally, you may not have time to run these workshops within the time constraints of the review, or perhaps the scope of the review doesn’t require detailed scenarios. But being able to elicit and detail evaluation criteria using scenarios is a useful technique to have.


Checklists are probably the most used technique in practice for capturing evaluation criteria. Whilst scenarios are good for capturing project and domain specific evaluation criteria, checklists are good for capturing the technical architecture issue that exist across multiple projects. For example, I have a checklist that deals solely with integration architecture that I use on most projects.

There’s not much to say about using checklists. Simply create and maintain checklists for the quality issues that exist across all projects and use them in your reviews. The answers to this question on stackexchange provide a great collection of checklists. For instance, this one from Code Complete:

Additionally, TOGAF recommends the whole architecture review process be designed around checklists.

TOGAF review process with checklists

The TOGAF architecture compliance review process is not as detailed as the ones I’ll get to in later posts, but the TOGAF guide provides a useful set of checklists for areas such as:
  • Hardware and Operating System Checklist
  • Software Services and Middleware Checklist
  • Applications Checklists
  • Information Management Checklists
  • Security Checklist
  • System Management Checklist
  • System Engineering/Overall Architecture Checklists
  • System Engineering/Methods & Tools Checklist

Other sources for checklists include:
I’m sure there are plenty more out there, but these should be a good starting point for developing your own.


In this post of the blog series we’ve looked at evaluation criteria which are one of the most important parts of the Review Inputs (remember Review Inputs from the SARA report that we looked at in the last post?)

Let’s wrap it up with a rewording of the original tl;dr summary:
  • determine which qualities (or quality attributes) are most important for the system to deliver - eg: performance, maintainability, and usability. These are the architecture drivers.
  • use checklists, generic requirements and technical requirements/constraints for criteria that are relevant across solutions and projects.
  • elicit verifiable scenarios that operationalise the generic requirements and to exemplify qualities that are specific to your solution.
  • use workshops with stakeholders (where appropriate) to both make use of their extensive domain knowledge and to get them to take ownership of system quality in addition to functional user stories.
  • relying on experience can work if you have deep domain knowledge in addition to software architecture expertise.

The next post in the series will move on to next major area in architecture reviews - Methods and Techniques.

Thanks to @marioaparicio for comments on a draft of this article.

A Toolbox of Software Architecture Review Techniques - Pt 2

Part 2: Building on our collective experience

The first article in this series looked at the simplest thing that could be useful when doing architecture reviews. This one looks at how you structure a more systematic review using the collected experience of our industry.

Standing on each other’s shoulders rather than each other’s toes

Previously I lamented the fact that we’ve been both using techniques for architecture reviews for 30+ years - so why are we still not making use of them effectively? Wouldn’t it be great if someone collected all that experience and made it easy to use in practice? Well, they have - it’s called the Software Architecture Review and Assessment (SARA) report. Here we’ll look closer at how to use it to design a structured architecture review.

Back in 1999 a working group got together to collect industrial experience and research techniques for performing architecture reviews. There were many participants from multiple organisations and they presented the SARA report at the ICSE in 2002. (Obbink2002, [pdf]) If you plan on running an architecture review, or currently do them regularly, you should grab a copy and make use of it.

A digression: Apart from ICSE, there are some other conferences you should check out. For instance the «Working International Conference on Software Architecture (WICSA)» - (and the ECSA, and SATURN conferences). There are more conferences that discuss useful techniques for software architects than just the QCon and GOTO series.

The SARA report provides:
concrete, practical, experience-based guidance on how to conduct architectural reviews. This includes guidelines on:
  • what steps to follow,
  • what questions to ask,
  • what information to collect and document,
  • what documentation templates to use, and
  • tips on how to manage the social, managerial, and technical issues that arise when reviewing an artifact as important and complicated as a software architecture.
There are additional articles that collect experience with architecture reviews and provide an overall structure. I’ll include those as well in this article, but the main focus will be how to structure an architecture review using the SARA report.

Designing a Structured Architecture Review

The report structures the knowledge about architecture reviews into the following parts:

  • Review Inputs
  • Review Outcomes
  • Review Workflow
  • Methods and Techniques
  • Pragmatics and People Issues

These parts are a good starting point for planning your review.

The report then finishes with case studies to exemplify each aspect. This article will primarily focus on the inputs and outputs and we’ll come back to the others in later posts.

It begins by defining concepts and how they hang together. Nothing fancy, but an important place to start (figure 4-1).

The most important concepts are what the report terms the ASRs and ASDs:
The purpose of an architecture review is to understand the impact of every architecturally significant decision (ASD) on every architecturally significant requirement (ASR).
This sounds perfectly reasonable and straightforward to do. The problem is - as the report notes - that it’s often very difficult because the architecturally significant requirements are often hard to identify, the architecturally significant decisions are often not documented, and the way the decisions interrelate is often not easily understood. A significant part of the review process is often teasing these out.


The inputs include:
  • Objectives
  • Scope
  • Architecture description and artefacts
  • Evaluation Criteria
Obviously you’re going to need artefacts that describe the architecture, but the first input you need to consider are the objectives for the review. Why you are doing the review and who you are doing it for will drive the rest of the inputs and outcomes that you need to identify.

There are (at least) two starting points for running a review:

  1. It’s a one-off review that has been requested by particular stakeholders in the project.
  2. It’s one of a series of regularly scheduled reviews as part of the project lifecycle.


From this starting point you then identify what you want to get out of the review. The report lists the following examples:

  • Certifying conformance to some standard:
    • For example: Does the architecture fulfill the constraints and requirements of the relevant standards?
  • Assessing the quality of the architecture (the most common objective):
    • Does the architecture fit to the problem or mission statement?
    • Can specific qualities (e.g., scalability, performance, etc.) be architecturally controlled?
    • Etc.
  • Identifying opportunities for improvement:
    • Which design decisions should be revised in order to improve the architecture?
  • Improving communication between stakeholders.

Other objectives can include reviewing the portfolio of a newly acquired company to determine functional overlap and to determine if it’s worthwhile investing further in an existing, long-lived system compared to purchasing/building something based on more recent design/technology.


Once you have the objectives, then you need to set the scope of the review. There are many aspects to consider and you need to agree on both what’s in and what’s out of scope for the review. This is especially important if it’s a one-off review and you are an external reviewer. Scope creep is just a damaging for reviews as it is for software development - especially in situations where stakeholders may see the review as a means for pushing their own agenda. Examples of scope include:

  • Is it the whole system or just some of the major components?
  • Is it a system of systems?
  • Are you including all the stakeholders or just a selection?
  • Will you consider all the evaluation criteria, or just focusing on some of them?

The answers to these questions will often be affected by the time available, stakeholder availability, concerns of whomever has commissioned the review and if it’s a one-off review or part of a regular series of reviews.

Evaluation criteria

With the objectives and scope you can then move on to evaluation criteria - the requirements and other inputs that you want to throw against the architecture to see how it stands up. We’ll dig into these in the next article but they include:

  • Quality requirements
    • Scenario-based and experience-based
  • Check lists
  • Application of architecture patterns
  • Relevant standards, regulations and legislation
  • Architecture smells

Evaluation criteria can also include more than these traditional requirements. For instance, a useful input is a high level problem statement. It might be the world's greatest architecture, but if its not actually solving the right problem then its not much good. Similarly, you might have a Business Motivation Model or Impact Map - which gives you the same content but with more hipster packaging - that traces the business strategy down to specific tactical goals for the system.

Architecture description

Finally we get to the input where most people start - the architecture description itself. Often some (or all) of this documentation won't exist. It may be part of the initial stage of a review to work with the relevant team members to create the architecture views needed. As a starting point you could use a common set of views. For instance, the Kruchten 4+1 view model is good starting point. More recently I’ve also seen people using Simon Brown’s C4 model.

4+1 ModelC4 Model
Logical ViewSystem Context diagram
Development ViewContainer diagram
Process ViewComponent diagrams
Physical ViewClass diagrams
+1 for scenarios that instantiate the views for particular use cases...

These are great places to start. However for doing effective reviews you should have a good understanding of which architecture views you need. These will be affected by the objectives for the review, the quality requirements that are architecturally significant, and the point in the project lifecycle that you’re running the review. You need to identify the architecture views that allow you to apply the evaluation criteria to meet your objectives. For instance, the C4 doesn’t have behaviour views, so if you need to evaluate anything to do with timing and concurrency then you will need to provide an additional view. Rozanski and Woods have captured a many in their book on Views and Viewpoints for Software Architecture - and many are on their book website.

You may have additional architecture artefacts than just the architecture description. There may be proof-of-concepts, feasibility studies, or if it’s part of an agile project you may have a walking skeleton solution. For reviews of existing systems you’ll (hopefully) have the code.


That’s quite a lot on inputs - though there is more in the SARA report that is worth exploring. Now let’s look at the collective experience with outcomes.

The main tangible output - usually some kind of report - will depend on the objectives and the type of review you are performing. The SARA report provides a suggested structure for the report which you can use as a starting point and tailor appropriately.

Note that a review that is part of regular planned review activities and performed by people within the organisation/project will be less formal than if you are an external reviewer who is performing a one-off review for a particular stakeholder. The later will have greater need for documentation and justification for the outcomes.

The primary outcome is a list of risks, opportunities and recommendations concerning how the design deals with the prioritised evaluation criteria. But before we get into those there are also intangible outcomes that should not be ignored. One of the most useful outcomes from a review is simply the improved understanding about the planned architecture - especially for those who are not central in the design team. Additionally, a review team may not identify significant risks but the simple process of having the relevant team members explain the design will help share knowledge - and the act of doing this can often help the architects find and plug holes in their own understanding.

The first outcome most people think of are the risks, or issues, with the design. A useful approach is to identify both the risk and any relevant tradeoff that the stakeholders need to consider. For example, a design for a mobile app could recommend a native app over a html5 solution. The architecture review may document the extra cost and resources for supporting multiple platforms for this decision. But often what the stakeholders need is to understand the tradeoff so that they can prioritise the correct qualities. For instance, for a mobile game the user experience may be prioritised over the cost of supporting multiple platforms. For a relatively simple data entry app, the tradeoff might be prioritised the other way.

Risks don’t have to be solely about design. The review can identify issues such as:

  • business plan
  • process issues
  • requirements analysis
  • applied standards and technologies
  • design and implementation
  • test and monitoring
  • human resource issues

Be sure to also document strengths of the architecture under review and also the aspects that could not be reviewed. For instance, because the review:

  • lacked documentation
  • lacked access to particular people
  • was too early in the project to consider them in detail.

Finally,you should include recommendations and an action plan. Identifying weaknesses is one thing but stakeholders want suggestions for how to deal with them. Those recommendations don’t need to be a list of things that must be done. They can be an opening for discussions with the architecture team.

Review Workflow

Organise your review into the following phases:

  • review inception
  • the review itself
  • post review

The first phase results in the agreement on the review scope, cost, duration, participants, etc. The second phase is the ..., iterative process of discovering, capturing and comparing ASRs and the architecture description. The last phase concentrates on summarizing and communicating the finding, as well as improving the review techniques and methods

Methods and Techniques

The report provides an inventory of methods and techniques and I’ll go into them in subsequent articles. These will discuss:

  • Should you use scenario-based or experience-based method?
  • How will you identify those scenarios?
  • Do you need a full ATAM (Architecture Tradeoff Analysis Method) or should you use a lightweight variant?
  • Do you need quality-specific techniques such as Rate Monotonic Analysis to analyse issues with real-time systems?
  • etc.

Pragmatics and People issues

The report finishes with a section on Pragmatics and People issues. These are extremely important but I think this blog post has covered enough.


It’s not uncommon in our industry to bump into a grey-haired software architect, mention some new solution or technique, and watch them roll their eyes and ramble on about how it’s just a rehash of something they’d used or read about decades before. And it’s true, many things that we treat as new in software architecture have been solved before - it’s just that those solutions are unnecessarily hard to find.

Techniques for software architecture reviews are one of those areas and this article has described how to structure a more systematic software architecture review based on many years of collective experience. Go and download the SARA report and build on the experience of others.

Thanks to Pär for reviewing a draft of this article.