Non-Functional Requirements (NFRs) are used extensively in enterprise software to define things like performance requirements and availability expectations but it’s important to be specific about what we’re asking for.
I spend a lot of time with my clients helping them understand why the details matter for NFRs so let’s walk through an example to demonstrate what I’m talking about.
Let’s assume we’re dealing with a system that is going to process 3-5 million transactions per day with peak volume during business hours at around 50 transactions per second. A sample NFR for such a system might be stated as:
The system must process transactions in less than 5 seconds.
In order to satisfy this requirement we need to establish how that 5 seconds will be calculated. Is it an average, a maximum, a percentile? I find it’s usually best to specify both a maximum latency and at least one percentile value. We also need to define the sampling period for the percentile-based metrics.
We can now reword this NFR to include a few boundaries such as
99.99% of all transactions in any given 10-second window must complete within 5 seconds, and no transaction can take more than 10 seconds.
Better, but there’s still room for improvement.
We need to define the boundaries of “the system”. In fact, we may want to define a few system boundaries: one that reflects the user experience, and one that measures the components that are in our control.
The final piece of the puzzle is defining how error scenarios fit into our NFRs particularly if the architecture has automated retries built into the design. To handle this, let’s split it into a few more specific NFRs: one to capture the standard successful flow; one to cover a transaction that was successful after a few retries; and a separate NFR entirely to cover acceptable levels of failure.
Reworking our example above, we might end up with NFRs like this:
At least 99.9% of all successful transactions in any given 10-second window must complete within 5 seconds, from Submit to confirmation as seen by the user.
At least 99.99% of all successful transactions in any given 10-second window must complete within 5 seconds, from API request to API response.
No transaction may take more than 10 seconds from the user’s point of view.
At least 95% of all transactions should succeed without any automated retries.
At most 0.01% of all transactions may fail due to system errors.
These are now sufficiently detailed that we are able to capture the appropriate metrics, build reporting, and design tests to verify the system performance.