Performance problem mitigation per test level

(2021-08-15)

Some potential causes of performance bottle-necks are possible to verify quite early in the development cycle, while others are harder to pinpoint and require full production like environments to identify. The chart below highlights this by providing examples of the different types of risks - and the corresponding mitigation actions.

Some of these tests could, and should, be performed in the CI/CD pipeline, but some are harder to fit into the automated process flow.

While it's impossible to write a comprehensive list of potential casuses of performance problems, view this list as examples of the types of risks to be mitigated. The list is compiled as an instrument to communicate that performance mitigation should be considered for a range of different testing levels - and that there are a myriad of activities to perform to ensure capacity and performance of a system.

Risk
Mitigation method
Test level
(System scope)
Note
Unnecessary resource demanding SQL expressions
Tool supported static analysis
Developer IDE
Improvable programming from e.g. sub-optimal data type usage, or unnecessary code iterations
Tool supported static analysis
Developer IDE
Data structure problems
Tool supported static analysis / unit testing
Developer IDE
Utilized libraries introducing degrading performance when updated
Unit testing
Developer IDE / CI
Resource consuming error management
Unit testing
Developer computer/System test
Database locks
Unit testing
System test
Parallel execution of unit tests
Application architectural problems
End-point testing
System test
Might need huge data volumes to identify problems
Deep AD structure
End-point testing
System test
Mal-configured middleware
End-point testing
System test
Container infrastructure, and IoC, make OS settings bug-fixable
Operating system limitations
End-point testing
System test
Container infrastructure, and IoC, make OS settings bug-fixable
Large data volumes in single tables + full table scan
End-point testing
System testing/System integration testing
Obsolete infrastructure configuration reverting to fallback (e.g. DNS)
Performance testing on full scale integrated systems
System integration test
Performance affecting license limitations for infrastructure components (IPS/Firewall/Switch/CPU core count)
Performance testing on full scale integrated systems
System integration test
Some infrastructure components has license limitations for bandwidth or similar. These may differ between environment types.
Mal-configured load balancers
Performance testing on full scale integrated systems
System integration test
Huge data volumes transferred over the network
Performance testing on full scale integrated systems
System integration test
Heavy security solutions affecting performance
Performance testing on full scale integrated systems
System integration test
Different environment types may use different security solution implementations
Hardware limitations
Performance testing on full scale integrated systems
System integration test
Resource limitations (settings)
Functional volume testing on full scale integrated systems
System integration test
E.g. MQ queue length, logged in user slots
Colliding batch jobs or other scheduled tasks
Long-term testing under load on full scale integrated systems
System integration test
Unexpected load peaks
Stress testing on full scale integrated systems
System integration test
Campaigns, out-of-the-ordinary events
Increasing data volumes in the system in the long run
Static analysis at architectural level
System integration test
Analysis of decreasing performance over time
Production incident root cause analysis with fault slip through analysis
Assembled task force for analysis
Production
Slowly degrading performance over time
Monitoring / APM
Production
Un-known use-cases for the system
Slow roll-out of updates, to one user group at the time
Production
For any use-case that hasn't been found during feature analysis