That tense feeling we get when users find an important issue in production.
The panic when a manager asks the dreaded question.
Searching through your own notes and test results wondering, how did I miss this…?
Eventually, taking some deep breaths, you remind yourself that exhaustive testing is impossible.
However, you’re not entirely off the hook. As a quality champion for your team you can help to identify risk areas in the software development process and suggest improvements. To bolster your case for change, explain how each improvement could have helped to prevent recent production issues, or at least detect them sooner.
Some recent examples from my own experience include:
- Test environment was configured differently to the production environment, hiding the issue (i.e. no load balancer).
- The telltale sign in the log files was missed during testing due to the amount of noise in the error logs and lack of monitoring tools.
- Test data was too far removed from production data, so the problem wasn’t apparent.
- Automated test coverage was lacking in some areas, due to limitations in the framework.
- Some important legacy code was lacking unit tests, which I hadn’t realised.
It’s important that you don’t come across as making excuses, or passing on blame to others. Instead, each point can be used to drive improvements. For ideas on further improvements without waiting for production issues to occur, read The Science of DevOps: Accelerate by Forsgren, Humble and Kim. This book contains excellent ideas for improvements to your team’s processes, backed by data demonstrating links to increases in business performance.
Exploratory testing can help to find issues before they go to production, particularly with edge case scenarios. Exploratory testing is a skill that needs to be learned and practised. There are cheat sheets, heuristics and TestSphere cards to help you find important bugs fast.
Finally, and perhaps most controversially, how much does it matter if some bugs make it through to production? Consider the potential risks of failure; the effectiveness of monitoring and alerting tools used in production; the time required to fix production issues; whether a phased rollout is possible… All of these factors should affect the overall development, testing and deployment approach.
Asking how the testers missed a bug is simply not good enough. Expect more from your managers and your agile team members. In a healthy work environment, the team may ask instead, “How can we all help to prevent this type of issue from occurring in the future?”