Get the latest tech news
Production tests: a guidebook for better systems and more sleep
Your customers expect your site to be fully working whenever they need it. This means you need to aim for near-perfect uptime not just for the site itself, but for all features customers may use.
Modern software engineering uses quality control measures such as automated test suites and observability tools (tracing, metrics, and logs) to ensure availability. While I have always liked production tests, I got a real appreciation for them at Atlassian, where they are used extensively and are called “”, and I have seen first hand how they can give early warnings of problems, which can be fixed before the become incidents. Making a production system testable may take some work and have special switches, such as user fields, or feature flags so that the behavior can be slightly different to facilitate the test.
Or read this on Hacker News