Edit

Share via


Architecture strategies for testing

Applies to this Azure Well-Architected Framework Operational Excellence checklist recommendation:

OE:09 Enhance the quality of your workload by adopting testing practices that align with business objectives and uphold quality standards.

When you introduce a change to your workload, you need to make sure it works as intended and doesn't introduce new problems. Testing is how you assess those changes. It's an essential practice for maintaining quality and building confidence in your workload.

Effective testing delivers a reliable, high-quality workload. It prevents defects from reaching production, reduces costly rework and delays, and keeps your work aligned with business objectives throughout the development lifecycle.

This article provides strategies to help you deliver a high-quality workload through effective testing practices. It serves as foundational guidance for the specialized testing guidance in other pillars like performance, reliability, and security.

Formalize your test strategy and plan

Your test strategy and test plans are essential artifacts. They provide a clear roadmap for your testing efforts and ensure everyone's working toward the same quality goals.

Define your test strategy

Your test strategy is a high-level blueprint that guides your overall testing approach. It defines your testing objectives, scope, methodologies, tools, roles, and responsibilities. It helps you make informed decisions about testing priorities, resource allocation, and risk management. It also captures how you communicate with stakeholders and report results.

A well-defined test strategy gets buy-in from stakeholders and ensures all team members follow consistent quality standards. It provides direction and structure, aligns testing with business objectives, and protects long-term quality.

Your test strategy generally stays consistent across releases for a given workload. But always customize it to reflect the specific needs and business objectives of that workload.

Note

Don't apply the same standardized strategy to different workloads. They have unique considerations that need unique approaches.

Create your plan

Once you agree on the test strategy with your team, formalize it in a test plan with use cases aligned to business objectives.

Your test plan is a detailed document that guides testing execution for a specific release. It outlines the scope of testing, specific test cases, defect reports, timelines, resource assignments, and entry and exit criteria for testing activities.

A well-structured test plan makes testing efficient and aligned with the release's goals and timelines. It's your reference point for tracking progress and making informed decisions throughout testing.

Example: For an e‑commerce checkout system, the test strategy establishes a consistent approach across all releases. It defines that payment flows are always prioritized, specifies testing tools for the system such as Selenium for UI testing, JMeter for load testing, and OWASP ZAP for security testing, and clarifies team responsibilities for different test types. The test plan for the v2.5 release focuses on adding Apple Pay support. It defines exactly what to test for Apple Pay in this release, allocates resources (two engineers and three iOS devices), sets a four‑week schedule, and establishes clear entry criteria (code complete and environment configured) and exit criteria (all tests pass, zero critical bugs, and a two‑second SLA).

Note

Don't start testing before you clearly define your overall test strategy and plan. A solid test plan keeps your efforts focused and aligned with workload goals.

Test early, test often, test what matters

Start testing at the earliest stages of your software development lifecycle. When you find critical problems late, you face increased rework and slower releases. Too often, architects overlook testing requirements at the design stage, which negatively impacts the overall quality.

Developers should embrace a quality assurance mindset. Think about testing even during design. Use it to clarify requirements and spot potential limitations. Make testing an integral part of your development process. Take ownership of writing and maintaining tests alongside your code.

When you detect issues early, you can respond quickly. For example, you might prioritize essential design changes that impact user experience over routine bug fixes. Acting early reduces last-minute surprises and delays.

Testing isn't a one-time event. Keep testing after launch as part of a continuous testing mindset. Regularly review, update, and expand your test suite to cover new features and address bugs discovered in production to maintain long-term quality.

Note

Don't defer testing in large batches until late in the delivery cycle. Delaying tests leads to missed problems, increased rework, and slower releases.

Tradeoff. Early and continuous testing can increase operational costs and might initially slow development or create team friction. Strike a balance, identify which tests and changes are essential early, and which you can defer to later phases. This ensures efficiency without compromising quality.

Test in production with safeguards

Even with strong testing and validation practices, some problems only appear under real-world production traffic. To find issues you can't simulate, perform controlled testing in production with safeguards that limit user exposure and reduce risk.

Plan, isolate, and monitor production testing. This way of doing things, lets you gather real user feedback and performance data while minimizing disruptions to the broader user base.

Consider progressive exposure strategies, such as canary releases, only after your workload meets exit criteria and demonstrates strong quality. This approach releases updates to a small, targeted group of users first, helping you quickly uncover problems that might not appear in pre-release environments. With this approach, you accelerate your testing process and potentially reduce associated costs.

Risk: Be cautious when you test in production, as it directly affects real customers. Always implement safeguards and limit exposure to minimize potential negative impacts on your business.

Apply layered approach in test coverage

A layered test coverage strategy gives you rapid feedback, earlier defect detection, and faster releases. When you structure tests in layers, you can quickly isolate and debug defects, making it easier to pinpoint and address issues.

Use the test pyramid as a guide

The test pyramid model illustrates this layered approach for test automation. It distributes tests across different layers to maximize coverage while minimizing execution time and maintenance costs.

  • Base layer: Unit tests verify individual components in isolation. They run quickly and provide immediate feedback.
  • Middle layer: Integration tests verify interactions between components and services. They run more slowly than unit tests but provide broader coverage of system behavior.
  • Top layer: End-to-end tests validate user journeys through the entire system, simulating real-world scenarios. These tests run the slowest but give the highest confidence in overall quality.

You can more easily integrate base and middle layer tests into your pipelines because they have minimal dependencies. This integration enables swift feedback when tests fail and lets the build process stop immediately, preventing faulty code from progressing further.

As your test coverage grows, pipeline execution time can increase significantly. Maintain a fast feedback loop by using parallel and distributed test execution strategies, keeping pipelines efficient even as coverage increases.

Note

Don't include every possible test in the initial build pipeline that compiles and validates your code. This choice slows release cycles and risks important tests being bypassed. Focus on tests that directly protect critical workflows and provide meaningful confidence in system quality.

Tradeoff. There's a tradeoff between test coverage and pipeline efficiency. While larger test suites increase coverage, they also increase execution time and cost. They don't always deliver a meaningful return on investment.

Separate application and infrastructure testing

Create clear segmentation between testing your application code and your infrastructure code. Validate your infrastructure by observing application behavior through deploying the software and running tests against it.

Running application smoke tests can reveal infrastructure problems like network faults, DNS misconfigurations, or resource constraints before they impact production. For example, a smoke test that validates API health endpoints can quickly detect infrastructure provisioning problems or network policy problems.

With this approach, application specific tests proactively identify and resolve infrastructure problems, reducing the need for separate infrastructure validation. It gives you confidence that both your code and underlying infrastructure work together correctly.

Incorporate different types of testing

Use various testing methods throughout your workload. Completing unit tests doesn't mean you're done testing. Every aspect of your workload needs a distinct approach. Multiple test types enhance overall quality and build confidence that the system functions as intended.

Choose the right test type based on your workload maturity and risk profile. Start with functional validation through the test pyramid layers, then add non-functional testing like performance, security, and resilience. Align your test type selection with your workload's critical scenarios and risks.

The following table shows when to apply different test types throughout your testing cycle. Each addresses specific risks. While this table isn't an exhaustive list of all possible test types, it serves as an illustrative example.

Testing Type Primary Purpose When to Use Cost & Considerations
Manual Testing Validate scenarios requiring human judgment, exploratory learning, usability, and UX nuances. Early development, UI changes, ambiguous flows, or when automation isn’t feasible. High cost, low scalability. Use sparingly and focus on areas where human insight adds irreplaceable value.
Unit Testing Verify individual component or function logic in isolation. Continuously during development. Lowest cost and highest value. Fast, reliable, and critical for preventing regressions. Aim for broad coverage.
Integration Testing Validate interactions between components, APIs, contracts, and shared services. When components and services are ready to interact or when integrating new dependencies. Medium cost. Essential for catching misconfigurations and interaction defects early.
End-to-End (E2E) Testing Confirm full workflow correctness across the entire system from user action to backend services. When core user journeys have stabilized and can be automated. High cost and fragile. Use selectively for the most business‑critical flows.
UI Testing Detect visual, layout, and interaction regressions. After UI design stabilizes or when visual fidelity is a release requirement. High maintenance cost. Limit to critical UI paths and accessibility‑critical scenarios.
Load & Performance Testing Validate performance, latency, throughput, and scalability under expected workload. Begin as early as possible and repeat as architecture evolves. High cost but necessary for production readiness, especially for customer-facing workloads.
Stress Testing Determine system limits, breaking points, and recovery behaviors. Before production readiness or major architectural changes. High cost, provides resilience insights. Run selectively due to environment impact.
Security Testing Identify vulnerabilities, misconfigurations, and attack vectors. Apply throughout the development lifecycle. Medium–high cost but extremely high value. Critical for protecting data, meeting compliance, and reducing business risk.

Evaluate each feature or change based on business impact and risk. Prioritize test types based on this assessment. For customer-facing workloads, emphasize end-to-end and UI testing. For API-driven workloads, focus on integration and contract testing. For high-availability systems, invest in resilience and chaos testing.

Tradeoff. Prioritize security tests at the beginning of the release process. This approach helps prevent vulnerabilities and ensures safer deployments. However, this priority can slow the pace at which you deliver new features to production.

Treat your test assets as important as code assets

Test assets capture essential business rules, edge cases, historical defect patterns, and valuable organizational knowledge. When test quality degrades, teams waste time debugging unreliable tests instead of finding real defects. This situation creates frustration and developers lose trust in the test framework.

Treat test assets with the same rigor as code assets. Taking full responsibility for your test assets enhances both reliability and overall quality of your testing framework.

Structure and secure your tests

Structure test code with the same architectural principles as your application code. When possible, keep your tests alongside your code in the same repository to streamline maintenance and promote consistency.

If your automation suite resides in a separate repository, implement equivalent governance controls such as mandatory code reviews, pull request policies, and build validation pipelines to maintain quality standards.

Tests often interact with production data and systems, which can introduce risks from imported libraries or vulnerable test code. Implement secure coding practices in your test code to prevent vulnerabilities. Treat tests with the same security standards as production code.

Version your test data alongside your code. When you change data schemas or business rules, update test data to match the current workload state.

Do baseline validation of your own tests to make sure they work as expected. Any failures should point to actual application issues, not test defects. Verify that tests fail appropriately and pass consistently when your workload is healthy. Address unreliable tests promptly and ensure test assertions reinforce each test's intent.

Set up practices that ensure test independence and reliability, such as isolate test data, avoid shared state, and implement proper setup and teardown processes. Implement automated cleanup of test data. To benefit from parallel test execution, design your tests to be independent so they can run in any order without affecting outcomes. Independent tests should always set up and tear down their own data and dependencies so states aren't carried over to the next test run.

If your application requires sequencing in tests, use test frameworks that support ordered test execution.

Maintain and evolve your tests

Maintaining your test assets is crucial for preserving workload quality. These assets often contain valuable organizational knowledge. If you don't regularly maintain them, they quickly become obsolete, undermining their effectiveness and your ability to deliver high-quality releases.

As your workload evolves, your test assets must evolve in parallel to stay aligned with business objectives. Use the same architectural principles for your test design as you do for production code.

Your regression test suite should contain your most valuable and stable tests. Start small and grow your regression suite deliberately. Add tests when production incidents occur, when you fix critical bugs, and when you introduce high-risk changes. Automate your regression suite so it runs consistently without human intervention. Create fast smoke tests that run on every commit, and broader regression tests that run nightly or before release.

Test cases capture the intent, scope, and expected outcomes. Keeping both test automation scripts and test cases in sync is essential. If automation scripts diverge from their corresponding test cases, it's hard to track what's being validated, leading to gaps in coverage and accountability.

Proactively plan regular updates to your test cases as you introduce new features and enhancements to your workload. When you automate a test case, link the automation to the original test case so you can trace coverage.

Manage test technical debt

Schedule regular test maintenance sprints to address accumulating debt before it becomes overwhelming.

Flaky tests, duplicate coverage, obsolete tests, and poor test design all contribute to test debt. When you identify unreliable tests, prioritize fixing or removing them to maintain the integrity of your test suite. A smaller set of reliable tests is more valuable than a large set of flaky tests.

Strategic decisions about skipping or deferring tests are as important as what you test. Consider skipping tests for trivial or low-risk code such as simple getters and setters without business logic, extremely rare edge cases, third-party libraries, and legacy code scheduled for removal. Document these decisions so your team can revisit them as context changes.

Evaluate your test suite and separate reliable, consistent tests from those prone to external changes. Tests that frequently break due to factors outside your control such as UI tests impacted by frequent UI updates, may not be good candidates for automation. This separation helps you decide which tests to automate and which to keep manual.

Tradeoff. When you remove fragile tests, you reduce your automation coverage. Balance this reduction by focusing automation on stable interfaces and critical workflows while accepting manual testing for frequently changing UI elements.

Not all tests deserve ongoing maintenance. Retire tests for features you removed, tests that duplicate coverage, and tests that no longer provide value. Document why you're removing tests so the decision is clear to your team.

Extend observability to your test framework

Observability in testing accomplishes two key objectives: ensuring the test framework operates reliably and offering ongoing visibility into the quality and health of your workload. When you integrate observability practices into your test framework, you strengthen diagnostic capabilities, facilitate real-time monitoring of stability, and drive continuous improvement of testing processes.

Without this visibility, you face considerable difficulties diagnosing problems, limited ability to monitor systems in real time, and you don't gain clear, actionable insights into your automation coverage effectiveness.

Test suites deteriorate over time, tests become unreliable, lose relevance, or fail to keep up with workload changes. By incorporating observability, you can effectively monitor your test suite's health, rapidly pinpoint and address failures, and make informed decisions about maintenance priorities and improvements. Generating coverage reports makes it easier to identify gaps in automation. When test coverage aligns with observability data from production incidents, teams gain insight into which scenarios lack adequate validation.

Use industry-standard test coverage and reporting tools within your framework to:

  • Provide clear visibility into tested code paths
  • Identify consistently failing tests
  • Analyze long-term trends in test reliability
  • Trace the origins of failures for targeted improvements

Risk: If logs are inconsistent and overly detailed, they might create more noise than value, making debugging harder. Implement structured logging with clear, actionable messages and different verbosity levels to ensure logs provide meaningful insights without overwhelming your team.

Simulate realistic conditions

Evaluate your workflows from an end-user perspective to make sure they truly address customer needs and expectations. Define clear acceptance criteria for your workloads, and design tests that accurately reflect real user flows and experiences, not just isolated system behaviors.

Scale coverage strategically

Scale your test coverage based on risk and value. Prioritize coverage for high-value user journeys and critical paths that directly impact customer experience. As workload complexity grows, expand your test coverage by evaluating scenarios that provide the highest confidence in workload quality and reliability.

Risk: Don't overinvest in a single user flow beyond the point of diminishing returns. Once you've achieved sufficient coverage for critical paths, shift focus to other important areas. Strive for balanced coverage rather than perfection in one flow.

Align with business objectives and SLOs

Align your tests with both business objectives and Service Level Objectives (SLOs). Set measurable quality thresholds that reflect business commitments and user expectations. Agree on these thresholds, as they provide a reference point to detect deviations and troubleshoot errors. This approach protects user experience by ensuring key service quality thresholds aren't compromised. Regularly review and update your baseline metrics to make sure they continue to meet current customer needs and expectations.

Use representative test data

Test data should represent real-world scenarios as closely as possible. Synthetic data can simulate authentic user scenarios while avoiding the complexity of production data handling. For example, synthetic testing can replicate real-world scenarios by generating representative data sets to evaluate how your workload performs under planned scaling conditions.

If you need to use production data for testing, make sure you properly anonymize all information to protect sensitive information.

Use synthetic data as your default choice. Reserve production data for specific scenarios where synthetic data can't replicate the necessary complexity, such as testing data migration scripts.

Simulate your production environment

The production environment is your source of truth for understanding how your workload behaves in real-world conditions. Create an environment that closely mirrors real-world conditions so you can trust that the system performs as expected in production.

Tailor your approach to mirroring production environments to your workload's specific needs. For mission-critical workloads that require high availability, test in a dedicated environment that closely resembles production. For these workloads, carefully balance cost optimization with the need for robust validation. Use a dedicated, production-like environment for performance and load testing to make sure service behavior is accurately assessed under realistic conditions.

For other workloads, make your environment closely mirror the production infrastructure to reduce false positives, cases where tests succeed in lower environments but fail in production. Aim for consistency across all environments as your code advances through the pipeline. Simulate production conditions throughout various aspects of your workload, including infrastructure, data, and security, to ensure reliable test results.

When you mirror your environments to production, configuration drift can create false confidence in quality. Prevent this by implementing automated validation checks for configuration drift to make sure your environment stays aligned with production. Where appropriate, set up deployment gates to verify the correct version is deployed before testing begins.

Risk: When you mirror production environments, it can significantly increase operational costs. Evaluate whether ephemeral environments or persistent test environments offer the optimal balance between cost efficiency and quality for your workload.

Build purpose-driven test environments

Design environments with a clear focus on their intended purpose. Evaluate the distinct requirements of each phase in your testing lifecycle and make sure the environment aligns closely with the objectives of that phase.

Intentionally design every test environment to match the specific stage and goals of testing, whether for functional validation, integration testing, or other purposes. If a streamlined environment effectively serves your test needs, prioritize that approach to maximize efficiency.

Use mock services

Full replication of production systems for every test scenario is often impractical. Evaluate which components of your workload you can safely replicate for testing without compromising critical business workflows. When full replication isn't feasible, use mock services that accurately simulate production service behaviors to validate scenarios effectively without risking live operations.

Purpose-driven test environments provide an ideal foundation for deploying mock services on ephemeral environments. Ephemeral environments offer a cost-effective way to simulate production conditions for testing. You can validate interactions and behaviors without the overhead of maintaining full production-like environments for every test scenario. These on-demand environments are created for specific test purposes and destroyed after use, reducing infrastructure costs while maintaining test quality.

Building ephemeral environments requires your workload to reach a higher maturity level where automation with Infrastructure as Code (IaC) and deployment pipelines is well established.

Azure facilitation

Azure Test Plans is a browser-based test management solution that provides all the capabilities required for planned manual testing, user acceptance testing, exploratory testing, and gathering feedback from stakeholders. It includes Test Analytics to track your test quality over time and identify areas for improvement.

Azure Pipelines makes it possible for you to integrate testing into your CI/CD pipeline. You can also use GitHub Actions integrated with Azure.

Azure App Testing is a service that supports functional and performance tests. It allows you to run functional tests with Playwright Workspaces and performance tests using Azure Load Testing.

Azure Deployment Environments can help spin up app infrastructure with project-based templates that establish consistency and best practices while maximizing security.

Azure also provides platform‑native tools that support reliability, performance, and security testing.

Operational Excellence checklist

Refer to the complete set of recommendations.