About the Client
The client is one of India’s largest e-commerce platforms, with almost 40% of the market share, and over 100 billion registered users. They offer a huge category of retail and payment services, delivering seamless experiences to both sellers and customers on the portal.
The client’s annual calendar features a week-long mega sale, with offers and discounts on products across all categories. To make sure that the customers can get the best out of the sale days and every purchase is delivered without a hitch, their e-commerce portal needs to be available 24×7, with all systems working exactly as expected.
The goal for Qapitol QA was to perform Scaled Performance Testing, to ensure that the client’s systems can handle the huge volume of queries during the sales days. The focus was on:
- Identifying the maximum load the system can handle at any given time
- Implement solutions to help the client systems handle increasing queries on the fly
- Checking query response times and working to bring that down to a minimum
The Qapitol QA team crafted a performance testing solution that focused on understanding the client’s current systems, test across a set of realistic scenarios to identify system vulnerabilities and create the necessary solutions to ensure robust performance.
Phase 1 – Planning & Preparation
The Qapitol QA team invested time in understanding the client’s current system capabilities and create a roadmap for performance testing and improvement.
Key tasks achieved during this phase were:
- Requirement gathering
- Have each of the teams – orders, payments, logistics, etc. – benchmark their system performance
- Identifying which systems are the most likely to be affected and overwhelmed by an increase in traffic
- Identify the kind of test data required and how to create or access that
The core testing solution involved using a combination of Locust load testing solution and Qapitol QA’s proprietary framework to test the system performance.
- Locust was used to execute automated scripts that simulate a given number of hits to the client’s e-commerce platform per second
- The scripts also simulated a realistic break-up of these hits on different systems. For example – 5 lakh users surfing, 2 lakh adding products to the cart and 1 lakh making payments
- Post simulated order placements, the Qapitol QA proprietary framework monitors downstream systems – warehouse, fulfillment, logistics, vendor management, etc – to understand performance and existing resource utilization
- When server utilization on any system exceeds pre-set limits, the framework is designed to spin up new systems to handle the extra load and direct users across these systems via load balancers
Three business-critical scenarios that were incorporated into the test cases, to make them as close to the platform’s real usage as possible:
- The client had different post-order workflows for different product categories. So the test scripts were designed to accurately reflect the user split between different product categories
- Simulating the processing of a single order containing product products from multiple different categories, to test different workflows activated with a single order
- Simulating changes to the carts after order placement
All of these scenarios brought in better test coverage for the client, allowing them to identify a wider range of possible challenges and prepare for them.
Phase 2 – Execution
With the plan in place, the execution stage involved:
- Writing the required test scripts for Locust and running them by the Qapitol QA proprietary framework
- Conduct a wide range of tests:
- Load tests: Test the system with 200 queries per second and steadily increase it to 350-400 queries per second, over the course of an hour, to identify system breakpoints
- Spike tests: Check and monitor rise in CPU usage with a sudden spike of users on the platform – 12 lakh users at any given time
- Volume tests: Check and monitor system response to a large number of users using the platform over a period of time – 10 lakh users using the site for over 30 minutes
A key imperative at this point was to conduct all tests on the production environment, to make sure they are verifying realistic operations and site usage. So all queries generated by test scripts had flags “perf-test = true” to distinguish them from actual user queries, and were directed to a test database.
Phase 3 – Results & Analytics
As the test scripts started to deliver results, the Qapitol QA team focused on analyzing:
- How the systems behave as the load increases in different scenarios
- The spawning of new servers on the fly, by the proprietary framework, once server utilization for any system crossed 90%
- Monitor if the response times for all systems are within acceptable criteria
- Analyze and fix code to improve response times when necessary.
With Qapitol QA’s testing solution in place, the client was able to ensure largely uninterrupted performance across all systems throughout the high-traffic sale days.
- The e-commerce platforms could handle 1 lakh users at any given time
- Response times were maintained between 2-5 seconds for all systems
- A set of highly representative and realistic test scenarios was developed which could be scaled to meet similar testing needs in the future.