Scaling Quality with ACV’s Rapid Growth
When I started at ACV in May of 2019, the Product & Tech org was around 40 people. We had one quality engineer, very few automated tests, and released code less than 10 times per week. The company was beginning to scale quickly. We were setting records for cars sold week over week and were hiring at a pace I’ve never experienced before. This created a few problems from a quality perspective.
Even after hiring more quality engineers, our staffing ratio was still around 15 developers for every quality engineer. At this point, our change failure rate was around 50%, meaning half of our deployments had to be rolled back due to a problem. We knew quality was becoming a bottleneck, as we were heavily dependent on manual testing. It was time to pull the andon cord.
Quality Engineering decided it was time to implement a set of 3 core values for quality at ACV.
3 Core Quality Values
The first core value is that quality is everyone’s responsibility. With this in mind, we made the decision that we would not require a QA sign-off for changes. Following Agile principles, decisions such as “are we ready to move to production” should be made at the team level, and not just down to an individual. Quality Engineers were tasked with ensuring that their teams understood the current state of quality for a given feature or product, helping to build the team’s confidence in their decision.
The second value is that automation is a priority. We spent the next month building out a robust end-to-end test suite for our customer-facing applications. This allowed our developers to run tests without the manual intervention of a quality engineer. We invested in stabilizing our test environments and created a production-like staging environment. We put quality gates in place for a code moving to production, including a smoke test stage that all changes had to pass before deploying.
The third value is to ensure efficiency and productivity. We focused on removing bottlenecks from our deployment pipeline. This included creating tools to ease the burden of testing which helped to get everyone involved in our testing efforts. We created APIs for simplifying test data creation. We incorporated the e2e and smoke tests into our automated deployment pipelines. We automated test reporting so developers could quickly identify problems with their code as early as possible.
How It Went
As e2e coverage increased, our change failure rate decreased. Developers were coached to start deploying smaller, quicker changes to our systems. Our release frequency continued to climb. Developers continued to contribute to building quality into our systems by ensuring unit and integration test coverage for all changes. Having our developers committed to building the foundation of our test pyramid plays a key role in the success of our quality today.
As of September 2021, we are now releasing over 100 times a week with a change failure rate of <5%. Without automation, we would have never been able to 10x our release velocity while reducing our change failure rate 90%. Within the last month, our team has run ~20,000 e2e tests, which is the equivalent of manually testing 24 hours a day for 127 days.
(Above is our device rack, where we can execute up to 25 parallel mobile e2e tests on demand!)
As ACV continues to grow, new problems arise for the quality team to tackle. Scaling our systems to grow with new business lines, features, and exponential growth of our development team has us continually re-evaluating our systems and processes. While it can be stressful and uncertain at times, these are fun problems to have and what I look forward to solving each and every day.
- This the first of a series of posts about Quality @ ACV. Up next is insight on how our quality engineers influence quality on their product teams!