The Rise of DevOps

DevOps continues to be one of the fastest growing trends in the software industry. According to new statistics posted by Spacelift, the DevOps industry is expected to grow from an estimated $10.4 billion dollars in 2023 to $25.5 billion in 2028. This is not a surprise when 99% of organizations that implemented DevOps have reported positive results and 61% report that DevOps enhanced their quality specifically.

This article will outline what DevOps is and how it came to be. We will also look at the DevOps Reporting and Assessment metrics or DORA. We will look at the challenges facing mobile development when adopting DevOps. Finally, we will look at the future of DevOps and what companies can do to prepare for it.

What is DevOps?

The challenge newcomers have when adopting DevOps, or Developer Operations, into their businesses is that DevOps has no formal definition. In 2015, a paper by Andrej Dyck, Ralf Penners, and Horst Lichter called, “Towards Definitions for Release Engineering and DevOps” stated, “To our knowledge, there is no uniform definition for the terms release engineering and DevOps. As a consequence, many people use their own definitions or rely on others, which results in confusion about those terms.” This sentiment persisted in 2017, when Erich F.M.A., Amrit C., and Daneva M. wrote their own paper, “A qualitative study of DevOps usage in practice”, and stated, “We discovered that there exists little agreement about the characteristics of DevOps in the academic literature.”

The advice given to newcomers is to think of DevOps as a set of practices and tools, with a little bit of cultural philosophy thrown in The goal of DevOps is to improve collaboration and integration between software development, people who write the code, and IT operations, people who deploy and maintain the code. While the exact definition of what DevOps is can vary, there are core values:

Shared Ownership: DevOps encourages collaboration between developers, operations teams, and stakeholders. DevOps fosters a culture that embraces agility, accountability, and focuses on continuous improvement.

Automation: DevOps relies heavily upon automation to streamline repetitive tasks, such as code integration, testing, and deployment. Automation of these tasks paves the way into Continuous Integration and Continuous Deployment. Continuous Integration is the process of regularly integrating code changes into a shared repository with automated builds and tests. Continuous Deployment is the act of automating the deployment process allowing frequent deliveries of code to production.

Rapid Feedback: DevOps also encourages continuous monitoring of applications and infrastructure. Discovering problems quickly allows for faster resolution, improved performance, and enhanced user experiences.

Flavors of DevOps

Further compounding the lack of a formal definition are the variants of DevOps. Here are the three biggest:

GitOps: GitOps emphasizes the use of Git repositories for storing declarative infrastructure and application code. This allows all changes to the production environment to be version-controlled, auditable, and collaborative. This approach works well for organizations that pursue infrastructure as code.

AIOps: AIOps is the use of artificial intelligence to optimize and automate IT operations. IT uses machine learning models to analyze vast amounts of data from various systems and applications in real time to make optimization, efficiency, and performance suggestions.

DevSecOps: DevSecOps is the integration of security practices into DevOps processes. DevSecOps aims to incorporate security measures earlier in the development process. Suggestions include adding a security review during the code review process and adding vulnerability testing to continuous deployment pipelines. The goal is to make security an integral part of the development process, not an afterthought.

Where Did DevOps Come From?

While the term DevOps is new, the concept is not. The earliest proposal for combining operations and software development together comes from a paper written in 1993 by M. Chapman and N. Gatti called, “A Model of a Service Life Cycle.”

The idea really did not start getting widespread adoption however until around 2007 - 2008. At that time Lean Manufacturing had been well established, Agile started to replace the traditional Waterfall methodology, and a new set of Continuous Integration tools were beginning to emerge. The industry was in a better position to start implementing the ideas proposed in Chapman’s and Gatti’s paper. All that DevOps needed was an advocate.

Patrick Debois

At this time, Patrick Debois, started working with the Belgium government on a database migration. Debois was specifically responsible for the certification and readiness testing. He was frustrated by the lack of cohesion between application methods and infrastructure methods.

A year later, in 2009, Debois saw a presentation given at the O’Reily Velocity conference in Toronto called, “10+ Deploys Per Day: Dev and Ops Cooperation at Flickr”. The talk by two employees of Flickr, John Allspaw, Senior VP of Technical Operations, and Paul Hammond, Director of Engineering, made the argument that application development and operations activities should be seamless, transparent, and fully integrated. This presentation outlined the first set of modern DevOps practices. Debois, inspired by the presentation, would go on that year to organize the first DevOps conference in Ghent, Belgium.

10+ Deploys Per Day: Dev and Ops Cooperation at Flickr 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr

As the conferences gained popularity, the demand for more information about DevOps increased. Looking to address this new demand, Alanna Brown published the first “The State of DevOps” report in 2012. The next year a book by Gene Kim, Kevin Behr, and George Stafford was published called, “The Phoenix Project”. This fictional book introduced the concepts of DevOps in a relatable and entertaining way. Consensus credits the book with bringing DevOps into the collective consciousness of the IT Industry.

The final piece of modern DevOps practices came in the 2016 State of DevOps report when the report introduced the DORA metrics. DORA is an acronym that stands for DevOps Reporting and Assessment. The DORA metrics looked to capture two things: throughput and stability. Throughput is based on two measurements, deployment frequency and lead time for changes. Stability is also based on two measurements, mean time to recovery and change failure rate.

In 2023, the State of DevOps report published changes to the DORA metrics that updated stability metric “mean time to recover” to “failed deployment recovery time”.

The DORA Metrics

The DORA metrics are straight forward and easy to adopt. As an example of how to use the DORA metrics, here is how ACV stacks up.

Deployment Frequency is how frequently code changes go to production. For companies that have sufficient automation, deploying to production should be on demand.

For 2024, ACV averages 2,270 releases a month or 567 releases a week.

Change Lead Time is how long it takes a code change to go from committed to deployed. The goal would be to get Change Lead Time to be less than 1 day.

For 2024, ACV is striving towards this goal and averages around 3.67 days.

Change Failure Rate is how frequently a software deployment introduces a failure that requires immediate intervention. The best companies in the world aim for a Change Failure Rate that is less than 5%.

Here is how ACV stacks up:

Failed Deployment Recovery Time is how long it takes to recover from a failed deployment. The goal is to recover in less than an hour.

At ACV, we average around 27 hours for 2024.

Overall, looking at the DORA metrics for ACV, we can determine that we are strong when it comes to deployment frequency and change failure percentage but need improvement on lead time and failed deployment recovery.

DORA Metric Pitfalls

While we use the DORA metrics here at ACV, we do recognize that there have been a variety of criticisms about the metrics over the years. For companies adopting these metrics, and based on our experience, here are common pitfalls to avoid.

Speed Over Risk: DORA metrics prioritizes speed while not taking risk into account How fast a team is willing to go is based to an extent on their risk tolerance. A recent survey produced by Junade Ali found that 71% of software engineers were concerned about software quality while only 33% listed delivering work quickly as a top priority. The DORA metrics do not consider the changing tolerance in the balance of risk and reward. Teams that are more risk tolerant will outperform teams that are more cautious.

Change Failure Rate is a Distraction: Change Failure Rate works against the other metrics. Change Failure Rate works against the other metrics. For teams that move quickly, or teams that maintain large complex systems, bugs and failures are inevitable. Measuring how often teams fail, and having a goal they need to hit, builds a negative incentive. Teams that feel pressure to not fail will become less risk tolerant, which will slow them down. Instead, teams should focus on improving observability and automation. It is ok to fail if teams catch and fix those failures quickly.

Team Comparisons Do Not Work: The DORA metrics try to enable cross-industry comparison between different teams and even different organizations. The problem is that every team and organization work differently and will have its own success criteria. An example would be website developers having a higher deployment frequency than mobile engineers, see the note about mobile development a little further down. Context matters and it is important to remember when comparing teams that the teams have different operating environments.

Correlation vs. Causation: High performing organizations are not high performing because they focus on the DORA metrics. High performing organizations are high performing because they focus on delivering value to their users. Organizations should focus on outcomes, value produced, instead of outputs, the number of code releases.

A Note About Mobile Development

The DORA metrics offer a unique set of challenges to mobile development teams. These teams will want to tweak traditional processes and procedures to help the metrics take their specific context into consideration.

Tightly Controlled Ecosystems: Apple and Google have strict controls around the development and deployment of mobile applications. These restrictions have a negative impact on the DORA metrics. Traditional CI / CD approaches and best practices will require modification to take OS specific requirements into account.

Compiled Binaries: Deploying mobile apps requires compiling and installing binaries on to mobile devices from scratch every time. This means that hotfixes are not an option for mobile CI / CD. It also means reduced flexibility which increases time between releases. Less releases in turn means larger releases, which further increases the complexity.

App Reviews: Mobile apps that are business-to-consumer require an app review from the distribution network supplier, Apple, or Google. This requirement eliminates the ability to do fully automated deployments.

Extensive Testing: Mobile apps support a large variety of mobile devices. This means extensive testing is necessary for changes. Even with a massive automated test suite, which takes a long time to run, there still needs to be manual spot checking. This increases the lead time for changes.

The Future of DevOps

The best way to approach the future of DevOps is to look at the current trends and draw them out to a reasonable conclusion. From that point, we can make educated guesses as to what business will do in the world created by these trends. The trends we can look at are automation, AI, multi-cloud and hybrid cloud environments, and Platform Engineering.

Automation: Imagine a world where all tedious, repetitive tasks have been automated. This means that incident management, application deployments, security and compliance tasks are no longer eating up critical development bandwidth. Businesses want to optimize the use of their expensive development talent and start to focus them on more complex concepts that are not easy to automate. This renewed focus on difficult complex work will require new skills and talents which will put increased pressure on the current IT skills shortage. In-house training supported by competitive learning and development budgets will be necessary for businesses to stay competitive.

As an aside, if you find your company struggling to hire and retain talent our article Hiring and Retaining Talent Post Pandemic might be useful to you.

AI: As AI continues to advance, we will start to see more AI-driven tools. These tools will enhance predictive analytics, providing teams with deeper insights into potential issues before they escalate. Machine Learning algorithms will improve anomaly detection and automate decision-making processes, making DevOps practices more proactive and intelligent. This will not only allow systems that self-recover, but also systems that avoid problems without any human intervention. This in turn will lead to greater levels of quality and stability.

Multi-Cloud and Hybrid Cloud environments: Cloud hosting solutions provide more efficient scaling and stability than on-premises solutions, and this has led to a surge in businesses moving their hosting to the cloud. DevOps supports this model well but there is risk in relying on one cloud provider. As businesses seek to reduce their risk and spending, they will start looking to diversify their hosting solutions. The adoption of multi-cloud and hybrid cloud strategies will present both opportunities and challenges for DevOps teams. Managing diverse cloud environments will require advanced tools and practices. DevOps will need to adapt to these complexities, ensuring seamless integration and operation across various cloud platforms.

Platform Engineering: Platform Engineering dedicates teams within an engineering organization to develop tools not for clients but for internal developers. The goal is to improve software delivery by creating reusable, self-service platforms. Gartner predicts that by 2026, 80% of large software engineering organizations will establish platform engineering teams. The prediction is that Platform Engineering will replace DevOps, but the situation is not one of replacement but rather augmentation.

Platform Engineering will become more popular in the coming years, but the value it generates will be up to the organizations that implement it. Our prediction is that organizations that use both Platform Engineering and DevOps will have teams that can produce code faster within the guardrails created by platform engineers.

Prepare Now

DevOps has come far since its conception and has space to grow. Automation, AI, security, and advanced cloud strategies will drive the evolution of DevOps practices. To prepare, businesses will want to make investments in their engineering organizations to introduce or properly staff a platform engineering team. In addition, they should put together an L&D budget to prepare their existing talent for AI and machine learning. Organizations should also consider adopting the DORA metrics to gain more insights into how their teams are performing. For those that make the investment, there will be major benefits including efficient use of talent, faster product development, and greater stability.