Embracing the Future: AI Driven Development

On March 12th, 2024, Cognition.AI made history by introducing Devin, the world’s first AI software engineer.

Introducing Devin

Cognition claimed that Devin used advances in long term reasoning and planning to execute complex engineering tasks requiring thousands of decisions. The AI was measured against the SWE-bench, a challenging benchmark that asks agents to resolve real-world GitHub issues found in open source projects, and correctly resolved 13.86% of issues end-to-end. This surpassed the previous state-of-the-art which had a rate of just 1.96%.

The excitement around Devin lasted for about 5 weeks until, on April 19th, the Daily Dev published an article, “Is Devin a Scam? Unpacking the Truth Behind the Claims”. The Daily Dev targeted the claims around Devin’s ability to complete complex coding tasks and its ability to increase productivity In their testing, Daily Dev concluded that Devin struggled with complex tasks, suggested that Cognition staged the video of Devin completing a code task, and noted that Devin required “significant human intervention and doesn’t handle changes well.”

This example raised concerns among developers about the future of the industry. These concerns centered around the authenticity and reliability of AI-driven development or AIDD. This example also showed the level of hype and sensational claims surrounding AIDD.

This article will bring clarity around AIDD by cutting through the hype and taking an objective view of what AIDD is, what its current limitations are, and where it could go in the future. The goal is to use research and ACV’s own experience to arm businesses with the essentials to help them avoid the hype and make good decisions when adopting AIDD.

What is AIDD?

AIDD or Artificial Intelligence Driven Development refers to the use of artificial intelligence technologies to enhance the software development process. The promise of AIDD is that it can improve a software development team’s productivity, reduce errors, and create more robust and efficient software while reducing costs. It involves tasks like code generation and optimization, and predictive analytics.

A good example of AIDD is GitHub’s Copilot feature. Copilot is a developer tool that engineers can use to help them write software. Copilot uses the Codex AI engine to accomplish this task. Codex in turn relies on OpenAI’s GPT-3 large language model which has been trained on 159GB of data from public GitHub repositories.

Large Language Models (LLMs), like GPT-3, are neural structures trained on vast textual datasets essential for NLP or Natural Language Processing operations. NLP is an AI discipline that focuses on how computers understand and respond to human language. GPT-3 is recognized for producing human-like textual content, with uses ranging from text generation to answering queries.

If your head is starting to spin, it’s ok, just remember that AIDD is about creating a faster more reliable software development process. Here are the key features:

1. Automated Code Generation: AI tools can generate code snippets or entire modules based on high-level descriptions or existing code patterns. This speeds up the development process and helps maintain consistency while reducing human errors.

2. Bug Detection and Fixing: Machine learning models trained on vast amounts of code can identify patterns that indicate potential bugs or vulnerabilities before the code even runs. This proactive approach helps in maintaining code quality and reduces the time spent on debugging.

3. Automated Testing: AI-driven testing tools can create and execute test cases. These tools can then analyze the test results, identify patterns, and suggest areas for improvement.

4. Predictive Analytics for Project Management: By analyzing historical data, AI can forecast project risks, estimate timelines, and suggest resource allocation strategies. This enables development teams to make informed decisions, optimize workflows, and anticipate potential challenges before they become critical issues.

5. NLP for Requirements Gathering: AI-powered natural language processing (NLP) tools can translate user requirements and documentation into actionable development tasks. This facilitates clearer communication and ensures that the development team accurately captures and addresses user needs.

6. Code Optimization: By analyzing code execution patterns, AI tools can suggest improvements or refactor code to adhere to best practices. This results in more efficient and maintainable code, contributing to better application performance.

AIDD’s Challenges and Limitations

While the application of AIDD is limitless, the industry will need a grounded approach to its application.

Handling Human Emotions

First there is a very real emotional response when it comes to AIDD with 45% of developers experiencing AI Skill Threat, or the concern of being replaced by AIs like Devin. This fear, whether rational or irrational, will need to be managed with sensitivity.

If you have a team that is apprehensive about AI, Pluralsight is offering a free Generative AI Adoption ToolKit. There is a Pre- and Post-Intervention AI Skill Threat Benchmarking Assessment, a Pre-Mortem guide to encourage thriving during the adoption of AI-Assisted coding, and a Gen-AI Assisted Coding Learn-a-thon guide.

Lack of Experience

AIDD pushes developers into unfamiliar territory, demanding proficiency in both coding and AI elements like natural language processing and deep learning. This gap in experience is why 74% of software developers that are planning to upskill in AI-assisted coding. Businesses will want to consider if they are doing enough to support the effort that will be necessary to close this talent gap. It is highly recommended that businesses have a competitive learning and development budget.

Interested readers can also see how a competitive L&D budget can help them with Hiring and Retaining Talent Post Pandemic.

Data Quality

AI-centric platforms demand unparalleled data accuracy to function seamlessly. This means that the integrity and precision of this data comes under intense scrutiny. Developers will need to structure this data meticulously, ensuring its compatibility with specific algorithms, and that requires tremendous upfront investment.

Understanding Context

The problem AIDD needs to solve is understanding what to build and how it should work. This is more important than whether AI can write code. The core issue of defining project requirements and the user’s needs underscores the importance of human developers. Human insight is often a critical condition in transforming complex requirements into viable software. NLPs are promising, but as can be seen from the Devin example, improvement is needed.

Eyes on the Horizon

As we look to the future of AI it is important to look at the current state as a reference point. Right now, we have “narrow” or “weak” AI. This means that AI is capable of intelligently accomplishing specific tasks. For example, AIs that are good at chess, like Deep Blue. These AIs cannot imitate human intelligence in any situation, like “general” AIs, or even surpass human intelligence by becoming self-aware, like “super” AIs. Industry will get there; we are just not there yet.

What happens when we can produce general and super AIs?

When the world can produce robust “general” AIs cheaply, humans will be free from tedious and dangerous tasks leaving them free to focus on work that is more complex and challenging. General AIs will be true partners, capable of collaborating with humans or working independently with minimal intervention. This will mean that software developers can focus more on developing the AI itself, rather than the work the AI produces. In this phase AIDD will continue to augment humans rather than replacing them and this increase in talent freed up to work on AI will accelerate its evolution and lead to “super” AIs.

With “super” AIs, humans would move on to focus on tasks that require creativity and empathy. AIDD would take on its final form as it moves from augmentation to replacement. AI isn’t just driving the development process; it is the development process. Not only can it write the code businesses ask for, but it can also write code to increase its own functioning, bringing it above a human’s level of understanding.

Optimistic But Cautious

With our understanding of weak, general, and super AIs, it becomes easier to see why Devin seemed suspicious to so many. Cognition was essentially saying that they leapfrogged from weak AI to Super AI in a few years, a process that is estimated to take at least a decade. The moral of the story is to be optimistic and even excited about AI, but do so with prudent caution and avoid the hype where possible.

As businesses consider the impact of AI, it is important to embrace AI Driven Development where possible. As AIDD reaches wider adoption, AI augmented engineering will be essential for development teams aiming to stay competitive. Businesses should prepare now by training their staff and helping them get comfortable with the current set of tools available like GitHub’s Copilot and ChatGPT. ACV has started to leverage both. Businesses that do this will experience a short term “first mover” advantage in terms of efficiency and quality and in the long term enjoy a more sustainable and predictable software development process.