Decades ago, we abandoned the practice of measuring developers for the number of lines of code they developed. We realized it was too easy to game the system by writing bloated code that reduced value rather than increased it. The best developers, who made code smaller, faster, and easier to maintain, were penalized, because they were seen as generating negative productivity – but the metric was wrong. Bill Atkinson, a developer at Apple, is reported to have reduced 2,000 lines of code in a single week. He did this while making the drawing calculations six times faster.
Today, we can generate thousands of lines of code with a single prompt to a large language model (LLM). It can beat any human in delivering lines of code. However, is that really the goal?
The Training
Before we can get to the problem of excessive lines of code, we need to understand how LLMs arrived at the generation of code with unnecessary lines. The answer is in the training dataset and how that dataset was sourced from publicly accessible places, including open repositories on Github and coding websites. These sources lack any form of quality control, and therefore the code the LLMs learned on is of varying quality.
While there are absolutely some repositories of code that contain meticulous and beautiful code written by the best developers and released after quality peer review, that’s not the norm. Many of the publicly available repositories are public because they were written by developers who were just learning. They made their repositories public, because they didn’t see much value in what they were producing.
Early on in my SharePoint software development career, I railed against what I saw as one of the biggest problems with the sample code that was being littered across various sites. Coming from the official templates that Microsoft offered, it overrode the RenderControl() method, which literally just wrote HTML back to the client. It would take years of petitioning before the templates were changed to CreateChildControls(), which behaved properly inside of the ASP.NET 2.0 stack, allowing for post back events. If AI were trained on the SharePoint development code before about 2010, it would have been consistent and wrong.
In the quest to get as much training data as possible, there was little effort available to vet the training data to ensure that it was good training data. The result LLMs outputting the kind of code written by a first-year developer – and that should be concerning to us.
The Security Problems
The last decade has seen an escalating conflict between malicious attackers seeking to find defects in software and the software developers who are hardening their work. Initial reports of AI code implies that it’s going to get worse. Some of the common vulnerabilities that we’ve known about for decades, including cross-site scripting, SQL injection, and log injection, are the kinds of vulnerabilities that AI introduces into the code – and it generates this code at rates that are multiples of what even junior developers produce. In a time when it’s important that we be more cautious about security, AI can’t do it.
The Maintenance Problems
Today, we have AI generating bloated code that creates maintenance problems, and we’re looking the other way. It can’t structure code to minimize code duplication. It doesn’t care that there are two, three, four, or more implementations of basic operations that could be made into one generic function. The code it was trained on didn’t generate the abstractions to create the right functions, so it can’t get there. (See Focus on Functions for some of the writing I was doing decades ago on how to make good functions that doesn’t appear AI crawled.)
Can we code with AI assistance? Yes. Can we “vibe code”? Absolutely. However, the questions we need to be asking ourselves are: 1) At what cost? 2) What can we do to mitigate those costs?
The answer seems to be to have experienced developers reviewing and refactoring code to ensure quality and maintainability standards are being met. We first wrote about how to do effective code reviews two decades ago in Effective Code Reviews Without the Pain. If you need help developing a pattern for reviewing AI (or human) generated code we can help.
