Skip to main content

Reliably Generating AI Code

· 4 min read

In 2025, writing boilerplate is cheap. Any junior dev with an LLM can spin up a Lambda, wire up logging, and handle CORS in minutes.

However, AI doesn’t understand organizational rules or know your team’s error-handling conventions. It forgets your patterns unless you either repeat them, or cache them into context - consuming a ton of tokens regardless.

Left unguided, AI-generated code creates entropy and scales poorly. A codebase where every function looks different, debugging requires archaeology, and "it works on my machine" becomes the standard. Hence, job descriptions have switched from finding engineers who just write code to ensuring code behaves as intended.

1. Contract > One-shot Coding

We stopped asking devs to write Lambda handlers. Instead, I built a [framework] that helps the developer focus on writing the business logic, by handling the plumbing. The first deploy works every time. What used to take hours to weeks, now is done in a day on average.

Now, when a developer (or their AI assistant) scaffolds a new service, they don’t write plumbing. They implement a strict interface. The logging, retries, tracing, and error normalization are enforced by the framework, not by convention.

Why this matters in the AI era:

AI is great at filling in blanks. It is terrible at maintaining architectural integrity across 50 microservices, if not given the proper infrastructure to work with. By hardening the [framework], I ensure that whether a human or an AI writes the business logic, the system behavior is identical.

A new hire ships on day three because the system is designed to onboard them fast without needing to understand the full system yet.

What would have just been a 4x multiplier, is also maintaining the system's integrity.

2. Zero-Cognitive-Load Adoption

We built an invisible architecture in a system. Product teams across AAA adopted it without having to sell it to them, or asking anyone to attend integration workshops.

In a world where AI can generate infinite features, the bottleneck isn’t creation, it’s integration. When your internal tools require meetings, docs, or config files, it's natural to meet more resistance to adopt. This invisible architecture adopts itself because it offers immediate value with zero cognitive load.

This idea holds value in different forms across unique systems.

3. Eradicating Bug Classes and Data Corruption

We had a silent data-loss bug in MongoDB. An AI might fix the specific instance. A senior dev might patch the service.

I fixed the client.

I wrote [21 LOC] and injected it into our shared database layer. Now, no one — human or AI — can write code that overwrites data. The entire class of this bug is extinct. This is the difference between coding and engineering. Engineering eliminates the category of error. Coding alone might just hide the symptom.

4. Design for the 3AM Page

Our sync jobs used to page us at 3 AM. We replaced them with self-healing state machines. They retry, back off, and recover automatically. Now, there's hardly a page in a year.

AI can write a cron job. It cannot easily design a system that anticipates failure modes, handles race conditions, and recovers without human intervention as it requires deep system understanding. To even address the problem, it requires empathy for the person on call.

5. Descoping is a Design Tool

Every quarterly planning cycle starts with multiple feature requests for various epics. Since AI is endlessly generative, it will happily write all the code. It doesn't mean it should. With no concept of ROI, AI doesn't ask whether a feature deserves to exist.

What you don't build matters as much as what you do. Every line of code can become a liability. It needs tests to write, a feature to maintain and if anything breaks, a bug to fix, and a potential page at 3 AM.

Engineering judgment is the filter against that entropy.

What can we remove? What can we simplify? What shouldn’t exist at all?

A descoping pass ensures product is built for its end-users, not for AI's generative capacity.

Conclusion

In an AI-first world, velocity is becoming cheaper, and code is treated like a commodity. Integrity is what compounds here, especially when consistency is the spine.

Instead of being the hero who leaves vacation to fix breakage, I like architecting systems that can detect breakage and self-heal, so that they work even when no one is watching.

P.S. Across the system’s lifetime, during my role at AAA, this pattern has saved an estimated ~4,000 developer hours, and continues to compound by eliminating repeated fixes, debugging effort, and inconsistent behavior across services.