Skip to main content

Fifteen Lambdas, Zero Deviation, and 4k Developer Hours Saved

· 5 min read

Nineteen Lambdas, Zero Deviation

We had nineteen Lambda functions. They all did different things. They all behaved the same way.

That wasn't an accident. It was enforcement.

The problem started like this: five functions, then eight, then twelve. Each written by a different person on a different day. One logged errors with stack traces. Another logged nothing at all.
One retried failed API calls three times. Another gave up immediately.
One returned CORS headers on every response. Another forgot them on errors—which meant browsers silently swallowed failures and left users staring at a frozen screen.

The fixes were easy. The pattern was not.

We were building a system where

  • every new function meant re-learning how to do the basics.
  • on-call required memorizing nineteen different behaviours.
  • bugs fixed in one place would still exist in countless others.

So I built a framework.


The Framework

Every new Lambda extended it. Override generateResponse() with your business logic. Maybe initializeHandler() if you needed setup. That's it.

The parent class handled everything else.

  • event parsing
  • logger initialization
  • execution timing
  • retry logic with exponential backoff
  • error classification
  • response formatting with CORS headers
  • request tracing via X-Ace-RequestId that linked frontend errors to backend logs—automatically present on every response, success or failure.

A developer writing a new action group didn't think about any of this.

  • If they forgot to log something, it was logged.
  • If they threw an error, it was caught and classified.
  • If a transient failure occurred, it retried automatically.

All they had to do now was write the code for the feature's business logic, and the framework completed the rest.

The first Lambda written this way worked on the first deploy.
So did the second. So did the next five, and subsequently others.


Two Arrays That Did the Work of Ten Engineers

The classification lived in an error-handling library. Two arrays.

const no_retry_exceptions = [
'ValidationException',
'AccessDeniedException',
'ResourceNotFoundException'
];

const retry_exceptions = [
'ThrottlingException',
'ServiceQuotaExceededException',
'InternalServerException',
'ConflictException',
'DependencyFailedException'
];

That's it.

Validation errors don’t retry as that code wouldn't fix on its own.
Throttling errors retry once, with backoff, because hammering an overloaded service only makes it worse.

When a new error type appeared, we updated the arrays once.
Nineteen Lambdas updated instantly.

No hunting. No meetings. No “did we get them all?”


Twenty-One Lines That Made a Class of Bugs Extinct

In the inheritted MongoDB's implementation of the $set operator, a nested object doesn't merge, rather replaces. You think you’re updating a field. You’re actually deleting everything else.

For instance, if you had { user: { name: "John", phone: "555-1234" } } and you ran { $set: { user: { phone: "555-9999" } } }, you'd get { user: { phone: "555-9999" } }. The name vanished. Permanently.

This is documented. It's also a trap every MongoDB developer steps in eventually.

I wrote flattenObject. Twenty-one lines. It turned nested objects into dot notation—{ "user.phone": "555-9999" }.

Now $set updated exactly one field.

Then I put it in the shared MongoDbClient. Every Lambda that wrote to the database inheritted this protection automatically. No one had to remember the rule, and no one had to lose data again.


Cold Starts That Stopped Mattering

Creating AWS clients is expensive. TLS handshakes. Credential resolution. Connection pools. Three hundred milliseconds here, eight hundred there.

On a cold start, that time adds up.

So we initialized clients at module scope. If that failed, we lazily initialized inside the handler on the first request. That first request is slightly slower. It doesn't crash. Every request after that is fast.

We reduced p95 latency by ~800ms, by removing a class of latency entirely.


What Nineteen+ Lambdas Look Like Now

We have nineteen. The pattern held.

A new developer joined and added an action group. He wrote his business logic in data structures already being passed to the framework, and shipped in a few hours.

He didn’t have to ask about:

  • logging
  • retries
  • error handling
  • CORS
  • request IDs

His code worked on the first deploy because of the framework.

Most teams optimize for flexibility early. They pay for it later: in inconsistency, bugs, and cognitive load.

I did the opposite. I constrained everything that didn’t need variation. When behavior is standardized, correctness compounds.

Concluding

Two weeks to build the framework.
Thousands of hours saved since.
Tending to Zero data-loss events from third party clients.
Entire classes of bugs eliminated.
New engineers productive on day one.
Velocity 100x.

I build systems where the right behavior is the default, and everything else is secondary.

P.S. Across the system’s lifetime, this pattern has saved an estimated ~4,000 developer hours—and continues to compound by eliminating repeated fixes, debugging effort, and inconsistent behavior across services.