Skip to main content

Why We Chose the Hard Path of Building Our AI Stack

· 6 min read

It was 2024. The pressure to buy was immense.

Every week, a new vendor would pitch us. The message was seductive in its simplicity: “You don’t want to build this since it’s hard and messy. Let us handle it.”

Honestly, building from ground up can be expensive. It keeps you up at night.

Yet, we chose to build anyway.

Looking back from 2025, I know this wasn’t the obvious choice. In fact, for most companies, it would have been a mistake. But for us, in that specific window of time, it was the only way to survive. Here is why we did it, what it cost us, and what we learned about the true price of ownership.

The Landscape Was Empty

What people forget about early 2024 is how little actually existed.

LangChain was evolving weekly. Bedrock had just launched. Most “production-ready” demos were smoke and mirrors, often just polished abstractions of basic Q&A.

We had a specific problem: a sixteen-step emergency roadside workflow. We needed a system that could route between specialized agents without losing context, trigger frontend behaviors like GPS detection, and render maps—all without forcing us into a rigid API contract before we even knew what the product looked like.

The vendors couldn’t do this, because they were building for the average user. They were optimized for horizontal chatbots. We needed deep, vertical orchestration. The open-source ecosystem wasn’t ready either.

We stood at a crossroads. Wait for the market to mature (and lose our first-mover advantage), or build it ourselves (and risk burning our runway).

We chose the risk.

What We Built Instead

and What Broke Along the Way

At the time, everyone treated LLMs as chat interfaces. We had to treat them as orchestration primitives.

We built a multi-agent system where agents could share full conversational context. This meant a specialist agent could take over mid-conversation without the user repeating themselves. It sounds simple now, but in 2024, it was fragile. It broke often. We spent weeks debugging context windows that leaked memory and prompts that hallucinated steps.

But when it worked, it unlocked a lot.

We built an orchestrator-supervisor agent with one main job, that is, to decide where the conversation should go.

General query? → Knowledge base agent.
Emergency? → Specialized roadside agent with execution tools.
(and other use cases)

This allowed us to model an entire workflow that completed tasks, while also answering questions.

The Protocol That Didn't Exist

The system needed to trigger browser behaviors, that is, request GPS access, render maps, display nearby service options.

A traditional approach would require tightly coupled API contracts between backend and frontend. We avoided that entirely to avoid getting slowed down to a crawl, and also prevent our agents from that.

Instead, we embedded structured markers within the model’s output stream. In this unelegant design, the frontend listened for these signals and reacted in real time, without any rigid schema, versioned contracts or coordination overhead. Hence, the interface became responsive to the model, not dependent on it.

The solution was brittle as we were coupling our UI logic to the output of a probabilistic model. Any seasoned engineer would raise an eyebrow. In 2024, elegance was a luxury. We traded engineering purity for product velocity. We accepted the technical debt because it allowed us to ship a feature that felt magical to users—weeks before our competitors could even draft their API specs.

(Note: Today, with mature frameworks and standardized JSON modes, we would approach this differently. But back then, this hack was our lifeline.)

The Fear of Lock-In

While we hacked the UI layer, we refused to hack the model layer.

We were terrified of getting stuck with a single provider. In 2024, vendors weren’t just selling tools; they were trying to own our stack. They pushed their models, their embeddings, their proprietary formats. We knew that if we leaned too hard into one provider, we’d lose our ability to pivot.

So, unlike the UI layer, we built a clean, thin abstraction layer for model access that served as our escape hatch.

We encapsulated model access and prompt handling behind a strict interface. This meant that when Model A became too expensive, or Model B suddenly got smarter at reasoning, we could switch traffic in hours.

While competitors were locked into six-month migration projects to switch providers, we were testing new models in production on a Tuesday afternoon. That agility wasn’t a feature; it was our survival mechanism.

The Trade-off

This wasn’t the obvious choice.

Building meant owning complexity: agent coordination and orchestration, prompt design, streaming behavior, optimization through evaluation and observability, and constant iteration in an unstable ecosystem. It required time, focus, and a willingness to operate without established patterns.

Along the process, we made mistakes, expanded team to more engineers and still burned out, disrupted our work-life balance.

For many teams, buying would have been the right decision. It definitely saves a lot of time and sanity. If your use case is standard, please, just buy.

In our case, the requirements were too specific and the pace of change was too fast. The constraints made the decision clear.

The Outcome: Autonomy Over Savings

Because we built, we bought ourselves something money can’t easily purchase: autonomy.

Yes, we saved ~$500k in vendor license fees. But, building isn't free either. It is paid for in engineering salaries and late nights.

The real financial win was the efficiency of spend. We optimized compute in ways vendors never would have allowed, because their margins depended on inefficiency.

The bigger win was avoiding the hidden tax of dependence:

  • No waiting for external roadmaps.
  • No negotiating contracts during peak traffic.
  • No hitting rate limits that killed our user experience.

We scaled on our own terms: our accounts, our limits, our decisions.

The choice between building and buying.

Although buying optimizes for speed today, building, when done for the right reasons, optimizes for control tomorrow.

Spending the complexity upfront, gave us leverage and a foundation that is entirely ours.

If you’re sitting on the fence today, ask yourself:

  • Is this feature your core differentiator?
  • Are you spending time building something that can be easily delegated?
  • Does your vendor understand your compliance needs, or will they become a bottleneck?

If the answer is "no" to any of these, buy.
Don’t be afraid to build if your survival depends on specificity and speed.