How AI Agents Build Our SaaS Products

We Don't Have an Engineering Team. We Have an Orchestration System.

There's no engineering team at Zeron Studio. No standups, no sprint planning, no team of developers committing code late on a Friday night. It's a solo operation — just me, Adam Bullied — building production SaaS products used by real businesses.

And yet, we ship. Every feature in Taktik, every component on this website, every internal tool in our stack was built and deployed to production. That's not a boast. It's the setup for a question worth answering honestly: how?

The answer is a system called Nexus — an internal AI orchestration layer that manages the full software development lifecycle. Not as a prototype. Not as a research project. As the actual way software gets built here.

This post is a practitioner's account of how it works, where it falls short, and what it means for the people who buy software from us.

What an AI Agent Actually Does (And What It Doesn't)

Let's start with what Nexus is not.

It's not autocomplete. It's not GitHub Copilot suggesting the next line of code. And it's definitely not a chatbot you paste a requirement into and hope for the best. The mental model of "AI writes code like a faster developer" misses the point — and it misses why agentic software development is actually different.

An AI agent, in the way we use the term, is a model that takes a goal, builds a plan, executes steps, evaluates results, and loops back when something doesn't work. It has context — previous learnings, coding standards, architecture decisions — and it uses that context to make decisions autonomously over the course of a task, not just a single prompt.

What that looks like in practice: a ticket describing a feature comes in. The agent reads the ticket, reads the relevant codebase, reads past learnings from similar work, generates code, runs it, reads the output, and revises. It doesn't stop at the first response. It iterates.

What it doesn't do: understand what your users actually need. Make brand decisions. Recognize when a technical solution is architecturally wrong for reasons that don't show up in tests. Those are human jobs — and I'll come back to that.

The Nexus Workflow, Step by Step

Here's how a feature actually gets built, from ticket to pull request:

Jira ticket arrives — A "Ready for Dev" ticket comes in. This is the trigger. No ticket, no build.
Ticket validation — Nexus checks the ticket against a required fields checklist. Missing context? It flags the gap and waits. Garbage in, garbage out is a real risk.
Context assembly — The system combines the ticket description with the relevant portion of the codebase, architectural documentation, previous AI learnings from similar tasks, and project-specific coding standards. This context package is what the builder agent actually works with.
Builder agent runs — A Claude headless agent instance receives the context and builds the feature. It has access to the codebase, can run code, read outputs, and iterate.
Checklist validation — Nexus runs a mechanical checklist over the output: Does it compile? Do existing tests pass? Does the implementation match the spec?
Reality validator runs — A separate, fresh agent instance — with no knowledge of what the builder did — reviews the work. Its only job is to be skeptical. It regularly finds edge cases the builder didn't consider.
PR created and updated in Jira — If the validator passes the work, a pull request goes up and the Jira ticket moves to "In Review."
Learnings captured — Win or fail, the system logs what happened. Patterns get extracted. The next similar ticket benefits from this one.

Two things worth highlighting: self-healing and deduplication. If a step fails — a test, an API call, a deployment — the system retries with a correction prompt before escalating. Most failures resolve without human intervention. And before any agent starts work, Nexus cross-references Jira, GitHub, and its internal state to make sure the work hasn't already been done. No retry loops, no duplicate PRs.

Real-world output from this workflow: Taktik, a production customer support platform built to replace Intercom and Zendesk.

What AI Gets Right, and Where Humans Still Matter

Here's where I want to be direct, because "AI builds all our software" invites a particular kind of skepticism — and that skepticism is fair.

AI agents are genuinely good at a specific set of things: writing well-structured code when given clear requirements and sufficient context, applying consistent coding standards across a large codebase, running iterative loops without fatigue, and catching certain classes of bugs through independent validation. For a solo founder building B2B SaaS, this is transformative. Tasks that would take days compress to hours.

But the validation loop exists precisely because agents make mistakes. The builder hallucinates APIs that don't exist. It misunderstands edge cases. It occasionally produces code that passes tests and still does the wrong thing. The independent validator catches many of those issues. Human review catches the rest.

More importantly: there are entire categories of decisions that don't belong to an agent. What should we build next? Does this feature actually solve the problem customers are frustrated by? Is this the right architecture for where this product is going in two years? Those require judgment, relationship context, and honest conversation — things I write about in more detail in why we built Taktik.

The model isn't "AI does everything." It's AI handles the parts that scale mechanically. Humans handle the parts that require empathy and judgment. That division is intentional.

The Economics: Why This Changes What We Can Charge

Traditional SaaS companies spend 60 to 70 percent of revenue on engineering. That cost is structural — it gets passed directly to customers through pricing tiers, per-seat fees, and features locked behind expensive plans. The economics of headcount-based development almost guarantee it.

Our gross margins on software are close to 100 percent. Not because we're cutting corners on quality, but because the engineering overhead looks nothing like a traditional team. That creates a real choice: where does the delta go?

At most companies, it goes to growth marketing, investor returns, and executive compensation. We've made a different call. As I wrote in Stop Renting Software, the point of near-zero engineering overhead is to change what we can sustainably charge — and who we can afford to invest in.

Concretely, the savings go to two places. First: lower prices. Taktik costs dramatically less than Intercom or Zendesk for the features that most support teams actually use. Second: real people. Support, customer success, the relationships that make software actually useful rather than just theoretically capable.

If you're curious what that looks like in practice, our services page outlines how we apply this model to custom SaaS replacement for clients — not just our own products.

What We've Learned Building This Way (Honest Reflections)

Running this system in production teaches you things no whitepaper will. Here's one that's worth being specific about.

In October 2025, three consecutive pull requests failed. PRs #180, #181, and #175. Same pattern each time: the agent was doing real work, but in the wrong directory. The correct production codebase lived at /Users/adambullied/apps/zeron-feedback-service/. The agent kept working in /Users/adambullied/apps/zeron/feedback-service/ — a path from a deleted, older version of the monorepo. The code was fine. The path was wrong.

The fix wasn't a single patch. We implemented four layers of protection: path validation at configuration time so the error can't be set up incorrectly, dashboard warnings that make invalid paths impossible to miss, explicit critical warnings injected into the agent's context at task start, and updated documentation and checklists to catch the pattern before it runs.

That's the lesson in miniature: agentic software development isn't a set-and-forget system. It's a continuous improvement loop. Each incident makes the system more resilient. Each learning becomes part of the context that the next agent builds on. The quality today is a direct function of the failures we've worked through.

Context quality is the other thing I'd highlight. The output of an AI agent is roughly proportional to the quality of context it receives. Clear ticket requirements, well-maintained documentation, and explicit coding standards aren't nice-to-haves in this model — they're load-bearing. A vague ticket produces vague code.

Could This Model Work for Your Team?

The honest answer: it depends on what you're building and how much investment you're willing to make in the orchestration layer.

For Zeron's internal use, the model works because the infrastructure — Nexus, the context-building pipeline, the learning system, the validation loop — is already built and continuously improving. The first few months required real effort to get to a place where the system was more helpful than it was frustrating.

For teams considering something similar: the tooling is more accessible than it was eighteen months ago, but "agentic software development" still requires deliberate architecture. You need to think carefully about what context your agents receive, how validation works, and where human oversight sits in the loop.

For clients who want the output without building the infrastructure: that's what our custom SaaS replacement services are designed for. We apply the same workflow — Nexus, independent validation, continuous learning — to build bespoke software that replaces overpriced enterprise tools. The economics that make our own products affordable apply to client work too.

If any of this resonates — whether you're evaluating a product like Taktik or curious about what a custom build might look like — I'd rather have an honest conversation than a sales call. That's the model here.