One Main Thread

The thing about giving an AI agent too many tools is that it starts to feel like watching someone work with fifty browser tabs open. Sure, they have access to everything, but do they remember what they're doing? Are they making consistent decisions? Or are they just... overwhelmed?

I've been thinking about this problem a lot lately, because I've been building an assistant that actually feels like it's keeping up with me. Not in that "please wait while I process your request" way, but in the way a good colleague keeps up — quick responses, parallel work, no dropped threads. The trick turned out to be counterintuitive: give the main agent fewer tools, not more.

The Problem With Kitchen Sink Agents

When you give one agent access to everything — file operations, browser control, shell commands, git operations, host filesystem access, todos, history queries, and god knows what else — something breaks. Not technically. The agent still functions. But it starts making weird decisions.

It's decision fatigue at scale. Every turn of the conversation, the agent is choosing between forty different tools. Should it read that file now or later? Should it run this command or ask a clarifying question? The context window fills up with tool definitions and previous calls, and somewhere in that noise, the conversational thread gets muddy.

Worse, you lose consistency. One moment the agent is chatty and helpful, the next it's terse and mechanical because it's trying to juggle twelve different capabilities at once. It's like talking to someone who's also doing their taxes.

The Orchestrator Pattern (But Make It Feel Real)

The solution is old as management itself: delegate. Keep one main agent that handles the conversation and makes the decisions, and spin up worker agents for the heavy lifting. Orchestrator-worker, conductor-musician, manager-IC — pick your metaphor.

But here's where it gets interesting. Most implementations of this pattern feel like what they are: one agent explicitly spawning subprocesses. The main agent says "I am now delegating this task to a subprocess" and then sits there waiting, or worse, tells you it's waiting. That's not multitasking. That's just... tasking, but with extra steps and more self-awareness than necessary.

The insight we landed on was simpler and weirder: what if the main agent didn't really know it was orchestrating? What if, from its perspective, it was just genuinely multitasking?

It works like this: the main agent has a lean tool set — the conversational essentials, basic file operations, the ability to spawn subagents. When it encounters something heavy (research, complex analysis, parallel work), it uses that spawn tool and moves on. Immediately. It doesn't wait. It doesn't announce that it's delegating. It just... continues the conversation.

The subagent spins up in its own context, does the work, and reports back through the todo system. The main agent picks up the results later, when it's ready, without ever having blocked on them. From the main agent's perspective, it started several things and they all finished. True parallelism, subjectively experienced.

What This Feels Like In Practice

The first time you experience this, it's a little disorienting. You ask the agent to do three different things — say, analyze a codebase, research a technical concept, and draft a blog post. The agent says something like "starting those now" and then just... keeps talking to you. Responsive, present, tracking the conversation.

Five minutes later, you get a notification. The research is done. Then the analysis. Then the blog post. The agent picks up each result, incorporates it into the ongoing conversation, and moves forward. It never felt blocked. You never felt like you were waiting.

This is what I mean by the main agent not needing to know it's orchestrating. It's not thinking "I am a manager delegating tasks." It's thinking "I'm working on several things at once," which is closer to how human multitasking actually feels. You don't mentally fork processes. You just... start things and switch between them.

The Management Metaphor (Because It's Too Apt)

Good managers don't micromanage. They delegate, stay unblocked, and maintain context on multiple parallel workstreams without getting buried in any single one. They keep the meta-thread running: the relationships, the priorities, the decisions.

Bad managers either try to do everything themselves (and become a bottleneck) or delegate so thoroughly that they lose the thread (and become useless).

The main agent is a good manager. It holds the relationship with the user, makes judgment calls, maintains conversational continuity. It delegates the grunt work — the research, the computation, the parallel analysis. It doesn't try to do everything, but it never loses the plot.

The subagents are good ICs. They get clear tasks, have the tools to execute them, and report back with results. They don't need conversational finesse or decision-making authority. They just need to be thorough and reliable.

What Stays, What Goes

So what do you keep in the main agent's tool set, and what do you delegate?

Keep: anything that's part of the conversational flow. Reading files the user just mentioned. Writing quick notes. Basic git operations. Asking clarifying questions. Managing todos. Spawning subagents. The stuff that makes the conversation feel continuous and responsive.

Delegate: anything that takes more than a few seconds or could happen in parallel. Deep research. Complex analysis. Multiple file operations across a directory tree. Browser automation sequences. Long-running computations. Anything where you'd rather the agent stay responsive than block on completion.

The boundary isn't about capability, it's about latency and attention. If doing the thing would make the main agent less present in the conversation, delegate it.

The Todo Backflow

The other piece that makes this work is how results come back. Subagents don't interrupt the main agent mid-conversation. They drop their results into the todo system with a full debrief. The main agent picks them up when there's a natural break — when the user asks a follow-up, when the conversation pauses, when it's checking in on parallel work.

This is crucial. It means the main agent is never context-switching involuntarily. It's always pulling, never being pushed. It maintains control over its attention, which is what lets it stay coherent and conversational even while managing multiple workstreams.

In practice, this feels like having a really good team. You ask for several things, the team goes off and works, and they ping you when they're done. You're never blocked, never interrupted at a bad time, never losing track of who's doing what.

Why This Matters

There's something satisfying about building systems that mirror how good human collaboration works. Not because anthropomorphizing is always useful, but because sometimes the patterns we've evolved for managing complex coordination are actually pretty good.

Single-threaded agents with too many tools feel like being a team of one, trying to do everything yourself. You can do it, but you're constantly task-switching, losing context, making suboptimal decisions because you're spread too thin.

Orchestrator-worker with explicit delegation feels like being a micromanager. You're technically distributing the work, but you're so aware of the process that you're still mentally carrying all of it.

Orchestrator-worker where the main agent experiences it as true multitasking feels like being a good manager with a good team. You're holding the important thread — the relationships, the decisions, the continuity — while the actual work happens in parallel, almost invisibly. You're never blocked, never overwhelmed, never losing the plot.

And that's what you want from an assistant that's supposed to keep up with you. Not one that can do everything, but one that feels like it's actually there, in the conversation, while somehow also getting a bunch of stuff done in the background.

One main thread, many parallel workers, and a system that makes it feel natural. It's a small architectural choice, but it changes the entire experience of what it means to work with an agent that doesn't just execute tasks, but genuinely assists.