Building an AI Agent from Scratch

I work at Obvious.ai. We build agents for a living. Until last weekend, I had never actually built one myself.

I'd been upstream of the thing for months: reviewing architectures, asking about tool schemas, thinking about agent UX. You can do all of that without writing a tool loop yourself. It just starts to feel hollow after a while.

So: a Saturday morning, Cursor open, an Anthropic API key, and no plan beyond "make something that works."

By Sunday it was running. About 800 lines of TypeScript on Bun. I named it Kit.

The model part was easy

The tool loop is a while (response.stop_reason === "tool_use") and some message bookkeeping. Claude with clear tool schemas does what you'd expect. I spent maybe two hours on the core loop and it mostly just worked. That's not false modesty, it's genuinely the least interesting part of the project.

What I spent most of my time on was everything the user touches.

Sandbox and approval

Kit runs shell commands in Docker. The workspace mounts at /workspace, the agent gets its own filesystem, and I don't worry about it destroying anything on my machine. Wiring that up was the first real piece of work: persistent containers per session, network access controls, running as root inside the container because package managers are picky about permissions.

The harder design problem was the approval flow. When Kit wants to run a command, should it just go? Should it ask me first? I landed on a confirmation prompt by default, with an auto-approve toggle in settings for when I trust the task. Getting that to feel right in a terminal UI took more iteration than the entire tool loop.

Most of the work was UI

This is the part that surprised me. I built the agent in a few hours. I spent the rest of the weekend on the Ink TUI.

How tool calls display in the conversation. How much of a long command output to show before truncating. A status bar that tells you what the agent is doing without being noisy. Keyboard shortcuts that make sense. Making the input box not drop characters when you type fast (that one took longer than I'd like to admit).

I kept thinking I'd get back to the "real" agent work, the model stuff, the prompt engineering. But the UI kept being the thing that mattered. Every improvement to how Kit presented its work made the whole system feel more capable, even when the underlying agent hadn't changed at all. A better display of tool results made me trust it more. A cleaner input flow made me use it more.

The agent is the loop plus the model. But the product is the interface. I think most people building agents underinvest in that second part.

Where it stands

Kit runs in my terminal. It has a Docker sandbox, an Ink TUI, and a growing set of tools. The conversation history gets summarized periodically so the context window stays manageable. It does real work.

I don't know yet what it'll become. But the gap between "I work on agents" and "I built an agent" turned out to be about a weekend, and most of that weekend was spent on fonts, layouts, and keyboard shortcuts.