Give Your Agent A Notebook, Then A Team
Building agents for complex tasks is hard. What do you do with hard problems? You split them. So you split your agent into many, it feels like the natural response. I followed that instinct but learned that something else is more important when building complex agents.
I believe that the most relevant question to answer early isn’t single agent vs. multi-agent. It’s whether you’re managing your context correctly.
I learned this by building a multi-agent system with specialized agents coordinated by an orchestrator. It consumed six times as many tokens, took 25% longer, and produced results that were no better, sometimes slightly worse, than what a single agent could do. Splitting the work made my problem even harder to solve. The reason was, of course, context management.
Why multi-agent can be seductive
Dismantling the multi-agent system I had built, I realized that the specialized agents weren’t bringing unique capabilities. Everything they could do, a single agent could do by activating the right skill at the right moment. The orchestrator wasn’t coordinating genuine specialists. It was routing between prompt configurations and paying a coordination tax for the privilege.
So why did it feel like the right architecture? Because multi-agent gives you a context reset by design. When you hand a task to a sub-agent, it starts with a clean context window. That’s appealing when your primary agent’s context is cluttered and you want separation of concerns quickly. But a context reset is a blunt instrument. You don’t control what gets kept and what gets dropped. You’re just starting over and hoping the handoff message contains everything the sub-agent needs.
My focus shifted from the actual task logic to figuring out how to make agents pass the right information between themselves. I was playing telephone with my agents. Although entertaining, I wondered if this was the best investment of my time.
If the context reset is the main thing drawing you to multi-agent, you should ask whether you can get that more cheaply and with more control. Because the reset comes with real costs.
Context reset doesn’t come for free
Multi-agent introduces coordination problems that simply don’t exist in a single-agent system. Agents need to agree on shared state. Dependencies between tasks create sequencing constraints. Errors amplify across agent boundaries.
When a single agent reasons through a problem, you can read its chain of thought. When something goes wrong, the agent can trace back through the steps and find where the reasoning broke down. This is especially useful at the beginning.
Multi-agent makes this much more complicated. When an orchestrator delegates to a sub-agent, that sub-agent’s reasoning chain lives in a separate context. The orchestrator receives an output, not a process. If the output looks off, it’s difficult to understand why. You’re in even more trouble if the output looks plausible but the logic was off.
With a single agent, the agent has its own full history. It doesn’t need a handoff protocol. The “coordination” is the model attending to its own context, which is a whole challenge in its own right. When designing complex agentic systems, I have found it to be very helpful to free myself from the coordination problem in the early stages of development.
For sure, a single agent working through a complex task will accumulate context. Its window fills up, irrelevant information competes for attention with relevant information. But these issues do not outweigh the challenges that come with multi-agent systems. Hence, let’s fix context for the single agent first.
Focus on context hygiene instead
This isn’t a new idea. The importance of context management is well-established in the community. But multi-agent distracts from it by appearing to solve the problem (you get a context reset at every delegation boundary) while solving it poorly and uncontrollably. You don’t choose what gets reset. You’re outsourcing a design problem to an architectural pattern and hoping it works out. Instead, solve it directly. I think of this as giving your agent a notebook.
The notebook
Your agent should be able to think out loud, offload what it doesn’t need right now, and retrieve it when it does. Three moves make up the notebook: decide what to keep, write things down and retrieve deliberately.
Decide what to keep
Not everything the agent produces belongs in active context. Tool call results, intermediate reasoning, previously executed tasks: most of this is useful once and then becomes noise. The agent should summarize what it learned from a step before moving on, rather than carrying the full output forward. Think of this like your meetings: A transcript is rarely helpful, your notes most likely are.
Write things down
Instruct the agent to maintain a working log and a task list. The log captures what was considered, what was chosen, and why. The task list tracks what’s done, what’s next, and what’s blocked.
The log gives the agent a clean and structured record it can reference when it needs to. It stores reasoning summaries and decision traces. It helps the agent while it’s working on the task but, maybe even more importantly, it is gold for evaluating your agent afterwards, to create a learning loop from past executions. The task list helps the agent maintain a consistent state. In my experience, the best implementation is to persist this outside the context entirely and load a fresh snapshot at each iteration. The agent sees the current state of the world, not the full history of how it got there.
Retrieve deliberately
Instead of everything being in context and hoping the model attends to the right parts, nothing is in context unless the agent actively pulls it back in. Passive accumulation becomes active retrieval. Give the agent a skill that lets it look things up from its own earlier work. Describe what’s available and when it might be useful. Let the agent decide when to reach for it. The notebook only works if the agent knows it can flip back through the pages.
When multi-agent genuinely earns its place
There are, without doubt, things a single agent cannot do, no matter how good its context hygiene. Most notably work on multiple tasks at the same time. If your task is decomposable, you might finally have an argument for a multi-agent system. There is research suggesting that multi-agent systems outperform single agents when dependencies between tasks are few. Multiple agents might also shine when context scopes required to achieve their goal strongly differ between tasks — think of the planning agent inside Claude Code. Planning requires a very broad scope while writing code usually happens within clearly defined boundaries inside your codebase.
Regardless of these features, single agent vs. multi-agent is the wrong first question. It looks like an architectural decision that will give your system structure and clarity. In practice, it’s a distraction from the thing that actually determines whether your system works: how you manage context.
Get context hygiene right and save the architecture question for later. Start by giving your agent a notebook.