AI / ENGINEERING

Async Agents Are a Game Changer (Part 1)

November 4, 2025 · 7 min read · By Silas Rhyneer
𝕏 in f 🔗

I once accidentally spawned 60 Claude agents at the same time.

My machine started grinding to a halt. Each agent was spinning up its own exploration subagents, which were spinning up their own subagents—exponentially multiplying like some kind of AI hydra. I needed to grep and kill every Claude process, but I couldn't remember the exact command. So I opened a new terminal tab and asked Claude for help.

Except Claude was now slow as hell because my entire system was choking on agent processes. I sat there watching my computer crawl, genuinely worried I was about to cook my machine and burn through my API credits in one catastrophic cascade.

I did eventually kill them all. And then I kept building.

Agent chaos visualization

Because despite the chaos, I'd stumbled onto something that fundamentally changed how I work: async agents unlock long autonomous orchestration. The ability to hand off complex, multi-step workflows and walk away for 15+ minutes while the system works—without drifting into chaos.

"Async agents enable sustained trust: hand off work and enter flow state elsewhere for 15+ minutes."

Flow state requires this sustained trust.

The Blocking Problem

Before async agents, I was doing batch parallelism. The workflow went like this:

  1. Create a comprehensive plan with tasks organized into batches
  2. Each batch contains independent tasks that can run in parallel
  3. Instruct Claude to spawn agents for all tasks in batch 1 simultaneously
  4. Wait for entire batch to complete
  5. Move to batch 2, repeat

Claude spawns multiple agents in parallel as part of the same group of function calls—this is where the real compute advantage comes from. More agents running simultaneously means more work done faster.

But it's still fundamentally blocking.

You can't start batch 2 until batch 1 finishes, even if most of batch 1 is done and just waiting on one slow agent. The workflow waits. Claude waits. I wait.

The entire orchestration becomes as slow as the slowest agent in each batch—and I can't really do anything else because I'm context-locked to this task, ready to jump in when the next batch starts.

This is the fundamental constraint of synchronous orchestration: it demands continuous attention even when no attention is needed.

Enter Async

Async agents solve this by making "waiting" productive.

Instead of blocking on agent completion, Claude spawns agents and actively monitors their progress through a combination of:

The orchestrator agent stays active, making decisions, spawning new agents, handling blockers—all while other agents continue working in the background.

This unlocks something critical: I can walk away.

Not just for 30 seconds. For 15+ minutes. Because the orchestrator is handling the complexity, the dependency management, the error recovery. It doesn't need me sitting there ready to respond to every batch completion.

The Trust Threshold

Here's what changed: I can now delegate a complex refactor—say, fixing 15 type errors across 8 files—and trust that when I come back, the work is done correctly.

The orchestrator spawned specialized agents. It monitored their progress. When one hit a blocker, it analyzed the issue and spawned a fixer agent. When the build failed, it read the errors, identified the root cause, and delegated the fix.

All without me.

That's the trust threshold: can I enter flow state on something else and return to completed work?

With async orchestration, the answer is yes.

Implementation Reality

Building this wasn't about clever prompts or fancy tooling. It was about creating the right primitives:

  1. Agent outputs as files: Every agent writes its progress to agent-responses/{agent_id}.md
  2. Blocking and non-blocking monitors: await for dependencies, sleep + Read for progress checks
  3. Active orchestrator role: Never spawns agents and exits—stays engaged until work completes
  4. Build validation: Always runs the build after implementation, delegates fixes if needed
  5. Error recovery protocol: One mitigation attempt per blocker before escalating to user

These aren't sophisticated. They're boring primitives that compose into surprisingly powerful orchestration.

What This Unlocks

I'm now running workflows that would have been impossible with synchronous batch parallelism:

All while I'm in flow state on something completely different.

The system doesn't demand my attention every 3 minutes. It works autonomously, fails gracefully, and only interrupts when it genuinely needs human judgment.

The Catch

This only works because I'm willing to accept that it might break things.

I'm in pre-production mode—building fast, iterating quickly, knowing that a broken build is fixable. The orchestrator doesn't have perfect judgment. Sometimes it makes the wrong call. Sometimes it fixes one error and introduces another.

But that's the trade-off: velocity over perfection.

And honestly? The error rate is lower than I expected. Most runs complete successfully. When they fail, they fail with clear context about what went wrong and why.

What's Next

I'm going to keep building on this foundation. There's a lot more to explore:

But the core insight stands: async agents enable sustained autonomous work at a scale that fundamentally changes the human-AI collaboration model.

From "I'm working with AI" to "AI is working while I'm doing something else."

That's the game changer.