CLOUDYETI · FIELD NOTES
Claude Code · Dynamic Workflows

Subagents vs
Workflows:
who holds
the plan?

Everyone fixates on the headline number — up to a thousand agents. But the real difference isn't scale. It's where the plan lives. Get that, and everything else falls out of it.

↳ 6 MIN READ ↳ UPDATED MAY 2026 ↳ FOR THE LOCAL-AI CROWD
↑ 11-SECOND LOOP · THE WHOLE IDEA IN ONE SCENE ⌁ DOWNLOAD MP4
01 — The actual distinction

Subagents, skills, and workflows can all run a multi-step task. The split is who decides what runs next.

With subagents, Claude is the orchestrator. It decides turn by turn what to spawn, and every result lands back in its context window. The plan lives in Claude's head — durable only as long as the conversation.

With a workflow, the plan lives in a JavaScript script that a separate runtime executes. The loop, the branching, and every intermediate result sit in script variables — not in Claude's context. Claude only ever sees the final answer.

A workflow moves the plan into code. That one shift is what makes everything else possible.

The thousand-agent ceiling isn't an arbitrary limit — it's a consequence. Two hundred results from two hundred subagents would blow up a context window. In a workflow those results never touch it, so you can keep fanning out.

Fan-out · one orchestrator → many workers WORKFLOW results stay in script variables — never in Claude's context →
02 — What you actually get

Same job, different machine.

Subagents

Claude holds the plan
  • Live delegation. Claude steers turn by turn — great when you want to course-correct mid-task.
  • Results fill context. Everything a worker finds comes back into Claude's window.
  • Reusable: the worker. You can save the worker definition, not the overall plan.
  • Interruption = restart. Stopping a turn means re-running it.
  • Scale: a few per turn. Bounded by what one conversation can coordinate.

Workflows

A script holds the plan
  • Codified orchestration. The loop and branching live in a script you can read and rerun.
  • Context stays clean. Intermediate results live in variables; only the final answer returns.
  • Reusable: the whole pipeline. Save the run as a /command you fire on every branch.
  • Resumable mid-run. Completed agents return cached results; the rest run live.
  • Scale: dozens to hundreds. Up to 1,000 agents per run, in the background.
The sleeper feature: it can enforce a quality pattern, not just more agents.

Because the script controls flow, it can have independent agents adversarially review each other's findings before anything is reported, or draft a plan from several angles and weigh them against each other. That's a structural guarantee baked into code — not "hope Claude remembers to double-check."

03 — At a glance

Three ways to read the same trade-off.

Where results live
Subagentscontext window
Skillscontext window
Workflowsscript variables
What's repeatable
Subagentsthe worker
Skillsthe instructions
Workflowsthe orchestration
On interruption
Subagentsrestarts
Skillsrestarts
Workflowsresumable
04 — "Wait, is /deep-research new?"

A bundled deep-research command is new. The idea isn't.

The confusion is fair — "deep research" has lived in a few forms. Only one of them ships inside Claude Code with nothing to install.

Earlier — the web app
The Research toggle in claude.ai
Powerful, but browser-only. No terminal command, no API endpoint — people resorted to driving Chrome from Claude Code to reach it.
Ongoing — community builds
DIY skills, plugins & MCP servers
Plenty of third-party /deep-research commands via Exa, Firecrawl, Semantic Scholar. Things you bolt on — not something Anthropic shipped.
Now — first-party
Bundled /deep-research, built on workflows
Ships with Claude Code. Fans out searches across angles, cross-checks the sources, votes on each claim, and returns a cited report with the claims that didn't survive already filtered out.
05 — Does it cost extra?

It's in your subscription — but it eats the budget faster.

Separate charge?
No premium tier.
Runs count toward your plan's usage and rate limits like any other session. There's no "workflows tier" you pay extra for.
The catch
More tokens, faster.
Fanning out to many agents means a single run can burn meaningfully more tokens — and once you pass your limit, overage bills at standard API rates.

⌁ Where the real cost control lives

/modelCheck it before a large run — a 20-agent fan-out on Opus is a very different bill than the same run on Sonnet.
route stagesAsk Claude to send the grunt stages to a smaller model and reserve the strong one for the parts that need it.
/workflowsStop a runaway or mis-scoped run any time without losing the work already completed.

Tokens are payroll — so a workflow is less a new line item and more a question of how many "employees" you put on the problem, and at what pay grade.

06 — So which do I reach for?

Steer live, or commit to code.

Reach for subagents

When you want to course-correct turn by turn, the path isn't fully known up front, or the task is small enough that one conversation can hold it. Flexibility over repeatability.

Reach for a workflow

When the plan is clear enough to commit to code, the scale would flood a context window, you'll rerun it (audits, migrations, reviews on every branch), or a wrong answer is costly enough to want adversarial cross-checks.

link copied ✓