LoopFlow, the tutorial
LoopFlow is a small natural-language DSL for loop engineering. A .loop file describes a self-correcting, human-gated coding workflow — its objective, the context it may read, the actions it may take, how it verifies itself, and when it stops. This page teaches the whole language from the first line to full A-to-Z pipelines, every section grounded in a real example you can run.
# run in chat: /loopflow run rate-limit.loop
# (the agent knows LoopFlow from AGENTS.md — your repo's memory)
git:
work on a branch
commit when the goal is met
open a pull request
loop "add API rate limiting":
goal: requests are rate-limited per API key
done when "pnpm test rate-limit" passes
look at: the API and its middleware, and the last failure
allow edits automatically, but ask me before pushes
each cycle: plan, then act, then observe
also: run a security check
when it fails: reflect, then plan again
after 6 tries: stop and warn "thrashing"
Left: a dozen messages, and it still pushed to main. Right: the whole job, written once — it runs, verifies, and stops only when it's really done.
npx @loop-lang/loop init # installs the /loopflow skill + AGENTS.md
# then, in a Claude Code chat:
/loopflow fix the failing test — done when the suite passes
Watch it plan → act → observe, reflect on a red test, and stop only when the check is green. Skeptical? → Why not just prompt? · Full setup in Getting started.
What is a loop basic
AI writes the code now — but you are still the conductor. Every coding task is really five decisions:
| Decision | In a .loop | Question it answers |
|---|---|---|
| Objective | goal: | What are we trying to do? |
| Context | look at: | What may the agent read first? |
| Actions | allow… / ask me before… | What may it do, and what needs a human? |
| Verification | done when | How do we know it worked? |
| Stopping | when… / after N tries | When do we stop — done, or thrashing? |
Here are all five decisions as one real loop — every line is one of the rows above:
loop "fix the failing test": # the work
goal: the cart total is correct with a coupon # Objective
look at: the checkout code, and the last failure # Context
allow edits automatically, ask me before pushes # Actions
done when the cart coupon tests pass # Verification
after 6 tries: stop and warn "stuck" # Stopping
When the model took over the building, those five decisions got buried in prompts. LoopFlow promotes them to first-class, editable knobs. At runtime they drive five phases — plan → act → observe → reflect → stop — the diagram above. You don't re-tune a prompt; you edit the loop.
- Edit the loop, not the prompt. The control structure is the artifact.
- You can't fake done.
done whenruns a real command — a test, a scanner, a script. The loop stops only when the world agrees.
Prompt vs LoopFlow — why not just prompt? why
You could just say "fix the bug." So why write a loop? A prompt fires once and trusts the model's word that it's done. A loop verifies, self-corrects, and stops only when the work is provably finished.
| Just prompting | A loop | |
|---|---|---|
| "Done" means | the model says "done" | a real command passes — done when "…" |
| On failure | you notice, re-prompt, repeat | reflects on the failure, re-plans automatically |
| Stops | when the model stops typing | when the check is green — or warns after N tries |
| Risky actions | hope it asks first | gated; never pushes to main/master |
| Scope | wanders the codebase | look at: keeps it in your module |
| Repeatable | re-type it, get drift | re-run the same file, same shape |
| Shareable | a paragraph in Slack | a .loop in the repo, reviewable in a PR |
Same task, both ways
The prompt:
src/checkout and make sure nothing else breaks."→ the agent edits, replies "Done — fixed the rounding." Did
checkout.spec.ts::tax actually pass? The whole suite? You re-run it yourself. Failed? Re-prompt. (And it may have run git push on the way.)The loop:
loop "fix the checkout tax test":
goal: the checkout tax test passes with no regressions
done when the checkout tax tests pass
look at: the checkout code, and the last failure
each cycle: plan, then act, then observe
when it fails: reflect, then plan again
after 6 tries: stop and warn "tax fix thrashing"
Runs the test every cycle. Fails → reflects on why → fixes again. Stops only when the test is green (or warns after 6). Works on a branch, never touches main. The same file works tomorrow, and a teammate can read exactly what "done" meant.
- Prompting asks for an answer. A loop guarantees a result.
vs Claude Code's /loop and /goal why
Claude Code already has two looping built-ins — they're useful, and they're not the same tool. /loop is a scheduler (re-run a prompt every few minutes). /goal is the closest cousin: keep going until a condition holds. The catch with /goal — its condition is judged by a fast model reading the transcript; it can't run your test or open a file. So "done" is what the model says, not a command that passed.
/loop | /goal | LoopFlow | |
|---|---|---|---|
| What it's for | run a prompt on a schedule | loop until a condition reads true | a verified, gated, reusable workflow |
| "Done" means | never — you stop it | a model judges your condition from the transcript | a real command passes — done when "pnpm test" passes, can't be faked |
| On failure | fires again next interval | next turn; no introspection | reflect on the failure, then re-plan (the back-edge) |
| Human gate mid-run | no | no — fully autonomous | yes — a human approves the plan first |
Never push to main | no | no | built-in, unconditional |
| Reusable / shareable | no | no — ephemeral per session | a version-controlled .loop — run in any repo, save to your library |
| Multi-step | — | one condition | pipelines, flows, for each |
/loop— polling and cadence ("check the deploy every 5 minutes")./goal— a quick, throwaway "keep going until it looks done" in this session.- LoopFlow — when "done" must be provable, the loop must self-correct, a human gates the risky step, and you want to keep and reuse the workflow. A
.loopis/goalwith a real check, a retry, a gate, and a file.
Getting started setup
That's the why. From here down is the deep dive — set LoopFlow up once, then learn the language line by line.
Prerequisites: Node 18+, the Claude Code CLI (LoopFlow drives it), and a git repo to work in. Two ways to run a loop: inside a Claude Code chat (the bundled skill — recommended), or by hand in VS Code with the extension. One command installs everything:
1 · Install with npm
From your repo, run the installer. In one step it writes the /loopflow skill and AGENTS.md (the full language reference) into the project:
npx @loop-lang/loop init # install into this repo
npx @loop-lang/loop init --global # or: install the /loopflow skill for every repo
That gives you two things at once. /loopflow is now available in a Claude Code chat here. And AGENTS.md sits at the repo root — it travels with the project, so any agent that opens the repo already knows the LoopFlow language; it's the project's persistent memory of how to write a .loop. (Methods are shared the same way: use the <X> method pulls in a .loop preset another repo can reuse.)
2 · Run your first loop
/loopflow fix the failing auth test in src/auth, gate any database migration # writes a .loop
/loopflow run examples/fix_test.loop # runs it, in the chat
Describe the work and the skill writes the .loop; name a file and it runs the loop natively in the session — you watch every plan → act → observe → reflect step and answer human gates right in the chat. Prefer a terminal? The same files run headless via the CLI (loop-run run <file>) — see Running a loop.
That first command writes a file like this — yours to read, edit, and re-run:
# /loopflow "fix the failing auth test in src/auth, gate any database migration" writes:
loop "fix the failing auth test":
goal: the auth suite passes in src/auth
done when "pnpm test src/auth" passes
look at: the auth code, and the last failure
ask me before migrations
each cycle: plan, then act, then observe
when it fails: reflect, then plan again
after 6 tries: stop and warn "stuck"
3 · Author in VS Code supported
A .loop is just text — let the agent draft it, or write one by hand. Install the LoopFlow extension from the VS Code Marketplace — Extensions ▸ search “LoopFlow” ▸ Install — then open any .loop file:
- Syntax highlighting for the whole language —
goal,done when, gates,flow,for each,git:. - Completion + formatting — keyword snippets so you type a loop from muscle memory, not a cheat sheet.
- A ▶ Run button on every loop (the LoopFlow: Run this loop CodeLens) — and you choose where it runs via the
loop.runModesetting: open an interactive Claude Code session, or a headless output-panel trace. - New from template (command palette → LoopFlow: New from template) — scaffold a ready-to-edit
.loopbundle. - A soft linter that nudges, never blocks — "this loop has no way to verify it's done", "add a thrash guard".
Same file, same engine: prompt the agent to generate a loop, or open VS Code and write the .loop yourself. The chat is the first-class way to run; the extension is the first-class way to author by hand.
git: block, LoopFlow works on a branch and commits when the goal is met, and never pushes to main/master — see the git keyword.Rather learn hands-on?
Two guided ways to get the language into your fingers before the deep dive below:
done when — how a loop verifies itself basic
The predicate is the spine of the whole idea. Four forms:
done when the test "billing.spec.ts::apostrophe" passes # a named test (runs via your test runner)
done when "pnpm test" passes # a shell command, success = exit 0
done when "semgrep --severity=high" finds nothing # exit 0 AND empty output
done when a human confirms "the UI looks right" # a person is the check
- A predicate is a real command run with your privileges — like an npm script or a Makefile target. So treat a
.loopfrom an untrusted source as you would their shell scripts. finds nothingis how you say "this scanner must report zero" — it requires both exit 0 and empty output.- The
a human confirms "…"form is decided by a person — it's satisfied when you run the loop in conversation; the headless shell verifier returnshuman check required: …and never passes on its own. - A loop with no
done whenhas no machine check, so it must finish through a human path — a review gate, an approved plan-first pass (a plan-only loop), or an explicitwhen …: stop— otherwise it runs to the hard cap. Always give it a real check when one exists.
The cycle and the reflect back-edge intermediate
each cycle: lists the steps, in order — any subset of plan, act, observe:
each cycle: plan, then act, then observe # full self-correcting unit
each cycle: act, then observe # skip planning — just do + check
- plan — read the
look at:files, decide the smallest change toward the goal. (Runs read-only.) - act — make the change, honoring the policy.
- observe — run the
done whencheck and read pass/fail.
On a failed observe, when it fails: reflect, then plan again fires. reflect reads the failure output and writes a short diagnosis; that diagnosis becomes context for the next plan. This is the back-edge — the orange arc in the diagram — and it's the difference between an agent that retries blindly and one that learns from each miss.
Human gates human-in-the-loop
LoopFlow has five places a person steps in. The first two go inside a loop; the third is a stage gate; the fourth is a transition; the fifth is the per-run confirm prompt from the action policy (Section 4) — ask me before …, asked once and remembered.
a human approves the plan first # approve the plan before any acting
a human reviews before stopping # judge the result before the loop may stop
a human approves before provisioning # a hard, blocking gate (used on a stage)
when blocked: ask a human # unblock when the agent is stuck
a human approves the plan first— high-stakes work where the plan must be right before touching anything.a human reviews before stopping— subjective "looks right" goals (UI, copy) where no command can decide done.a human approves before <X>— a blocking gate before a whole stage runs (deploys, provisioning).
Composing — pipeline, flow, for each scaling up
One loop handles one job. Three constructs scale it up — each a keyword with its own reference page:
pipeline— runstages in order; a failing stage halts the rest. An epic → a pipeline, each story → a stage. (Example below.)flow— chain whole.loopfiles; each step's summary carries forward (discover → design → build).for each— run a template loop once per item in a YAML/Markdown plan — A-to-Z over every story.
pipeline "ship feature":
stage security:
goal: no high or critical vulnerabilities
done when "semgrep --severity=high" finds nothing
each cycle: plan, then act, then observe
when it fails: reflect, then plan again
stage build:
goal: feature works and tests pass
a human approves the plan first
each cycle: act, then observe
done when "pnpm test" passes
stage ui:
goal: matches design, responsive at 375px
each cycle: plan, then act, then observe
a human reviews before stopping
stage deploy:
a human approves before provisioning
goal: infra live and healthchecks green
done when "./scripts/health.sh" passes
each cycle: act, then observe
examples/ship_feature.loop
The full grammar — pipelines, flow chains, and for each iteration, with worked examples — is in the manual and the keyword reference.
In practice — real workflows walkthroughs
You don't write LoopFlow all day. You reach for it when a task has a clear "done." Here's how it slots into the work you already do.
A ticket from Jira (the daily driver)
You picked up PROJ-412 — "Applying a coupon can make the cart total negative." Turn the ticket into a loop:
- Describe it. In the chat:
/loopflow PROJ-412: a coupon must never make the cart total negative; done when the coupon-floor tests pass— the skill writes the.loop. (Or write it yourself — the ticket's acceptance criterion becomesdone when, in plain words.) - What it produces:
loop "PROJ-412: coupon must not make the cart total negative":
goal: applying a coupon never produces a negative cart total
done when the coupon-floor tests pass
look at: the cart total logic and the coupon code, and the last failure
allow edits automatically, but ask me before migrations or pushes
each cycle: plan, then act, then observe
when it fails: reflect on which layer broke, then plan again
after 6 tries: stop and warn "PROJ-412 thrashing — check the spec"
- Run it:
/loopflow run proj-412.loop. Watch plan → act → observe; answer the migration confirm if it asks. - It lands on a branch and commits when the test passes (the default git). Add a
git:block withpush when done+open a pull requestto get a PR — paste that link back into the ticket.
done when has something real to check.Built with LoopFlow — Forge's sandbox runner case study
This isn't hypothetical. Forge — a ticket-driven implementation platform (you hand it a ticket, agents implement it) — is itself built with LoopFlow. One of its sharper-edged modules is the sandbox runner: the infrastructure that executes agent-written code in isolation, so untrusted code can run without ever touching the host. I built it as a pipeline, one provable stage at a time:
# examples/forge-sandbox.loop
pipeline "forge sandbox runner":
stage isolate:
goal: every run gets a fresh, network-less container with CPU and memory caps
look at: the runner service and the container config, and the last failure
each cycle: plan, then act, then observe
done when "pnpm test sandbox/isolation" passes
when it fails: reflect, then plan again
after 6 tries: stop and warn "isolation thrashing"
stage execute:
goal: run agent code, capture stdout, stderr and exit code, kill on timeout
each cycle: act, then observe
done when "pnpm test sandbox/execute" passes
stage harden:
goal: no container escape, no host filesystem or cloud-metadata access
also: a security scan
done when "pnpm test sandbox/security" passes
a human approves before enabling network egress
stage integrate:
goal: a real ticket's generated code runs end to end inside the sandbox
done when "pnpm test:e2e sandbox" passes
a human reviews before stopping
Why a pipeline, not one loop: a sandbox is only as trustworthy as the stage you trust least. isolate has to be green before execute is even attempted — a failing stage halts the rest, so the runner is never "half-isolated." harden pairs a security suite with a scan and gates network egress on a human — the one call I never wanted an agent to make alone. integrate won't declare done until a real ticket's generated code actually runs inside the box, with me reviewing before it stops.
Every stage carried its own done when, so "done" meant a green check, not a vibe — and when the escape test failed, the loop reflected on why and re-planned, instead of me re-prompting from scratch. The whole module is now a single file I re-run whenever the base image changes.
LoopFlow by role — where it earns its keep examples
Anywhere "done" is a command, LoopFlow fits. A few real shapes by role — each a runnable .loop you'd write in seconds (or have the agent write):
Backend
Ship an endpoint against its tests; gate the migration.
loop "add POST /orders":
goal: the endpoint creates an order and returns 201
done when "pytest tests/api/test_orders.py" passes
look at: the orders router and schema, and the last failure
ask me before I run a database migration
each cycle: plan, then act, then observe
when it fails: reflect, then plan again
after 6 tries: stop and warn "orders endpoint stuck"
Mobile / frontend
Build a screen until its widget tests are green.
loop "build the login screen":
goal: the login screen validates input and matches the spec
done when "flutter test test/login_test.dart" passes
look at: the login widget and the design spec, and the last failure
each cycle: plan, then act, then observe
when it fails: reflect, then plan again
after 6 tries: stop and warn "login screen stuck"
DevOps
A gated infra change — the scan must pass, and a human approves before it touches staging.
pipeline "harden the staging cluster":
stage scan:
goal: no high-severity misconfigurations in the manifests
done when "kube-score score manifests/" passes
each cycle: plan, then act, then observe
when it fails: reflect, then plan again
stage apply:
goal: the change is live on staging
a human approves the plan first
done when "kubectl rollout status deploy/web -n staging" passes
QA
Turn a bug report into a reproducing test, then make it pass — "done" is the test, not a vibe.
loop "reproduce + fix BUG-481: coupon makes the total negative":
goal: a regression test reproduces the bug, then the fix makes it pass
done when "pnpm test cart/coupon" passes
look at: the cart total logic and the bug report
each cycle: plan, then act, then observe
when it fails: reflect, then plan again
after 6 tries: stop and warn "BUG-481 stuck"
Security
A scan that must find nothing — save it to your library and run it in every repo.
loop "security pass on the auth module":
goal: no high or critical findings in src/auth
done when "semgrep --config p/owasp-top-ten --severity=high src/auth" finds nothing
also: a security scan
each cycle: plan, then act, then observe
when it fails: reflect, then plan again
after 6 tries: stop and warn "findings remain"
done when, the reflect back-edge, a guard, and a gate where it's risky. Learn it once; it travels to whatever you build.Running a loop intermediate
Every command below runs the same kind of file — here's the one referenced throughout this section:
# examples/fix_test.loop
loop "fix the failing checkout tax test":
goal: the tax line is correct at checkout
done when the checkout tax tests pass
each cycle: plan, then act, then observe
when it fails: reflect, then plan again
after 5 tries: stop and warn "stuck"
In conversation (recommended)
With the LoopFlow skill loaded, run it inside the chat — the assistant executes the cycle itself, narrates every step, and you answer gates inline:
/loopflow "fix the failing checkout tax test" # write a .loop from a request
/loopflow run examples/fix_test.loop # run it, right here in the chat
You stay in the chat the whole time: you watch each step, you answer the gate, and the loop ends on a real green test — not the assistant saying "done". This is the only mode where a long interactive discovery works, because the human is already in the loop.
Headless CLI
loop-run run examples/ship_feature.loop # drive Claude Code, glyph trace
loop-run show examples/ship_feature.loop # print the loop's flow as compact ASCII
loop-run ls # list every .loop in the repo + its shape
loop-run parse examples/ship_flow.loop # print the parsed spec (the loop-spec IR)
loop-run viz examples/ship_flow.loop # write a self-contained HTML schematic
# flags: --model , --out , --events (NDJSON for a UI host), --json
In VS Code
- Syntax highlighting, hover docs, and tab-completion for every construct.
- A ▶ Run CodeLens above each definition. When you click it, you choose where it runs (the
loop.runModesetting): open a Claude Code session — an interactive run in the integrated terminal where you watch every step and answer gates in chat — or the VS Code output panel (headless, with a live trace and native gate dialogs). Setloop.runModetoask(prompt each time),session, oroutput. - A soft linter that nudges (never blocks): "this loop has no way to verify it's done", "add a thrash guard".
- LoopFlow: New from template — scaffold a ready-to-edit method bundle (a generic
for eachsetup, or a BMAD A-to-Z one) into your workspace in one pick.
Troubleshooting
| Symptom | Fix |
|---|---|
/loopflow isn't recognized in the chat | The skill isn't installed. Run npx @loop-lang/loop init (or --global) and reopen the chat. |
| The loop runs forever / hits the attempt cap | It has no real done when, so nothing can decide "done". Pin it to a test or command, and add an after N tries guard. |
A done when a human confirms check never passes headless | Human predicates need a person — run it in the chat (or VS Code session mode), not the headless output panel. |
push when done fails before the loop even starts | You're on main/master. LoopFlow refuses to push to a protected branch — switch to a feature branch (or drop push when done). |
loop-run isn't found in the terminal | The headless CLI ships with @loop-lang/runtime: npm i -g @loop-lang/runtime. The chat (/loopflow) needs nothing extra. |
Your global library — save a loop, run it in any repo reuse
You wrote a loop that audits a repo for security holes. It verifies, it self-corrects, it never pushes to main. You'll want it again next week, in a different project. Don't copy the file around — save it to your global library and call it by name from anywhere.
The library is a folder Claude owns: ~/.claude/loopflow/, one <name>.loop per saved loop. It sits next to the skill you installed once, so it's there in every project. You never edit it by hand — you drive it through /loopflow in the chat:
/loopflow save this as security # store the current loop as ~/.claude/loopflow/security.loop
/loopflow list # what have I saved?
/loopflow run security # run my security loop against THIS repo
/loopflow remove security # delete it
Now open a brand-new project and type /loopflow run security. Claude reads your saved loop and runs it here — plan → act → observe, reflecting on each failure, stopping only when your done when check is green, asking before anything risky. Same loop, new repo, every guarantee intact. A bare name means your library; a path or a .loop file still means a local file, so the library never shadows a loop that lives in the repo.
- Ask an LLM to "check this repo for security issues" and you get a fresh plan-then-execute every time — a new interpretation, no real verification, no memory of what "done" meant, and you re-type it in each project.
- A saved loop carries its
done whencheck, its reflect-and-retry, its gates, and itslook atscope with it. Running it re-applies all of that, identically, anywhere. It's a reusable guarantee you built once — not a prompt you re-type and re-trust.
/loopflow command live in the chat — that's where you watch each step and answer gates as the loop runs. VS Code is the second seat: handy for editing a .loop with highlighting and a ▶ Run button, but the loop still runs through Claude. If you only ever use one surface, use the chat.Go deeper
This page is the tour. The depth lives next door:
- 📖 The full manual — every keyword, the CLI, how a run works, the loop-spec IR.
- 📖 Keyword reference — one page per keyword, with diagrams.
- 🛠️ Workshop — build a small todo app, hands-on.
- 🎮 LoopFlow Lab — learn loops by playing.
- ★ GitHub — the source, the open
loop-spec, and issues.