The BOUND Test
Five gates that decide whether a task deserves an agent at all, applied before a single line of orchestration code.
The BOUND Test is a five-gate screen for deciding whether a task deserves an agent before you write orchestration code.
Key Takeaways
- The BOUND Test is five independent gates, all of which a task must pass before it deserves an agent: Bounded goal, Observable state, Usable tools, Narrow autonomy, Designed failure.
- A failed gate is an instruction, not a verdict. It tells you exactly what to build before you grant autonomy.
- Narrow autonomy is the operational answer to OWASP LLM06 Excessive Agency: refuse excessive functionality, permissions, and autonomy by default and grant each only against a justified need.
- Designed failure distinguishes failing safe (arrest and escalate) from failing silent (proceed confidently wrong). Engineer every foreseeable failure to fail safe and instrument the rest to fail loud.
- The introduction's incident failed all five gates. Running the test would not have killed the project; it would have produced a safe rung-4 agent that delivered the value without the surprise.
Read this beside bounded goals and task contracts, tools, permissions, and side effects, and principles of building AI agents when you turn the chapter into a production design.
I keep a one-page worksheet taped above my desk, metaphorically, and on more than one literal whiteboard. It has five gates. Before any team commits to building an agent, the task has to pass all five. Not most. All. A task that fails any single gate is not ready for an agent, and the gate it fails tells you exactly what to fix or whether to drop down the ladder to a workflow.
The worksheet is the BOUND Test, and the five gates are Bounded goal, Observable state, Usable tools, Narrow autonomy, and Designed failure. The letters are a memory aid, not the point. The point is that each gate is an independent failure mode, and agents in production fail at whichever gate you skipped.
I will define each gate operationally, give you the question that tests it, the evidence that passes it, and a real failure that comes from skipping it. Then I will give you the worksheet as an artifact you can copy.
B: Bounded goal
The question: Can you state the goal as a verifiable condition, with explicit success criteria, explicit out-of-scope, and a stopping rule?
An agent pursues a goal. If the goal is unbounded, the agent's loop is unbounded, and an unbounded loop in a system that costs money per step and can take real-world actions is a slow-motion incident. "Improve customer satisfaction" is not a goal an agent can have. "Resolve this specific ticket to one of these four terminal states, or escalate, within twelve tool calls" is.
A bounded goal has three parts. A success condition you can check ("the ticket is in state RESOLVED and the customer's stated problem maps to one of these resolution codes"). An explicit scope boundary ("you may issue refunds, change plan tiers, and reset passwords; you may not delete accounts, modify other customers' data, or change pricing"). And a stopping rule ("if you have not reached a terminal state in twelve steps or you encounter anything outside scope, escalate and stop").
The failure from skipping this gate is the wandering agent: the one that, given a vague goal, keeps "improving" forever, burning tokens, calling tools in increasingly creative combinations, until something irreversible happens or someone notices the bill. A bounded goal is the difference between a contractor with a scope of work and a contractor with a key to your house and the instruction to "make it nicer."
O: Observable state
The question: Can you reconstruct, after the fact, exactly what the agent saw, decided, and did at every step, and can you observe the relevant world state during the run?
This is the gate the demo that lied failed. The agent took an action, and the thirty thousand tokens of context that produced it were gone. Observability has two faces. Internal observability is the trace: every observation, reasoning step, tool call, tool result, and state transition, recorded in a form you can replay. External observability is your ability to see the relevant state of the world the agent is acting on: the account balance before and after, the ticket state, the config value.
The evidence that passes this gate is a trace schema that exists before the agent ships, not bolted on after the first incident. If you cannot answer "where did the plan change?" from your logs today, for a workflow you already run, you will not magically be able to answer it for an agent. Observable state is a precondition, which is why it is the O and not an afterthought.
The failure from skipping it is the unfixable incident. Something goes wrong, you cannot reconstruct why, so you cannot fix the cause, so you either turn the agent off entirely or leave it running while blind. Both are bad, and you chose between them by skipping a gate.
U: Usable tools
The question: For every tool the agent can call, do you know its permission scope, input schema, side effects, idempotency, rollback path, rate limits, and whether it logs?
This is the Tool Trust Contract, which gets its own chapter, but it is a BOUND gate because an agent is only as reliable as its tools. The applyCredit disaster was a U-gate failure: a tool with unclassified side effects (it wrote to the ledger and fired a webhook), no idempotency (a retry doubled the credit), and no rollback path. The model did nothing wrong. The tool was not usable, in the precise sense that its behavior under agent control was not characterized.
"Usable" does not mean "exists." Plenty of tools exist and are unusable by an agent because nobody classified their side effects. The evidence that passes this gate is a completed Tool Trust Contract for every tool in the agent's toolbox, with side effects classified as read, reversible-write, or irreversible-write, and irreversible writes either removed from the toolbox or gated.
The failure from skipping this gate is the amplified fragility I warned about earlier. Agents call tools more often, in more combinations, in response to more inputs than you tested. A tool that is 99 percent reliable under scripted use can be a steady source of incidents under an agent that calls it two hundred times a day in combinations the original author never imagined.
N: Narrow autonomy
The question: Is the autonomy you are granting the minimum needed for the value you want, and have you written down exactly what the agent may and may not decide?
This is the BOUND Test's connection to OWASP's Excessive Agency (LLM06), now a named, current entry in the OWASP Top 10 for LLM Applications. OWASP names three root causes that map almost exactly onto this gate: excessive functionality (the agent has tools it does not need), excessive permissions (the agent's tools have broader access than the task requires), and excessive autonomy (the agent acts without confirmation on high-impact operations). Narrow autonomy is the discipline of refusing all three by default and granting each one only against a specific, justified need.
Narrowness is the principle of least privilege applied to decisions, not just permissions. If the task only needs the agent to choose among reading data and drafting responses, do not give it the ability to decide to issue refunds, even if a refund tool exists in your codebase. The rung on the Autonomy Ladder is the coarse setting; narrow autonomy is the fine-grained version, decided per tool and per action class.
The evidence that passes this gate is an explicit autonomy specification: a list of decisions the agent is allowed to make unsupervised, a list it must escalate, and a list it can never make. If your answer to "what can it decide?" is "whatever it figures out it needs to," you have failed this gate, and you have built the excessive-agency vulnerability on purpose.
D: Designed failure
The question: When the agent is wrong, where does the failure land, and is that landing zone safe?
Every agent will be wrong. The question is never whether but where the wrongness goes. A designed failure means you have decided, in advance, what happens in each failure class: the agent gives up, the agent escalates to a human, the action is rolled back, the run is checkpointed and paused, the kill switch trips. An undesigned failure means the wrongness goes wherever the code happens to send it, which is often "commit the bad action and move on."
There is a crucial distinction inside this gate between failing safe and failing silent. A safe failure arrests: the agent stops, escalates, and leaves the world in a recoverable state. A silent failure proceeds: the agent does the wrong thing confidently and nobody notices until the consequences surface. Designed failure means engineering every failure mode you can foresee to fail safe, and instrumenting the ones you cannot foresee to at least fail loud.
The evidence that passes this gate is a failure taxonomy with a designed landing zone for each class, plus a default landing zone (usually "escalate and halt") for the classes you did not foresee. The failure from skipping this gate is the introduction's $14,000 credit: an agent that, when wrong, simply committed the wrong irreversible action because nobody had designed what should happen instead.
The worksheet
Here is the artifact. Run it as a gate review before building, and re-run it whenever the task, tools, or environment change. Each gate is pass or fail. There is no partial credit, because partial observability is no observability when the incident hits.
| Gate | Pass requires | Evidence artifact | If it fails |
|---|---|---|---|
| Bounded goal | Verifiable success condition + explicit scope + stopping rule | Goal/task contract (next chapter) | Bound the goal, or build a workflow |
| Observable state | Replayable internal trace + visibility into relevant world state | Trace schema deployed before launch | Build observability first; do not launch blind |
| Usable tools | Tool Trust Contract complete for every tool; irreversible writes gated or removed | Signed Tool Trust Contracts | Classify side effects; gate or remove irreversible tools |
| Narrow autonomy | Explicit allow/escalate/never decision lists; least privilege on tools and actions | Autonomy specification | Remove unneeded tools/permissions/decisions |
| Designed failure | Failure taxonomy with a safe landing zone per class + default halt-and-escalate | Failure taxonomy + landing zones | Design the landing zones before granting autonomy |
A task that passes all five is a candidate for an agent at the appropriate rung of the ladder. A task that fails any gate is not. The discipline is in treating the failures as instructions, not obstacles. "Fails the U gate" does not mean "give up." It means "go classify your tools' side effects, then come back," which is work you should have done anyway.
Worked example: the BOUND Test on the demo that lied
Let me run the introduction's support agent through the gates retroactively, because hindsight makes the gates vivid.
B, Bounded goal. Partial. "Fix the customer's invoice problem" had a fuzzy success condition (what counts as fixed?) and no stopping rule. The agent decided "fixed" meant "credit applied," which was its interpretation, not our specification. Fail. We should have specified terminal states and a refund cap.
O, Observable state. Hard fail. The thirty-thousand-token black hole. We had the first instruction and the last action and nothing in between. Fail. We could not even reconstruct that disputed_amount was null until we re-ran the API by hand.
U, Usable tools. Hard fail. applyCredit had unclassified side effects, no idempotency, no rollback. Fail. This alone should have stopped the launch.
N, Narrow autonomy. Fail. The agent could issue an arbitrary credit amount on any account. It needed, at most, the ability to propose credits, capped, on non-enterprise accounts, with anything above a threshold escalated. We granted excessive agency by default. Fail.
D, Designed failure. Hard fail. When the agent misread disputed_amount, there was no landing zone. The wrongness went straight to the ledger. Fail.
Five for five. The demo that lied was not an unlucky edge case. It was a task that failed every single gate of the BOUND Test, shipped because the demo was beautiful and nobody had a worksheet. The $14,000 was the tuition for not having one.
If we had run the test, we would not have killed the project. We would have done five concrete pieces of work (bound the goal, build the trace, classify the tools, narrow the autonomy, design the landing zones) and shipped a supervised agent at rung 4 that proposed credits for a human to approve. That agent would have delivered most of the value with none of the surprise. The BOUND Test does not say no to agents. It says not yet, and here is exactly what to fix first.
The next four chapters are the four gates that take the most engineering: bounded goals as task contracts, usable tools as the Tool Trust Contract, observable and controllable state, and planning that does not wander. Designed failure earns its own chapter near the end, with the kill switches and rollback runbooks.
