AN Alpesh Nakrani
BlogBooksPraiseAbout Work with me →
Book overview
Chapter 1 / Field Manuals

The Production Problem

Why the work fails after the demo and what must be made explicit first.

The Boundary

Support That Resolves Itself begins with a plain constraint: deflection hides whether the customer actually got the issue resolved. The subject is customer support automation, but the real work is deciding where the system is allowed to act, what it must prove, and who owns the result when the answer is wrong.

Deflection is the wrong metric; resolution is the right one. Building support where the machine closes the ticket and the human handles the exception. This chapter treats that premise as operating material. The goal is not novelty. The goal is a decision process that a team can explain before launch and defend after the first incident.

This manual is written for the team that has to operate customer support automation, not only present it. For support, success, and operations leaders, the useful boundary is the point where a model response becomes a business action. Before that point, experiments are cheap. After that point, errors have owners.

The First Principle

The first principle is simple: support automation should be judged by resolution, not avoidance. It sounds obvious, but most failures start when teams accept a pleasing example as proof of general behavior.

A responsible team writes down the decision before optimizing the system. The question is not whether customer support automation can produce an impressive answer. The question is whether it can support a real decision: which cases can be resolved without lowering trust.

What To Ignore

Ignore benchmark theater that does not resemble the work. Ignore demos with handpicked inputs. Ignore architecture diagrams that cannot name an owner for failure. The reliable path starts with the work people actually need completed.

The temptation is to widen scope quickly. Resist it. The faster route is to define a narrow promise, measure it honestly, and expand only when the evidence remains stable across new inputs.

How To Read The Book

Each chapter returns to one operating question: what would have to be true for this system to be trusted? The answer changes by domain, but the discipline does not.

Use the book as a working memo. Mark the policies that already exist, the measurements that are missing, and the assumptions that need a test before they become roadmap commitments.

Research Lens

The research base for Support That Resolves Itself matters because customer support automation sits between capability and consequence. Papers, benchmarks, and risk frameworks can show what is possible, but production teams still have to translate that evidence into decisions. This chapter treats research as a constraint on judgment, not as decoration.

The most useful research habit is to separate mechanism from outcome. A paper can show that a method improves a benchmark. It does not prove that the same method improves durable resolution rate in your product. That gap is where evaluation, sampling, and release discipline belong.

For this chapter, read external sources as pressure tests. If a source describes a known weakness, ask whether your system can observe that weakness. If a source describes a benchmark gain, ask whether your users send the same kind of work. If a source describes a risk, ask who owns it after launch.

Scope method

Start with a written task statement. It should name the user, the input, the expected output, the source of truth, and the action that follows. If any of those pieces are missing, customer support automation is not ready for broad automation because the team cannot tell whether the result is good enough.

Next, define the control surface. For this topic, the control surface includes case classification, policy retrieval, action permissions, audit trails, and handoff. Each control should have a reason to exist and a way to be tested. A control that cannot be tested becomes process theater. A control that can be tested becomes part of the operating system.

Finally, decide what the system does when the answer is not ready. The mature options are ask for more context, return a partial answer with evidence, route to a person, or stop. The immature option is to keep generating until the output sounds confident.

Boundary evidence

Evidence should be collected at the same grain as the decision. If the decision is which cases can be resolved without lowering trust, the review set should contain examples that force that decision. A broad score is useful only after the team has inspected the cases that carry the most cost.

The strongest evidence combines observed user work, known edge cases, recent incidents, and synthetic pressure tests. Synthetic examples are useful when they fill a known gap. They are dangerous when they replace the real distribution the system must serve.

A good review record includes the input, the relevant context, the output, the expected answer, the judgment, and the fix. Without that record, quality work becomes memory work. With it, the team can see whether the system is learning, drifting, or merely changing shape.

Implementation Notes

Implementation should begin with the smallest useful workflow. The first version should be narrow enough that the team can replay every important failure. If replay is not possible, the system is not observable enough for serious use.

The second version should add volume without changing the promise. This is where durable resolution rate should be watched closely. If the metric improves while support tickets, corrections, or handoffs rise, the measurement is missing something important.

The third version can expand scope only after the team knows which failures are acceptable, which failures require escalation, and which failures require rollback. Expansion without that knowledge creates a system that appears productive while quietly moving risk to the customer.

Decision Review

At the end of the chapter, the team should be able to answer four questions. What promise are we making? What evidence supports it? What happens when the promise fails? Who has authority to change the promise? These questions are simple, but they expose most weak deployments.

The answer should not live only in a meeting note. It should appear in the evaluation suite, the release checklist, the incident process, and the product experience. Users do not need to see the internal machinery, but they do need to feel its discipline.

Support That Resolves Itself is ultimately about replacing vague confidence with accountable practice. The point is not to slow teams down. The point is to make speed repeatable, explainable, and safe enough to build a business on.

Operating table

The Production Problem operating table

AreaWhat to inspectDecision evidence
ScopeDefine the exact customer support automation task before expanding coverage.durable resolution rate
EvidenceRequire examples that represent real work, not only ideal demos.which cases can be resolved without lowering trust
OwnerName the person responsible when closing tickets faster while customers reopen the same issue later.keep the human for judgment, exceptions, and policy changes
Chapter notes

What to carry forward

  • Define the work as an operating responsibility.
  • Use durable resolution rate as the anchor metric.
  • Make this decision explicit: Which cases can be resolved without lowering trust.
  • Keep the human for judgment, exceptions, and policy changes.
  • Define the boundary before expanding scope.
  • Treat the first example as a clue, not proof.
Share