Name: Pricing Software That Thinks
Availability: InStock

Price around the work your software performs, named the way the customer already names it, not the model calls underneath.

Research spine

This chapter is grounded in OpenAI API pricing, Anthropic prompt caching documentation, and Stanford HAI, 2025 AI Index Report.

Pricing software that thinks starts by pricing units of resolved work against variable inference, review, and risk cost.

A founder I worked with had built a contract-review product. The AI read inbound vendor agreements, flagged risky clauses, suggested redlines, and produced a summary memo the legal team could act on. Genuinely useful. The first pricing draft she showed me priced it "per 1,000 tokens of analysis," with a credit system on top that converted dollars into "AI credits" at a rate she could change later.

I asked her one question: when a customer's general counsel describes what your product did this month, what word do they use? She did not say tokens. She did not say credits. She said: "We reviewed forty-three contracts." That is the work unit. Forty-three reviewed contracts. Everything else, the tokens, the retrieval, the credits, is plumbing the customer should never have to think about, and pricing on the plumbing was her first mistake.

This chapter is about finding that word.

What a work unit is

A work unit is the smallest piece of completed, recognizable work that your customer would naturally count. It has three properties, and all three matter:

The customer already counts it. They have a word for it that predates your product. Reviewed contracts. Resolved tickets. Reconciled invoices. Generated reports. Qualified leads. Transcribed hours. You are not teaching them a new unit; you are pricing on one they already use to run their business.
It corresponds to value they can feel. One unit of it is worth something concrete to them: an hour of a paralegal's time, a support interaction handled, a month-end close that goes faster. They can put a rough dollar figure on it even if they have never tried.
It correlates with your cost. More units means more work performed by your software, which means more tokens, retrieval, and review. The correlation does not have to be perfect, but a work unit that has nothing to do with your cost will eventually betray you the way the seat did.

When all three hold, you have a unit you can build pricing on that is legible to the customer, aligned with their value, and connected to your cost. That alignment is the entire game. This is the Work Unit Model, and it is the foundation under almost every good AI pricing decision in the book.

The Work Unit Model rolling many model calls into one recognizable business unit — Many model calls and plumbing roll up into a single recognizable work unit the customer counts.

Why the unit you pick decides everything downstream

The work unit is not a packaging detail. It is the choice that constrains every other choice. Pick well and the rest of pricing gets easier; pick badly and you will spend years patching symptoms.

Consider what the unit determines:

Whether the customer can predict their bill. "We expect to review about fifty contracts a month" is a forecast a customer can make. "We expect to consume about 14 million tokens" is not. The first lets procurement budget; the second produces the bill shock we deal with later.
Whether your price tracks your cost. A unit correlated with work performed moves with your COGS. A unit decoupled from work, like the seat, drifts away from cost until margin breaks.
Whether you can run the Adoption Penalty Test cleanly. When the unit is recognizable, you can reason about whether charging per unit rewards or punishes adoption, because the customer understands what each unit is and what it is worth.
What your sales motion sounds like. "Buy 500 reviewed contracts a month" is a conversation a salesperson and a buyer can have. It maps to the buyer's world. Token bundles do not.

Intercom understood this when it priced its Fin AI agent at $0.99 per resolution rather than per message or per token, as documented in their outcome and resolution model. A resolution is a work unit a support leader already counts and values. It is not a coincidence that this became one of the most-cited AI pricing models of the era; it is a textbook work unit. We will scrutinize whether resolution is the right unit, and the verification problems it creates, in the outcome chapter. For now, notice the move: they priced on the thing the customer recognizes.

Finding your work unit: a discovery worksheet

Finding the unit is fieldwork, not a whiteboard exercise. Here is the worksheet I run with teams. Do it with real customers, not with the product team guessing.

Step 1: Capture the customer's own language. Sit in on three to five customer calls or read support transcripts. Write down, verbatim, how customers describe what your product did for them. Do not translate it into your internal vocabulary. You are looking for the noun they reach for. "It handled X." "It produced Y." "We got through Z of them."

Step 2: List candidate units. From that language, list every countable noun. A contract-review product might surface: contracts reviewed, clauses flagged, redlines suggested, memos produced, hours saved. A support product might surface: tickets resolved, conversations handled, articles drafted, escalations avoided.

Step 3: Score each candidate against the three properties. For each candidate, rate one to five on: does the customer count it, does it map to value, does it correlate with cost. A simple table:

Candidate unit	Customer counts it	Maps to value	Correlates with cost	Total
Contracts reviewed	5	5	4	14
Clauses flagged	2	3	5	10
Memos produced	4	4	4	12
Tokens consumed	1	1	5	7

Tokens always win the cost column and lose everything else. That is the tell that cost-correlation alone is a bad basis for a unit. You want the candidate with the best total, weighted toward customer recognition and value, because a unit the customer does not understand will not survive a renewal no matter how well it tracks your cost.

Step 4: Pressure-test the leader against edge cases. Take your top candidate and ask the hard questions. What counts as a unit when the work is partial? When the contract is two pages versus two hundred? When the customer uploads the same contract twice? When the AI does the work but the human rejects the output? These edge cases are where work-unit pricing either earns trust or loses it. You need crisp, documented, fair answers before you ship, not after the first dispute.

Step 5: Check the cost variance within a unit. This is the step teams skip and regret. Your work unit will not have a constant cost. A two-page contract and a two-hundred-page contract are both "one reviewed contract" but cost you very different amounts. You need to know the distribution. If the cost per unit varies by 2 to 1, a flat per-unit price is fine; the variance averages out across a customer's volume. If it varies by 50 to 1, a flat price is dangerous, because a customer whose mix skews toward the expensive end will torch your margin while paying the same as everyone else. The margin chapter goes deep on this; here, just measure it.

When the obvious unit is the wrong unit

The first candidate is not always right. A few patterns where the obvious unit fails:

The unit the customer counts but does not value uniformly. "Messages sent" is countable but a customer does not value a clarifying message the same as a resolved problem. Pricing per message means you get paid the same for noise as for value, and customers feel it. This is why per-message support pricing aged badly and per-resolution pricing aged well.

The unit that invites gaming. If you price per "report generated" and reports are cheap to generate, a customer optimizing their bill might generate fewer, fatter reports, or you might be tempted to encourage more, thinner ones. Any unit where the vendor's and customer's incentives diverge on how to chop the work into units will generate friction. Pick a unit where chopping it finer or coarser does not change anyone's behavior much.

The unit with runaway cost variance. Sometimes the unit the customer recognizes has 50-to-1 internal cost variance and no way to bucket it. "One research report" might be a five-minute lookup or a four-hour deep agentic run. If you cannot tier the unit (standard report versus deep report) so that each tier has manageable variance, you may need a hybrid: a work-unit price for the predictable majority plus a metered component for the heavy tail. That is not a failure of the model; it is the model telling you the work is genuinely heterogeneous.

Naming the unit, because naming matters

Once you have the unit, name it in the customer's language, not yours. This sounds trivial and is not. The credits chapter treats naming in depth, but the principle starts here: a customer who buys "reviewed contracts" understands what they bought and can check the bill against their own records. A customer who buys "AI credits" that convert to contracts at a rate you control is one rate change away from feeling deceived. Every layer of abstraction between the price and the work is a layer where trust leaks out.

So my contract-review founder's pricing became: plans sized in reviewed contracts per month, with a clear definition of what counts as a reviewed contract, tiers for standard versus complex contracts to handle cost variance, and overage priced per additional contract. No tokens on the page. No floating credits. The general counsel reads the bill and sees the same number they would say out loud: "We reviewed forty-three contracts." The plumbing is hidden, the value is legible, and the price moves with both value and cost.

The honest limits

The Work Unit Model is the foundation, but it is not a universal solvent, and I will not pretend it is.

Some products do not have a clean work unit. A general-purpose copilot that helps with "whatever you are doing" resists a single unit because the work is genuinely heterogeneous; the copilot chapter handles that case specifically. Some work units are real but cannot be verified cheaply enough to bill on, which pushes you toward usage as a proxy or outcome with attestation; those are later chapters too. And some customers, especially in early enterprise deals, will demand a seat or platform fee regardless because that is what their procurement understands, in which case you build a hybrid with a platform component and a work-unit component.

The point of the model is not that every product prices on one perfect unit. The point is that you should always know what your work unit is, even if you do not price purely on it, because it is the thing that connects what the customer values to what you cost. Lose sight of it and you are back to billing plumbing.

Practical exercise

Take your product to the worksheet right now. Write down, in five minutes, the noun a customer used on your last call to describe what the product did. Then score your top three candidate units on the three-property table. If your highest-scoring unit is something other than what you currently charge for, you have found the most important pricing problem in your business, and the next several chapters are about how to act on it.

Key Takeaways

A work unit is the smallest piece of completed work the customer already counts, that maps to value, and that correlates with your cost.
The unit you pick determines predictability, margin alignment, the Adoption Penalty result, and what your sales motion sounds like.
Find the unit through fieldwork: capture the customer's own language, list candidates, score them, pressure-test edges, and measure within-unit cost variance.
Reject units the customer counts but does not value uniformly, units that invite gaming, and units with runaway cost variance unless you can tier them.
Name the unit in the customer's language; every layer of abstraction between price and work leaks trust.
Even when you do not price purely on the work unit, always know what it is, because it links customer value to your cost.

The Work Unit: The Thing the Customer Recognizes