The AI Runtime Field Lab

Field Briefs

E-commerce Intermediate Vertical Agent Open Draft, pending editorial review

Routing the Mess: A Merchant Operations Copilot

Build a merchant copilot that routes messy requests across catalog, sales, operations, and support while preventing unsafe mutations.

Why this matters

A small merchant fields a constant stream of mixed requests: product questions, order changes, refund asks, and inventory edits. A single agent that answers and proposes actions saves hours, but the moment it can change catalog or order state, a wrong action costs real money. The hard part is routing accurately and refusing unsafe mutations, not generating fluent text.

Persona

Small e-commerce merchant

Current manual workflow

The merchant reads each message, decides which system it touches (catalog, orders, support), looks up the relevant record by hand, and makes the change in an admin panel, switching context on every request.

The AI workflow to build

The copilot classifies each incoming request, retrieves the relevant records, and drafts an answer or a proposed action. Read requests are answered directly with grounded references. Write requests produce a proposed action stating the exact change, held for merchant approval and never auto-applied. Ambiguous or destructive requests are refused or down-scoped with a stated reason.

Inputs

  • products
  • orders
  • support docs
  • merchant context

Outputs

  • answers
  • proposed actions
  • summaries
  • safe refusals

Definition of done

On a synthetic request set mixing read and write intents, the copilot routes each request to the correct domain, answers read requests with grounded references, returns proposed actions (never auto-applied) for write requests, and refuses unsafe or ambiguous mutations with a reason. A destructive request, for example delete all out-of-stock products, is refused or down-scoped, never executed.

Example input

A merchant message: a customer says order 10432 never arrived, refund them and pull that product from the store.

Example output

Routed to orders and catalog. Proposed actions: open a refund on order 10432, held for approval; set the product to unlisted rather than deleted, held for approval. The agent states it will not delete the product and asks the merchant to confirm each action.

Data plan

synthetic data

Boundaries and non-goals

  • real production integration
  • real customer data
  • destructive actions

Evaluation ideas

  • routing accuracy
  • tool-use accuracy
  • groundedness
  • unsafe action prevention

Run Level target

R3 Reliable Plain translation: handles real cases.

Scope envelope

Buildable by one solo builder in 20 to 30 focused hours, on public, synthetic, or sanitized data, with a demo path that requires no production access.

Suggested tools

Suggested options, never requirements; briefs are tool-agnostic.