What an AI-First PRD Actually Looks Like

The single most useful thing we did with cohort 06 was a closed-door PRD review. Three days, fourteen PRDs, four senior PMs from Razorpay, Google, and LinkedIn in the room. Same scoring rubric. Same brutal feedback. No softening it for the students.

Here is what I learned, both as a mentor and as someone who has graded hundreds of PRDs in interview loops over the last six years.

The three things every PRD that made it through had

I will not bury the lede. PRDs that scored highest shared three traits — and they are the same three traits I look for when I am hiring an AI PM today.

1. A crisp non-deterministic spec

Traditional PRDs assume the feature works the same way every time. AI features don't. The PRDs that got the highest marks treated this head-on: they specified what "good output" meant numerically, listed the edge cases the model would fail on, and pre-committed to an eval that the team could run on every release.

One graduate wrote a brilliant section called "What we accept when the model is wrong" — three bullet points specifying refusal behavior, fallback paths, and a human-review queue trigger. The Razorpay reviewer said it was the cleanest framing of AI failure he had seen all year.

2. A real eval, not vibes

Most cohort-1 PRDs (early in the program) describe evaluation as "we'll test it." The strong ones come back to the cohort review with numbers — faithfulness ≥ 0.92, p95 latency under 1.8 seconds, refusal-quality 0.88 against a 200-question adversarial set.

You can write a beautiful product vision, but if you cannot tell me how you'll know the model is doing its job, I won't ship it. — Senior PM, Razorpay (during the mock review)

3. A clear hand-off to engineering

The strongest PRDs included a "what an eng leaning into this needs from me" section. Sample prompts. Eval datasets. Vendor recommendations (with cost-per-token sanity checks). The eng-PM contract, written down.

An eng/PM handoff diagram from one of the cohort 06 PRDs. Used here with permission.

The surprising omission

Every senior PM in the room noted the same gap, independently, across more than half the PRDs reviewed: no rollback plan.

If your AI feature ships, gets pulled into 8% of customer flows, and quietly starts hallucinating SKU codes a week in — what's your plan? The best PRDs answered this in two paragraphs. The weakest didn't mention it at all.

The fix is mechanical. We added a "Rollback & reversibility" section to the EdWagon PRD template. Three questions:

What user-visible behavior tells us the feature is misbehaving?
What's the toggle that turns it off, and who has the keys?
What's the recovery cost, in user-trust and engineering hours?

The template we use, in code

Here's the skeleton — copy it and use it as your starting point. It's intentionally short. PRDs that take more than four pages get glossed over.

# Feature Name
1-line problem statement.
1-line target outcome.

## User & Use
Who, when, what they're trying to accomplish.

## Behavior (the deterministic bits)
Inputs, outputs, edge cases that have one right answer.

## Behavior (the AI bits)
- What "good" means, quantified.
- What "wrong" means, and what we do about it.
- Refusal & fallback paths.

## Eval
- Faithfulness, latency, refusal-quality targets.
- Adversarial set link.
- Pass/fail thresholds for ship/no-ship.

## Rollback & reversibility
- Misbehavior signals.
- Kill-switch owner.
- Recovery cost.

## Eng handoff
- Sample prompts & eval data.
- Vendor & cost-per-token.
- Open questions for the team.

If you're prepping for an AI PM loop, write three PRDs against this template before the call. Read them out loud. Cut the fat. Get one senior person to grade you against the rubric.

I'll keep posting field notes from the cohort here — including the actual review rubric we use and the offer letters our graduates have signed. If you're applying to the next cohort, the template above is the same one we send the first week.

The three things every PRD that made it through had

1. A crisp non-deterministic spec

2. A real eval, not vibes

3. A clear hand-off to engineering

The surprising omission

The template we use, in code

More from EdWagon

Designing self-healing infrastructure that survives a 3 AM page

From DevOps to AIOps in 90 days: a cohort walkthrough

The five evals every AI feature needs before it ships