Why most pilots aren't really pilots.
"Pilot" is one of the most abused words in B2B software. Most of what gets called a pilot is actually a soft commitment with a marketing-friendly label — six months, fifteen users, a steering committee, three success criteria that get redefined in month four when the numbers don't show up. Then it doesn't end. It just becomes the rollout.
That's not a pilot. That's a slow procurement decision wearing a costume. A real pilot has a yes-or-no answer at the end, a small enough footprint that you can pull it without political damage, and a single clear criterion you commit to before the first rep opens the app.
"Pick one region, three to five reps, 90 days. If adoption isn't above 70% voluntarily, we end it. No expansion until you say it works."
The reason we run pilots this way isn't because we're allergic to bigger commitments. It's because a six-month, fifty-user "pilot" doesn't test the thing the pilot needs to test. The question is whether reps voluntarily adopt the tool. That answer is clearest with a small group, a short clock, and one number to watch.
The pilot shape that works.
Four constraints. Each one is doing a specific job.
One region.
Not "two reps from each of three regions." Not "a cross-functional slice." One region. One sales manager. One territory. The reason: when something works (or doesn't), you want to know whether it was the product, the manager, the team culture, or the territory. With one region, you can attribute. With three sliced regions, you can't.
Three to five reps.
The mix matters more than the count. Pick one top performer, one steady middle rep, and one struggler (and optionally one or two more, but no more than five total). The signal isn't whether your top rep liked it — top reps adopt almost anything reasonably built. The signal is the spread across all three skill tiers.
Ninety days.
Long enough for the novelty to wear off and adoption to become real. Short enough that nobody's career is riding on it. Long enough that you can see the pattern at day 60 and decide. Short enough that "let's give it another quarter" doesn't become the default.
One exit criterion.
Voluntary capture above 70% by day 60. That's the number. Below it, end the pilot. Above it, you've earned the conversation about expansion. Pick the number on day one and write it down somewhere your future self can see it.
You can't move one without weakening the others. Two regions doubles the cost of attribution. Eight reps stops being voluntary. Six months means the test becomes the rollout by default. Three exit criteria becomes zero exit criteria as soon as one of them slips. The shape is a system.
The weekly cadence.
The weekly rhythm of a real pilot is small and protected. Most weeks should feel boring. If your pilot has a lot going on, it's not a pilot — it's a project.
Week 1 — Setup.
Reps onboarded. Three voice notes from each rep by Friday. That's the entire week-one bar. Nothing else. No data review, no manager dashboard, no leadership check-in.
Weeks 2–4 — Voluntary rhythm.
The believer manager runs a 15-minute coaching conversation with each rep individually, anchored in the rep's own captured notes. No group meetings about the pilot. No "how's it going?" emails from leadership. Capture rate logged weekly, internally, not shared with the reps.
Week 6 — First honest read.
Pull capture rates. If you're below 50%, find out why before week 8. If you're above 60%, you're on track for the day-60 number. Don't share the metric with the reps yet — once it becomes a target, it becomes a mandate.
Day 60 — The gate.
Voluntary capture rate is what it is. Above 70% means continue. Below 70% means end the pilot. No expansion past day 60 either way — you finish the 90 days on whatever footprint you started, so you can see how adoption settles when the novelty's truly gone.
Day 90 — The decision.
Voluntary capture rate stabilized? Reps using it on their own without prompting? At least one concrete account-level win the rep can point to? If yes to all three, you've earned the conversation about expansion. If no, the answer is no.
The one number on day 60.
The temptation in any pilot is to track twelve metrics so you can find one that looks good. Don't. You're tracking one metric: voluntary capture rate — the percentage of customer visits in the pilot region where the rep left a voice note within 24 hours, without being prompted.
The reason it's voluntary capture and not visit count is that visit count is easy to game and tells you nothing about whether the tool fits. A rep who logs 30 stops a week and captures none of them is failing the pilot. A rep who logs 15 stops and captures 13 is passing it — with better data than the high-visit rep would've ever produced.
The second number worth watching, but only as a tiebreaker if day 60 lands close to the 70% line: follow-up close rate. Of the follow-ups reps committed to in weeks 1–4, how many actually closed by week 8? If it's climbing, the tool's helping the rep — even if their raw capture rate is borderline. If it's flat or dropping, the capture isn't translating into action and the pilot is hollow.
Day 90: expand or stop.
At day 90, three outcomes are possible. Each has a clear next step.
Outcome A — Clean pass.
Voluntary capture 80%+, follow-up close rate trending up, at least one rep volunteering that they'd be annoyed if the tool got taken away. Expand to the next region. Same shape: one region, three to five reps, 90 days, 70% gate. You're not rolling out — you're running another pilot. The second pilot's job is to confirm the first one wasn't a fluke.
Outcome B — Borderline.
Capture between 60% and 75%, mixed signal on follow-up close, reps neutral on whether they'd miss the tool. Don't expand. Don't end. Run another 90 days on the same footprint and see if the floor moves. If it does, expand on day 180. If it doesn't, end on day 180.
Outcome C — Real fail.
Capture below 60% sustained, no rep-side advocacy, the believer manager has stopped opening Voze Web. End the pilot. Don't extend. Don't expand "just a little." Don't reframe the criteria. The pilot just gave you the most expensive answer you'll ever get for cheap — that this tool doesn't fit this team — and the worst thing you can do is keep paying to discover it more slowly.
Outcome B looks like a win because you're not at zero — but it's the most dangerous one. A 65% capture rate after 90 days that climbed to 72% after 180 days is real. A 65% rate that stayed flat means the ceiling for your team is 65%, and you're about to roll out a tool half your reps tolerate. End it.
The pilot template.
Copy this into whatever doc your team uses for sales projects. The whole pilot fits on one page on purpose.
Setup (Week 0)
- Pilot region: ___ (one)
- Believer manager: ___ (name)
- Reps: ___ (1 top, 1 middle, 1 struggler — up to 5 total)
- Start date: ___
- Day 60 gate date: ___
- Day 90 decision date: ___
- Exit criterion (voluntary capture rate at day 60): ___ % (default 70%)
- Tiebreaker (follow-up close rate trend): up / flat / down
Weekly check-in (15 minutes, manager only)
- Capture rate this week vs. last
- One coaching question per rep, anchored in their own notes
- One pattern from the territory feed
- Anything that smells like the pilot becoming a mandate — kill it
Day 60 gate
- Pull the voluntary capture rate
- ≥70% → continue to day 90, no expansion
- <70% → end the pilot
Day 90 decision
- Capture rate, follow-up close rate trend, rep advocacy → Outcome A / B / C
- A → run a second pilot in the next region
- B → extend 90 days, same footprint
- C → end, write up what you learned, move on
A real pilot is small, time-boxed, and has one number.
- One region, three to five reps, 90 days, voluntary capture above 70% by day 60 — that's the shape.
- Don't track twelve metrics. Track voluntary capture rate; use follow-up close rate as a tiebreaker.
- Don't expand at day 60. Hold the footprint through day 90 so you see how adoption settles past the novelty window.
- Outcome B (borderline) is the dangerous one. Extend the pilot before you extend the footprint.
- If the pilot fails, end it. The data you got is worth more than the rollout you would've done anyway.