Last quarter, my team shipped a feature that scored a 47 on RICE. For context, our threshold for greenlighting work was 30. This thing sailed through prioritization review like it had diplomatic immunity.
Six weeks after launch, we killed it. Retention in the cohort that adopted it dropped 11% compared to the control. Support tickets tripled for that user segment. The feature technically did what it promised — it just made the product worse.
Here's the uncomfortable part: the RICE score wasn't wrong. Every input was defensible. The score just didn't ask the right questions.
How a 47 Becomes a Disaster
The feature was an automated weekly digest email for our B2B dashboard product. The math looked gorgeous:
Reach: 100% of active accounts. Every customer had an email address. Easy.
Impact: Medium-high. We had survey data showing users wanted "better visibility into their metrics without logging in."
Confidence: 80%. We'd seen digest emails work at three competitor products.
Effort: Two engineers, three weeks. Straightforward.
What RICE didn't capture: our users were operations managers who already suffered from notification fatigue. Their inboxes were graveyards of unread alerts from Datadog, PagerDuty, Jira, Slack, and half a dozen internal tools. They didn't want another email. They wanted the dashboard to load faster so they could check it during their 7am standup and move on.
"Better visibility without logging in" meant "make the login-to-insight path take 15 seconds instead of 45," not "send me a PDF every Monday." We had the verbatim survey responses in front of us. We could have caught this. But survey responses are ambiguous, and we resolved the ambiguity in the direction that was easiest to build. Two engineers, three weeks, ship a digest — that's a clean ticket. Rearchitecting the dashboard's data layer to shave 30 seconds off load time? That's a quarter-long slog through caching strategies and API redesigns. The RICE spreadsheet made the easy option look like the smart option.
We heard what we wanted to hear, scored what we wanted to score, and built what was easy to build.
The Framework Isn't the Problem
I'm not here to trash RICE. RICE is fine. MoSCoW is fine. Value-vs-effort matrices are fine. The problem is that most teams treat the score like a judge's ruling — number comes out, conversation ends. Frameworks are supposed to start the conversation, not close it.
Three Patterns That Kill Good Teams
1. Anchoring on reach as a proxy for value. "It affects all users" is not the same as "all users want it." A settings page redesign has 100% reach. That doesn't make it your top priority. Reach tells you the blast radius, not the desirability. I've watched teams greenlight mediocre features purely because the denominator was big. A push notification preference screen touches every user. A broken onboarding flow touches every new user. Same reach score. Wildly different urgency. When you let reach dominate, you end up optimizing for surface area instead of depth of need — and you build a product that's wide and shallow, full of features nobody asked for twice.
2. Borrowing confidence from competitors. "Slack does it" is not a confidence score. Your users are not Slack's users. Your product context is different. Competitor features are data points, not validation.
3. Scoring effort in engineering-weeks, not in opportunity cost. Two engineers for three weeks doesn't sound bad until you realize those same engineers could have spent that time on the login-to-insight speed improvement that would've actually moved retention. RICE scores effort in absolute terms. Strategy requires you to score it in relative terms — what are you not building?
What We Do Now Instead
We still use RICE. But we added three forcing questions that happen after the score is calculated and before anything gets committed to a sprint:
"What's the user's current workaround, and is it actually bad?"
If users have a workaround that's tolerable, your feature is a nice-to-have disguised as a need. The digest email users had a workaround: they opened the dashboard. It took 45 seconds. Annoying, but not broken. The real fix was making the workaround better, not replacing it. This question alone would have killed the digest in the first meeting. We knew the workaround existed. We knew it was functional. We just convinced ourselves that "functional" wasn't good enough — while simultaneously proposing a solution that didn't improve the function at all, just added a new channel for the same slow data.
"If we ship this and it works, what do we stop hearing about?"
This one filters out vanity features fast. If you can't name the specific complaint, support ticket category, or churn reason that disappears when this ships, you might be building for a phantom problem. We couldn't answer this for the digest. Users weren't churning because they lacked emails. They were churning because the core experience was sluggish.
"Who on the team would bet their quarterly goal on this?"
Not "who thinks this is a good idea." Who would stake their performance review on it landing? This question separates intellectual agreement from genuine conviction. When we asked this about the digest, the room got quiet. That silence was the signal we ignored.
The Spreadsheet Comfort Zone
Frameworks feel rigorous. Numbers in cells, a score comes out, and a messy decision looks objective. It's satisfying the way color-coding your task board is satisfying — it mimics progress without requiring the hard, subjective judgment calls that actually matter.
Profit-First Doesn't Fix This Either
The 2026 trend toward "profit-first roadmaps" is an interesting evolution. Teams are tying prioritization directly to revenue impact and margin contribution rather than proxies like engagement or reach. That's a step forward, but it has the same trap: any single metric you optimize for will eventually mislead you if you stop questioning the assumptions underneath it.
The best PMs I've worked with use frameworks the way a doctor uses a checklist — as a safeguard against forgetting something obvious, not as a substitute for diagnosis. They score things, then they argue about the scores. They look at the ranked list and ask "does this feel right?" And when the spreadsheet says one thing and their gut says another, they don't immediately trust either one. They dig until they understand why there's a gap.
The Digest Postmortem
We pulled the digest after six weeks. The postmortem wasn't about the feature — it was about our decision process.
The fix was a norm: before any feature above a RICE score of 35 gets committed, one person has to argue against it for ten minutes. Not devil's advocate theater — a genuine attempt to find the strongest case for not building it. If nobody can articulate a compelling counter-argument, that's actually suspicious.
We shipped the dashboard speed improvement two sprints later. Median load time went from 8.2 seconds to 1.9. No RICE score needed. Every person on the team would have bet their quarter on it.
Sometimes the right answer is obvious. The framework's job is to keep you honest, not to keep you busy.