I’m getting close to the point where I’ve been not-a-product-manager for longer than I ever was a product manager, so take this with a grain of salt. But a piece of advice I’ve given to newbie PMs in the past is that it’s very useful to “plan for failure”. Three years too late, I’ve finally figured out a good way to distill what I mean by that!
Level 1: Just be mentally prepared for something to fail
The job of the product manager is to (within your remit) guide your teams to build things that will benefit the customer and company (ideally both at the same time.) You would be forgiven for assuming that the way to do that is to only ever build good stuff that works. The problem is that’s really really hard, and it’s especially hard to know what works before you actually do it.
In practice, the job ends up being much more about doing things you think will work, and then when a significant portion of those fail to do what you want, you learn from those and inform your next guess at what might work. At a previous employer doing product management for big global B2C e-commerce brand, we found that when our team did experiments, if you looked across all of them on average about ~30% would yield a measurable positive impact, about ~20% would yield a measurable negative impact, and the other ~50% would have no measurable impact at all. This was true not just for our team, but across other teams as well. And anecdotally, these numbers (plus or minus say 10%) seem about right for well-performing teams on products across different companies in different industries, particularly if your product is fairly mature and all the low-hanging fruit has already been picked. It seems like somewhere in the neighborhood of 1/4 work, 1/4 fail, 1/2 have no impact is a pretty good rule of thumb (though I’d imagine that in a younger product those might be tweaked.)
So the first level is: just mentally prepare yourself for the fact that only, say, 1/4 of your ideas will actually work. That on its own is not failure, that’s just the cost of doing business! Ideas are cheap and plentiful, but good ideas are rare! And you can’t really tell which is which without actually testing them!
Level 2: Think through specific failure scenarios
If you’ve already prepared yourself for the reality that your experiment is likely to not succeed, then it’s worth mapping out what are the possible ways it could fail and, most importantly, how will you know which version it is. You should already be thinking through your feature’s hypothesis and figuring out how you’ll be able to prove that it worked in a success case. Now try doing that in reverse: if it fails or has no impact, what data will you look at to prove that? How will that data look different in the different ways it might fail? For instance, imagine an experiment where you’re adding a promotional card to a feed of deals that shows a deal that’s on sale. How would the data look different if it failed because:
No one clicked on the card, but no other behavior changed
No one clicked on the card and they also stopped scrolling past it
People clicked on the card, but never bought the deal and abandoned
People clicked on the card, clicked back, and the extra delay broke their flow
People clicked on the card, but only people that were already going to buy, and now they bought a cheaper thing rather than the full-price thing they were about to buy
Etc etc etc
All of those would look different in data, and all of them suggest a different learning from the experiment.
If you think about the possible failure scenarios and start building your failure hypotheses early, then you can start thinking a few moves ahead. For instance, if it failed because people clicked the card but never bought the deal, then maybe that suggests the card idea is a plausible one, but you need to do a better job targeting the deal. If you start thinking about this early, then you save yourself time later when the experiment ends by pre-writing analysis queries, or pre-hypothesizing new features and tweaks you can do based on which kind of outcome you get, etc.
Level 3: Pick your preferred failure
This is the real galaxy brain, and while by the end I was pretty good at consistently doing level 1 and 2 with all my features, I only ever did 3 a handful of times. But again, if you take as gospel that you’ll have more not-successes than you have successes, in some cases you can be strategic about what kind of not-success you want.
As an example, I was working on a new map-based search experience for a previous employer’s mobile app, and we were forced with a decision: when should the map be shown?
If you show the map when there’s no or very few nearby deals, then it’s not very useful and a huge waste of space compared to a list! But the whole hypothesis of the feature was that showing the map would be more useful and preferable when there are nearby deals. So what’s the number of nearby deals (however we define “nearby”) that should trigger the map view?
At first the goal was to do some analysis and try and figure out what might be the number that splits the difference and is the ideal tipping point between “list is best” and “map is best.” Unfortunately, the previous map feature was pretty hidden and underused at that point so we just didn’t really have much data of actual usage to go on (we would have to rely on supply distributions and make guesses about how people might react, rather than actually seeing interactions.) Then, talking with I think my designer at the time, we realized…. Let’s be real, there’s a very low likelihood we’ll actually get the number right first try (25% at best if the overall hit rate holds true here!) And talking with engineering, it became clear that adjusting this tipping point number is trivial, the kind of thing that can be done in minutes1.
Then the question became, if we think we’re going to get it wrong but it will be easy to change it in the future, which direction of wrong would we rather be? Do we want to show the map too often when the list would be better, or show it too rarely? And we concluded in this case we’ll get much more useful data from showing the map too often, because we’ll get to see how people react to it in more circumstances (including what circumstances cause them to flip back to the list or cause them to abandon their search!)
And so we deliberately went extreme and said we’re going to show the map if there was even one deal within 20 miles, which we were fairly sure was way way way too often and would lead to a lot of abandoned searches. And we were right! But we were able to look at what searches at what deal density performed better or worse and tune that pretty quickly in an immediate followup. I think it would’ve been much harder to go the opposite direction, where we only showed the map if we were super confident there was enough density, because we wouldn’t have had the same amount of map interaction data we got from just showing it everywhere and would have to inch towards the ideal number over a lot of different attempts.
Now of course the flip is this meant the experiment failed, but in some circumstances a failure that gives you useful information is better than a neutral (or even a success!) that doesn’t. If you’re already prepared ahead of time and know what to look for, you can use that deliberate failure to get you to a success faster than trying to get to the success more cautiously.
That won’t always be the case, and it’s certainly politically risky lol, hence why I said I only did it rarely. But I hope that if I’d stayed in product for longer I would’ve built up my skills (and confidence) in being able to not just accept and plan for failure, but to actually use failure to my advantage by steering into the most useful kinds of failure.
So that’s the three levels of planning for failure: just be mentally prepared to fail, start thinking about how you’ll determine and handle different failure scenarios, and (where relevant) choose which kind of failure you want and optimize your experiment for that. Go forth and prosper or whatever!
Well, minutes of engineering time; this was a mobile app and so it still needed to go through mobile app store reviews and align with a broader release schedule, so it would require a new mobile version to make the change, but still compared to other kinds of changes to the feature this would be incredibly simple.