The Avoidable Mistake of Cherry Picking Data

Beginner, Theory

Discussing investments with others is often a minefield of competing information.  Two highly educated people can look at the exact same numbers and come to completely different conclusions about the results.  Information from primary sources even as reputable as Vanguard does not necessarily make the situation any better, as the same fund can appear wonderful in one tool but disappointing in another.  How on earth can an everyday investor find an appropriate portfolio when even the experts can’t agree on what is good and what is not?

The important takeaway is not that risk and returns are unknowable or that everyone who disagrees with you is bad at math.  The problem most often arises from the natural desire of humans to reduce complex problems down to a single straightforward number.  The real world is more complex than most people understand, and the assumptions by which they derive that number can greatly affect the results.  Whether they realize it or not, most investors — casual and professional alike — cherry pick data every day.

As an example, here’s a simple question:


What’s the average historical real return of emerging markets?


Let’s look at three different sources and see what we find.  I’ll let you know up front that I verified each source with my own calculations, so there are no major definition discrepancies involved.  All numbers are compound real returns.

First, let’s look up the long term returns for one of the most popular emerging market index funds, Vanguard’s VEIEX.


That’s easy, it’s obviously right there… hmm… wait… which number is it?  There’s a big difference between a 10-year return of 1.55% and a return since inception of 5.29%.  The SEC demands that all funds report results like this and I think it’s good that they do, because it’s already obvious that timeframe makes a big difference.  Erring on the side of more data, let’s use the 5.29% return since inception.  Estimating inflation was about 3% a year, then the real return must have been close to 2.3% per year.

Next, let’s try our good friend Google.  Searching for “emerging market historical returns” one of the top results is a fellow financial blog that presents accurate historical data.  Cool!  According to his calculations, from 1988-2011 the annualized returns were 12.33%.  Discount that by the same 3% for our inflation estimate, and the real return was somewhere around 9.3% per year.

Finally, let’s fire up the Long Term Returns calculator here and find the CAGR since 1972. Looking only at the top-line number for a moment, the return (that is already inflation-adjusted) was 10.2% per year.

So here’s a quick recap of our results for emerging markets:

  • Source #1: 2.3%
  • Source #2: 9.3%
  • Source #3: 10.2%

Ummm… that’s a huge difference in returns for the same index.  Which one is accurate?

Here’s the thing — they all are.

You see, index investing isn’t necessarily complicated but the math behind it doesn’t like to be reduced down to single numbers.  Emerging markets are a very volatile asset, and depending on when you look at them they can be all over the map.  Let’s look again at the timeframes that each source used:


As you can see, each average return is completely accurate, but no one average return tells the full story.  The source with the most data has the most representative long-term number, but it hides the fact that emerging markets grew massively in the decade between 1983-1993 and have done pretty poorly for the last 20 years.  The shortest source includes all data since its index fund was founded, but excludes the remarkable run that drove people to start it in the first place.

So which number should you use for your own decision making?  That’s where the cherry picking comes in.  Everyone has their own theory on which number is more valid, and it’s often impossible to come to a consensus.  When selected by someone with ulterior motives, accurate data can actually be quite deceptive.

Here’s my simple solution — why limit ourselves to only one number?  As the site tagline says, a picture is worth a thousand calculations.


Instead of looking at only one investment period, here’s a Portfolio Growth chart that plots the inflation-adjusted value of an emerging markets investment starting in every investing period since 1972.  Here you can see the huge spread between the best case and worst case historical scenarios.  Note that the lines are color coded by start date and how many endpoints fall on the bottom edge of the plot, and you can also quickly see that emerging markets have seen some of their worst times recently.

Looking at the big picture provides much more information than any single average can provide, and immediately offers valuable context to any investing discussion.  When faced with the visualized uncertainty of an investing decision, those simple averages tracking only a single line seem unimportant in comparison.  Emerging markets may be a perfectly reasonable addition to certain portfolios, but it isn’t necessarily because of a single desirable long term average that you may never personally experience.

Pretty cool, huh?  If you find this kind of analysis as helpful as I do, then you’re in the right place.  I’ve long been fascinated by the topic of uncertainty and selection bias in investing, and it has been a major driving force in much of my own financial education.  Each of the calculators on this site tackles this issue in its own way, peeling back the layers of data and presenting returns in a way that helps you see the big picture.  Browse the charts in the sample portfolios, and the implications of studying investing from a start-date-independent perspective should become pretty obvious.  For example:

Total Stock Market Pixel ChartPermanent Portfolio Pixel Chart

Some portfolios are much more sensitive than others to the selected timeframe, and good risk management means truly understanding that and planning accordingly rather than simply hoping the average you see will apply to you, assuming you can predict the next economic cycle, or resigning yourself to waiting out a painful investing period that you know will make you miserable.  You do have other options.

So the next time you read about the average return of a given portfolio, take a moment to think about what information is being left out.  While others argue over individual cherries, consider the entire tree.  By studying the good times, the bad times, and everything in between, your ultimate decision will be much wiser for the effort.