The New Year is officially underway. The holiday hangover is starting to wear off, resolutions have been made (and already broken), and bowl games are wrapping up. But most importantly, new 2016 data is available! I have actually been feverishly working for over a month now to prepare, and am proud to announce some exciting new updates to each and every calculator:
We now have 2016 data for every asset. In addition, data has been extended back another two years and now starts in 1970. So we now have a full 47 years of market data to study that covers all kinds of economic conditions.
There are more bond options! Based on popular request, I have added investment grade corporate, high yield corporate, and international bonds. You can definitely expect to see a few new portfolios including these assets in the near future.
I’ve organized the asset allocation column in the calculators to better communicate how the individual indices relate to one another, and now you can see which assets are actually subsets of others. In addition, I’ve made a handful of improvements to a few calculators to make them easier to use overall.
The quality of the source data is greatly improved, with new sources that better model the desired indices. The calculators also now include a nifty new system of verifying older data to ensure that the numbers are as accurate as possible, and they clearly communicate when data is trustworthy and when it is only estimated.
Quantity of data is certainly fun, but when it comes to trust it’s all about quality. So let’s start there.
The data that powers the site is pulled from all around the web, and the Simba team on the Bogleheads forum has been spending a great amount of time and effort lately reviewing the original data sources for quality. In some cases we identified alternative indices that model the desired one better than before, and in others we found that the original sources were, to put it kindly, highly questionable. In a few extreme cases, we were even forced to eliminate many years of data because there are no good sources available.
Eliminating data, however, is tough pill to swallow for a guy who likes to study the historical performance of portfolios. But that raised an interesting question — does a few years of missing data for an asset or two really hugely change the final results? To explore this, I’ve been experimenting with ways to model portfolios with missing data and measure the resulting error. It turns out that in some cases it does make a noticeable difference, but depending on the portfolio it does not necessarily have a major impact. What happens when you build that error checking logic directly into the calculators? You get some powerful and flexible new tools!
A really cool byproduct of this system is that it allows me to add more assets that were previously excluded simply because there was less available data. So investment grade corporate, high yield corporate, and international bonds are all in, as I can use a total bond market index as a reasonable stand-in when needed and calculate the point when that assumption no longer holds up for a specific asset allocation. And I also extended the verification system to any data point in the Simba spreadsheet not directly from an index fund or provider, so now you can feel confident that the data you see accurately follows your desired real-world portfolio regardless of original source.
You can read about how the error checking works in the updated calculator methodology, but the end result is that you can now study any portfolio you like back to 1970 and the calculators will clearly communicate when numbers are verified to be dependable and when they are only estimated based on the best available alternative data. For example, here’s what the new Heat Map looks like:
In this case, the squares with the dark outlines are verified and the ones with no outlines are estimated. Early small cap value data was derived from my own Fama French calculations, and the verification system determined that 30% SCV introduced too much potential error prior to 1976 (smaller percentages are not such a big deal). The estimated numbers are still very valuable for providing investing context, as ignoring the early 70’s just because you’re short a bit of good SCV data isn’t particularly wise and can lead to decisions influenced by start date bias. But understanding the dependability of the underlying calculations is also an important data point in any discussion and will keep you from leaning too heavily on precise CAGR numbers that maybe aren’t set in stone.
Unfortunately, you may notice that the one unavoidable casualty in all of this is TIPS. Some very smart Bogleheads did a lot of research, and not only is the study that the old numbers depended on simply not reliable, but there also isn’t enough good data to even attempt to accurately backfill the numbers. I hope to eventually reintroduce TIPS, but not before I can do it in a way that I’ll be willing to vouch for the results. Good decisions require good data, and I’m committed to supporting both.
Of course the lack of TIPS will affect a few model portfolios, and I’ll be addressing that very shortly as I update the rest of the site. It will take a little time to get the Portfolios and Assets sections up to speed, but in the meantime take the all-new Calculators for a spin! All of the additional data should open up many new avenues of exploration.
Overall, I feel like the calculator updates are a huge step forward in data quality, diversity, and transparency. It has required a ton of effort and I hope you find the results as useful and interesting as I do. If you have any questions or spot an inevitable bug that I missed in the update process, please don’t hesitate to contact me.
Happy New Year, and happy portfolio hunting!