Building Portfolios that Beat their Benchmark: Measuring Nanometers with a Yardstick

Bob Veres

Despite constant admonitions against using historical performance as a guide to future returns, advisors routinely construct portfolios based on the track records of the underlying funds, with predictably spotty results. Such ineffective measures are pervasive. Someday, the advisors of the future will look back at the measures that that we’re using today to evaluate investments and regard us like we would the Saxon architects of the 1100s, who defined a “yard” as the precise distance from the tip of English King Henry I’s nose to the end of his outstretched thumb.

Take the concept of beta, for example. You know what beta is; it measures an investment’s volatility (absurdly, down to two decimal places) against that of some benchmark index.  Chances are you think twice about putting client money in a mutual fund with a beta of 1.3 or more.  More recently, sophisticated portfolio managers have started creating a “risk budget” for client portfolios, using beta as their measuring tool.

Most of us use the same yardstick, applied somewhat differently, to compare individual funds’ track records to market returns.  How often have you looked at a graph of a fund’s performance vs. the S&P 500 over different time periods? 

This may produce meaningful results for funds that closely track indices (meaning they have a high R-squared), but of course the most interesting funds are run more creatively.  Do you really want to pay a management fee based on the entire fund portfolio when only a small part of it is actually deviating from the market?  And if you’re investing a fund whose research staff is actively managing the entire portfolio, what confidence do you have that you’re actually looking at an extraordinary track record? Morningstar’s Don Phillips has memorably remarked that the best way to beat an index over the long term is to invest in things other than what the index is holding. 

Meanwhile, using tools he co-developed with the Nobel-prize winning economist Bill Sharpe, one advisor – whose approach I’m about to describe – has found that he can reliably outperform an appropriate benchmark. His work proves it is possible to build a portfolio knowledgably. You just need the right tools to get the job done; the 21st century equivalent of yardsticks based on royal anatomy won’t cut it.

Next-generation style analysis

Gary Miller, founder and chief investment officer of Frontier Asset Management, LLC, in Sheridan, WY, has spent the last 25 years looking for ways to lend precision to the nose-to-thumb yardsticks to which we’ve all grown accustomed.  His alternative is something called “style analysis,” sometimes also called “factor analysis.”

Style analysis is fairly simple in theory.  You take the weekly or monthly returns of a fund portfolio, then you run a regression analysis on many different mixes of investment indices over the past three years, until you find the blend that best matches the portfolio’s results. Doing so allowed Miller to shine an X-ray through the veil of secrecy that surrounds funds, which only report their holdings quarterly, and it gave him a pretty good approximation of the asset mix in a fund portfolio.  Instead of comparing a fund manager to “the market,” he could now compare a manager’s track record to the index returns of the actual asset mix that the manager was investing in – and tease out an alpha factor that measured skill in stock selection.  Was the fund manager consistently investing in stocks that beat this customized benchmark, or not? 

Of course, this calibration mechanism had to be refined over the ensuing decades.  Any regression analysis relies on data from today, yesterday, the day before, six months ago and three years before that.  So if the manager moved out of government bonds in 2009, factor analysis will still contain their echoes in the data. In other words, there may be ghostly images on that x-ray that simply aren’t relevant.  Miller has added algorithms that give more weight to recent data and less to longer-ago signals, and he has also calculated the R-squared on the performance vs. customized benchmark, which tells him whether he’s looking at precision or noise.

The earliest versions of style analysis delivered a lot of “false positives” – that is, the best fit regression might include a percentage allocation to government bonds when, in fact, the manager in question had never invested in government bonds during his entire career.  The false positive might come from any number of sources; a large holding in a stock that behaves differently from the market (and may actually be sensitive to changes in bond rates), for example, or a temporary similarity in the return patterns of two or more investment categories.  (The last three months of 2008, when everything was in free fall, is a particularly stark recent example.)

Miller says that he now “cheats” – he looks at each fund’s quarterly portfolio disclosures and constrains the model by leaving out asset classes that the manager doesn’t own.  He also looks at regressions over different time periods, to weed out the false positives that appear and then vanish when certain assets are moving in tandem.

How does this work in the real world?  Let’s look at the Tweedy Browne Global Value Fund, which most of us would agree has been an extraordinarily good investment in recent years – the recent unpleasantness in Europe notwithstanding.  Below is a look at the fund’s performance from January 2008 through July 2012, compared with “the market” – in this case, the MSCI EAFE Value index.

If you were fortunate enough to identify this fund before 2008, you would have (to use an imprecise standard that King Henry might have appreciated) “beaten the market.”  But would you have picked this fund out of the myriad of alternatives?  Below is the fund’s performance, using the same crude comparison with “the market” over the previous five years, from January 2003 through the end of December 2007.