Advisor Perspectives* welcomes guest contributions. The views presented here do not necessarily represent those of *Advisor Perspectives*.*

*This is Part two of a two-part article series. Please see article one **here**.*

Michael Edesess’ article, The Trend that is Ruining Finance Research, makes the case that financial research is flawed. In this two-part series, I examine the points that Edesess raised in some detail. His arguments have some merit. Importantly however, his article fails to undermine the value of finance research in general. Rather, his points highlight that finance is a real profession that requires skills, education and experience that differentiates professionals from laymen.

Edesess’ case against so-called evidence-based investing rests on three general assertions. First, there is the very real issue with using a static t-statistic threshold when the number of independent tests becomes very large. Second, financial research is often conducted with a universe of securities that includes a large number of micro-cap and nano-cap stocks. These stocks often do not trade regularly and exhibit large overnight jumps in prices. They are also illiquid and costly to trade. Third, the regression models used in most financial research are poorly calibrated to form conclusions on non-stationary financial data with large outliers.

This article will tackle the first issued, often called “p-hacking,” and proposes a framework to help those who embrace evidence-based investing to make judicious decisions based on a more thoughtful interpretation of finance research. Part one of this series addressed the other two issues.

**P-hacking and scaling significance tests**

When Fama, French, Jegadeesh, et al. published the first factor models in the early 1990s, it was reasonable to reject the null hypothesis (no effect) with an observed t-statistic of 2. After all, the computational power and data at the time could not support data mining to the extent that it is now possible. Moreover, these early researchers were careful to derive their models very thoughtfully from first principles, lending economic credence to their results.

However, as Cam Harvey has so assiduously noted, the relevant t-statistic to signal statistical significance must expand through time to reflect the number of independent tests. He suggests that, based on several different approaches to the problem, current finance research should seek to exceed a t-statistic threshold of at least 3 to be considered significant. If the results are derived explicitly through data mining, or through multivariate tests, the threshold should be closer to 4, while results derived from first principles based on economic or behavioral conjecture, and with a properly structured hypothesis test, may be considered significant at thresholds somewhat below 3.

Harvey’s recommendations make tremendous sense. The empirical finance community – like so many other academic communities such as medicine and psychology – is guilty of propagating “magical thinking” for the sake of selling associated investment products, journal subscriptions and advertising. With few exceptions, journals only publish papers with interesting and significant findings. As a result, the true number of tests of significance in finance vastly exceeds the number of published journal articles.