In finance, statistical arbitrage (often abbreviated as Stat Arb or StatArb) is a class of short-term financial trading strategies that employ mean reversion models involving broadly diversified portfolios of securities (hundreds to thousands) held for short periods of time (generally seconds to days). These strategies are supported by substantial mathematical, computational, and trading platforms.
Broadly speaking, StatArb is actually any strategy that is bottom-up, beta-neutral in approach and uses statistical/econometric techniques in order to provide signals for execution. Signals are often generated through a contrarian mean reversion principle but can also be designed using such factors as lead/lag effects, corporate activity, short-term momentum, etc. This is usually referred to[by whom?] as a multi-factor approach to StatArb.
Because of the large number of stocks involved, the high portfolio turnover and the fairly small size of the effects one is trying to capture, the strategy is often implemented in an automated fashion and great attention is placed on reducing trading costs. 
Statistical arbitrage has become a major force at both hedge funds and investment banks. Many bank proprietary operations now center to varying degrees around statistical arbitrage trading.
As a trading strategy, statistical arbitrage is a heavily quantitative and computational approach to securities trading. It involves data mining and statistical methods, as well as the use of automated trading systems.
Historically, StatArb evolved out of the simpler pairs trade strategy, in which stocks are put into pairs by fundamental or market-based similarities. When one stock in a pair outperforms the other, the under performing stock is bought long and the outperforming stock is sold short with the expectation that under performing stock will climb towards its outperforming partner.
Mathematically speaking, the strategy is to find a pair of stocks with high correlation, cointegration, or other common factor characteristics. Various statistical tools have been used in the context of pairs trading ranging from simple distance-based approaches to more complex tools such as cointegration and copula concepts.
StatArb considers not pairs of stocks but a portfolio of a hundred or more stocks—some long, some short—that are carefully matched by sector and region to eliminate exposure to beta and other risk factors. Portfolio construction is automated and consists of two phases. In the first or "scoring" phase, each stock in the market is assigned a numeric score or rank that reflects its desirability; high scores indicate stocks that should be held long and low scores indicate stocks that are candidates for shorting. The details of the scoring formula vary and are highly proprietary, but, generally (as in pairs trading), they involve a short term mean reversion principle so that, e.g., stocks that have done unusually well in the past week receive low scores and stocks that have underperformed receive high scores. In the second or "risk reduction" phase, the stocks are combined into a portfolio in carefully matched proportions so as to eliminate, or at least greatly reduce, market and factor risk. This phase often uses commercially available risk models like MSCI/Barra, APT, Northfield, Risk Infotech, and Axioma to constrain or eliminate various risk factors.
Over a finite period of time, a low probability market movement may impose heavy short-term losses. If such short-term losses are greater than the investor's funding to meet interim margin calls, its positions may need to be liquidated at a loss even when its strategy's modeled forecasts ultimately turn out to be correct. The 1998 default of Long-Term Capital Management was a widely publicized example of a fund that failed due to its inability to post collateral to cover adverse market fluctuations.
Statistical arbitrage is also subject to model weakness as well as stock- or security-specific risk. The statistical relationship on which the model is based may be spurious, or may break down due to changes in the distribution of returns on the underlying assets. Factors, which the model may not be aware of having exposure to, could become the significant drivers of price action in the markets, and the inverse applies also. The existence of the investment based upon model itself may change the underlying relationship, particularly if enough entrants invest with similar principles. The exploitation of arbitrage opportunities themselves increases the efficiency of the market, thereby reducing the scope for arbitrage, so continual updating of models is necessary.
On a stock-specific level, there is risk of M&A activity or even default for an individual name. Such an event would immediately invalidate the significance of any historical relationship assumed from empirical statistical analysis of the past data.
StatArb and systemic risk: events of summer 2007
During July and August 2007, a number of StatArb (and other Quant type) hedge funds experienced significant losses at the same time, which is difficult to explain unless there was a common risk factor. While the reasons are not yet fully understood, several published accounts blame the emergency liquidation of a fund that experienced capital withdrawals or margin calls. By closing out its positions quickly, the fund put pressure on the prices of the stocks it was long and short. Because other StatArb funds had similar positions, due to the similarity of their alpha models and risk-reduction models, the other funds experienced adverse returns. One of the versions of the events describes how Morgan Stanley's highly successful StatArb fund, PDT, decided to reduce its positions in response to stresses in other parts of the firm, and how this contributed to several days of hectic trading.
In a sense, the fact of a stock being heavily involved in StatArb is itself a risk factor, one that is relatively new and thus was not taken into account by the StatArb models. These events showed that StatArb has developed to a point where it is a significant factor in the marketplace, that existing funds have similar positions and are in effect competing for the same returns. Simulations of simple StatArb strategies by Khandani and Lo show that the returns to such strategies have been reduced considerably from 1998 to 2007, presumably because of competition.
It is a noteworthy point of contention, that the common reduction in portfolio value could also be attributed to a causal mechanism. The 2007-2008 financial crisis also occurred at this time. Many, if not the vast majority, of investors of any form, booked losses during this one year time frame. The association of observed losses at hedge funds using statistical arbitrage is not necessarily indicative of dependence. As more competitors enter the market, and funds diversify their trades across more platforms than StatArb, a point can be made that there should be no reason to expect the platform models to behave anything like each other. Their statistical models could be entirely independent.
Statistical arbitrage faces different regulatory situations in different countries or markets. In many countries where the trading security or derivatives are not fully developed, investors find it infeasible or unprofitable to implement statistical arbitrage in local markets.
In China, quantitative investment including statistical arbitrage is not the mainstream approach to investment. A set of market conditions restricts the trading behavior of funds and other financial institutions. The restriction on short selling as well as the market stabilization mechanisms (e.g. daily limit) set heavy obstacles when either individual investors or institutional investors try to implement the trading strategy implied by statistical arbitrage theory.
- Currency correlation
- Fourier-related transforms
- Machine learning
- Time series
- Volatility arbitrage
- Andrew W. Lo (2010). Hedge Funds: An Analytic Perspective (Revised and expanded ed.). Princeton University Press. p. 260. ISBN 978-0-691-14598-3.
- "Statistical Arbitrage". DayTradeTheWorld. 28 February 2020.
- Mahdavi Damghani, Babak (2013). "The Non-Misleading Value of Inferred Correlation: An Introduction to the Cointelation Model". Wilmott. 2013 (1): 50–61. doi:10.1002/wilm.10252.
- Rad, Hossein; Low, Rand Kwong Yew; Faff, Robert (2016-04-27). "The profitability of pairs trading strategies: distance, cointegration and copula methods". Quantitative Finance. 16 (10): 1541–1558. doi:10.1080/14697688.2016.1164337. ISSN 1469-7688. S2CID 219717488.
- Avellaneda, Marco (Spring 2011). "Risk and Portfolio Management; Statistical Arbitrage" (PDF). Courant Institute of Mathematical Sciences. Retrieved 2015-03-30.
- For example, Andrew Lo (op.cit.) states "the widespread use of standardized factor risk models such as those from MSCI/BARRA or North-field Information Systems ... will almost certainly create common exposures among those managers to the risk factors contained in such platforms"
- Lowenstein, Roger (2000). When Genius Failed: The Rise and Fall of Long-Term Capital Management. Random House. ISBN 978-0-375-50317-7.
- Mahdavi Damghani, Babak (2012). "The Misleading Value of Measured Correlation". Wilmott. 2012 (1): 64–73. doi:10.1002/wilm.10167. S2CID 154550363.
- Amir Khandani and Andrew Lo. What Happened to the Quants In August 2007?
- Scott Patterson (2010-01-22). "The Minds Behind the Meltdown". Wall Street Journal Online. Retrieved 2011-06-06.
- Amir Khandani and Andrew Lo. What Happened to the Quants in August 2007?: Evidence from Factors and Transactions Data
- Mahdavi Damghani, Babak (2013). "De-arbitraging With a Weak Smile: Application to Skew Risk". Wilmott. 2013 (1): 40–49. doi:10.1002/wilm.10201. S2CID 154646708.
- Avellaneda, M. and J.H. Lee: "Statistical arbitrage in the US equities market". A well documented empirical study which confirms that StatArb profitability dropped after 2002 and 2003.
- Bertram, W.K., 2009, Analytic Solutions for Optimal Statistical Arbitrage Trading, Available at SSRN: https://ssrn.com/abstract=1505073.
- Bertram, W.K., 2009, Optimal Trading Strategies for Ito Diffusion Processes, Physica A, Forthcoming. Available at SSRN: https://ssrn.com/abstract=1371903. Presents a robust theoretical framework for statistical arbitrage trading.
- Richard Bookstaber: A Demon Of Our Own Design, Wiley (2006). Describes: the birth of Stat Arb at Morgan Stanley in the mid-1980s, out of the pairs trading ideas of Gerry Bamberger. The eclipse of the concept after the departure of Bamberger for Newport/Princeton Partners and of D.E. Shaw to start his own StatArb firm. And finally the revival of StatArb at Morgan Stanley under Peter Muller in 1992. Includes this comment (p. 194): “Statistical arbitrage is now past its prime. In mid-2002 the performance of stat arb strategies began to wane, and the standard methods have not recovered.”
- Jegadeesh, N., 1990, 'Evidence of Predictable Behavior of Security Returns', Journal of Finance 45, p. 881–898. An important early article (along with Lehmann’s) about short term return predictability, the source of StatArb returns
- Kolman, Joe (1998). "Inside D. E. Shaw". Derivatives Strategy. Retrieved 23 June 2013.
- Lehmann, B., 1990, 'Fads, Martingales, and Market Efficiency', Quarterly Journal of Economics 105, pp. 1–28. First article in the open literature to document the short term return-reversal effect that early StatArb funds exploited.
- Ed Thorp: A Perspective on Quantitative Finance – Models for Beating the Market Autobiographical piece describing Ed Thorp's stat arb work in the early and mid-1980s (see p. 5)
- Ed Thorp: Statistical Arbitrage, Wilmott Magazine, June 2008 (Part1 Part2 Part3 Part4 Part5 Part6). More reminiscences from the early days of StatArb from one of its pioneers.