Statistical learning refers to a set of tools or models that help extract insights from datasets. Understanding statistical learning is critical in modern financial markets, even for non-quants (view post here). This is because statistical learning illustrates and replicates how the experiences of investors in markets shape their future behavior. In financial markets, statistical learning can enhance information efficiency in many ways. It can also directly predict returns, market direction, or the impact of specific events. Methods range from simple regression to complex neural networks. Simplicity can deliver superior returns if it avoids “overfitting”, i.e. gearing models to recent experiences. Success must be validated based on “out-of-sample” test sets that played no role in the estimation of model parameters or choice of hyperparameters (model structure).
Linear regression remains the most popular tool for supervised learning in financial markets. It is appropriate if one can relate market returns to previously available information in a theoretically plausible functional form. In the macro trading space, mixed data sampling (MIDAS) regressions are a useful method for nowcasting economic trends and financial market variables, such as volatility (view post here). This type of regression allows combining time series of different frequencies and limits the number of parameters that need to be estimated.
Structural vector autoregression (SVAR) is a practical model class if one wishes to capture several interconnected time series processes. It studies the evolution of a set of linearly related observable time series variables, such as economic data or asset prices. SVAR assumes that all variables depend in fixed proportion on past values of the set and new structural shocks. The method is useful for macro trading strategies (view post here) because it helps identify specific interpretable market and macro shocks (view post here). For example, SVAR can identify short-term policy, growth, or inflation expectation shocks. Once a shock is identified it can be used for trading in two ways. First, one can compare the type of shock implied by markets with the actual news flow and detect fundamental inconsistencies. Second, different types of shocks may entail different types of subsequent asset price dynamics and, hence, form a basis for systematic strategies.
A particularly important practice of statistical learning for investment research is dimension reduction. This refers to methods that condense the bulk of the information of a large set of macroeconomic time series into a smaller set that distills the most important information for investors. In macroeconomics, there are many related data series that have only limited incremental relevant information value. Cramming all of them into a prediction model undermines estimation stability and transparency. There are three types of statistical dimension reduction methods.
- The first type of dimension reduction selects a subset of “best” explanatory variables by means of regularization, i.e. the reduction of coefficient values through penalizing coefficient magnitudes in the optimization function that is applied for statistical fit. Penalty functions that are linear in individual coefficient values can set some of them to zero. Classic methods of this type are Lasso and Elastic Net (view post here).
- The second type selects a small set of latent background factors of all explanatory variables and then uses these background factors for prediction. This is the basic idea behind static and dynamic factor models. Factor models are one key technology behind nowcasting in financial markets, a modern approach to monitoring current economic conditions in real-time (view post here). While nowcasting has mostly been used to predict forthcoming data reports, particularly GDP, the underlying factor models can produce a lot more useful information for the investment process, including latent trends, indications of significant changes in such trends, and estimates of the changing importance of various predictor data series (view post here).
- The third type generates a small set of functions of the original explanatory variables that historically would have retained their explanatory power and then deploys these for forecasting. This method is called Sufficient Dimension Reduction and is more suitable for non-linear relations. (view post here).
Dimension reduction methods do not only help to condense information of predictors of trading strategies, but also support portfolio construction. In particular, they are suited for detecting latent factors of a broad set of asset prices (view post here). These factors can be used to improve estimates of the covariance structure of these prices and – by extension – to improve the construction of a well-diversified minimum variance portfolio (view post here).