“Dimension reduction” condenses the information content of a multitude of data series into small manageable set of factors or functions. This reduction is important for forecasting with macro variables because many data series have only limited and highly correlated information content. There are three types of statistical methods.The first type selects a subset of “best” explanatory variables (view post here). The second type selects a small set of latent background factors of all explanatory variables and then uses these background factors for prediction (Dynamic Factor Models). The third type generates a small set of functions of the original explanatory variables that historically would have retained their explanatory power and then deploys these for forecasting (Sufficient Dimension Reduction).

The post ties in with the lecture on information efficiency on this post.

The below are excerpts from the paper. Headings and some other cursive text has been added, particularly in the place of formulas, for convenience of reading.

### Dimension reduction: what is it and why do we need it?

Here dimension reduction is the mapping of available data series to a lower dimensional space, i.e. a lower number of time series, such that uninformative variance in the data is discarded.

“[Dimension reduction addresses] the challenges of macro-forecasting in a data rich environment [when] a large set explanatory variables is available to forecast a single variable [such as future economic trends or related asset returns]..All [explanatory variables] are potentially useful in forecasting. Yet, estimation…via [classical ordinary least square regression] can be problematic when the set of explanatory variables is large relative to the sample size, or variables in the set of explanatory variables are nearly collinear, as is often the case in…macro forecasting.”

“Let us take the example of a motorbike rider in racing competitions. Today, his position and movement gets measured by GPS sensor on bike, gyro meters, multiple video feeds and his smart watch. Because of respective errors in recording, the data would not be exactly same…Assume that an analyst sits with all this data to analyze the racing strategy of the biker – he or she would have a lot of variables or dimensions which are similar and of little (or no) incremental value. This is the problem of high unwanted dimensions and needs a treatment of dimension reduction.” [AnalyticsVidhya]

“Dimension reduction methods in regression fall into two categories: variable selection, where a subset of the original predictors is selected for modeling the response, and feature extraction, where linear combinations of the regressors, frequently referred to as ‘derived,’ replace the original regressors. The underlying assumption in variable selection is that the individual predictors have independent effects on the response, which is typically violated in econometric time series that have varying degrees of correlation.”

On methods for variable selection for macro trading strategies view post here.
The two main types of feature extraction methods are summarized in the sections below.

### Dynamic Factor Models

“Stock and Watson [pioneers of dynamic factor models] after pointing out difficulties with [estimating the effects of a large number of similar regressors] resolve to use their factorial structure apparatus assuming from the outset that the forecast model [is a target variable that depends linearly on a small set of background variables or factors that govern the dynamics of the multitude of explanatory variables]. Stock and Watson showed…that the factors are identifiable and estimable by principal components.”

“The basic building blocks of the Dynamic Factor Model forecasting framework generalize the case of a classic factor structure assuming that [i] the set of explanatory variables is, up to idiosyncratic noise, driven by a small set of latent factors [and]…[ii] the forecasting model is a linear function of these factors allowing for response lags as predictors… An important feature of this framework is the assumption that the same factors that determine the marginal distribution of the explanatory variables also determine the conditional distribution of the target variable.”

“Factors can be identified and estimated via principal components…Principal Component Regression operates in two stages. First, the linear combinations that maximize the variance of the explanatory variables and are mutually orthogonal are extracted… Secondly, the target variable is regressed on the first couple principal components.”

“Studies that exploit a factor structure in forecasting follow a recurrent theme: First the dimension of a large panel of data is reduced to a sufficiently ‘lean’ factor structure and then the factors are used to forecast a target variable [such as an asset return]. The reduction step has so far been largely disconnected from the targeting step…Targeting comes into the picture only after a condensed latent structure is estimated and is resolved by postulating a linear relationship between the target variable y and the factors.”

### Sufficient Dimension Reduction

“Sufficient Dimension Reduction (SDR) is a collection of novel tools for reducing the dimension of multivariate data in regression problems without losing inferential information on the distribution of a target variable…The reduction and targeting are carried out simultaneously as SDR identifies a sufficient function of the regressors [or explanatory variables] that preserves the information in the conditional distribution of [the targeted variables]…It is assumed that the data generating process directly generates the set of regressors without the mediation of latent factors, and conditions that ensure identification and estimation are placed directly on the marginal distribution of [target variables].”

“SDR works directly with observables and seeks to identify how many and which functions of the explanatory variables are needed to fully describe the conditional cumulative distribution function.”

“Specifically, SDR aims to identify and estimate functions of the predictors so that the conditional distribution of the future target variable with respect to the functions is equal to its conditional distribution with respect to the original explanatory variables. These functions are called reductions because they preserve all the information that the explanatory variables carry about the future target variable. Obviously, only if such functions are fewer than the number of original explanatory variables do they represent proper reductions… Reductions can be either linear or nonlinear functions of the panel data.”

“In contrast to Dynamic Factor Models, SDR methods depart from the linear factor assumption. In fact, SDR methods thrive when the relationship between the target variable and the predictors contains nonlinearities.”

“In summary, our SDR approach is complementary to the Dynamic Factor Model framework…[It] is likely a better organizing framework for interpreting empirical results if the final objective is prediction…However, if the purpose is to identify the basic forces driving a panel of variables, the DFM framework remains a very effective device.”