Quantitative methods for macro information efficiency

Financial markets are not information efficient with respect to macroeconomic information because data are notoriously ‘dirty’, relevant economic research is expensive, and establishing stable relations between macro data and market performance is challenging. However, statistical programming and packages have prepared the ground for great advances in macro information efficiency. The quantitative path to macro information efficiency leads over three stages. The first is elaborate in-depth data wrangling that turns raw macro data (and even textual information) into clean and meaningful time series whose frequency and time stamps accord with market prices. The second stage is statistical learning, be it supervised (to validate logical hypotheses), or unsupervised (to detect patterns). The third stage is realistic backtesting to verify the value of the learning process and to assess the commercial viability of a macro trading strategy.

Basic points

Statistical programming

The rise of statistical programming is enabling the expansion of data science for the macro trading space. Data science is simply a set of methods to extract knowledge from data. The methods do not necessarily have to be complicated and the datasets do not necessarily have to be “big”. Indeed, in the macroeconomic space data do not expand as rapidly as in other areas, because data frequency is typically low (often just monthly and quarterly), the set of relevant countries and currency areas is limited, and there is not much scope for adding data from experimental settings.

With modern tools it is easy for portfolio management to incorporate data science. It takes essentially three things: [1] API access to relevant economics and markets databases, such as Refinitiv, Macrobond, Bloomberg, Haver, Quandl, Trading Economics and so forth, [2] research into what the economic and market time series actually mean and how that should affect market prices, and [3] a suitable programming platform to work with the data. The latter typically calls for either R or Python:

  • The R project provides a programming language and work environment for statistical analysis. It is not just for programmers (view post here). Even with limited coding skills R outclasses Excel spreadsheets and boosts information efficiency. Like Excel, the R environment is built around data structures, albeit far more flexible ones. Operations on data are simple and efficient, particularly for import, wrangling, and complex transformations. The tidyverse, a collection of packages that facilitate data science within R is particularly useful for the purpose of macro trading research (view post here). Moreover, R is a functional programming language. This means that functions can use other functions as arguments, making code succinct and readable. Specialized “functions of functions” map elaborate coding subroutines to data structures. Finally, R users have access to a repository of almost 15,000 packages of function for all sorts of operations and analyses.
  • Python is an interpreted, object-oriented, high-level programming language, which is also highly convenient for data analysis through libraries such as Numpy (foundational package for scientific computing), SciPy (collection of packages for scientific computing), pandas (structures and functions to work with data), matplotlib (package for plots and other 2D data visualizations), and IPython (robust and productive environment for interactive and exploratory computing). Python is exceptionally popular among programmers. Unlike domain-specific programming languages, such as R, Python is not only suitable for research and prototyping but also for building the production systems.

N.B.: A newer open-source statistical programming language, Julia, has appeared in 2012. It distinguishes itself from R and Python through greater speed. However, it has not yet established itself in terms of community and packages to an extent that is comparable with R and Python.

Statistical programming allows the construction of quantamental systems (view post here). A quantamental system combines customized high-quality databases and statistical language code in order to systematically investigate relations between market returns and plausible predictors. The term “quantamental” refers to a joint quantitative and fundamental approach to investing. The purpose of a quantamental system is to increase the information efficiency of investment managers, support the development of robust algorithmic trading strategies and to reduce costs of quantitative research.

Beyond system building, statistical programming objects facilitate many common analytical tasks in asset management. For example, predictive power scores, which are based on Decision Tree models, are an alternative to correlation matrices for quick data exploration (view post here). Unlike correlation, this score also captures non-linear relations, categorical data, and asymmetric relations.

“No programming language is perfect. There is not even a single best language; there are only languages well suited or perhaps poorly suited for particular purposes.”

Herbert Mayer

Statistical wrangling

In order to be relevant for portfolio management, macro information must be relatable to market prices, both conceptually and in terms of timing. This is an essential part of tradable economics, the technology for building systematic trading strategies based on economic data (view post here). Alas relating macro and asset return data is often obstructed by many deficiencies in the former:

  • Short history: Many economic data series, particularly in emerging economies, have only 5-20 years of history, which does not allow assessing their impact across many business cycles. Often this necessitates combining them with older discontinued series or substitutes that market had followed in the more distant past.
  • Revisions: Most databases simply record economic time series in their recently revised state. However, initial and intermediate releases of many economic indicators, such as GDP or business surveys, may have looked significantly different. This means that the information recorded for the past actually is not the information that was available in the past.
  • Time inconsistency: Unlike for market data, time of reference and time of availability for economic data are not the same. The information for industrial production in January may only be available in late March. This information is typically not embedded in the databases of the main service providers.
  • Calendar effects: Many economic data series are strongly influenced by seasonal patterns, working day numbers, and school holiday schedules. While some series are calendar adjusted by the source, the adjustment is typically incomplete and not comparable across countries.
  • Distortions: Almost all economic data are at least temporarily distorted relative to the concept they are meant to measure. For example, inflation data are often affected by one-off tax changes and administered price hikes. Production and balance sheet data often display disruptions due to strikes or natural disasters and sudden breaks due to changes in methodology.

Therefore, in order to make economic data tradable, they require an in-depth elaborate process of data wrangling. Generally, data wrangling means the transformation of raw irregular data into a clean tidy data set. In many sciences, this simply requires reformatting and relabelling. For macroeconomics, data wrangling takes a lot more.

  • Common technical procedures include[1] splicing different series across time according to pre-set rules, [2] combining various revised versions of series into a single ‘available at the time’ series, and [3] assigning publication time stamps to the periodic updates of time series.
  • Additional standard statistical procedures for economic data include seasonal and standard calendar adjustment (view post here), special holiday pattern adjustment, outlier adjustment and flexible filtering of volatile series. Seasonal adjustment is largely the domain of official software and there are modules in R and Python that provide access to these. Beyond there are specialized packages in R and Python that assist with other types of adjustments.
  • Beyond, machine learning methods can be used to replicate what statistical procedures would have been applied in the past. Unlike market price trends, macroeconomic trends or states are hard to track in real-time. Conventional econometric models are immutable and not backtestable because they are built with hindsight and do not aim to replicate perceived economic trends of the past (even if their parameters are sequentially updated). Machine learning can remedy this. For example, a practical approach is “two-stage supervised learning” (view post here). The first stage is scouting features. The second stage evaluates candidate models.

Market data are generally easier to wrangle than economic data but also suffer from major deficiencies. The most common issues are missing or bad price data. There is – at present – no commercial database for a broad range of generic financial returns (as opposed to mere price series). Nor are there widely used packages of functions that specifically wrangle financial return data across asset classes.

News and comments are major drivers for asset prices, maybe more so than conventional price and economic data. Yet it is impossible for any financial professional to read and analyse the vast and growing flow of written information. This is becoming the domain of natural language processing; a technology that supports the quantitative evaluation of humans’ natural language (view post here). It delivers textual information in a structured form that makes it usable for financial market analysis. A range of useful tools is now available for extracting and analysing financial news and comments.

“The fact that data science exists as a field is a colossal failure of statistics…Data munging and manipulation is hard and statistics has just said that’s not our domain.”

Hadley Wickham

Statistical learning

Statistical learning refers to a set of tools for modelling and understanding complex datasets. Understanding statistical learning is critical in modern financial markets, even for non-quants (view post here). This is because statistical learning provides guidance on how the experiences of investors in markets shape their future behaviour. Statistical learning works with complex datasets in order to forecast returns or to estimate the impact of specific events. Methods range from simple regression to complex machine learning. Simplicity can deliver superior returns if it avoids “overfitting”, i.e. gearing models to recent experiences. Success must be measured in “out-of-sample” predictive power after a model has been selected and estimated.

Machine learning is based on statistical learning methods but partly automates the construction of forecast models through the study of data patterns, the selection of best functional form for a given level of complexity, and the selection of the best level of complexity for out-of-sample forecasting. Machine learning can add efficiency to classical asset pricing models, such as factor models (forthcoming post), and macro trading rules, mainly because it is flexible, adaptable, and generalizes knowledge well (view post here). Beyond speed and convenience, machine learning methods are highly useful for macro trading research because they enable backtests that are based on methods rather than on specific factors. Backtests of specific factors are mostly invalid because the factor choice is typically shaped by historical experiences.

A standard statistical learning workflow for macro trading involves model training, model validation and method testing. In particular, it [1] selects form and parameters of trading models, [2] chooses the best of a set of models, based on past out-of-sample performance, and [3] assesses the value of the deployed learning method based on further out-of-sample results. A convenient technology is the ‘list-column workflow’ based on the tidyverse packages in R (view post here).

Machine learning and expert domain knowledge are not rivals but complementary. Domain expertise is critical for the quality of featurization, the choice of hyperparameters, the selection of training and set samples, and the choice of regularization strategy.

Machine learning is conventionally divided into three main fields: supervised learning, unsupervised learning, and reinforcement learning.

  • In supervised learning the researcher posits input variables and output variables and uses an algorithm to learn which function maps the former to the latter. This principle underlies the majority of statistical learning applications in financial markets. A classic example is the assessment of what the change in interest rate differential between two countries means for the dynamics of their exchange rate.
    Supervised learning can be divided into regression, where the output variable is a real number, and classification, where the output variable is a category, such as “policy easing” or “policy tightening” for central bank decisions. An important subsection of supervised machine learning are ensemble methods, i.e. machine learning techniques that combine several base models in order to produce one optimal prediction. Ensemble methods include bagging, random forest and gradient boosting. They have been shown to produce superiod predictive power for credit spread forecasts (view post here) and – for the case of random forest – for equity reward-risk timing (view post here).
    Also, artificial neural networks have become increasingly practical for (supervised and unsupervised) macro trading research. This is a popular machine learning method that consists of layers of data-processing units, connections between them and the application of weights and biases that are estimated based on training data. For example, neural networks can principally be used to estimate the state of the market on daily or higher frequency based on an appropriate feature space, i.e. data series that characterize the key features of a market (view post here). Beyond, neural networks can be used to detect lagged correlation between different asset prices (view post here) or market price distortions (view post here).
  • Unsupervised learning only knows input data. Its goal is to model the underlying structure or distribution of the data in order to learn previously unknown patterns. Application of unsupervised machine learning techniques includes clustering (partitioning the data set according to similarity), anomaly detection, association mining and dimension reduction (see below). More specifically, unsupervised learning methods have been proposed to classify market regimes, i.e. persistent clusters of market conditions that affect the success of trading factors and strategies (view post here).
  • Reinforcement learning is a specialized application of (deep) machine learning that interacts with the environment and seeks to improve on the way it performs a task so as to maximize its reward (view post here). The computer employs trial and error. The model designer defines the reward but gives no clues as to how to solve the problem. Reinforcement learning holds potential for trading systems because markets are highly complex and quickly changing dynamic systems. Conventional forecasting models have been notoriously inadequate. A self-adaptive approach that can learn quickly from the outcome of actions may be more suitable. Reinforcement learning can benefit trading strategies directly, by supporting trading rules, and indirectly by supporting the estimation of trading-related indicators, such as real-time growth (view post here).

Linear regression remains the most popular tool for supervised learning in financial markets (apart from informal chart and correlation analysis). It can be the appropriate model if it relates market returns to previous available information in a theoretically plausible functional form. For example, mixed data sampling (MIDAS) regressions are a particularly useful method for nowcasting economic trends and financial market variables, such as volatility (view post here). These regressions allow combining time series of different frequencies and limit the number of parameters that need to be estimated.

Structural vector autoregression (SVAR) is a practical model class for empirical macroeconomics. It studies the evolution of a set of linearly related observable time series variables, such as economic data or asset price, assuming that all variables depend in fixed proportion on past values of the set and new structural shocks. The method can also be employed for macro trading strategies (view post here) because it helps to identify specific market and macro shocks (view post here). For example, SVAR can identify short-term policy, growth or inflation expectation shocks. Once a shock is identified it can be used for trading in two ways. First, one can compare the type of shock implied by markets with the actual news flow and detect fundamental inconsistencies. Second, different types of shocks may entail different types of subsequent asset price dynamics and may form a basis for systematic strategies.

One important area of statistical learning for investment research is dimension reduction. This refers to methods that condense the bulk of the information of a large set of macroeconomic time series into a smaller set that distills the relevant trends for investors. In macroeconomics there are many related data series that have only limited incremental relevant information value. There are three types of statistical dimension reduction methods.

  • The first type selects a subset of “best” explanatory variables (Elastic Net or Lasso, view post here).
  • The second type selects a small set of latent background factors of all explanatory variables and then uses these background factors for prediction. This is the key idea behind static and dynamic factor models. Factor models are one key technology behind nowcasting in financial markets, a modern approach to monitoring current economic conditions in real-time (view post here). While nowcasting has mostly been used to predict forthcoming data reports, particularly GDP, the underlying factor models can produce a lot more useful information for the investment process, including latent trends, indications of significant changes in such trends, and estimates of the changing importance of various predictor data series (view post here).
  • The third type generates a small set of functions of the original explanatory variables that historically would have retained their explanatory power and then deploys these for forecasting (Sufficient Dimension Reduction)

Dimension reduction methods not only help to condense information of predictors of trading strategies, but also support portfolio construction. In particular, they are suited for detecting latent factors of a broad set of asset prices. These factors can be used to improve estimates of the covariance structure of these prices and – by extension – to improve the construction of a well-diversified minimum variance portfolio (view post here).

“When data volume swells beyond a human’s ability to discern the patterns in it…we need a new form of intelligence.”

Mansour Raad


Backtesting is meant to assess how well a trading strategy idea would have worked in the past. Statistical programming makes it easy to backtest trading strategies. However, its computational power and convenience can also be corrosive the investment process due to its tendency to discover temporary patterns while data samples for cross-validation are limited. Moreover, the business of algorithmic trading strategies, unfortunately, provides strong incentives for overfitting models and embellishing backtests (view post here). Similarly, academic researchers in the field of trading factors often feel compelled to resort to data mining in order to produce publishable ‘significant’ empirical findings (view post here).

Good backtests require sound principles and integrity (view post here). Sound principles should include [1] formulating a logical economic theory upfront, [2] choosing sample data upfront, [3] keeping the model simple and intuitive, and [4] limiting try-outs when testing ideas. Realistic performance expectations of trading strategies should be based on a range of plausible versions of a strategy, not an optimized one. Bayesian inference works well for that approach, as it estimates both the performance parameters and their uncertainty. The most important principle of all is integrity: aiming to produce good research rather than good backtests and to communicate statistical findings honestly rather than selling them.

The evaluation of a trading strategy typically relies on statistical metrics. Many measures are incomplete and can be outrightly misleading. An interesting concept is the discriminant ratio (‘D-ratio’), which measures an algorithm’s success in improving risk-adjusted returns versus a related buy-and-hold portfolio (forthcoming post).

“If you torture the data long enough, it will confess to anything.”

Ronald Coase