Nowcasting GDP growth

Financial markets have long struggled with tracking GDP growth trends in a timely and consistent fashion. However, over the past decade statistical methods for “nowcasting” various economies have improved considerably, benefiting macro trading strategies. Dynamic factor models have become the method of choice for this purpose: they extract the communal underlying factor behind timely economic reports and translate the information of many data series into a single underlying trend. The estimation process may look daunting, but its basics are intuitive and calculation is executable in statistical programming language R.

For sources and links see the end of this post.

The post ties in with this site’s lecture on macroeconomic trends.

The below are excerpts from various papers. Emphasis and cursive text have been added. Mathematical formulas have been replaced by text.

Solving the nowcasting problem

“Macroeconomic indicators tend to be released with substantial delays, and this is especially true for gross domestic product (GDP). To deal with this issue, policy institutions have traditionally used simple forecasting models and judgement to predict the current state of the economy, as well as that of the recent past. This process is now commonly referred to as nowcasting.” [Chernis and Sekkel, 2017]

“Macro-econometricians face data sets that have hundreds or even thousands of series, but the number of observations on each series is relatively short, for example 20 to 40 years of quarterly data…Dynamic factor models…[have] received considerable attention in the past decade because of their ability to model simultaneously and consistently data sets in which the number of series exceeds the number of time series observations…[The] central empirical finding that a few factors can explain a large fraction of the variance of many macroeconomic series has been confirmed by many studies.” [Stock and Watson, 2010]

“The problem of nowcasting lies in estimating GDP in the interval of time between the beginning of the reference quarter and its official release, exploiting the information coming from other higher frequency variables. More formally, the nowcast of GDP can be defined as the orthogonal projection of quarterly GDP on the available information set, which contains mixed-frequency variables and is characterized by a ‘ragged edge’ structure given that the time of the last available information varies from series to series. Each time new information arrives, a new nowcast is produced.” [Bragoli, Metelli, and Modugno, 2014]

“Dynamic factor models (DFMs are particularly appealing for addressing the characteristic problems of nowcasting. First, there is typically a large number of relevant variables available. Second, the release schedule can be different for each variable, which results in unbalanced data sets. Finally, most leading indicators are monthly, while most often the variable of interest is quarterly. Traditional models can struggle with these challenges, while a DFM handles them elegantly…Papers have shown that DFM nowcasts not only outperform simple benchmarks and other competing nowcasting approaches, such as bridge models and mixed-data sampling (MIDAS) regressions, but also often produce nowcasts that are on par with those of professional forecasters.” [Chernis and Sekkel, 2017]

Dynamic factor models in a nutshell

“Large-dimensional dynamic factor models have become popular in empirical macroeconomics…Factor models can cope with many variables without running into scarce degrees of freedom problems often faced in regression-based analyses. A second advantage of factor models is that idiosyncratic movements which possibly include measurement error and local shocks can be eliminated…Dynamic factor models were traditionally used to construct economic indicators and for forecasting.” [Breitung and Eickmeier, 2005]

“The model we use in order to compute the nowcast and the news is a dynamic factor model…It exploits the fact that there is a large amount of co-movement among macroeconomic data series, and hence that relatively few factors can explain the dynamics of many variables…The model can be written as a system with two types of equations: a measurement equation linking the observed series (i.e. economic indicators) to a latent state process [latent factors], and the transition equation, which describes the state process dynamics.” [Bragoli, Metelli, and Modugno, 2014]

“The premise of a dynamic factor model is that a few latent dynamic factors drive the co-movements of a high-dimensional vector of time-series variables, which is also affected by a vector of mean-zero idiosyncratic disturbances. These idiosyncratic disturbances arise from measurement error and from special features that are specific to an individual series (the effect of a Salmonella scare on restaurant employment, for example). The latent factors follow a time series process, which is commonly taken to be a vector autoregression [a model based on linear interdependencies among several time series].” [Stock and Watson, 2010]

“An important motivation for considering dynamic factor models is that, if one knew the [latent] factors and if the disturbances are Gaussian, then one can make efficient forecasts for an individual variable [such as GDP] using the population regression of that variable on the lagged factors and lags of that variable. Thus, the forecaster gets the benefit of using all variables by using only [a small set of latent] factors.” [Stock and Watson, 2010]

“Nowcasters are frequently interested in the impact of each new data point. For example, it might be interesting to know what the impact of the latest industrial production figure is for the GDP forecast. Furthermore, the nowcasting environment is characterized by a large set of variables that can arrive at a high frequency. This results in the nowcaster studying a sequence of nowcasts that can be updated very frequently, reflecting the steady stream of new information arriving. The DFM framework…allows us to study this so-called ‘news’…We have a way of quantifying the change in information set and the average impact of each variable.” [Chernis and Sekkel, 2017]

How to estimate dynamic factor models

“Factor estimation…focused on time-domain methods in which latent factors could be estimated directly…Theoretical econometric research…over the past decade has made a great deal of progress, and a variety of methods are now available for the estimation of the factors and of the number of factors…The first generation consisted of low-dimensional (small number of time series) parametric models estimated in the time domain using Gaussian maximum likelihood estimation and the Kalman filter [state space model]…The second generation of estimators entailed nonparametric estimation with large a number of time series using cross-sectional averaging methods, primarily principal components and related methods…The third generation uses these consistent nonparametric estimates of the factors to estimate the parameters of the state space model used in the first generation [allowing it the expand to a large number of time series].” [Stock and Watson, 2010]

“MARSS stands for Multivariate Auto-Regressive State-Space. The MARSS package is an R package for estimating the parameters of linear MARSS models with Gaussian errors…The MARSS package allows you to easily fit time-varying constrained and unconstrained MARSS models with or without covariates to multivariate time-series data via maximum-likelihood using primarily an expectation-maximization algorithm.” [Holmes, Ward, and Scheuerell, 2014]

“[One can] use MARSS to do dynamic factor analysis (DFA), which allows to look for a set of common underlying trends among a relatively large set of time series…Here we are trying to explain temporal variation in a set of observed time series using linear combinations of a [much smaller] set of hidden random walks…DFA model is a type of MARSS model…The general idea is that the observations are modeled as a linear combination of hidden trends [latent factors] and factor loadings plus some offsets [constant parameters]. The DFA model…and the standard MARSS model…are equivalent.” [Holmes, Ward, and Scheuerell, 2014]

Examples for emerging and developed economies

Brazil

“The Brazilian statistical office publishes real GDP growth two months after the end of the quarter. The aim of the statistical model…is to predict GDP before the official figures are published by taking advantage of the information in the flow of economic data releases that precede them and updating our prediction with each successive data release. We include in our model those variables whose headline number is reported by the main statistical sources and central banks; in addition we consider those indicators monitored by financial markets and by the press… The peculiarity of the Brazilian data set is the fact that it includes two indicators that are strictly related to the target variable (quarterly GDP). The first is the monthly nominal GDP, published by the BCB, based on monthly indicators for economic activity and prices. The second is the economic activity index (EAI), also published by the BCB…The rest of the variables can be divided into four categories: surveys, labor, production/demand, and trade indicators…We choose the transformations that guarantee stationarity of the variables.

Given that most of the indicators we include in our model are characterized by missing data at the beginning of the sample (as it is in the case of the consumer confidence index, which starts in September 2005, or the PMI, which starts in February 2006) and by a “ragged edge” structure, due to unsynchronized data releases at the end of the sample, we adapt the estimation procedure to the presence of arbitrary patterns of missing data.

The nowcasting model for Brazil… proves the relevance of updating GDP forecasts to take advantage of the flow of data releases. Institutional forecasts, which in Brazil are revised as often as once a week, perform as well as model-based forecasts. This result…proves that pure judgment turns out to be no more accurate to the scope of forecasting.”

[Bragoli, Metelli, and Modugno, 2014]

Canada

“We estimate an approximate dynamic factor model (DFM) for Canada and evaluate its nowcasting performance for Canadian GDP…When building a nowcasting model, it is important to find variables that are: i) helpful to predict GDP growth; ii) timely; and iii) updated frequently (e.g., monthly).To help us meet these criteria, we choose…a mix of hard and soft indicators. We also include commodity prices and a set of US economic indicators because of the Canada’s close economic ties with the United States.

Recent papers…show that medium-sized data sets (i.e., with 10-30 variables) perform equally well as models with larger data sets with over 100 variables. With these considerations in mind, we select 23 predictors of the Canadian economy…Of the 23 variables that we include, 14 are domestic, 6 are US and the remaining 3 are the Bank of Canada non-energy commodity price index, WTI oil prices, and Global Purchasing Manager’s Index (PMI).The domestic variables cover most of the standard nowcasting variables: car sales, PMI, merchandise trade, housing variables, and various real activity measures…We transform the series to ensure stationarity….

Furthermore, several series published by Statistics Canada have been re-based or undergone definitional changes…To overcome this obstacle, series that suffer from this problem are simply spliced together with the corresponding older series….There are two notable peculiarities in Canadian macroeconomic data: first, the data are released with a larger delay relative to other developed economies, and second, Canada has a monthly GDP indicator…

Through a pseudo real-time exercise using data from the first quarter of 1980 to the second quarter of 2016, we show that the root-mean-squared forecast error (RMSFE) of the DFM improves steadily as more information is released.”

[Chernis and Sekkel, 2017]

Indonesia

“We produce ‘predictions’ of the current state of the Indonesian economy by estimating a dynamic factor model on a dataset of eleven indicatorsover the time period 2002 to 2014.

Indonesia’s GDP data…[suffer from] number of issues in the data that need to be carefully tackled even before starting any monitoring process…There is no single long series available from official statistical sources…the Asian financial crisis in the late 1990s stands out as a unique episode for Indonesia’s growth path with GDP falling by more than 15% in 1998. Then, since 2002 the growth rate stabilized somewhat at a yearly rate of around 5%, with low volatility. As a consequence of this pattern, we exclude the Asian financial crisis years from our sample and use the series only from 2002…Growth series exhibits a marked seasonal pattern, and yet, there is no seasonally adjusted GDP series available from official sources…In order to deal with the two issues highlighted, we construct a long time series for GDP by using over-a-year-ago growth rates.

To gauge the current state of the Indonesian economy in real-time, we need to construct a prediction of GDP growth before the official data is released. This means that at each point in time we want to predict not only the current and next quarter estimates for GDP growth (henceforth nowcast and forecasts), but also, wherever the official data has not been published yet, the past quarter GDP growth (backcast).

We find that relying on market ‘revealed preferences’ for certain indicators on the Indonesian economy is an effective guide to choosing what variables to include…Since Bloomberg constantly ranks the analysts’ demand for these alerts by constructing a relevance index for each macroeconomic indicator, we can select variables based on this relevance index…For Indonesia only a relatively small number of macroeconomic series are tracked in real-time by the markets…We end up with a data set of ten macroeconomic indicators…While Business Tendency Index are quarterly series, the remaining are at monthly frequency…There are substantial differences between series in terms of their publication delay. For example, the PMI for developing economies is published just four days after the reference month, while data on imports are released a month after the reference month.

Despite using a relatively narrow set of variables, when focusing on the year-on-year growth rate, the dynamic factor model nowcast error falls by 35% compared to the benchmark AR…Our model predictions perform compared to those of experts’ forecast surveyed by Bloomberg. Still, our ‘pseudo-real-time’ forecasting performance is comparable to the one achieved in a truly ‘real-time’ setting by the median Bloomberg survey.”

[Luciani et al, 2015]

New Zealand

“We make real-time factor model forecasts for real GDP growth [and inflation]…we construct factor model forecasts using real-time panels of data consisting of almost 2000 series…The paper then examines the marginal impact of each of the 21 release blocks on predictions for real GDP growth the CPI inflation.

We found that, for some horizons, the factor model achieved forecast accuracy comparable to the Reserve Bank of New Zealand for real GDP growth…Analyzing the marginal value of 21 different data releases revealed that surveys of business opinion – the QSBO and the NBBO – were important determinants of how the factor model predictions, and the uncertainty around those predictions, evolves through each quarter. The importance of the surveys of business opinion to forecasting in New Zealand appears not only due to their timeliness, but also to the underlying quality of the data.”

[Matheson, 2007

Norway

“We produce predictions of Norwegian GDP. To this end, we estimate a Bayesian dynamic factor model on a panel of fourteen variables (all followed closely by market operators) ranging from 1990 to 2011. By means of a pseudo real-time exercise, we show that the Bayesian dynamic factor model performs well both in terms of point forecast and in terms of density forecasts. Results indicate that our model outperforms standard univariate benchmark models, that it performs as well as the Bloomberg survey, and that it outperforms the predictions published by the Norges Bank in its Monetary Policy Report.

To evaluate the performance of our model, we perform a pseudo real-time out-of-sample exercise. Backcasts, nowcasts, and forecasts are produced according to a recursive scheme, where the first sample starts in January 1990 and ends in December 2005. More specifically, starting from January 2006, we construct real-time vintages by replicating the pattern of data availability implied by the stylized calendar. Every time new data are released, the model updates the backcast, the nowcast, and the forecast of GDP growth rate based only on information actually available at that time; that is, in each quarter, the model produces a sequence of predictions, where the prediction of GDP growth rate is obtained.”

[Luciani and Ricci, 2014]