Heterogeneous autoregressive models of realized volatility have become a popular standard in financial market research. They use high-frequency volatility measures and the assumption that traders with different time horizons perceive, react to, and cause different types of volatility components. A key hypothesis is that volatility over longer time intervals has a stronger impact on short-term volatility than vice versa. This leads to an additive volatility cascade and a simple model in autoregressive form that can be estimated with ordinary least squares regression. Natural extensions include weighted least-squares estimations, the inclusion of jump-components and the consideration of index covariances. Research papers report significant improvement of volatility forecasting performance compared to other models, across equity, fixed income, and commodity markets.

See paper references at the end of this post.
The below post consists of quotes from the papers. Cursive text and text in brackets have been added for clarity.
The post ties up with this site’s summary on systematic value generation through macro trends, particularly the section on financial market indicators.

An idea to model financial returns volatility

Autocorrelations of [return volatility measures] show very strong persistence that last for long time periods (months). Return distributions at different horizons show fat tails and tail crossover, i.e. return probability density functions are leptokurtic [i.e. have have heavier tails] with shapes depending on the time scale and present a very slow convergence to the normal distribution as time scales increase…The observed data contain noticeable fluctuations in the size of price changes at all time scales, while standard GARCH and stochastic volatility short-memory models appear like white noise once aggregated over longer time periods. Hence, growing interest in long-memory processes has emerged in financial econometrics.” [Corsi]

“Over the last two decades, measures based on high-frequency data such as realized volatility were found to dominate traditional ARCH-type models…Realized volatility is a more efficient estimator of return volatility.” [Hizmeri,/Izzeldin/Nolte/Pappas]

“We propose an additive cascade model of different volatility components each of which is generated by the actions of different types of market participants.” [Corsi]

“The so-called Heterogeneous Market Hypothesis…recognizes the presence of heterogeneity across traders…At one end of the spectrum we have dealers, market makers, and intraday speculators, with very high intraday frequency as a trading horizon. At the other end, there are institutional investors, such as insurance companies and pension funds who trade much less frequently and possibly for larger amounts…Agents with different time horizons perceive, react to, and cause different types of volatility components…Simplifying a bit, we can identify three primary volatility components: the short-term traders with daily or higher trading frequency, the medium-term investors who typically rebalance their positions weekly, and the long-term agents with a characteristic time of one or more months.” [Corsi]

“It has been recently observed that volatility over longer time intervals has a stronger influence on volatility over shorter time intervals than conversely…The overall pattern that emerges is a volatility cascade from low frequencies to high frequencies. This can be economically explained by noticing that for short-term traders the level of long-term volatility matters because it determines the expected future size of trends and risk.” [Corsi]

How do HAR realized volatility models work

“This additive volatility cascade leads to a simple AR-type model in the realized volatility with the feature of considering volatilities realized over different time horizons. We thus term this model, Heterogeneous Autoregressive model of Realized Volatility (HAR-RV).” [Corsi]

“The heterogeneous autoregressive model was designed to parsimoniously capture the strong persistence typically observed in realized volatility and has become the workhorse of this literature due to its consistently good forecasting performance…[It] is perhaps the most popular benchmark model for forecasting return volatility. It is often estimated using raw realized variance and ordinary least squares (OLS).” [Clements/Preve]

“[We define] the latent partial volatility…as the volatility generated by a certain market component…[The model contains an] additive cascade of partial volatilities, each having a first-order autoregressive structure. To simplify, we consider a hierarchical model with only three volatility components corresponding to time horizons of one day, one week, and one month.” [Corsi]

“The model for the unobserved partial volatility processes…at each level of the cascade is assumed to be a function of the past realized volatility experienced at the same time scale (the AR(1) component) and, due to the asymmetric propagation of volatility, of the expectation of the next-period values of the longer-term partial volatilities (the hierarchical component)…The economic interpretation is that each volatility component in the cascade corresponds to a market component that forms expectations for the next period’s volatility based on the observation of the current realized volatility and on the expectation for the longer horizon volatility…[This] can be seen as a three-factor stochastic volatility model, where the factors are directly the past realized volatilities viewed at different frequencies.” [Corsi]

Extensions and modifications

“The original HAR model is often estimated using realized volatility and the method of ordinary least squares (OLS). However, given stylized facts of raw realized volatility, such as spikes, outliers, conditional heteroskedasticity, and non-Gaussianity, and the well-known properties of OLS (highly sensitive to outliers, suboptimal in the presence of conditional heteroskedasticity or non-Gaussianity), this [calls for] straightforward improvements…[Thus] it is proposed to use the method of weighted least squares (WLS), or least absolute deviations (LAD), for estimating the HAR model. It is also proposed to replace realized volatility with logarithmic range, a simpler low-frequency data-based volatility proxy, when using the HAR in instances where realized volatility is not readily available… found. The benefits of replacing OLS with WLS or LAD are particularly clear for longer forecast horizons.” [Clements/Preve]

“The latent one-day integrated variance can be consistently estimated by the one-day realized variance…The realized volatility measure is defined as the sum of the squared returns within a [trading] day. Range-based volatility estimators are an important class of estimators that require less data than the realized volatility…[A popular] range-based estimator used here is the simple log-range estimator [based on] the daily intraday high and low log-prices respectively.” [Clements/Preve]

“An easily implemented, and OLS estimated, extension of the HAR model dubbed the HARQ model…accounts for the error with which realized volatility is estimated by using realized quarticity…This framework allows for less weight to be placed on historical observations of RV when the measurement error captured by realized quarticity is higher.” [Clements/Preve]
N.B.:Quarticity is the condition of being quartic, i.e. a polynomial in the fourth degree.  Integrated quarticity is a critical ingredient for inference regarding the return variation and for the reliability of jump test procedures. Precise estimation of integrated quarticity is highly important and is a valuable ingredient of jump hypothesis test statistics. Inference the presence of jumps should, as a matter of routine, exploit jump-robust measures for integrated quarticity.

“We propose a new class of realized volatility based forecasting models, by generalizing the basic HAR model with the (co)variances and semi-(co)variances from the market index (SPY-ETF) to form a more general ex-ante forecasting model. The use of the index can be motivated by a CAPM type-argument whereby the volatility of the stock is known to be driven by stock-specific idiosyncratic factors as well as by systematic factors pertaining to the market portfolio…We show that the [extended] class of [HAR] models substantially outperforms the basic HAR-RV model.” [Hizmeri,/Izzeldin/Nolte/Pappas]

Empirical findings from equity markets

Heterogeneous Autoregressive Realized Volatility outperforms ARCH-family models no matter the index and the time horizon, confirming that the realized volatility is by far a more precise measure of volatility than conditional variance. Also, log-realized volatilities are to be preferred in using the HAR-RV given the lognormal distribution of realized volatility.” [Mastro]

“In spite of its simplicity, the proposed model is able to produce rich dynamics for returns and volatilities which closely reproduce what we observe in empirical data.” [Corsi]

Empirical findings from bond markets

“We examine the predictive power of the Heterogeneous Autoregressive model on Treasury bond return volatility of major European government bond markets…Short term and medium-term volatility is a robust and statistically significant predictor of the term structure of intraday volatility of bonds with maturities ranging from 1-year up to 30-years. When decomposing volatility into its continuous and discontinuous (jump) component, we find that the jump tail risk component is a significant predictor of bond market volatility…Large jumps in realised bond market volatility tend to coincide with monetary policy announcements…Overall, our HAR models exhibit extraordinary in-sample and out-of-sample forecasting power with in sample R2s ranging from 50% to 80% and out-of-sample R2s ranging from 20% to 75%.” [Gencay Özbekler/Kontonikas/Triantafyllou]

Empirical findings from commodity markets

“We focus on five important agricultural commodities…namely, corn, rough rice, soybeans, sugar, and wheat and we produce forecasts for 1-day to 66-days ahead…Our out-of-sample results…strongly suggest that the simple HAR model significantly outperforms…autoregressive models [for forecasting volatility]. However, none of the HAR extensions is able to generate forecasts that are statistically significantly better compared to the simple HAR model.” [Degiannakis/Filis,/Klein/Walther]

“We examine the role of both volatility implied from the OVX and observable market variables when forecasting volatility for the WTI futures market. We apply the simple Heterogenous Autoregressive (HAR) model on realized volatility itself. Additionally, two fundamentally different types of variables are used…the forward-looking implied volatility index and other exogenous market variables including volume, open interest, daily returns and the slope of the futures curve…We find that including information from the OVX significantly improves the day-ahead and week ahead volatility forecasts. Second, the exogenous market variables improve volatility forecasts for daily, weekly and monthly horizons.” [Molnar/Haugomb/Langelanda/Westgaarda]

References

Clements, Adam and Daniel Preve (2019), “A Practical Guide to Harnessing the HAR Volatility Model”
Corsi, Fulvio (2009), “A Simple Approximate Long-Memory Model of Realized Volatility”, Journal of Financial Econometrics, 2009, Vol. 7, No. 2, 174–196
Degiannakis, Stavros, George Filis, Tony Klein and Thomas Walther (2019), “Forecasting Realized Volatility of Agricultural Commodities”. Forthcoming in: International Journal of Forecasting
Gencay Özbekler, Ali, Alexandros Kontonikas, and Athanasios Triantafyllou (2020), “Volatility Forecasting in European Government Bond Markets”
Hizmeri, Rodrigo, Marwan Izzeldin, Ingmar Nolte and Vasileios Pappas (2019), “A Generalized Heterogeneous Autoregressive Model using the Market Index”
Mastro, Daniele (2014), “Forecasting Realized Volatility: ARCH-type models versus the HAR-RV model”
Molnar, Peter, Erik Haugomb, Henrik Langelanda and Sjur Westgaarda (2014), “Forecasting Volatility of the U.S. Oil Market”