The statistical term ‘fat tails’ refers to probability distributions with relatively high probability of extreme outcomes. Fat tails also imply strong influence of extreme observations on expected future risk. Alas, they are a plausible and common feature of financial markets. A summary article by Nassim Taleb reminds practitioners that fat tails typically invalidate methods and conventions applied in quantitative finance. Standard in-sample estimates of means, variance and typical outliers of financial returns are erroneous, as are estimates of relations based on linear regression. The inconsistency between the evidence of fat tails and the ongoing dominant usage of conventional statistics in markets is plausibly a major source of inefficiency and trading opportunities.

Taleb, NassimNicholas (2018), “The Statistical Consequences of Fat Tails: Research and Commentary”, Technical Incerto, Vol 1.

The post ties in with SRSV’s summary lecture on setback risk, particularly the section on exit risk and shock effect indicators.
It also ties in with the previous week’s post on “The importance of volatility of volatility“, since the realization of fat tails and unpredictability of return variance is a key source of the vol-of-vol risk premium.

The below are quotes from the book chapter. Emphasis and cursive text have been added. Some technical terms and terms that can only be understood in the context of the book have been replaced.

### What are fat tails?

“If we take a [normal or Gaussian] distribution…and start fattening it, then the number of departures away from one standard deviation drops. The probability of an event staying within one standard deviation of the mean is 68 per cent. As the tails fatten, to mimic what happens in financial markets for example, the probability of an event staying within one standard deviation of the mean rises to between 75 and 95 per cent. So note that as we fatten the tails we get higher peaks, smaller shoulders, and a higher incidence of a very large deviation.” N.B.: If an asset return simply is governed by high- and low-variance regimes (with normal distributions) the combination will have fat tails.

“[For] a class of distributions that is not fat-tailed… the probability of two 3-standard deviations events occurring is considerably higher than the probability of one single 6-standard deviations event… In other words, for something bad to happen, it needs to come from a series of very unlikely events, not a single one. This is the logic of normal distributions…[By contrast] for [fat-tailed] distributions, ruin is more likely to come from a single extreme event than from a series of bad episodes. This logic underpins classical risk theory but has been forgotten by economists in recent times.”

“With fat tail distributions, extreme events away from the centre of the distribution play a very large role. Black swans are not more frequent, they are more consequential. The fattest tail distribution has just one very large extreme deviation, rather than many departures form the norm.”

### How does quantitative finance deal with fat tails?

“The traditional statisticians approach to fat tails has been to assume a different distribution but keep doing business as usual, using same metrics, tests, and statements of significance…They fall into logical inconsistencies. Once we leave the…zone for which statistical techniques were designed, things no longer work as planned.”

“Statistical estimation is based on two elements: the central limit theorem (which is assumed to work for ‘large’ sums, thus making about everything conveniently normal) and that of the law of large numbers, which reduces the variance of the estimation as one increases the sample size. However, [when tails are fat]…convergence can be very, very slow –it is distribution dependent…Life happens in the ‘preasymptotics’ [a state where observed data cannot give us great certainty on mean, variance and so forth].”

### What are the consequences of fat tails?

“In a world of normal distributions, no observation can really change the statistical properties. In a world of fat-tailed distributions, the tails (the rare events) play a disproportionately large role in determining the properties [of estimated distributions and associated risk].”

“Insurance can only work in a world of thin distributions; you should never write an uncapped insurance contract if there is a risk of catastrophe. The point is called the catastrophe principle.”

“The consequence of moving out of the… zone [of well-behaved distributions] is [that] we encounter convergence problems…The law of large numbers, when it works, works too slowly in the real world…This is more shocking than you think as it cancels most statistical estimators… The law of large numbers tells us that as we add observations the mean becomes more stable, the rate being the square of the number of observations. [The figure below] shows that it takes many more observations under a fat-tailed distribution (on the right hand side) for the mean to stabilize…While it takes 30 observations in the Gaussian to stabilize the mean up to a given level, it takes 1011 observations in the [fat-tailed] Pareto [distribution] to bring the sample error down by the same amount.” “The mean of the distribution will not correspond to the sample mean, particularly if the distribution is skewed (or one-tailed). In fact, there is no fat tailed distribution in which the mean can be properly estimated directly from the sample mean, unless we have orders of magnitude more data than we do (people in finance still do not understand this).”

Standard deviations and variance are not useable… They fail out of sample, even when they exist.”

Linear least-square regression does not work…[due to the] failure of the Gauss-Markov theorem [according to which ordinary least squares estimators are the best linear unbiased estimators under certain conditions]… Either we need a lot, a lot of data to minimize the squared deviations or…the second moment does not exist.”

“The practice of reading into descriptive statistics may be acceptable under thin tails (as sample sizes do not have to be large), but never so under fat tails, except… in the presence of a large deviation.
Let us illustrate one of the problem of thin-tailed thinking with a real world example. People quote so-called ‘empirical’ data to tell us we are foolish to worry about ebola when only two Americans died of ebola in 2016. We are told that we should worry more about deaths from diabetes or people tangled in their bedsheets. Let us think about it in terms of tails. But, if we were to read in the newspaper that 2 billion people have died suddenly, it is far more likely that they died of ebola than smoking or diabetes or tangled in their bedsheets?… It is naïve empiricism to compare these processes, to suggest that we worry too much about ebola.”

There is no such thing as a typical large deviation. Conditional on having a ‘large’ move, the magnitude of such a move is not defined, especially under serious fat tails (the power law tails class). This is associated with the catastrophe principle…In the Gaussian world, the expectation of a movement, conditional that the movement exceeds 4 standard deviations, is about 4 standard deviations. For a power law it will be a multiple of that.”

Option risks are never mitigated by dynamic hedging. This might be technical for non-finance people but the entire basis of finance rests on the possibility and necessity of dynamic hedging, both of which will be shown to be erroneous.”

### What are serious fat tails (Pareto tails)?

“The Pareto Law… is simply defined as: say a is a random variable. For x sufficently large, the probability of exceeding 2x divided by the probability of exceeding x is no different from the probability of exceeding 4x divided by the probability of exceeding 2x, and so forth. This property is called ‘scalability’…So if we have a Pareto (or Pareto-style) distribution, the ratio of people with \$ 16 million compared to \$ 8 million is the same as the ratio of people with \$ 2 million and \$ 1 million.”

“Although this distribution often has no mean and no standard deviation we can still understand it –in fact we can understand it much better than we do with more standard statistical distributions. But because it has no mean we have to ditch the statistical textbooks and do something more solid, more rigorous, even if it seems less mathematical.”

“A Pareto distribution has no higher moments: moments either do not exist or become statistically more and more unstable… If we do not know anything about the fourth moment, we know nothing about the stability of the second moment. It means we are not in a class of distribution that allows us to work with the variance, even if it exists. This is finance. For silver futures, in 46 years 94 per cent of the kurtosis came from one observation. We cannot use standard statistical methods with financial data. “

### How to work with fat tails?

“While there is a lot of uncertainty and opacity about the world, and an incompleteness of information and understanding, there is little, if any, uncertainty about what actions should be taken based on such aincompleteness, in any given situation… Our grandmothers understand fat tails. These are not so scary; we figured out how to survive by making rational decisions based on deep statistical properties.”

“We do not know the variance. But we can work very easily with Pareto distributions. They give us less information, but nevertheless, it is more rigorous if the data are uncapped or if there are any open variables.”

“The trick is to estimate the distribution and then derive the mean. This is called plug-in estimation… Once we have figured out the distribution, we can estimate the statistical mean. This works much better than observing the sample mean. For a [one-tailed] Pareto distribution, for instance, 98% of observations are below the mean. There is a bias in the mean. But once we know we have a Pareto distribution, we should ignore the sample mean and look elsewhere.” 