Matthew Steel

EMAIL
CV [PDF]
EDUCATIONAL DOCS

A Few Arguments for the Standard Deviation Should Be Retired

Nassim Taleb's case against the standard deviation at Edge.org has succeeded in sparking some debate. Despite Taleb's (uncharacteristic) restraint, though, the discussion has generated more heat than light. This post clarifies a few points of discussion, rebutting some poor arguments in the hope that we can focus on the better ones

Taleb makes two main points. He believes that the mean of absolute deviations ("MAD") is a more intuitive and more useful measure of variation than the standard deviation ("STD"), and he believes that the name "standard deviation" (an apparent historical accident) is itself misleading.

Most objections to the article attempt to demonstrate common uses in which the STD measure is more convenient. Taleb admits that such cases exist (and even enumerates a few), but his argument would be undermined if they were pervasive. As might be expected, the most damaging objections are the most difficult to substantiate.

To begin:

Standard deviations provide good "rule of thumb" confidence intervals for normally distributed data.

Taleb's article wasn't terribly technical, but it did contain a counter-argument in anticipation of this objection. To wit:

In the Gaussian world, STD is about ~1.25 time MAD, that is, the square root of (Pi/2)

MAD-based confidence intervals are very similar to STD-based ones &mdash. In particular, the area within a given number of MADs is invariant under changes to the distribution's parameters. Below is a table of two-tailed probabilities for observations lying within integer MAD and STD areas:

STDMAD
10.6826890.575063
20.9545000.889460
30.9973000.983319
40.9999370.998585

The best argument to be made for the use of STD here is that the 2-sigma number is round-ish, and if we want 95% confidence intervals it's easier to multiply by 2 than by 2.5. This is a reasonable objection, if a little unfair: we probably use 95% confidence intervals because they lie two standard deviations from the mean.

Without looking at the appropriateness of particular values for first-cut confidence intervals, I don't think an argument either way can be terribly convincing.

The standard deviation is a fundamentally related to the normal distribution.

This objection covers a lot of ground, but it essentially boils down to the idea that the usual parametrization of the normal distribution by its mean and its standard deviation is "natural". Under this parametrization, the distribution's probability density function is

$$\frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$$

In terms of the MAD ("d" in the equation below), it's

$$\frac{1}{d\pi} e^{-\frac{(x-\mu)^2}{d^2\pi}}$$

This is pretty damning. Parametrizing the variation of the normal distribution by STD instead of MAD complicates its PDF (or at least makes it no simpler.) It seems difficult to hold that MAD is any less natural or fundamental to the normal distribution, at least by this particular measure.

(Of course, this isn't a strong argument for the MAD either.)

The standard deviation "penalizes" observations further from the mean more severely, and this is a useful property.

The use of the idea of a "penalty" here betrays some confusion between the standard deviation (a measure of spread) and mean squared deviation (a measure of a model's goodness-of-fit.) They're very closely related, but they aren't the same thing.

Taleb does not argue for the use of mean absolute error in modelling applications instead of the mean squared error. (Nor for the use of Manhattan distances over Euclidean distances.) And just as well — the mean squared error is lovely, and it has wonderful, magical properties. Mean absolute error, by contrast, is only used by practitioners with loose morals and bad taste in music.

But this is largely beside the point. We should be asking two questions:

I don't know the answer to those questions, but I will agree with a restricted argument for MAD in contexts involving lay audiences. Taleb's example of daily temperature variations is instructive — the meaning and properties of the MAD statistic are certainly more intuitive than the STD's.

TL;DR: I don't know whether Mutually Assured Destruction is better than Sexually Transmitted Diseases, but at least it's a more interesting discussion than the damn pi/tau "debate".