**The extreme value fallacy**

If you take a number of samples of a random variable and
put them in order of magnitude, the *extreme* values are the *largest*
and *smallest*. These extreme values exhibit special distributions of their
own, which depend on the **distribution** of the original variate and the **number**
of ranked samples from which they were drawn. The fallacy occurs when the
extremes are treated as though they were single samples from the original
distribution.

A very common example is the *birth month fallacy*,
which recurs in the media several times a year. It usually takes the form of a
headline such as **People born in July are more likely to get toe-nail cancer**
or more fancifully **Cancerians are more likely to get toe-nail cancer**.
What the “researchers” have done is look at the statistics for each of the
twelve months, pick out the biggest, and then marvel that it seems large
compared with what you would expect for a random month. The expectation (or mean)
for the largest value actually increases (logarithmically) with the number of
samples from which it was drawn.

For a fully worked example from a real media story see the case of anorexia.

There are, of course, cases where the statistics of
extremes is paramount. The *distribution of largest values* applies in
cases such as floods or peak annual temperatures, and of course to all forms of record;
while the *distribution of smallest* values applies to strength of
materials problems, where the principle of a chain being as strong as its
weakest link dominates, or to such phenomena as droughts or the duration of
human life. The rigorous theory has been fully worked out, starting in the 1920s
with the great R A Fisher, but refined in the 1950s by the likes of the great
authority on the subject E J Gumbel. Despite this, engineers and scientists were
long after applying the normal distribution to phenomena for which it could not
possibly apply, such as the breakdown of electrical insulation.

The mathematics for, say, calculating the expected value
for the largest of a given number of samples from a known distribution is rather
complicated, but thanks to a neat piece of mathematics by Gumbel involving L’Hôpital’s
rule, the most likely value (the mode) is easy to calculate. For *n*
samples from a distribution *F*(*x*)
the *characteristic largest value* is defined as the value for *x* at
which *F*(*x*)=1-1/*n*. It is shown that the characteristic
largest value is a good approximation for the mode of the largest valued
distribution. Since inverse distributions are available in such packages as
MathCad ®, this is easy to calculate, which is why the mode has been used in
these pages to illustrate such phenomena as records.

Another important quantity is the return period, which is
the average time interval between occurrences of a particular value, say the
annual maximum flood. The return period for a distribution *F*(*x*)
where the samples of *x* are regularly spaced in time is simply 1/(1-F(x)).
One of the mainstays of myths such as global warming is holding up examples of
extreme floods or heat waves, when these often have a return period of only
about a century. This is a doubled up extreme value fallacy, as they not only
select the largest value in time but also in space (i.e. geographical location).