**Binomial Distribution**

Let’s start of with the tossing of a coin calling one outcome H, for heads and the other T for tails. If we toss it once we get four events to which we can assign numbers representing their probability:

Neither heads or tails 0

Heads ½

Tails ½

Either heads or tails 1

Thus our events H and T have evens probability, which we call ½. Now suppose we toss the coin twice. We have four combinations that can occur: HH, HT, TH, TT. Now we have no reason to suppose that any of these is less probable than another, so we assign them equal probabilities, totalling 1, and grouping them we get the following:

Two heads ¼

One of each ½

Two tails ¼

Because “one of each” can happen in two ways we have to give each of them an equal probability and add them; so we have distributed our probabilities in the proportions 1,2,1. If we now toss the coin three times per experiment we get four possible combinations:

Three heads 1/8

Two heads and a tail 3/8

Two tails and a head 3/8

Three tails 1/8

There are three ways of getting each two-plus-one combination: TTH, THT, HTT and HHT, HTH, THH. So we have now distributed our probabilities in the proportions 1,3,3,1 and divided by the total, as the total probability of the events has to be 1. A pattern is emerging and this pattern is known as Pascal’s triangle, though it was known to Chinese algebraists in 1303, try it for four or five and you can prove it yourself:

1

1 1

1 2 1

1 3 3 1

1 4 6 4 1

1 5 10 10 5 1

………………………..

Each number is found by adding the two either side above it. Thus without any complicated calculations we can predict the probability of any outcome for any number of coin tossings.

Each number in the triangle represents the number of ways
of selecting *r *things from *n* different things, where *n* is
the number of tossings (the number of rows from the top) and *r* is the
horizontal position from the left. As a mathematical short hand we write this ^{n}C_{r
}and we can see from the symmetry of the triangle that it has the important
property that ^{n}C_{r} = ^{n}C_{n-r}.

That is all very well, but what if our coin was biased and
the probabilities of H and T are unequal, but of course sum to unity. Let us set
the probability of H = *p* and the probability of T = *q.*

Then we get for the double tossing:

Two heads *p*^{2}

One of each
2*pq*

Two tails *q*^{2}

For the triple tossing we get:

Three heads *p*^{3}

Two heads and a tail 3*p*^{2}*q*

Two tails and a head 3*pq*^{2}

Three tails *q*^{3}

We can extend this to any number of tossings, *n*, and
have therefore established, by common sense, the Bernoulli theorem:

**If the probability of success
in each trial is p, then the probability of r successes in n
trials is ^{n}C_{r} p^{r} q^{n-r}.**

Let us illustrate this numerically. If our coin was biased, the probability of a head might be 0.4 and of a tail 0.6, so wherever we see a T we put 0.4 and wherever we see an H 0.6. If the idea of a biased coins seems difficult, imagine a black bag containing four discs marked H and six discs marked T. You shake it up, make a selection, identify it then replace it. Then for two selections the probabilities are modified to:

Two heads 0.4x0.4 = 0.16

One of each 2x0.4x0.6 = 0.48

Two tails 0.6x0.6 = 0.36

So for any combination all we do is multiply the
probabilities together and then multiply by the number of ways we can get that
particular combination. We check our calculations by making sure the sum adds up
to 1. We can do this for three, four, ten, one thousand or any number of
tossings and calculate the exact probabilities. This is the **binomial
distribution **and that is all it is, common sense. Of course it is rather
tiring to toss a coin one thousand times, so we get a computer to do it.

It is fairly obvious that the average number of heads we
are going to get is *np*, so this is the average of a binomial
distribution. The standard deviation is the square root of *npq*.

Here are the numbers for n = 10, p = 0.4: