How do Relative Risk and Odds Ratio compare?
In the FAQ on Relative Risk, there was advice against the use of Odds Ratio, given without explanation. The reason for this is that an explanation needs more mathematics than many readers are comfortable with. The question keeps recurring, however, so you might wish to skip the next section and proceed to the discussion below.
Forget about statistics for the moment.
Imagine you are involved in a study that involves two physical variables and the way they track each other. You are offered the choice of one of two possible functions, the simple ratio and a rather more complicated one. Letís call them f and g.
Some thoughts might occur to you:
|Why use a complicated function when a simple one would do (Occamís razor)?|
|The function f is well known to anyone with a modest experience of functions of two variables. Without having to plot it, we know that it forms a series of straight lines in one plane and a series of rectangular hyperbolae in the orthogonal plane.|
|On the other hand, g is not nearly so well behaved. It has a double pole at (1,0) and a double zero at (0,1). In the regions near these roots it reveals an exaggerated response to changes and errors.|
|If you know two of f, x and y, it is easy to calculate the other.|
|This is not true in the case of g, which would lead to the need to solve an implicit equation.|
|Likewise, if we know f or g we cannot calculate the other without knowing x or y.|
|If we did not have a computer handy, we would not even consider the choice of g.|
|If some workers are using f and others g, it is difficult to compare their results.|
You might well come to the conclusion that g has nothing going for it. Now substitute for x the estimated probability, p, that an event occurs in a target group and for y the probability, q, that an event occurs in a control group. We have then defined the Relative Risk as f, and the Odds Ratio as g.
Here they are plotted on a plane of probabilities.
The Relative Risk
The Odds Ratio
Note the different ranges of the dependent variables for the same ranges of probabilities .
Relative Risk and Odds Ratio are defined as:
Where p is the estimated probability of an event occurring in a target group and q is the estimated probability of the same event occurring in a control group. RR is a simple and well understood ratio, while OR is unnecessarily complicated and difficult to unravel, though it tends to produce numbers that are more impressive to the less numerate.
For these reasons, you might consider it rather eccentric to apply the Odds Ratio, unless it were your desire to be obscure or misleading.
So why does the Odds Ratio exist? Well, a motivating force for the development of probability theory was the popularity of games of chance, particularly among the aristocrats who were often the patrons of early mathematicians. Tartaglia applied the new theory of permutations to the throwing of dice as early as 1523.
Odds are concerned with the fair return in a game of chance. If the probability is p and the odds are d, then they are related by:
Thus the chance of throwing a six with a fair die is 1/6, while the odds are 1:5 and a fair bet would be 5:1 against. In gambling, of course, the published odds also contain the profit element of the bookmaker or casino. The concept of a fair bet has no relevance to statistical inference in science.
We see that d is the type of ratio that exaggerates changes and errors. Furthermore, since q is representative of the general population, it should be (within the bounds of sampling error) constant; so the Odds Ratio will always seem to produce results that are more impressive than does Relative Risk. It is easy to see that (apart from the trivial zero case) d is always greater than p.
The fact that OR and RR exist side by side in certain sections of the literature of applied statistics is a generator of confusion and an impediment to fair comparison. It is difficult to see any justification for the use of odds in what purports to be scientific study. It is just another example of the misleading effects of statistical computing packages when they are used without understanding or disinterest.
Return to FAQs