Tuesday, December 11, 2007

probability and public policy VI (appendix)

We present here the calculations relevant to the above argument

Politician Q is a frequentist, so he simply reads the probability for heads off the data = 1/4.

Politicians R and S are Bayesians, but in order to simplify the problem, we need to distinguish their hypotheses about the underlying probability from their degree of confidence in those hypotheses. Let us say that each politician judges the probability of their favorite hypothesis to be 9/10. Call politician R's hypothesis (that the coin is fair) A and politician S's hypothesis (that the coin is heavily biased towards Tails) B. Furthermore, we simplify by assuming that A and B are mutually exclusive, i.e. there are no other possible hypotheses under consideration for either politician (we revisit this assumption in the sequel).

A = hypothesis that P(H) = 1/2 (i.e. that the coin is fair)
B = hypothesis that P(H) = 1/100 (i.e. that coin is biased strongly against H)
D = THTT (our data)

Then, using subscripts R and S for the beliefs of the respective politicians, we set

PR(A) = 9/10, so
PR(B) = 1/10, and
PS(B) = 9/10, so
PS(A) = 1/10

It is important to note here that the sequence of heads and tails is completely irrelevant; this is what allows us to apply the binomial distribution to calculate P(D|A) and P(D|B) (which will hold independently of the relevant politician).

[apologies: due to the apparent incompatibility of blogger with html math tags, I will use n{choose}k as short hand for the standard notation]

Binomial distribution as a function of n = number trials, k = number positives (in this case Heads), p = P(H):

f(k;n,p)=(n{choose}k)pk(1-p)n-k

Where,
(n {choose} k) = n! / (k!(n-k)!)

so, for us,

(4 {choose} 1) = 4! / (1!(3)!) = 4

which gives for X ∈ {A, B}:
P(D|X) = 4P(X)1(1-P(X))3=4P(X)(1-P(X))3

Therefore:

P(D|A) = 4(1/2)(1/2)3= 1/4

and

P(D|B) = 4(1/100)(99/100)3=4(1/100)(970,299/1000000)= 3,881,196/100000000 ≈ (3.8x106)/(108) ≈ 4/100

In order to apply Bayes Rule, we also need the value of P(D). Unlike the conditional probabilities just calculated, P(D) will differ for each politician's probability distribution.

PR(D) = PR(A)PR(D|A) + PR(B)PR(D|B) = (9/10)(1/4) + (1/10)(4/100) = (9/40) + (4/1000) ≈ 1/4

PS(D) = PS(A)PS(D|A) + PS(B)PS(D|B) = (1/10)(1/4) + (9/10)(4/100) = (1/40) + (36/1000) = (25/1000) + (36/1000) ≈ 6/100

[excuse the rough rounding, but it won't change the qualitative result even if it introduces small inaccuracies]

Now we can simply apply Bayes' Rule to determine how strongly the politicians R and S will believe their respective favorite theories after presentation with the evidence D:

PR(A|D) = PR(D|A)PR(A)/PR(D) ≈ (1/4)(9/10)/(1/4) = 9/10

PS(B|D) = PS(D|B)PS(B)/PR(D) ≈ (4/100)(9/10)/(6/100) = (2/3)(9/10) = 6/10

It should be clear that we can increase politician S's certaint0y of his original conclusion in the face of this evidence by increasing his initial degree of belief in that conclusion. Note that even if we include more than 2 possible hypotheses, this conclusion still holds (as these hypotheses will be weighted so weakly for each agent as to not be effected by such a small amount of evidence). The point here is just (to reiterate) that given sparse or ambiguous evidence and sufficiently strong priors, it is quite possible for rational agents to disagree dramatically about the pertinent conclusion to draw from the data.

No comments: