## Archive for August, 2004

### Mr Jenner on Channel 9

Wednesday, August 25th, 2004

Robert Scoble, a Technical Evangelist at Microsoft, just came around and interviewed me (and a few other people on our team) for Channel 9 - look for the video to be there in a week or two.

If I'd known I was going to be on TV I would have brushed my hair this morning.

### A new approach to the Monty Hall problem

Tuesday, August 24th, 2004

Reams and reams have been written about the Monty Hall problem, but no-one seems to have mentioned a simple fact which, once realised, makes the whole thing seem intuitive.

The Monty Hall show is a (possibly fictional, I'm not sure) TV gameshow. One couple have beaten all the others to the final round with their incredible skill at answering questions on general knowledge and popular culture, and now have a chance to win a Brand New Car. There are three doors. The host explains that earlier, before the couple arrived, a producer on the show rolled a dice. If a 1 or a 4 was rolled, the car was placed behind the red door. If a 2 or a 5 was rolled, it was placed behind the blue door and if a 3 or a 6 was rolled, it was placed behind the yellow door.

The host invites the couple to pick which door they think the car is behind. He then opens one of the other two doors and there's no car behind the door! (He knows where the car is, so he can always arrange for this to happen). Then the host asks the couple if they want to change their mind about which door they think the car is behind. Should they change? Does it make a difference.

Most people's first reaction is that it can't matter. How can it? The car has a one in three chance of being behind each of the doors.

No-one would argue that the car has anything but a probability of 1/3 of being in behind the door the couple picked (say it's the red door). But when the host opens the blue door, magic happens. The probability of the car being behind the blue door suddenly goes to zero. The probability can't vanish (otherwise there would only be a 2/3 probability of there being a car at all) and it can't go to the red door so this ghostly 1/3 probability-of-there-being a car goes to the yellow door. The car now has a 2/3 probability of being behind the yellow door. "Poppycock!" most people would say. Probability isn't this "magic stuff" that can travel between doors. But the correct answer is that the couple should change doors - the car really does have a 2/3 probability of being behind the yellow door.

If you're in doubt, you could simulate the situation with a computer program, run it lots of time for the choices "never change doors" and "always change doors" and see what fraction of the time in each case the couple wins the car. You will find that changing makes you win 2/3 of the time, and sticking 1/3. Or you could enumerate the possibilities:

1/3: Couple picks correct door in the first place. If they change, they lose.
2/3: Couple picks the wrong door. The other wrong door is then eliminated, so if they change, they win.

So changing has a 2/3 probablity of winning. This reasoning sounds like a more plausible argument for changing doors.

The key to this matter, and what makes the whole thing confusing to those who don't realise this, is that probability depends on what you know. If you think about this for a while, it becomes obvious. A fair coin, when tossed, has a 50% probability of landing on heads. However, once the event has happened, the probablity collapses to 0% (if it landed tails up) or 100% (if it landed heads up). Let event A be the tossing of a coin at noon, and success defined by the coin landing heads up. At five seconds to noon the probability of success is 0.5. At five seconds past noon, when everybody can see that the coin landed heads up, the probability of success is 1.0. If the coin is tossed and it rolls under the sofa, then at five seconds past there is still a 50% chance of success. Although the coin has landed, no-one knows what the result is. Probability depends on what you know. If you know nothing about the coin, the probability of success is 0.5.

Suppose a neutral third party is the only one to see the coin, and says "I'm not going to tell you what it says, but I'm going to roll a dice (behind your back, so you can't see it). If it comes up even, I'll say "heads", whatever the coin says. If it comes up odd, I'll say what the coin says. But I won't tell you whether the dice came up odd or even." Suppose this third party then tells says "heads". There's a 50% chance that this was because he rolled an even number and a 50% chance that that's what the coin really said. What is the probability of success now? Well, we can enumerate the possibilities again and notice that of the four equally likely possibilities (Heads+even, Heads+odd, Tails+even, Tails+odd) the only one we've eliminated is Tails+odd, since in that case he would have said "tails". Of the remaining three possibilities (which are still equally likely), two of them involve success so the probability of success is 2/3. We can check this as follows: He says "heads" three out of four times, so the probability of success is (2/3)*(3/4)+0*(1/4)=2/4=1/2 (since we know there is 0 probability of success if he says "tails"). This is the answer we expected.

We conclude that, by cleverness, we can do a "partial collapse" of the probability by finding out a bit of information (if not all of it). In this case the knowledge that the neutral third party said "heads" doesn't give as much information about the state of the coin as seeing the coin itself - it doesn't tell us for definite whether we have heads or not, but it does impart enough information to change the probability.

This is exactly what happens in the Monty Hall problem. The host imparts some information to the couple about which door the car is behind, but not enough to tell for the couple to tell for definite which door the car is behind - just enough to shift the probability in favour of the door which they would choose if they opted to "change". If it was a complete probability collapse (i.e. if he opened any two doors) no-one would be in any doubt as to whether they should change or not. It's just because the probability has only partially collapsed that people get confused.

Justin sent me this email:

I read your paper on "Monte Hall Strikes Back." and absorbed that probability depends on what you know. Following is the question that I made up and is having trouble "partial collapsing" it. Maybe you can help me out with an insight:

There are two doors, door #1 and door #2 behind which two real numbers are written at random. You get a prize if you choose the door with larger real number. At this point, the probability of winning a prize is 1/2. However, you get a chance. You first choose a door, and Monty shows you the number behind that door. What should you do in order to do better than 1/2? Or is it even possible to do better than 1/2?

The answer to the question hinges on how these two real numbers are chosen. If all real numbers are equally likely, you can never do better than 1/2 because for any real number x the size of the set of real numbers smaller than x is exactly the same size as the set of real numbers larger than x (this is easy to prove, just pair them up: for all y>0, pair up x-y with x+y).

Of course, the game can't work like that in the real world because most of the real numbers are extremely large (either positive or negative) and require more atoms than there are in the universe to write down.

Suppose we have a more reasonable probability distribution for $x$, $P(x <= r<= x+dx) = f(r)dx[/latex]. Then, given the value behind the first door, $x$, you can calculate the probability of $y$ (the number behind the second door) being less than $x$, $\displaystyle P(y < x) = \int_{-\infty}^xf(r)dr = F(x)$, and switch doors if $P(y\frac{1}{2}) = F(x)$ P(win; F(x)<\frac{1}{2}) = 1-F(x)$ $\displaystyle P(win) = \int_{-\infty}^zf(x)(1-F(x))dx+\int_z^\infty f(x)F(x)dx$ where $F(z) = \frac{1}{2}$. Integrating by parts gives $P(win) = \frac{3}{4}$. However, the probability of winning using the optimal strategy after you know x depends on the distribution and on the value of x, but can be anywhere between 1/2 and 1.