Wednesday, December 4, 2013

Understanding Monty Hall, and learning Python

Finally back with a post after a long hiatus!

I've finished the Masters that has been sucking the hours out of my life for the last 2 years and suddenly realized that I needed to add some better programming skills to my CV. I started doing the excellent R and interactive Python courses on Coursera. They give a good introduction and provide challenging assignments to help students build their knowledge.

I decided to code up some of the algorithms and concepts I've learned over the last 2 years to  improve my understanding of them and to develop my skills at coding. The first concept I decided to tackle was that of variable change, best captured in the Monty Hall problem.

The Monty Hall problem is a great example of where intuition is at odds with statistics. The player has to guess which of 3 doors of a prize is behind. Upon selection the game show host removes or opens a door that you haven't selected. He then offers you the choice to stick with the door you chose originally or switch to the remaining door.

The most common response is that it shouldn't make any difference; You chose 1 door out of 3 (33% chance) and now there are only 2 doors left so the odds are now 50/50. You may even feel that your odds have improved. So sticking with your initial choice should be as good a choice as switching to the other door.
This is actually wrong! It took me a long time to understand it and the reason is that we get very tied up thinking about winning. Instead lets think about losing. When the game started you had a 2 in 3 chance of picking the wrong door, which means you probably did. Since the other losing door has now been removed it makes the most sense to switch away from the initial (and likely wrong) door you chose in the first place.

Statistical evidence shows that contestants that switch from their initial choice win 2 out of every 3 times. This is what I wanted to code up and demonstrate with python. The table below demonstrates the results I collected over several runs with the python code I developed.


I've included the code snipped below which you can paste into Python or save as a .py and execute.
Appreciate any feedback either if the post was helpful to you, or if you have any improvements to suggest.

<CODE>
import random
import math

def setPrize():


    tmp=[0,0,0]
    door=random.randrange(0,3)

    tmp[door] = 1
    tmp[((door+1) % 3)]=0
    tmp[((door+2) % 3)]=0
       
    return tmp
prize=[0,0,0]
finalDoorz = [0,0]
n=100
   


switchnwin=0.000000
switchnlose=0.000000
sticknwin=0.000000
sticknlose=0.000000


prize = setPrize()

for i in range(0,n):
    prize = setPrize()
    #Player chooses their first door
    choice = random.randint(0,2)
   

    #We put the players choice in position 0
    if prize[choice]==1:
       finalDoorz[0]=1  
       finalDoorz[1]=0        
    elif prize[choice]==0:
       finalDoorz[0]=0  
       finalDoorz[1]=1


    #Second part; stick or switch; 0 stays with the users choice, 1 switches it.
    choiceb= random.randint(0,1)

    if finalDoorz[choiceb]==1:
        status="WIN"
    else:
        status="LOSE"

    if choiceb==0:
        switched="NO"
    else:
        switched="YES"

    if finalDoorz[choiceb]==1 and choiceb==0:
        sticknwin +=1
    elif finalDoorz[choiceb]==1 and choiceb==1:
        switchnwin +=1
    elif finalDoorz[choiceb]==0 and choiceb==1:
        switchnlose +=1
    elif finalDoorz[choiceb]==0 and choiceb==0:
        sticknlose +=1

print "Number of Observations: " + str(n)
print "Switch and Win: " + str(switchnwin/n)
print "Stick and Win: " + str(sticknwin/n)
print "Switch and Lose: " + str(switchnlose/n)
print "Stick and lose: " + str(sticknlose/n)
</CODE>


No comments:

Post a Comment