This is a pretty well known problem. (First problem the professor in my undergrad algorithms class ever gave to us)
SPOILER
p = probability of heads
1 - p = probability of tails.
flip the coin twice:
head head = p^2
head tail/tail head = 2p(1-p)
tail tail = (1 - p)^2
Notice the two in the middle there. We can just say head tail = Heads and Tail Heads = Tails. If we get HH or PP, just do it again.
Ok, now call the above procedure, our "randomization algorithm"
we can calculate the expected running time -- our algorithm succeeds with probability 2p(1-p). The number of trials we need to run our algorithm for is a geometric random variable. Thus, the expected running time is 1/(2p(1-p)).
thanks for writing this. the formula used in the webpage, (HH+TT+HT)/TH, appears to be off by one. i calculated it myself from the series, i.e. sum(n=0 to inf, 2(n+1)2pq(p^2+q^2)^n), and through the algebra i work it out to the same as yours (doubled).
i'm not sure how the author's formula was derived, but i get the same result calculating the series for a slightly different (but wrong) algorithm: just keep flipping until HT or TH appears (not flipping by pairs). this series is sum(n=1 to inf, (n+1)(qp^n + pq^n)), i.e. sum(n=0 to inf, (n+1)(qp^n + pq^n)) - (p+q). the series works out to 1/(p(1-p)), but that last term would account for the one-off difference.
Good point. I am going off of the original algo by von Neumann, which throws away both flips if you get HH or TT. There are many others, and I don't see any reason against your optimization, if the flips are truly independent.
yeah, indeed -- but i think the optimization is actually broken. if you are just waiting for an HT or a TH in a sequence of flips, then you will just wind up with H..HT or T..TH, so that the very first flip decides the outcome. this is how i first (much too hastily) read the algorithm, but, somewhat surprisingly, found that the expectation matched exactly.
SPOILER
p = probability of heads
1 - p = probability of tails.
flip the coin twice:
head head = p^2
head tail/tail head = 2p(1-p)
tail tail = (1 - p)^2
Notice the two in the middle there. We can just say head tail = Heads and Tail Heads = Tails. If we get HH or PP, just do it again.
Ok, now call the above procedure, our "randomization algorithm"
we can calculate the expected running time -- our algorithm succeeds with probability 2p(1-p). The number of trials we need to run our algorithm for is a geometric random variable. Thus, the expected running time is 1/(2p(1-p)).