How many coin flips until you get a head?

Posted by Diego Assencio on 2014.05.21 under Mathematics (Statistics and probability)

Suppose you have a coin which, when flipped, yields a head with probability $p$ and a tail with probability $q = 1-p$ (if the coin is fair, then $p = q = 1/2$). How many times do you need to flip this coin on average until you get a head?

To answer this question, consider the probability $P(n)$ of obtaining a head only on the $n$-th coin flip. In other words, $P(n)$ is the probability of obtaining $n-1$ consecutive tails and then a head: $$ P(n) = q^{n-1} p $$ The average number of coin flips we need until we get a head is then: $$ E_p[n] = \sum_{n=1}^\infty nP(n) = \sum_{n=1}^\infty nq^{n-1} p \label{post_6c1cfa313064329046317358d2aa22c0_mean_n} $$ All we need to do now is to compute the infinite sum on the right-hand side of equation \eqref{post_6c1cfa313064329046317358d2aa22c0_mean_n}. To achieve that goal, consider the following sequence: $$ e_k = \sum_{n=1}^k nq^{n-1} p = p\sum_{n=1}^k nq^{n-1} \label{post_6c1cfa313064329046317358d2aa22c0_ek} $$ The sequence $e_k$ is such that $E_p[n] = \lim_{k \rightarrow \infty} e_k$. It is closely related to the geometric series. In fact, consider the sequence $g_k$ which is obtained by taking the first $k$ terms of the geometric series as its $k$-th element (below $z$ is any real number of our choice): $$ g_k = \sum_{n=0}^k z^n = \displaystyle\frac{1 - z^{k+1}}{1 - z} $$ Taking the derivative of $g_k$ with respect to $z$ yields: $$ \displaystyle\frac{d g_k}{dz} = \sum_{n=1}^k nz^{n-1} = -\frac{(k+1)z^k}{1 - z} + \frac{1 - z^{k+1}}{(1 - z)^2} \label{post_6c1cfa313064329046317358d2aa22c0_dgk_dz} $$ where the term with $n = 0$ was omitted since it is identically zero. Comparing equations \eqref{post_6c1cfa313064329046317358d2aa22c0_ek} and \eqref{post_6c1cfa313064329046317358d2aa22c0_dgk_dz}, we see that: $$ e_k = p\sum_{n=1}^k nq^{n-1} = p\displaystyle\frac{d g_k}{dz}\Bigg|_{z = q} $$ In other words, $e_k$ is equal to $p$ times the derivative of $g_k$ with respect to $z$ computed at $z = q$: $$ e_k = p\left(-\frac{(k+1)q^k}{p} + \frac{1 - q^{k+1}}{p^2}\right) = -(k+1)q^k + \frac{1 - q^{k+1}}{p} \label{post_6c1cfa313064329046317358d2aa22c0_ek2} $$ If $p = 0$, we will never get a head and the problem is therefore of little interest. For $p \gt 0$, $q \lt 1$ and therefore $q^k \rightarrow 0$ as $k \rightarrow \infty$. Using this fact on equation \eqref{post_6c1cfa313064329046317358d2aa22c0_ek2}, we obtain: $$ \boxed{ E_p[n] = \lim_{k\rightarrow \infty} e_k = \displaystyle\frac{1}{p} } $$ Interestingly, the answer for this problem is extremely simple: the average number of times we need to flip the coin until we get a head is just $1/p$. If $p = 1$, every coin flip results in a head, so $E_{p=1}[n] = 1$. As $p$ approaches $0$, the coin becomes more and more biased towards tails, so the average number of times we need to flip it until we get a head increases and, not surprisingly, diverges as $p \rightarrow 0$. Since for a fair coin we have $p = 1/2$, then: $$ \boxed{ E_{p=1/2}[n] = 2 } $$ so we need on average to flip the coin only twice to get a head.


Julien on Sep 02, 2016:
There's a simpler way to do this. Call W the wait time for a head. After one flip, we hit heads with probablity p, and we go back to our initial state with probability q. Thus


W &= 1 + p*0 + q*W\\

W &=1 + qW\\

W - qW &=1\\

Diego Assencio on Sep 02, 2016:
@Julien: Thank you very much for your comment! I had already seen this approach before, but to be honest, I do not find it easier to understand than the one presented here (especially for a beginner in this topic).

The way I see it, defining $P(n)$ as above is something that anyone with basic knowledge of probability theory can understand, and the definition of $E_p(n)$ is, well, just the well-known definition of the mean value of $n$. The rest is just legwork :-)
Julien on Sep 03, 2016:
Thank you for the reply Diego. In your method, the reader must understand the following:

1. How to compute an infinite sum
2. How to take a limit
3. How to take a derivative
4. How to either know the derivation of or take at face value the definition of expectation
5. How to know the geometric series evaluation

That's a lot more than "basic knowledge of probability theory"! In fact, I'd challenge you on this: You implicitly use the formula for the sum of the geometric series, so your method must be at least as complicated as that. In my method, the proof essentially is identical to the proof for the sum of the geometric series. Here's another way to write my method:

W &= 1 + q(1 + q(1 + q( ...\\
W &= 1 + q + q^2 + q^3 + ...\\
Wq &= q + q^2 + q^3 + ...\\
W - Wq &= 1 \\
W(1 - q) &= 1\\
W &= 1/p\\

That's essentially the proof of the sum of geometric series, right? So isn't your method a strict superset in complexity of my method?
Diego Assencio on Sep 03, 2016:
@Julien: You're right. I underestimated the complexity of the "legwork" on my derivation. I also fixed your comment -- hope you don't mind :-)

I still don't see how equation (10) on your first comment (or equation (16) on the second one) is a trivial fact which can be used as a starting point on the derivation. I can only see it is correct using equations (1) and (2):

$\begin{eqnarray}\displaystyle E_p[n] &=& \sum_{n=1}^\infty nq^{n-1} p
\\[5pt] & = & p + \sum_{n=2}^\infty nq^{n-1}p
\\[5pt] & = & p + \sum_{n=1}^\infty (n+1)q^{n}p
\\[5pt] & = & p + \sum_{n=1}^\infty nq^{n}p + \sum_{n=1}^\infty q^{n}p
\\[5pt] & = & p + q\sum_{n=1}^\infty nq^{n-1} p + \sum_{n=1}^\infty q^{n} p
\\[5pt] & = & p + qE_p[n] + \left(\frac{1}{1-q} - 1\right)p
\\[5pt] & = & p + qE_p[n] + 1 - p
\\[5pt] & = & qE_p[n] + 1

Now $E_p[n] = 1/p$ follows trivially. In any case, this is indeed easier than the proof I presented above, so thanks for that :-)
Ilya on Apr 16, 2018:
For more drama, one can count the average number of shots in the game of Russian roulette before it ends. p=1/N, where N is the number of chambers in the revolver, and the average number of shots would be N.
Yvo Menezes on Jun 03, 2021:
Could you give more details on why the term $(k + 1)q^k \rightarrow 0 \space \text{when} \space k \rightarrow \infty \space \text{and} \space 0 \le q \le 1 $? Isn't it indeterminate? Thank you very muck for this article.
Diego Assencio on Jun 06, 2021:
@Yvo: The only interesting values for $q$ are within the range $0 \lt q \lt 1$ because $q = 0$ means every coin toss always yields a head while $q = 1$ means every coin toss always yields a tail.

For $0 \lt q \lt 1$, the fact that $kq^k \rightarrow 0$ as $k \rightarrow \infty$ is a well-known result from Calculus. From that, it follows that $(k + 1)q^k \rightarrow 0$ as $k \rightarrow \infty$ as well.