Lemon Candy and Dynamic Programming

Over at TheDailyWtf, hidden among some comments was an interesting dynamic programming problem:

Consider this problem:

George bought a sack of 100 pieces of candy at the store. 90 of the pieces are lemon flavored and ten are cherry flavored. Of the two, George prefers the lemon flavored candies.

Every day George randomly picks a piece of candy out of the bag. If it is lemon flavored, he eats it and puts the bag away for the next day.

But if the candy he chose is cherry flavored, he puts it back in the bag and then randomly picks a candy out of the bag and eats it regardless of the flavor. In other words, he’ll only put a piece of candy back at most once per day.

What are the odds that when one piece of candy remains, it will be lemon flavored?

I posed the problem at a company where I used to work. All but one person tried to do it recursively. The remaining person tried to do it using an Excel spreadsheet!!!

Maybe I (and some of the other posters on that thread) are morons, but Excel (or in my case, OpenOffice) seemed like a fine way to solve it, so I did.

spreadsheet (excel format) | spreadsheet (openoffice format) — note that the work was done in OpenOffice Calc, and there is no guarantee that the Excel sheet will work identically.

Eating A Cherry

Note that we can only lose a cherry if we draw a cherry, then replace, then draw a cherry again. Since this is drawing with replacement, it’s just the square of the odds of picking a cherry at any point. The probability of eating a Cherry at any point is given by:

 P(Cherry)^2  =  (C/(C+L))^2

This is calculated for all points the “lose cherry” sheet.

Eating A Lemon

To lose a lemon, either we:

  1. draw a lemon then eat it
  2. draw a cherry, then draw a lemon.

A way to simplify this down is to view this a “not taking a cherry” ( ^Cherry, or !Cherry -> 1 – Cherry, since we eat either a Lemon, or Cherry, and by the law of total probability, those must sum to one). This calculation is show in the “lose lemon” sheet.

Combining the information…

Suppose C,L are the number of cherry and lemon candies remaining at some point after the first round. The configuration space for the previous round had to therefore be or , since we had to eat a candy to get to . Thus, the probability of getting to C,L (for all C<10 and L<90) is:

    P(C+1,L) * P(eating a cherry)  +  P(C,L+1) * P(eating a lemon)

We also know that P(10,90) = 1. The “combined” sheet shows the “dynamic” properly conditional probabilities for all points in the space. Note that the first row and column have special treatment. Another way of handling this would have been to have special “0” rows and columns in the “lose cherry” and “lose lemon” sheets.


“What are the odds that when one piece of candy remains, it will be lemon flavored?”

Looking at the <1,0> (L91 – 1 Lemon) and <0,1> (K92 – 1 cherry) cells, we see that they have probability of 0.08192 and 0.91818 . Dividing these to get odds yields ~ .089:1 odds for, or 11.2222:1 against.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s