Lemon Candy and Dynamic ProgrammingPosted: December 22, 2008
Consider this problem:
George bought a sack of 100 pieces of candy at the store. 90 of the pieces are lemon flavored and ten are cherry flavored. Of the two, George prefers the lemon flavored candies.
Every day George randomly picks a piece of candy out of the bag. If it is lemon flavored, he eats it and puts the bag away for the next day.
But if the candy he chose is cherry flavored, he puts it back in the bag and then randomly picks a candy out of the bag and eats it regardless of the flavor. In other words, he’ll only put a piece of candy back at most once per day.
What are the odds that when one piece of candy remains, it will be lemon flavored?
I posed the problem at a company where I used to work. All but one person tried to do it recursively. The remaining person tried to do it using an Excel spreadsheet!!!
Maybe I (and some of the other posters on that thread) are morons, but Excel (or in my case, OpenOffice) seemed like a fine way to solve it, so I did.
Eating A Cherry
Note that we can only lose a cherry if we draw a cherry, then replace, then draw a cherry again. Since this is drawing with replacement, it’s just the square of the odds of picking a cherry at any point. The probability of eating a Cherry at any point is given by:
P(Cherry)^2 = (C/(C+L))^2
This is calculated for all points the “lose cherry” sheet.
Eating A Lemon
To lose a lemon, either we:
- draw a lemon then eat it
- draw a cherry, then draw a lemon.
A way to simplify this down is to view this a “not taking a cherry” ( ^Cherry, or !Cherry -> 1 – Cherry, since we eat either a Lemon, or Cherry, and by the law of total probability, those must sum to one). This calculation is show in the “lose lemon” sheet.
Combining the information…
Suppose C,L are the number of cherry and lemon candies remaining at some point after the first round. The configuration space for the previous round had to therefore be or , since we had to eat a candy to get to . Thus, the probability of getting to C,L (for all C<10 and L<90) is:
P(C+1,L) * P(eating a cherry) + P(C,L+1) * P(eating a lemon)
We also know that P(10,90) = 1. The “combined” sheet shows the “dynamic” properly conditional probabilities for all points in the space. Note that the first row and column have special treatment. Another way of handling this would have been to have special “0” rows and columns in the “lose cherry” and “lose lemon” sheets.
“What are the odds that when one piece of candy remains, it will be lemon flavored?”
Looking at the <1,0> (L91 – 1 Lemon) and <0,1> (K92 – 1 cherry) cells, we see that they have probability of 0.08192 and 0.91818 . Dividing these to get odds yields ~ .089:1 odds for, or 11.2222:1 against.