Please help me with my problem ... I worry about the following type of
random distribution:
Supposing a million people, without reference to each other's choices,
each choose a 'random' number between 1 and 1,000,000.
a percent of numbers will be chosen 0 times
b percent of numbers will be chosen 1 time
c percent of numbers will be chosen 2 times
etc.
How can I find a, b, c, d etc., and what is the formula governing
these values if the ratio between guessers and numbers is varied?
Many thanks if you can help.
Please help me with my problem ... I worry about the following type of random variable?
Whew!
OK -- we're going to start with a _vastly_ simplified version of this problem. This will let us examine the mechanics of this problem without getting buried in hopelessly large numbers of calculations . . .
=== Five people randomly choosing 1, 2, 3, 4, or 5. ===
First of all, there are 5 * 5 * 5 * 5 * 5 different possible ways of the five numbers being chosen. (I'm going to stick with permutations, rather than combinations, so that we can keep track of the individual people.) That's 3125 total outcomes.
Of those -- some of them have no repeats. The permutation would look like 5 * 4 * 3 * 2 * 1 = 120 ways. So, from this possibility, we get that this distribution . . .
0% of numbers will be chosen 0 times
100% of numbers will be chosen 1 time
0% of numbers will be chosen 2 times
0% of numbers will be chosen 3 times
0% of numbers will be chosen 4 times
0% of numbers will be chosen 5 times
. . . has a 120 / 3125 probability (0.0384) of happening.
Now -- what if one number is chosen twice? We'd calculate the number of possibilities by multiplying 5 * _1_ * 4 * 3 * 2, times one more factor of 10. The "1" that I underlined is because the second person needs choose the same number that the first person picked. (Then the other three people need to choose new numbers.) The extra factor of 10 is because it doesn't have to be the first two people who chose the same number -- and there are 10 ways that you can select two people from the group of five. [That's 5 C 2 for you mathematicians!]) Anyway -- that's a total of 1200 ways. So, from _this_ possibility, we get that this distribution . . .
20% of numbers will be chosen 0 times
60% of numbers will be chosen 1 time
20% of numbers will be chosen 2 times
0% of numbers will be chosen 3 times
0% of numbers will be chosen 4 times
0% of numbers will be chosen 5 times
. . . has a 1200 / 3125 probability (0.384) of happening.
Now -- what if we get "three of a kind"? We'd calculate the number of possibilities by multiplying 5 * 1 * 1 * 4 * 3, times a factor of 10. The extra factor of 10 in _this_ case is because there are 10 ways that you can select _three_ people to choose the same number from the group of five. (10 C 2 equals 10 C 3 -- you mathematicians know why!) Anyway -- that's a total of 600 ways. So, from _this_ possibility, we get that this distribution . . .
40% of numbers will be chosen 0 times
40% of numbers will be chosen 1 time
0% of numbers will be chosen 2 times
20% of numbers will be chosen 3 times
0% of numbers will be chosen 4 times
0% of numbers will be chosen 5 times
. . . has a 600 / 3125 probability (0.192) of happening.
_Four_ of a kind?
5 * 1 * 1 * 1 * 4, times an extra factor of 5. (5 C 4 and all . . .)
Total of 100 ways. So . . .
60% of numbers will be chosen 0 times
20% of numbers will be chosen 1 time
0% of numbers will be chosen 2 times
0% of numbers will be chosen 3 times
20% of numbers will be chosen 4 times
0% of numbers will be chosen 5 times
. . . has a 100 / 3125 probability (0.032) of happening.
_FIVE_ of a kind?!?
5 * 1 * 1 * 1 * 1 = 5 ways. (All 1s, all 2s, all 3s, all 4s, or all 5s . . .)
80% of numbers will be chosen 0 times
0% of numbers will be chosen 1 time
0% of numbers will be chosen 2 times
0% of numbers will be chosen 3 times
0% of numbers will be chosen 4 times
20% of numbers will be chosen 5 times
. . . has a 5 / 3125 probability (0.0016) of happening.
Now -- there are still a couple that we haven't talked about. What about _two_ pair?
5 * 1 * 4 * 1 * 3, times an extra factor of 10, times _another_ extra factor of 3, _and_ divided by 2.
=== What?!? ===
Well -- the first extra factor is because there's 10 ways of picking two people to choose the first pair.
The _second_ extra factor is because there's 3 ways of picking two _more_ people for the _second_ pair from the three people who are left after the first pair was picked.
The dividing by two takes care of the duplication that occurs when the two pairs of people "switch numbers." It's a different permutation, but it's not a new _combination_ . . .
(Got all that?)
Anyway -- 900 ways.
20% of numbers will be chosen 0 times
0% of numbers will be chosen 1 time
80% of numbers will be chosen 2 times
0% of numbers will be chosen 3 times
0% of numbers will be chosen 4 times
0% of numbers will be chosen 5 times
. . . has a 900 / 3125 probability (0.288) of happening.
Almost there. One more possibility. Full house . . .
5 * 1 * 1 * 4 * 1, times an extra factor of 10. (Ten ways to choose the three people who picked the three of a kind. After they've been chosen, there are only two people left to pick the pair, so no extra factors are needed for that.)
200 ways.
60% of numbers will be chosen 0 times
0% of numbers will be chosen 1 time
20% of numbers will be chosen 2 times
20% of numbers will be chosen 3 times
0% of numbers will be chosen 4 times
0% of numbers will be chosen 5 times
. . . has a 200 / 3125 probability (0.064) of happening.
Now notice -- if you add up all of the ways (or all of the probabilities), you will notice that we have, in fact, accounted for every possibility.
But we're not done. What you really wanted to know was the _expected value_ for this problem. So we now need to categorically take each _possibility_ and multiply it by its _probability_. Put simply, take each little table from this answer, and multiply the percentage in each row by the probability for the entire table. (I won't take you through the process in painstaking detail -- here's the results . . .)
Over the long run . . .
27.008% of numbers will be chosen 0 times
35.2% of numbers will be chosen 1 time
32% of numbers will be chosen 2 times
5.12% of numbers will be chosen 3 times
0.64% of numbers will be chosen 4 times
0.032% of numbers will be chosen 5 times
Now -- that's the general idea. If you'd like to use this background information to get started on your million-person, million-number problem -- great! I, on the other hand, am going to stop here.
However -- if you'd like me to come back and try this out for varying numbers (e.g. four people choosing from seven numbers), add an additional comment to your question, and I'll come back and add an edit to this answer . . .
:-)
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment