## Random values problem

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

### Random values problem

Say I pick a random number from 0-N, and do this over and over.

How can I find out when I'll be sure to have picked every number from {1,2,3... N}?

Thoughts: It is impossible to be 100% sure that Ive picked every number, but I can determine a 99% or 95% confidence interval if I run a simulation of picking numbers and treat it as a large sample discrete distributions...

I'm looking for the correct way to go about solving this.
FatTony52

Posts: 1
Joined: Sat Jul 28, 2007 5:48 pm UTC

I assume you mean you're picking random integers from 0-N, as the reals are dense.

The first thing that comes to mind is the first time you pick an integer, you have an N/N chance of picking a new one, the second time you have an (N-1)/N...So the expected number of attempts is the sum of the inverses of these fractions: N/N + N/(N-1) + N/(N-2)...= (sum i from 0 to N-1) N/(N-i) = N^2 * ((sum i from 0 to N-1) 1/(N-i)) (forgive my notation). Of course, this doesn't answer your question, it just gives you the expected number of times you have to do it. I'm pretty sure that there's no way to be 100% sure short of picking an infinite number of times, but using standard deviations, you could figure out how many times you need to do it to come as close as you want.

...Hm. That didn't help much. Oh well. *posts*

Blatm

Posts: 638
Joined: Mon Jun 04, 2007 1:43 am UTC

Right. There's certainly no way to be certain that you'll see every coupon after a certain number of steps, since for any finite k it's possible (albeit increasingly unlikely) that you'll just get k copies of the same coupon in the first k steps.

On the other hand, the problem of determining the expected value of how long this would take is called the Coupon Collector's Problem. Shockingly, neither Wolfram nor Wikipedia has anything more than a cursory statement of the problem, but if memory serves me right, it comes out to roughly N * ln(N) as the expected number of tries to get all N coupons. I don't know what the standard deviation is, however, so I'm not sure how you'd calculate the 95% or 99% confidence thresholds.

I'm sure you should be able to find some resources out there, though, now that you've got the name of the problem.
Buttons

Posts: 858
Joined: Wed May 02, 2007 3:27 pm UTC
Location: Somerville