## Random xkcd comic problem

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

### Random xkcd comic problem

I'm sure it's been thought of before, but I can't find it. Here's the question:

If you were to start out on xkcd.com, and press the random button (assuming true randomness) to see different comics, how many times on average would I have to press it to see every single comic? I need to give the problem a second thought, and I'm betting it's not too difficult to figure out, but in the meantime I want to see what others come up with.
thehodapp

Posts: 2
Joined: Thu Feb 02, 2012 10:39 am UTC

### Re: Random xkcd comic problem

At first glance this looks like the birthday problem, where the number of "days" is the number of comics accessible from the "Random" button, and you want to know how many independent samples you'd need before you expect the same one twice.

Only kink I can think of is that maybe "Random" won't give you the same comic twice in a row, which ever so slightly changes the probabilities from complete independence.
In the future, there will be a global network of billions of adding machines.... One of the primary uses of this network will be to transport moving pictures of lesbian sex by pretending they are made out of numbers.
Spoiler:
gmss1 gmss2

gmalivuk
Archduke Vendredi of Skellington the Third, Esquire

Posts: 20294
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There

### Re: Random xkcd comic problem

Yes, that definitely looks like it. I have a working program that simulates the problem, and when I get the chance I'll put up some graphs. I'd like to see a mathematical model for this problem though, so I'll page my way through that Wikipedia link and see if I can figure it out.
thehodapp

Posts: 2
Joined: Thu Feb 02, 2012 10:39 am UTC

### Re: Random xkcd comic problem

What's your stopping criterion: (1) you stop when you've seen every comic at least once, or (2) you stop when there's at least one comic you've seen twice?

(2) is the birthday problem; (1) is the coupon collector problem.
skullturf

Posts: 532
Joined: Thu Dec 07, 2006 8:37 pm UTC
Location: Delaware

### Re: Random xkcd comic problem

Ah, yeah, I misread the OP. The problem described there is definitely coupon collector, rather than birthday.
In the future, there will be a global network of billions of adding machines.... One of the primary uses of this network will be to transport moving pictures of lesbian sex by pretending they are made out of numbers.
Spoiler:
gmss1 gmss2

gmalivuk
Archduke Vendredi of Skellington the Third, Esquire

Posts: 20294
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There

### Re: Random xkcd comic problem

The way it would work is this: assume that you have n comics, it only takes 1 try to find a comic you've never read before if you haven't started yet, but then the average amount of time to find another (different) comic is n/(n-1), as 1 comic has already been read. Now to read the xth comic (x<n) you need n/(n-x) as there are n comics, but only n-x comics that you haven't read yet. This works out to n*Hn, where Hn is the nth harmonic number (harmonic numbers are the sum of the first few reciprocals of natural numbers). If you approximate this you get E(n)=n*(ln(n)+y)+0.5, where y is the euler-macheroni constant.
I have discovered a truly marvelous proof of this, which this margin is too narrow to contain.
tomtom2357

Posts: 555
Joined: Tue Jul 27, 2010 8:48 am UTC

### Re: Random xkcd comic problem

skullturf wrote:What's your stopping criterion: (1) you stop when you've seen every comic at least once, or (2) you stop when there's at least one comic you've seen twice?

(2) is the birthday problem; (1) is the coupon collector problem.

I interpreted it as (1) and tried messing around with it for awhile. I was worried when I saw the first response mentioning the birthday problem since I was like "wtf, have I just been being stupid?" since I had spent a decent amount of time poking around the problem trying get a feel for it before looking at the responses. Based on the wiki page it's a fairly deep problem as far as really characterizing what is going on with the stopping time, but the expected value and variance are trivial to get.
theodds

Posts: 23
Joined: Sat Jun 04, 2011 3:30 pm UTC

### Re: Random xkcd comic problem

This was a fun problem. For completeness, as of this weekend, there are 1011 xkcd comics.

Thus, the expected number of clicks to see all 1011 is, on average, 1011*H_1011 = 1011*7.4964 whose closest integer is 7579.

So, about 7579 clicks to see all xkcd comics. This is suboptimal when compared with say, the previous button.
dissonant

Posts: 63
Joined: Sat Jan 24, 2009 10:33 am UTC

### Re: Random xkcd comic problem

A complication: the xkcd random-comic script keeps track of the last 200 comics its randomly chosen, and won't pick one of those.
While no one overhear you quickly tell me not cow cow.

phlip
Restorer of Worlds

Posts: 6958
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia

### Re: Random xkcd comic problem

Does that really change anything?

The first 200 give you unique comics. Then you have (n-200)/((n-200)-x) for each comic x>=200, from tomtom2357s solution
Bomaz

Posts: 22
Joined: Wed Dec 26, 2007 7:06 pm UTC

### Re: Random xkcd comic problem

Bomaz wrote:Does that really change anything?

The first 200 give you unique comics. Then you have (n-200)/((n-200)-x) for each comic x>=200, from tomtom2357s solution

Not quite. keeping the "last 200" implies that it doesn't stay the same. That is, the oldest comic on the list gets booted out when you
hit random.
http://en.wikipedia.org/wiki/DSV_Alvin#Sinking wrote:Researchers found a cheese sandwich which exhibited no visible signs of decomposition, and was in fact eaten.
Sagekilla

Posts: 385
Joined: Fri Aug 21, 2009 1:02 am UTC
Location: Long Island, NY

### Re: Random xkcd comic problem

Sure, but the 200 in the list are, by definition, ones you've seen before, and they'll be distinct from each other. So there are always (n-200) comics it could choose, of which (n-x) are the ones you're yet to see. So you'll expect (n-200)/(n-x) clicks to see a new comic when you've already seen x of them (for x >= 200). (Note not ((n-200)-x) in the denominator, as that 200 is already part of the value of x.)
While no one overhear you quickly tell me not cow cow.

phlip
Restorer of Worlds

Posts: 6958
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia