Probabilities question (d20 vs. 20 coins)

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

Rijor
Posts: 1
Joined: Fri Feb 01, 2013 7:05 am UTC

Probabilities question (d20 vs. 20 coins)

Postby Rijor » Fri Feb 01, 2013 7:13 am UTC

So, I was trying to figure out the probability of rolling 1d20 and getting a perfect roll vs. flipping 20 coins and getting 20 out of 20 heads.
At first I was thinking that the two would have the same probability (most likely because I haven't taken a math or statistics class in a while) but then I realized, they really wouldn't be.

The D20 is obvious, 1:20 to get a perfect 20. But what about the coins? I haven't taken math or statistics in a long time, and have forgotten this stuff, but how would I find that probability?

User avatar
gmalivuk
GNU Terry Pratchett
Posts: 26439
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

Re: Probabilities question (d20 vs. 20 coins)

Postby gmalivuk » Fri Feb 01, 2013 10:05 pm UTC

The results differ by a factor of about 50,000.

To get 20 fair coins to all come up heads, the first one needs to be heads (1/2 chance), the second needs to be heads (1/2 chance), ..., and the 20th has to be heads (1/2 chance). The coins are all independent, so to find the probability that all these things happen at once, we multiply, to get a total probability of 1/2^20, or 1 in 1,048,576.
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)

User avatar
PM 2Ring
Posts: 3646
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Mid north coast, NSW, Australia

Re: Probabilities question (d20 vs. 20 coins)

Postby PM 2Ring » Sat Feb 02, 2013 4:45 am UTC

FWIW, it's possible to flip coins to generate a random event with a probability of 1/20 but the procedure isn't guaranteed to terminate after a finite number of flips because 20 isn't a perfect power of 2.

Here's one way to do it with 8 coins.
Spoiler:
The 8th row of Pascal's triangle is
1 8 28 56 70 56 28 8 1, which sums to 28 = 256

We want to find a combination of coin tosses that has p = 1/20
1 / 20 = 10 / 200 = (1 + 1 + 8) / (256 - 56)

So we toss the 8 coins. If we get (say) 5H3T (5 heads and 3 tails), that toss is discarded and we re-toss all 8 coins until we get any other combination apart from 5H3T.
If we get 8H, or 7H1T, or 8T, then we've "won" with p = 1/20.
With any other combination, we've lost with p = (8+2*28+56+70) / 200 = 19 / 20.

User avatar
snowyowl
Posts: 464
Joined: Tue Jun 23, 2009 7:36 pm UTC

Re: Probabilities question (d20 vs. 20 coins)

Postby snowyowl » Mon Feb 04, 2013 3:59 pm UTC

Simpler: Toss 5 coins, in order. Interpret them as a number in binary, every head representing a 1 and every tails representing a 0. This gives you a number between 0 and 31. If it is not actually between 1 and 20, re-roll. Otherwise you're done: you have an answer with the same probability distribution as the roll of a d20.
The preceding comment is an automated response.

User avatar
Qaanol
The Cheshirest Catamount
Posts: 3058
Joined: Sat May 09, 2009 11:55 pm UTC

Re: Probabilities question (d20 vs. 20 coins)

Postby Qaanol » Mon Feb 04, 2013 8:00 pm UTC

PM 2Ring wrote:FWIW, it's possible to flip coins to generate a random event with a probability of 1/20 but the procedure isn't guaranteed to terminate after a finite number of flips because 20 isn't a perfect power of 2.

Here's one way to do it with 8 coins.
Spoiler:
The 8th row of Pascal's triangle is
1 8 28 56 70 56 28 8 1, which sums to 28 = 256

We want to find a combination of coin tosses that has p = 1/20
1 / 20 = 10 / 200 = (1 + 1 + 8) / (256 - 56)

So we toss the 8 coins. If we get (say) 5H3T (5 heads and 3 tails), that toss is discarded and we re-toss all 8 coins until we get any other combination apart from 5H3T.
If we get 8H, or 7H1T, or 8T, then we've "won" with p = 1/20.
With any other combination, we've lost with p = (8+2*28+56+70) / 200 = 19 / 20.

That is extremely clever, setting it up so just a single entry of Pascal’s triangle represents the “unacceptable” portion. What are the limitations of this method?

Given a fraction 0 ≤ q ≤ 1, does there always exist a row n of Pascal’s triangle such that a subset of the numbers on that row add up to q·2n (meaning, a q’th of the total for that row)? Can we put a meaningful bound on the number of terms required to include/exclude?

snowyowl wrote:Simpler: Toss 5 coins, in order. Interpret them as a number in binary, every head representing a 1 and every tails representing a 0. This gives you a number between 0 and 31. If it is not actually between 1 and 20, re-roll. Otherwise you're done: you have an answer with the same probability distribution as the roll of a d20.

That works, but using 6 coins converges much faster (and 10 even faster than that, and 14 faster still, etc.) There’s also a way to make use of the information from the “missed” flips to lower the expected number of flips even further.
wee free kings

User avatar
Sizik
Posts: 1212
Joined: Wed Aug 27, 2008 3:48 am UTC

Re: Probabilities question (d20 vs. 20 coins)

Postby Sizik » Mon Feb 04, 2013 11:04 pm UTC

I suppose you could keep flipping coins until the last 5 flips form a number 20 or less.
gmalivuk wrote:
King Author wrote:If space (rather, distance) is an illusion, it'd be possible for one meta-me to experience both body's sensory inputs.
Yes. And if wishes were horses, wishing wells would fill up very quickly with drowned horses.

User avatar
dudiobugtron
Posts: 1098
Joined: Mon Jul 30, 2012 9:14 am UTC
Location: The Outlier

Re: Probabilities question (d20 vs. 20 coins)

Postby dudiobugtron » Mon Feb 04, 2013 11:24 pm UTC

Sizik wrote:I suppose you could keep flipping coins until the last 5 flips form a number 20 or less.

That doesn't work, sadly. Some patterns are more common than others using this method, so you wouldn't get an even probability distribution.

Check out this thread for more discussion:
viewtopic.php?f=3&t=492
Image

User avatar
PM 2Ring
Posts: 3646
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Mid north coast, NSW, Australia

Re: Probabilities question (d20 vs. 20 coins)

Postby PM 2Ring » Tue Feb 05, 2013 6:01 am UTC

snowyowl wrote:Simpler: Toss 5 coins, in order. Interpret them as a number in binary, every head representing a 1 and every tails representing a 0. This gives you a number between 0 and 31. If it is not actually between 1 and 20, re-roll. Otherwise you're done: you have an answer with the same probability distribution as the roll of a d20.

That method's good if you actually want to emulate a D20 with coins. But as Qaanol said, it'd be better to use a larger number of coins and divide your answer down, so you aren't re-tossing so much, eg with 6 coins you only need to re-toss 4 permutations.

I still prefer my method if you're only interested in generating 1 in 20 events, since you don't need to toss the coins in order (or alternatively, use distinguishable coins), and my method's easier to implement by non-mathematicians, since no binary arithmetic is required. :) But I guess you could just write powers of two on the heads side of each coin.


Qaanol wrote:That is extremely clever, setting it up so just a single entry of Pascal’s triangle represents the “unacceptable” portion.


Thanks!
I was looking at Pascal's triangle, preparing to write a program to do a brute-force search, when that solution just popped out at me. So I didn't bother with the program. :)

Qaanol wrote:What are the limitations of this method?

Given a fraction 0 ≤ q ≤ 1, does there always exist a row n of Pascal’s triangle such that a subset of the numbers on that row add up to q·2n (meaning, a q’th of the total for that row)?

Well, that's not going to work if q isn't a binary fraction. :)

With my method we want to find n and two disjoint subsets P and R of Rown, such that sum(P)/(2n - sum(R)) = q. We want to know if such an n always exists, and if multiple solutions exist we want to find the values of n that minimize the number of elements in R.

I don't know how to attack that analytically, so I might try a bit of brute-force searching and see what patterns I can spot. I'm hoping that something related to Fermat's Little Theorem (or should I say, the Fermat–Euler theorem) will pop up, since we can always find an infinite number of solutions n that satisfy k | (2n - 1) for any odd k.

tomtom2357
Posts: 563
Joined: Tue Jul 27, 2010 8:48 am UTC

Re: Probabilities question (d20 vs. 20 coins)

Postby tomtom2357 » Thu Feb 07, 2013 7:47 am UTC

Wouldn't you rather have the way that does it in the least average amount of flips?
I have discovered a truly marvelous proof of this, which this margin is too narrow to contain.

User avatar
4=5
Posts: 2073
Joined: Sat Apr 28, 2007 3:02 am UTC

Re: Probabilities question (d20 vs. 20 coins)

Postby 4=5 » Thu Feb 07, 2013 8:35 pm UTC

well the average number of flips for 5 coins would be
5 + 5*12/32 + 5*(12/32)^2 ...

and since the limit of 1 + 1/n + 1/n^2 ... is n/(n-1)
12/32 => ( 32/12) /(32/12 - 12/12)
32/12 / 20/12
32/12 * 12/20
32/20
so the average number of coin flips to simulate a d20 with 5 coins is 5 * 32/20

so for an arbitrary number of coins greater than 5 it'll take an average of n*(2^n)/20 flips
For 5 coins we expect 8 flips which also happens to be the least flips we can expect for a whole number of coins

User avatar
Qaanol
The Cheshirest Catamount
Posts: 3058
Joined: Sat May 09, 2009 11:55 pm UTC

Re: Probabilities question (d20 vs. 20 coins)

Postby Qaanol » Thu Feb 07, 2013 11:10 pm UTC

4=5 wrote:well the average number of flips for 5 coins would be
5 + 5*12/32 + 5*(12/32)^2 ...

and since the limit of 1 + 1/n + 1/n^2 ... is n/(n-1)
12/32 => ( 32/12) /(32/12 - 12/12)
32/12 / 20/12
32/12 * 12/20
32/20
so the average number of coin flips to simulate a d20 with 5 coins is 5 * 32/20

so for an arbitrary number of coins greater than 5 it'll take an average of n*(2^n)/20 flips
For 5 coins we expect 8 flips which also happens to be the least flips we can expect for a whole number of coins

With 6 coins, using the simple “mod 20” method, discarding values above 60, the expected number of flips is 6.4

We can lower that slightly by making use of the 4 left-over values. (Of course, with 5 coins, we can lower the EV of 8 substantially.)

For 6 coins we can do the following:
Spoiler:
Flip 6 coins. If the result is in the range 0-59, reduce it mod 20 and we’re done. (Add 1 if desired to get 1-20 rather than 0-19.)

Otherwise, we had a remainder in the range 0-3, call it X. There are four equally likely values for X, so we can divide the range 0-19 into four equal parts and use X to pick which one we’re in. There are 5 values in each of those quarters, so we just need to pick a number from 0-4 now.

Flip 4 coins until the result is not 15. So it is in the range 0-14. Reduce it mod 5 and call it Y. We are done, and our output is 5X+Y+1.


The expected number of flips is 6 + (4/64)*(4)*(16/15) = 6+4/15 = 6.2666…

This is the same EV as if we flipped 2 coins to begin, used that to divide 20 by 4, then flipped 4 coins repeatedly until we didn’t get 15, to decide among the remaining 5 options.

Edit: I believe the best possible strategy has EV of 5.6 flips to pick a number from 1 to 20 uniformly at random.

Spoiler:
i) Flip 2 coins, call the result X, and use it as the 5s digit for the answer written in base 5. Now we need to find the units digit, which means picking from 5 options uniformly at random.

ii) Flip 3 coins. If the result is 1-5, use it as-is and we’re done.

Otherwise, there are three possible remainders. Use them.

iii) Flip 1 coin. This has two possibilities, which combined with the three possible remainders gives 6 options, label them 1-6. If we are in an option 1-5, use it as-is and we’re done. Otherwise GOTO step ii.


Calculation of the EV for that method:

Spoiler:
We have 4 states:
“A” is the start
“B” is when we need to pick from 5 options and have nothing
“C” is when we need to pick from 5 options and have 3 options “floating”
“D” is the end

We’ll use those letters without quotes to mean the expected number of flips to get from the given state to “D”:
A = 2 + B (from “A” we flip 2 coins and end up in “B”)
B = 3 + (3/8)C + (5/8)D (from “B” we flip 3 coins and with probability 3/8 go to “C”, the rest of the time we reach “D”)
C = 1 + (1/6)B + (5/6)D (from “C” we only flip 1 coin)
D = 0

The rest is linear algebra, showing (A, B, C, D) = (5.6, 3.6, 1.6, 0)


I don’t have a proof of optimality yet, but my gut says we’re using all available information as well as possible at each coin flip.
wee free kings

User avatar
PM 2Ring
Posts: 3646
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Mid north coast, NSW, Australia

Re: Probabilities question (d20 vs. 20 coins)

Postby PM 2Ring » Sun Feb 10, 2013 11:02 am UTC

PM 2Ring wrote:With my method we want to find n and two disjoint subsets P and R of Rown, such that sum(P)/(2n - sum(R)) = q. We want to know if such an n always exists, and if multiple solutions exist we want to find the values of n that minimize the number of elements in R.

I don't know how to attack that analytically, so I might try a bit of brute-force searching and see what patterns I can spot.


tomtom2357 wrote:Wouldn't you rather have the way that does it in the least average amount of flips?


Yeah, I guess so. Which means I need to minimize the sum of the elements in R.
(EDIT: To be more precise, I need to minimize 2n/(2n - r), where r is the sum of the elements in R.)

Anyway, I have some preliminary results. I haven't bothered to do any minimization as yet. I've simply been generating the subsets, making fractions from them, then counting the total number of fractions that can be formed for any number of coins up to (and including) a given n. I also check how well they cover the space by seeing what's the highest Farey sequence is a subset of the set of generated fractions.

The first column is n, the number of coins tossed. The second column is the count of all possible fractions that can be generated by up to n coins (0 and 1 are not included in these counts). The third column is the number of the highest Farey sequence contained in the current pool of fractions.

Code: Select all

 1:  1, 2
 2:  5, 4
 3:  17, 5
 4:  49, 8
 5:  121, 9
 6:  417, 12
 7:  941, 19
 8:  2633, 21
 9:  8101, 22
10:  23299, 26
11:  47251, 34
12:  127039, 39
13:  252219, 39
14:  556953, 60
15:  2079193, 76


The numbers seem to be growing nicely, although I suppose that the lack of increase in the Farey number for 13 coins is a little concerning, but it seems to recover nicely with n=14. Overall, these results do give me some confidence that any fraction can be generated by my method.

FWIW, these sequences aren't in the OEIS.
Last edited by PM 2Ring on Mon Feb 11, 2013 12:47 am UTC, edited 1 time in total.

Combinatorics123
Posts: 113
Joined: Thu Oct 25, 2012 10:45 pm UTC

Re: Probabilities question (d20 vs. 20 coins)

Postby Combinatorics123 » Sun Feb 10, 2013 7:40 pm UTC

I used tossing 5 coins (values from 1 to 32) 20 times mod 20 vs d20 (randbetween(1,20))
Here is what I obtained after 998 trials

coins.gif

User avatar
Qaanol
The Cheshirest Catamount
Posts: 3058
Joined: Sat May 09, 2009 11:55 pm UTC

Re: Probabilities question (d20 vs. 20 coins)

Postby Qaanol » Mon Feb 11, 2013 1:46 am UTC

I’ve calculated the EV to simulated dice of up to 214=16,384 sides with coins, using what I believe is the most efficient method (keep a running total of how-many-sided of a die you are currently simulating, which doubles with each coin flip, until you equal or exceed the value n you’re going for. At that point, reduce mod n if you’re within a multiple of n, otherwise let the number of remainder options be your new running total and continue.)

Here are plots in standard and semilog coordinates, showing the EV number of flips to simulate each die:

2_14s.png
2_14s.png (6.71 KiB) Viewed 1834 times

2_14 Semilog.png
2_14 Semilog.png (7.81 KiB) Viewed 1834 times


My program is actually a bit more general, and can find the EV number of rolls to simulate an n-sided die with a k-sided die. The plots look similar, with the interesting features at powers of k.

The next generalization would be how to simulate an n-sided die using a set of dice with {k1, k2, …, km} sides. Even for the case of m=2, I do not immediately see a straightforward way to determine when it is most advantageous to roll which die.

Edit: My first guess would be, find out how many rolls of the bigger die it takes to have the number of options equal or exceed n (so, ceil(logk_b(n)),) then see how many of those kb’s you can replace with ka’s without dropping below n. That way you get the smallest remainder possible without wasting an extra roll.

With 3 or more k-values though, you still have the same goal (get a product of k’s as close to n without going under as you can, using exactly ceil(logk_max(n)) dice) but I haven’t put in the time to try to find a good approach.

Edit 2: What we really want is, with ceil(logk_max(n)) rolls, for the total number of options T (the product of the number of sides on the die for each roll) to minimize the quantity (T mod n) / n.
Last edited by Qaanol on Tue Feb 12, 2013 9:25 pm UTC, edited 1 time in total.
wee free kings

User avatar
Qaanol
The Cheshirest Catamount
Posts: 3058
Joined: Sat May 09, 2009 11:55 pm UTC

Re: Probabilities question (d20 vs. 20 coins)

Postby Qaanol » Tue Feb 12, 2013 6:28 pm UTC

I updated my code from Matlab to C, and it runs a lot faster now.

The command-line syntax is “./<binary name> <max-n> <k>”, which outputs a list of the EV number of rolls to simulate a die of each number of sides from 1 through n, using a k-sided die. You can send the output to a file if you like.

I still have not commented the code much, so it may not be obvious what’s going on. Essentially, we are solving a really simple linear system. The things we care about are, “In state j, how many flips will it take to exceed n options, what will the remainder mod n be for the following state, and what are the odds of going to that next state (rather than being done.)” We go until we get a loop back to a state we already saw.

You can visualize this as a whole bunch of linear equations, aj = ej + pj*aj+1 and then the very last one being aj_max = ej_max + pj_maxaL, where each e is the exponent (number of flips) required to exceed n options, and L is the state we loop back to. We want to find a1, the EV number of flips from the first state.

Also, since we can’t readily calculate how many distinct powers of m there are mod n, but we know there are less than n of them, I decided to preallocate arrays long enough to hold N-1 entries, where N is the maximum N we’re interested in. Then, in looping through n from 2 through N, the same arrays get reused each time. That’s why I pass pointers to my helper function, so it can use the arrays.

If you modify the code by commenting out the section between the blank comment lines (with the clock stuff) and uncomment the two line thereafter, then it will just find EV for the specific n and k that are input, rather than all n up to the first arg.

There is no error checking, and you can easily break this code with bad input. Both command-line args should be integers 2 or higher.

main.c

Code: Select all

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "diesim.h"

int main (int argc, const char * argv[]) {
   unsigned int n = atoi(argv[1]);   // Sides on die to simulate
   unsigned int k = atoi(argv[2]);   // Sides on die to roll
   
   double *prob = malloc((n-1)*sizeof(double));
   unsigned int *expon = malloc((n-1)*sizeof(int));
   unsigned int *value = malloc((n-1)*sizeof(int));
   unsigned int *locs = malloc((n-1)*sizeof(int));
   unsigned int *deja = calloc((n-1), sizeof(int));
   double val;
   
   //
   clock_t t = clock();
   printf("n\tk = %d\n1\t%f\n", k, 1.0);
   for (unsigned int ii = 2; ii <= n; ii = ii + 1) {
      val = diesim(ii, k, prob, expon, value, locs, deja);
      printf("%d\t%f\n", ii, val);
   }
   t = clock() - t;
   printf("Time elapsed = %f seconds\n", (float)t/CLOCKS_PER_SEC);
   //
   
//   double val = diesim(n, k, prob, expon, value, locs, deja);
//   printf("%f\n", val);
   
    return 0;
}


diesim.h

Code: Select all

#ifndef __QN_DIE_SIM
#define __QN_DIE_SIM

double diesim(unsigned int const n, unsigned int const k, double * const prob,
           unsigned int * const expon, unsigned int * const value,
           unsigned int * const locs, unsigned int * const deja);

#endif


diesim.c

Code: Select all

#include "diesim.h"

double diesim(unsigned int const n, unsigned int const k, double * const prob,
           unsigned int * const expon, unsigned int * const value,
           unsigned int * const locs, unsigned int * const deja)
{
   unsigned int nextv = 1;
   unsigned int power = 0;
   unsigned int pos = n;
   unsigned int ii;
   double val;
   
   for (ii = 0; ii < n-1; ii = ii + 1) {
      locs[nextv-1] = ii;
      deja[nextv-1] = n;
      expon[ii] = 1;
      value[ii] = nextv;
      power = k*nextv;
      while (power < n) {
         expon[ii] = expon[ii] + 1;
         power = k*power;
      }
      nextv = power % n;
      prob[ii] = (double)nextv / (double)power;
      if (nextv == 0) {
         break;
      } else if (deja[nextv-1] == n) {
         pos = locs[nextv-1];
         break;
      }
   }
   unsigned int d = ii;
   if (d == n-1) {
      d = d - 1;
   }
   
   val = (double)expon[ii];
   if (pos == d) {
      val = val / (1.0 - prob[pos]);
   } else if (pos == n) {
      pos = d;
   } else {
      double pt = prob[d];
      for (ii = d-1; ii > pos; ii = ii - 1) {
         pt = pt * prob[ii];
         val = val * prob[ii] + (double)expon[ii];
      }
      val = ((double)expon[pos] + val*prob[pos]) / (1.0 - pt*prob[pos]);
   }
   for (ii = pos-1; ii != -1; ii = ii - 1) {
      val = (double)expon[ii] + val*prob[ii];
   }
   
   return val;
}
wee free kings


Return to “Mathematics”

Who is online

Users browsing this forum: No registered users and 5 guests