"Random access" PRNG

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Larson, Moderators General, Prelates

"Random access" PRNG

Postby Nexuapex » Mon Jul 14, 2008 7:19 pm UTC

I've been poking around PRNGs, and am trying to find one that allows random access without a (relatively) time-consuming reseeding process. For example, the Mersenne twister algorithms I've seen regenerate a 624-word internal state every time the seed changes, which is not something you want to do every time you generate a number. The reason I don't want an internal state running around is, say, generating a random bitmap of extreme size. It would be convenient to store only the seed plus any changes the user makes to the bitmap, instead of the entire state of the picture (especially if many parts will never change), and it would be extremely useful in case the bitmap doesn't actually fit in memory (i.e. it is infinite). But you wouldn't want to reseed the PRNG constantly as the user scrolls around (as you'd need to reseed the generator one time for each visible row or column per frame, if the visible rect is moving around).

Basically, my question is if anyone knows about a PRNG that generates good random (preferably 64-bit) integers from an arbitrary and variable-sized seed (say an array of integers that might add or remove elements between calls) in O(n) relative to the number of elements changed in the array (instead of the total number of elements). For the above example, it would be a variable-sized array of integers as a seed, with the last two elements being the x and y coordinates of the pixel to be generated. Or something that can be used to help with the above example, if my thoughts are leading me in the wrong direction.

Any thoughts, or corrections as to my line of thinking for this?
“It’s my estimation that every man ever got a statue made of him was one kind of a son of a bitch or another.”
User avatar
Nexuapex
 
Posts: 22
Joined: Thu Sep 06, 2007 3:01 am UTC
Location: Northwest Iowa

Re: "Random access" PRNG

Postby Puck » Mon Jul 14, 2008 10:50 pm UTC

I'm very confused as to why you want to reseed the random number generator at all. Unless I'm misreading your post, you should probably just use one seed and then keep asking the RNG for the next random number.

Can you explain in more detail why you think reseeding is needed?
22/7 wrote:If I could have an alternate horn that would yell "If you use your turn signal, I'll let you in" loud enough to hear inside another car, I would pay nearly any amount of money for it.
Puck
 
Posts: 615
Joined: Tue Nov 27, 2007 7:29 pm UTC

Re: "Random access" PRNG

Postby headprogrammingczar » Mon Jul 14, 2008 11:21 pm UTC

I would do a method similar to one-time pad encryption. Take a huge-normous file (i.e. your OS's hibernate file) and compress it, then encrypt it. Then make whatever algorithm you want to do random access lookup on the file. As for multi-variable parametrization, you can use matrix operations to flatten the arguments into an integer. It would be nice to know what it is for, and whatever it happens to be for, come up with a solution yourself before implementing any suggestions here.
<quintopia> You're not crazy. you're the goddamn headprogrammingspock!
<Weeks> You're the goddamn headprogrammingspock!
<Cheese> I love you
User avatar
headprogrammingczar
 
Posts: 2953
Joined: Mon Oct 22, 2007 5:28 pm UTC
Location: Beaming you up

Re: "Random access" PRNG

Postby Xanthir » Mon Jul 14, 2008 11:49 pm UTC

headprogrammingczar wrote:I would do a method similar to one-time pad encryption. Take a huge-normous file (i.e. your OS's hibernate file) and compress it, then encrypt it. Then make whatever algorithm you want to do random access lookup on the file. As for multi-variable parametrization, you can use matrix operations to flatten the arguments into an integer. It would be nice to know what it is for, and whatever it happens to be for, come up with a solution yourself before implementing any suggestions here.


The method used to do random-access on the file will either be/contain a useful PRNG by itself (thus removing the need to carry around a huge file and do lookups on it), or will be/contain a poor PRNG (thus making it useless, as it won't do properly random lookups on the document).

All your suggestion does is remap one set of numbers to another set of predetermined random numbers. The target set is already random - there's no need to throw more randomness at it (and in fact, you're likely to reduce the randomness). It's exactly equivalent and much easier to just eat the random file you've generated from start to finish - it'll generate a sequence of similar or greater randomness.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))
User avatar
Xanthir
My HERO!!!
 
Posts: 3989
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: The Googleplex

Re: "Random access" PRNG

Postby Nexuapex » Tue Jul 15, 2008 1:02 am UTC

I can't use a machine-specific file, the seed must be machine-independent.

Some more detail: imagine the pseudocode that would draw a portion of this generated image:

Code: Select all
int stored_seed = get_stored_seed();
int x, y;
for (y = origin_y; y < viewport_height; y++) {
   srand(stored_seed ^ y);
   for (x = 0; x < origin_x; x++) {
      rand();
   }
   for (x = origin_x; x < viewport_width; x++) {
      plot(x - origin_x, y - origin_y, rand() % TONE_RANGE);
   }
}

To start my random number generation at any arbitrary (origin_x, origin_y) and get consistent results, I need to call srand() once per row and rand() constantly to shift to where my row begins. I could also do this:

Code: Select all
int stored_seed = get_stored_seed();
int x, y;
for (y = origin_y; y < viewport_height; y++) {
   for (x = origin_x; x < viewport_width; x++) {
      srand(stored_seed ^ x ^ y);
      plot(x - origin_x, y - origin_y, rand() % TONE_RANGE);
   }
}

That involves an srand() for every rand(). By that point I'm worried.

What I'm looking for, I guess, is basically a hash function for an array that is statistically random. I'd like to be able to do this, or something like it:

Code: Select all
size_t size;
int * stored_seeds = get_stored_seeds(&size);
rand_buf * state = new_rand(stored_seeds, size);
int coords[2];
for (coords[0] = origin_y; coords[0] < viewport_height; coords[0]++) {
   for (coords[1] = origin_x; coords[1] < viewport_width; coords[1]++) {
      plot(x - origin_x, y - origin_y, get_rand(state, coords, sizeof(coords)) % TONE_RANGE);
   }
}
“It’s my estimation that every man ever got a statue made of him was one kind of a son of a bitch or another.”
User avatar
Nexuapex
 
Posts: 22
Joined: Thu Sep 06, 2007 3:01 am UTC
Location: Northwest Iowa

Re: "Random access" PRNG

Postby Berengal » Tue Jul 15, 2008 8:14 am UTC

So to rephrase, you want to be able to find the n-th term of a prng without calculating all terms <n?

I think I remember reading something about this, but I can't remember exactly what it was about or where. I'll try looking for it again.
It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.
User avatar
Berengal
Superabacus Mystic of the First Rank
 
Posts: 2707
Joined: Thu May 24, 2007 5:51 am UTC
Location: Bergen, Norway

Re: "Random access" PRNG

Postby segmentation fault » Tue Jul 15, 2008 5:37 pm UTC

poll the sound card. its always generating random white noise.
people are like LDL cholesterol for the internet
User avatar
segmentation fault
 
Posts: 1770
Joined: Wed Dec 05, 2007 4:10 pm UTC
Location: Nu Jersey

Re: "Random access" PRNG

Postby Berengal » Tue Jul 15, 2008 7:47 pm UTC

segmentation fault wrote:poll the sound card. its always generating random white noise.

Good idea, though you might get problems if you want the same term twice. If that's the case then you'd have to store each retrieved value, or generate everything at once and store it in a file as previously suggested. Or use a prng where you can calculate an arbitrary term.

Edit: Come to think of it, could you use a hash? I'm not quite up to date on which hash would make the best random numbers, but you should get consistent return values from identical input.
It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.
User avatar
Berengal
Superabacus Mystic of the First Rank
 
Posts: 2707
Joined: Thu May 24, 2007 5:51 am UTC
Location: Bergen, Norway

Re: "Random access" PRNG

Postby Nexuapex » Thu Jul 17, 2008 4:32 pm UTC

Berengal wrote:So to rephrase, you want to be able to find the n-th term of a prng without calculating all terms <n?

Yep.

I could use a hash, I'm betting most hashes would do statistical randomness pretty well.

...but finding a hash function for an array that didn't turn {1} into a hash of 1, and {2} into a hash of 2, etc. might be a problem. I wouldn't know where to start looking.
“It’s my estimation that every man ever got a statue made of him was one kind of a son of a bitch or another.”
User avatar
Nexuapex
 
Posts: 22
Joined: Thu Sep 06, 2007 3:01 am UTC
Location: Northwest Iowa

Re: "Random access" PRNG

Postby Berengal » Thu Jul 17, 2008 7:25 pm UTC

What about md5 or sha-1? Being cryptographically secure, they should provide output indistinguishable from true randomness. In theory that is, and for your use as well.
It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.
User avatar
Berengal
Superabacus Mystic of the First Rank
 
Posts: 2707
Joined: Thu May 24, 2007 5:51 am UTC
Location: Bergen, Norway

Re: "Random access" PRNG

Postby Nexuapex » Wed Jul 30, 2008 1:18 am UTC

Well, MD5 and SHA-1 probably won't work straight off the bat, as they aren't really designed for this use (doesn't MD5 need to pad the input to a multiple of 256 bytes or something like that?). But I'll look into those, they're very close to what I want.
“It’s my estimation that every man ever got a statue made of him was one kind of a son of a bitch or another.”
User avatar
Nexuapex
 
Posts: 22
Joined: Thu Sep 06, 2007 3:01 am UTC
Location: Northwest Iowa

Re: "Random access" PRNG

Postby Berengal » Wed Jul 30, 2008 4:08 am UTC

Padding shouldn't be that huge an issue. Just repeat the pattern until you get above the closest multiple of 256 (or whatever it is) and cut the rest of. It might provide the same hash a couple of times too much at first (with 5, 55, 555 etc. all giving the same one), but if you don't need really random numbers, just ones that are random enough, it should work fine (or you could start counting at 1000).
It is practically impossible to teach good programming to students who are motivated by money: As potential programmers they are mentally mutilated beyond hope of regeneration.
User avatar
Berengal
Superabacus Mystic of the First Rank
 
Posts: 2707
Joined: Thu May 24, 2007 5:51 am UTC
Location: Bergen, Norway

Re: "Random access" PRNG

Postby Unparallelogram » Wed Jul 30, 2008 4:53 am UTC

Why not let your (i,j)-th pseudo-random number be hash(seed + i + j) or something, + being concatenation? Cryptographic hashes give very good pseudo-randomness, but of course simpler hashes are cheaper computationally if that matters to you.
Unparallelogram
 
Posts: 338
Joined: Mon Jul 28, 2008 4:16 am UTC

Re: "Random access" PRNG

Postby Nexuapex » Thu Jul 31, 2008 10:44 pm UTC

Unparallelogram wrote:Why not let your (i,j)-th pseudo-random number be hash(seed + i + j) or something, + being concatenation? Cryptographic hashes give very good pseudo-randomness, but of course simpler hashes are cheaper computationally if that matters to you.

It does matter somewhat.

Is there a better site than Wikipedia to go for information about hashing methods, cryptographic and otherwise?

Edit: Actually, anyone know of a cryptographic-ish hash that I could do hash(hash(seed, i), j) with, instead of hash(seed + i + j)?
“It’s my estimation that every man ever got a statue made of him was one kind of a son of a bitch or another.”
User avatar
Nexuapex
 
Posts: 22
Joined: Thu Sep 06, 2007 3:01 am UTC
Location: Northwest Iowa

Re: "Random access" PRNG

Postby Yakk » Fri Aug 01, 2008 3:20 pm UTC

Build "zones" of, say, ~1024 units (32x32). Your "world" becomes a bunch of patches of that size.

Hash the location of the zone into a seed for your random number generator.

Build each 32x32 patch by calling rand 1024 times in a particular order.

This requires 1 hashing and 1 seeding every ~1000 calls to rand. For a given pixel, it requires an average of 512 calls to rand -- however, odds are when examining an image is that you work on a reasonably contiguous region.

If you want a much much much larger region (ie, bigger than your hash function), you can do something like the above in a quad tree (or oct tree) hierarchy.

If this is for, example, a game, you could add an "extra dimension" for independent dimensions of the terrain.

(I believe SC2, among many other games, uses a technique somewhat like this to efficiently generate a rather large amount of data consistently.)
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
 
Posts: 10038
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: "Random access" PRNG

Postby petschge » Tue Aug 05, 2008 6:47 am UTC

You can use the BBS generator. It is named after the inventors Blum, Blum and Shub. Please note that the algorithm return pseudorandom bits, not bytes. Now for a short description:

* Generate two primes p, q of sufficient size which satisfy p \equiv 3 \pmod{4} and q \equiv 3 \pmod{4}.
* Calculate n = p * q
* Pick a seed s which is smaller than n and coprime to n
* Calculate x_0 = s^2 \pmod{n}
* Iterate x_i = (x_{i-1})^2 \pmod{n} and use the least significant bit of x_i

Now if you want to get x_j directly compute x_j = x_0 ^{(2^j mod ((p - 1)*(q-1)))}

I hope this helps a little.
petschge
 
Posts: 1
Joined: Tue Aug 05, 2008 6:26 am UTC

Re: "Random access" PRNG

Postby Cosmologicon » Tue Aug 05, 2008 11:42 pm UTC

petschge wrote:Now if you want to get x_j directly compute x_j = x_0 ^{(2^j mod ((p - 1)*(q-1)))}

But calculating 2j mod whatever is an O(j) operation, isn't it?
User avatar
Cosmologicon
 
Posts: 1806
Joined: Sat Nov 25, 2006 9:47 am UTC
Location: Cambridge MA USA

Re: "Random access" PRNG

Postby Yakk » Wed Aug 06, 2008 4:35 am UTC

Calculating 2^j mod n takes about lg(j) time, using a naive algorithm.

You could probably do even better.

(Let x_0 = 1 and x_{i+1} = {x_i}^2 for i >= 0. Each squaring takes about O(1) time in a specific modulus.

Now, 2^j = Product(i from X)( x_i ) for some set X with |X| <= lg(j), simply by expressing j as a binary number, and multiplying together the appropriate x_i.

This requires you to calculate lg(j) x_i terms, then multiply together some subset of them. Overall: O(lg(j)) time.

As noted, I'm not convinced that this is the fastest algorithm to solve this problem.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
 
Posts: 10038
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: "Random access" PRNG

Postby Cosmologicon » Wed Aug 06, 2008 4:51 am UTC

Yep good call. I don't know what I was thinking, probably computing x^(2^j). But that's not needed here.
User avatar
Cosmologicon
 
Posts: 1806
Joined: Sat Nov 25, 2006 9:47 am UTC
Location: Cambridge MA USA

Re: "Random access" PRNG

Postby Yakk » Wed Aug 06, 2008 3:54 pm UTC

Modular arithmetic is really fast. Even x^(2^j) mod k can be done in O(lg(j)) time if you can factor k (factoring k is sufficient, but not required).

Simply calculate 2^j mod phi(k), then calculate x^[ 2^j mod phi(k) ], which takes O(lg(k)) time.

At this level, I should be admitting that operations on numbers up to size k isn't constant time. Which slows things down. :-)

(As an example, multiplying together two lg(k) digit numbers takes various amounts of time depending on how sophisticated you want to be. The more sophisticated, the worse the constant factor, until you eventually do the multiplication in the frequency domain.)
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
 
Posts: 10038
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: "Random access" PRNG

Postby godrik » Sun Aug 10, 2008 3:27 pm UTC

(sorry for the inconvienience, formulas are written in latex. I believe the explanation are still easy to read.)


I would suggest to consider the problem of generating a one dimensional array of $n$ element of a finite field $F$ of size $p$.
Let us consider $P_0$ and $P_1$ two PRNG that maps an element of $F$ to another one.
Let $k$ be the element of the array you want to generate, we define $A_i (k) = P_0$ if the $i$-th bit of $k$ is $0$ an $A_i (k) = P_1$ if the $i$-th bit of $k$ is $1$.
An array is identified by a single element of $F$, namely the seed $s$.
We generate $k$ in $\log_2 (n)$ step by applying iteratively $A_i$ on the seed.

For instance, for $k=00101$, the element is $P_0 (P_0 (P_1 (P_0 (P_1 (s)))))$.
Using such a technique you can generate any of the array in $O(log_2 (n))$ call to a PRNG storing only $1$ element of $F$.

I believe this is one of the most reasonable answer to the problem. However, i am not quite sure of the property of such a generator. One should take care on the choice of the functions $P_0$ and $P_1$, they should not respect the property $P_0 (P_1 (x)) = P_1 (P_0 (x))$ since this would mean that $k=011001$ and $k=110001$ have the same value.
Using two exponentiation such as in RSA would lead to this problem. I believe Linear Congruential Generators (LCG) does not respect such a property and could be a valid choice. Perhaps it would even be better to use a LCG and an exponentiation technique.

To speed up the computations of the value, one could cache the result of the inner call to $A_i$.
godrik
 
Posts: 23
Joined: Wed Aug 06, 2008 10:37 am UTC


Return to Computer Science

Who is online

Users browsing this forum: Bing [Bot], Google Feedfetcher and 3 guests