I have completed my preliminary analysis and posted it

here. I shall recreate the post below. I'm eager to hear suggestions and corrections!

Post:

I am NOT a statastician, but I do have a strong math background. If

you notice any errors in my reasoning or methodology, please tell me.

After busting out the stats book and talking to some of the smart folk

at the xkcd message boards, I developed a rudimentary test for

comparing the effectiveness of scramblers. I have since conducted a

few experiments to compare the WCA megaminx scrambler and a scrambler

which behaves a little differently. (Also tested a standard 3x3x3

scrambler). I will explain the methodology first, so if you just want

the results, skip to the end.

The test I conducted was the simplest I could imagine, I applied

scrambles to the puzzle and kept track of how often each color landed

in each position. The assumption: Given a scramble algorithm

generator that generates an n move scrambling algorithms, if we take a

large number of solved puzzles and apply those algorithms, each

position should end up with each color a roughly equal number of times.

That is if an n move scramble from a generator truly approximates

random, all colors should have equal probability of showing up in all

locations. Essentially this is a test for how flat the distribution

of colors over locations is.

Methodology: Apply many scrambles to a cube and track how often each

color shows up in each position. The standard deviation of this list

should tell us approximately how well the scrambles approximate true

random.

So a list would look like this for 10 scrambles on a cube:

Position | R O G B Y W

0 | 3 2 1 2 2 0

1 | 4 1 4 0 0 1

....

47 | 1 1 2 1 3 2

Ignore the position column and take the standard deviation of the

rest. Centers are ignored as well since they are always the same

after scrambling. On the megaminx, after a scramble was applied, the

puzzle was reoriented so that the same center color was on top and the

same was on the front face every time, this prevents puzzle rotations

skewing the results.

The expected standard deviations which are used as a baseline were NOT

calculated directly, as I could not figure out how. Instead, they

were taken from an average of very long scramble algorithms which were

presumed to be random based on the data at hand. This bugs me, so if

you can figure out how to directly calculate the expected stdDev,

please let me know.

On to the results:

Each of these cases is conducted with 1 million (10^6) trials. Turns

is the number of non-puzzle rotating turns (except in the WCA megaminx

scrambler case, where puzzle rotations are counted). StdDev is the

observed standard deviation. Expected std dev is the standard

deviation which a truly random scramble generator should give.

Confidence is the % confidence that this scrambler / length

combination approximates true random in the long term (essentially the

ratio of expected stdDev to observed.) The standard practice I

believe is to take 95% confidence as the threshold for rejecting the

null hypotheses that sigma = s. Thus if the confidence is less than

95%, by convention we can assume the scramble is not truly random. In

practice, I would be willing to go as low as 93% or so. The custom

3x3x3 and megaminx scramblers are the same as from the excel sheet

generator I posted a few days ago.

3x3x3 Cube, generic scrambler (avoids redundant turns, etc):

Code: Select all

`Turns | Std Dev | Expected Std Dev | Confidence`

1 |264953.60| 360 | 0.1%

10 | 26320.07| 360 | 1.4%

20 | 4274.47| 360 | 8.4%

30 | 794.28| 360 | 45.3%

35 | 514.75| 360 | 69.9%

40 | 387.39| 360 | 92.9%

45 | 347.40| 360 | 96.5%

50 | 341.04| 360 | 94.7%

75 | 350.58| 360 | 97.4%

100 | 372.78| 360 | 96.6%

500 | 347.09| 360 | 96.4%

1000 | 355.14| 360 | 98.7%

When a range for the length was used, the results were pretty

miserable. I have been converted to an advocate for fixed length

scrambles. when allowed to range from 30-40 moves, the CI was only 73.1%!

Based on these results my scrambles for practice will always be at

least 40 moves, and I'd rather use 45.

On to the megaminx. It should be noted that the entirety of the

testing I did was in Java, and as such I had to modify the code from

the wca scrambler to java code. I'm not 100% familiar with

javascript, but I do think that my conversion is accurate. All code

used in this project is available on request.

Megaminx:

WCA Scrambler

Code: Select all

`Turns | Std Dev | Expected Std Dev | Confidence`

10 |108483.94| 279 | 0.3%

50 | 21986.84| 279 | 0.9%

70 | 12559.95| 279 | 1.7% *official scramble length

100 | 5546.20| 279 | 5.0%

150 | 1451.23| 279 | 19.2%

200 | 469.90| 279 | 59.4%

210 | 398.26| 279 | 70.1%

220 | 351.96| 279 | 79.3%

230 | 312.99| 279 | 89.2%

240 | 307.22| 279 | 90.9%

250 | 294.32| 279 | 94.8%

260 | 285.37| 279 | 97.8%

270 | 280.47| 279 | 99.5%

300 | 278.30| 279 | 99.7%

400 | 275.66| 279 | 98.7%

500 | 269.56| 279 | 96.6%

1000 | 264.62| 279 | 94.8%

This is the heart of why I'm posting. Based on this analysis the

official wca scrambler does not appear to be random until you reach

250 turns or more. And the official scramble length of 70 turns gives

a feeble 1.7% confidence of randomness.

Now lets examine a different scramble generator the big difference is

that puzzle rotations are randomly interspersed and do not count

against the move count. So a 10 move scramble may in fact be 15 moves

long. As such, for a scramble of length n, there will be at least

n/10 more non-puzzle rotating moves than in an n move wca scramble.

So an 80 move scramble here should only be compared to 88 or more move

wca scrambles. Also 1/5 twists are allowed as opposed to just 2/5:

Code: Select all

`Turns | Std Dev | Expected Std Dev | Confidence`

10 | 91955.36| 279 | 0.3%

50 | 5529.65| 279 | 5.0%

70 | 1572.39| 279 | 17.8%

100 | 380.54| 279 | 73.4%

110 | 315.88| 279 | 88.4%

115 | 287.75| 279 | 97.0%

120 | 291.12| 279 | 95.9%

125 | 279.61| 279 | 99.8%

150 | 273.86| 279 | 98.1%

200 | 282.68| 279 | 98.8%

500 | 279.16| 279 |100.0%

1000 | 286.71| 279 | 97.4%

Clearly this converges toward random much more quickly, in fact around

twice as quickly. I would put 115 moves as the bare minimum and will

probably henceforth use 125 moves myself. I suspect the difference

comes from having so many extra puzzle rotations, not from having 1/5

and 2/5 turns included, though I have not yet tested this.

There you have it. I will probably be posting this to the wca boards

sometime soon, as the megaminx situation concerns me a bit.

I will consider in the future analyzing the wca scramble lengths /

generators for the 4x4x4 and 5x5x5 cubes as well, but writing the

scramble tracking code is mind numbingly tedious and my wife is

complaining about me spending too much time on this

. As I

mentioned, if you are better at stats than I am (not hard to do) and

there is any error in my assumptions or methodology, please LET ME

KNOW! All source code used is available upon request.

I'm going to bed now, cheers.

-Daniel

In questions of science, the authority of a thousand is not worth the humble reasoning of a single individual.

Galileo