## A probability question

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

Qaanol
The Cheshirest Catamount
Posts: 3065
Joined: Sat May 09, 2009 11:55 pm UTC

### A probability question

Here’s the procedure:

• We choose n numbers from a standard normal distribution, and sort them so x1 ≤ x2 ≤ ⋯ ≤ xn.
• Then we find the midpoint of each consecutive pair, mi = (xi + xi+1) / 2.
• These midpoints partition the real line into intervals, one of which, call it J, contains 0.
• If we choose another number from the same normal distribution, it has probability pi of landing in each interval, which can be expressed with erf().
• Let H be the interval with highest probability (ties are vanishingly rare so we ignore them).
• Let L be the interval with lowest probability.

What are the probabilities that H = J, and that L = J?

In other words, how likely is it that the highest-probability interval contains 0, and how likely is it that the lowest-probability interval contains 0?

I am most interested in the n = 3 case, so I wrote a Monte Carlo program to estimate it, and the results are consistently close to 51.5% for H = J, and 22.2% for L = J. But I’d like to know the exact values, and ideally have an explanation for why.
wee free kings

CorruptUser
Posts: 10409
Joined: Fri Nov 06, 2009 10:12 pm UTC

### Re: A probability question

That second number looks suspiciously like 2/9. I'd have to think about it some more, but P(H=J) > P(L=J) for all but the trivial case makes sense. However, it would seem to me that according to your definition, there would be cases where NO interval contains 0, where all Xi are either above or below 0, with probability 2*(.5^N).

gmalivuk
GNU Terry Pratchett
Posts: 26592
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

### Re: A probability question

It's a partition of the real line, including everything above the highest mi in one interval and everything below the lowest in the other.
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)

Qaanol
The Cheshirest Catamount
Posts: 3065
Joined: Sat May 09, 2009 11:55 pm UTC

### Re: A probability question

I had a silly typo in my simulation code.

The *actual* results are consistently close to 83.1% for H = J, and 6.4% for H = L.
wee free kings

jestingrabbit
Factoids are just Datas that haven't grown up yet
Posts: 5967
Joined: Tue Nov 28, 2006 9:50 pm UTC
Location: Sydney

### Re: A probability question

83.1 ~ 83.33... = 5/6

Take it up to 4 and see what happens? Guessing what's going on from one data point is going to be pretty impossible.
ameretrifle wrote:Magic space feudalism is therefore a viable idea.

Qaanol
The Cheshirest Catamount
Posts: 3065
Joined: Sat May 09, 2009 11:55 pm UTC

### Re: A probability question

jestingrabbit wrote:83.1 ~ 83.33... = 5/6

Take it up to 4 and see what happens? Guessing what's going on from one data point is going to be pretty impossible.

I’ll try that when I have time to update the code.

I’m trying to find an analytical solution. With n IID normal variates, their position in ℝn is spherically-symmetric. The all-same-sign case is trivial, as CorruptUser alludes to.

With n=3, the remaining 6 octants are interchangeable, so we can just look at x, y positive, z negative. And WLOG we can take x>y, so we only have to handle half an octant. If additionally y>-z, which is 1/3 of the remaining region, then H = J.

So that leaves us with x>-z>y and -z>x>y. We can integrate the 3D gaussian distribution over that region, but it is difficult to determine the radius at which to stop since that appears to involve finding the inverse of the difference of erfs.
wee free kings