Here’s the procedure:
• We choose n numbers from a standard normal distribution, and sort them so x1 ≤ x2 ≤ ⋯ ≤ xn.
• Then we find the midpoint of each consecutive pair, mi = (xi + xi+1) / 2.
• These midpoints partition the real line into intervals, one of which, call it J, contains 0.
• If we choose another number from the same normal distribution, it has probability pi of landing in each interval, which can be expressed with erf().
• Let H be the interval with highest probability (ties are vanishingly rare so we ignore them).
• Let L be the interval with lowest probability.
What are the probabilities that H = J, and that L = J?
In other words, how likely is it that the highest-probability interval contains 0, and how likely is it that the lowest-probability interval contains 0?
I am most interested in the n = 3 case, so I wrote a Monte Carlo program to estimate it, and the results are consistently close to 51.5% for H = J, and 22.2% for L = J. But I’d like to know the exact values, and ideally have an explanation for why.
A probability question
Moderators: gmalivuk, Moderators General, Prelates
A probability question
wee free kings
- CorruptUser
- Posts: 9585
- Joined: Fri Nov 06, 2009 10:12 pm UTC
Re: A probability question
That second number looks suspiciously like 2/9. I'd have to think about it some more, but P(H=J) > P(L=J) for all but the trivial case makes sense. However, it would seem to me that according to your definition, there would be cases where NO interval contains 0, where all Xi are either above or below 0, with probability 2*(.5^N).
- gmalivuk
- GNU Terry Pratchett
- Posts: 26039
- Joined: Wed Feb 28, 2007 6:02 pm UTC
- Location: Here and There
- Contact:
Re: A probability question
It's a partition of the real line, including everything above the highest mi in one interval and everything below the lowest in the other.
Re: A probability question
I had a silly typo in my simulation code.
The *actual* results are consistently close to 83.1% for H = J, and 6.4% for H = L.
The *actual* results are consistently close to 83.1% for H = J, and 6.4% for H = L.
wee free kings
- jestingrabbit
- Factoids are just Datas that haven't grown up yet
- Posts: 5965
- Joined: Tue Nov 28, 2006 9:50 pm UTC
- Location: Sydney
Re: A probability question
83.1 ~ 83.33... = 5/6
Take it up to 4 and see what happens? Guessing what's going on from one data point is going to be pretty impossible.
Take it up to 4 and see what happens? Guessing what's going on from one data point is going to be pretty impossible.
ameretrifle wrote:Magic space feudalism is therefore a viable idea.
Re: A probability question
jestingrabbit wrote:83.1 ~ 83.33... = 5/6
Take it up to 4 and see what happens? Guessing what's going on from one data point is going to be pretty impossible.
I’ll try that when I have time to update the code.
I’m trying to find an analytical solution. With n IID normal variates, their position in ℝn is spherically-symmetric. The all-same-sign case is trivial, as CorruptUser alludes to.
With n=3, the remaining 6 octants are interchangeable, so we can just look at x, y positive, z negative. And WLOG we can take x>y, so we only have to handle half an octant. If additionally y>-z, which is 1/3 of the remaining region, then H = J.
So that leaves us with x>-z>y and -z>x>y. We can integrate the 3D gaussian distribution over that region, but it is difficult to determine the radius at which to stop since that appears to involve finding the inverse of the difference of erfs.
wee free kings
Who is online
Users browsing this forum: No registered users and 12 guests