2118: "Normal Distribution"
Moderators: Moderators General, Prelates, Magistrates
2118: "Normal Distribution"
title text: "It's the NORMAL distribution, not the TANGENT distribution"
50%? No way.
Much fewer than 50% of the pixels are gray, and more than 80% are white. Very few are black. The pixel production department is running out of of white pixels, and HR warns us that we may be sued for pixel hiring bias.
 80watt Hamster
 Posts: 12
 Joined: Tue May 13, 2014 1:17 pm UTC
Re: 2118: "Normal Distribution"
keithl wrote:Much fewer than 50% of the pixels are gray, and more than 80% are white.
Is that for the whole image? To the eye, the distribution looks much more even under the curve.
Any input as to what about this, exactly, would this annoy statisticians? As a nonstatistician, it confuses me because I don't know what someone would be trying to demonstrate by highlighting the given region.
80watt Hamster
Re: 2118: "Normal Distribution"
Normal normal for comparison:
Normally you would mark a horizontal sweep around the middle to indicate a useful range of values (in which 50% of random samples will lie, or where a random sample will end up with 50% chance). And the middle helpfully corresponds to the mean value.
Now the vertical midpoint is 1/(2σ√π) and both the height of lines and the corner points, I think, will tell you nothing. But the worst part is that it makes no sense as a visual aid—rather it's closer to visual AIDS.
I wonder how far apart those lines are, though.
80watt Hamster wrote:Any input as to what about this, exactly, would this annoy statisticians? As a nonstatistician, it confuses me because I don't know what someone would be trying to demonstrate by highlighting the given region.
Normally you would mark a horizontal sweep around the middle to indicate a useful range of values (in which 50% of random samples will lie, or where a random sample will end up with 50% chance). And the middle helpfully corresponds to the mean value.
Now the vertical midpoint is 1/(2σ√π) and both the height of lines and the corner points, I think, will tell you nothing. But the worst part is that it makes no sense as a visual aid—rather it's closer to visual AIDS.
I wonder how far apart those lines are, though.
Last edited by Flumble on Fri Mar 01, 2019 9:55 pm UTC, edited 1 time in total.
Re: 2118: "Normal Distribution"
80watt Hamster wrote:Any input as to what about this, exactly, would this annoy statisticians? As a nonstatistician, it confuses me because I don't know what someone would be trying to demonstrate by highlighting the given region.
It doesn't demonstrate anything at all, which is why it would annoy statisticians. Randall has shaded an arbitrary portion of the distribution as if it's meaningful, but it's not.
Re: 2118: "Normal Distribution"
80watt Hamster wrote:Any input as to what about this, exactly, would this annoy statisticians? As a nonstatistician, it confuses me because I don't know what someone would be trying to demonstrate by highlighting the given region.
Statisticians sometimes highlight the "50%" center of the distribution with vertical lines and a gray area to the left and right of the peak. The horizontal cuts are abnormal and meaningless. If this graph represents the likelihood of some variable  say, the probability (vertical axis) of a person being X centimeters tall (horizontal axis), then the shaded area doesn't represent something measurable, because probability is an aggregate, not an individual, quantity.
Either that, or the lines indicate where the lost airplane is presumed to have crashed into the mountain, and where to send the search parties first.
Note: I sold electronic products based on statistics. I've learned that, with enough data, no distributions are exactly Gaussian, and typically the "tails" are fatter on one side or both. If a fat tail represents toofarfromaverage transistors on a billiontransistor silicon integrated circuit, that leads to excessive production test failures and expensive field returns. Bart Kosko's popular science book "Noise" is an excellent introduction to "fat tails".
Re: 2118: "Normal Distribution"
It's independent of the variance, is the great thing.
Re: 2118: "Normal Distribution"
With the midpoint at 52.7%, I think this is a clear example of grade inflation.
Re: 2118: "Normal Distribution"
keithl wrote:I've learned that, with enough data, no distributions are exactly Gaussian
Something I've learned is that, with enough processing and discarding of outliers, all distributions are Gaussian to within any reasonable margin of error...
Re: 2118: "Normal Distribution"
rmsgrey wrote:keithl wrote:I've learned that, with enough data, no distributions are exactly Gaussian
Something I've learned is that, with enough processing and discarding of outliers, all distributions are Gaussian to within any reasonable margin of error...
What if we made a plot of all those distributions based on how gaussian they are...
Re: 2118: "Normal Distribution"
https://en.wikipedia.org/wiki/Bean_machine
Dalton Boards, also known as Bean Machines, are in quite a few museums, but none of them demonstrate the XKCD Demarcation of the Normal Distribution.
We could make a modified Dalton Board that did split the marbles into two roughly equal groups using Randall's arbitrary technique.
Since it would be meaningless, we could install the machine in an art museum, rather than a science museum.
We won't do this because it would require a lot of tedious effort, and not demonstrate anything interesting. Separating the central 52% without losing marbles would be a bit of an engineering challenge.
Dalton Boards, also known as Bean Machines, are in quite a few museums, but none of them demonstrate the XKCD Demarcation of the Normal Distribution.
We could make a modified Dalton Board that did split the marbles into two roughly equal groups using Randall's arbitrary technique.
Since it would be meaningless, we could install the machine in an art museum, rather than a science museum.
We won't do this because it would require a lot of tedious effort, and not demonstrate anything interesting. Separating the central 52% without losing marbles would be a bit of an engineering challenge.
 Soupspoon
 You have done something you shouldn't. Or are about to.
 Posts: 4060
 Joined: Thu Jan 28, 2016 7:00 pm UTC
 Location: 531
Re: 2118: "Normal Distribution"
Code: Select all
⊥
⊥
⊥
⊥
⊥
⊥
⊥
⊥ ⊥
⊥ ⊥ ⊥
⊥⊥
⊥
⊥
⊥
⊥
⊥
⊥
Re: 2118: "Normal Distribution"
keithl wrote:Either that, or the lines indicate where the lost airplane is presumed to have crashed into the mountain, and where to send the search parties first.
Good news! A search party found the airplane! All the passengers survived, except for the three who were eaten. The passengers also ate the search party. Then the passengers were eaten by the other search parties.
We will publish a paper on the nutritional value of searchers and passengers soon. There will be gaussian distributions for all the essential nutrients, after we eat enough of the researchers who disagree.
Re: 2118: "Normal Distribution"
rmsgrey wrote:keithl wrote:I've learned that, with enough data, no distributions are exactly Gaussian
Something I've learned is that, with enough processing and discarding of outliers, all distributions are Gaussian to within any reasonable margin of error...
Of course. If you discard all the data that doesn't fit your distribution, your distribution is perfect.
Re: 2118: "Normal Distribution"
sotanaht wrote:rmsgrey wrote:keithl wrote:I've learned that, with enough data, no distributions are exactly Gaussian
Something I've learned is that, with enough processing and discarding of outliers, all distributions are Gaussian to within any reasonable margin of error...
Of course. If you discard all the data that doesn't fit your distribution, your distribution is perfect.
Does a single data point fit a Gaussian distribution? Why or why not?
To be considered, all answers must be written on the back of a $100 bill and mailed to the contest address by April 1st ± sigma
https://app.box.com/witthoftresume
Former OTTer
Vote cellocgw for President 2020. #ScienceintheWhiteHouse http://cellocgw.wordpress.com
"The Planck length is 3.81779e33 picas."  keithl
" Earth weighs almost exactly π milliJupiters"  whatif #146, note 7
Former OTTer
Vote cellocgw for President 2020. #ScienceintheWhiteHouse http://cellocgw.wordpress.com
"The Planck length is 3.81779e33 picas."  keithl
" Earth weighs almost exactly π milliJupiters"  whatif #146, note 7
 Soupspoon
 You have done something you shouldn't. Or are about to.
 Posts: 4060
 Joined: Thu Jan 28, 2016 7:00 pm UTC
 Location: 531
Re: 2118: "Normal Distribution"
sotanaht wrote:Of course. If you discard all the data that doesn't fit your distribution, your distribution is perfect.
Ironically my little 'joke' above (in whose creation I realised I hadn't even added the Math::Trig module here until now) was the third run of the script to create. The first two runs showed a clear and tight x=y grouping of the scatter. Given it was a randomised scattering in polar coordinates, I knew it wasn't an error in my implementation (assuming no peculiar resonances of the inbuilt PRNG values around what effectively became 45 and 225 degrees on every second call for a value!), but I went on until it lost the pattern anyway. Because that was the mood I was in. And I still trusted a script to be more random in distribution (with all the possibilities of a pattern sneaking through via fluke) than any attempt to make it up manually.

 Posts: 4
 Joined: Mon Mar 04, 2019 7:45 pm UTC
Re: 2118: "Normal Distribution"
I'm surprised noone's explained Randall's maths.
Call h the height of the Normal distribution pdf at zero (the mean), ie h = 1/sqrt(2pi).
Then the horizontal lines meet the pdf at say
(+/a, h(1p/2)) and
(+/b, h(1+p/2))
(lower line and upper line respectively).
If p = 0.52682, then a = 1.69790 and b = 0.73479 (assumed sd of 1), and using the cumulative distribution, give or take a couple of rectangles, we get that the shaded area is 0.50000 as claimed. p rounds to 52.7%.
There you go, a bit of Maths pulled me out of 10+ years of lurking on this thread to actually register and post!
Call h the height of the Normal distribution pdf at zero (the mean), ie h = 1/sqrt(2pi).
Then the horizontal lines meet the pdf at say
(+/a, h(1p/2)) and
(+/b, h(1+p/2))
(lower line and upper line respectively).
If p = 0.52682, then a = 1.69790 and b = 0.73479 (assumed sd of 1), and using the cumulative distribution, give or take a couple of rectangles, we get that the shaded area is 0.50000 as claimed. p rounds to 52.7%.
There you go, a bit of Maths pulled me out of 10+ years of lurking on this thread to actually register and post!
Return to “Individual XKCD Comic Threads”
Who is online
Users browsing this forum: No registered users and 43 guests