## Why standard deviation?

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

TennysonXII
Posts: 14
Joined: Thu Dec 31, 2009 9:36 pm UTC

### Why standard deviation?

I'm taking a Psych Stats course this semester and we're getting into measures of variability. I'm understanding all of the math and terminology, but one thing is bothering me: why do we use the standard deviation?

I get that the standard deviation is the square root of the mean of squared deviations, and that for a normal distribution 68.27% of scores will fall within one standard deviation of the mean, 95.45% within two standard deviations, etc., but it still seems like a rather arbitrary measure. I've read the merged thread from ages ago (apologies if this should have gone there, the rules weren't clear on thread necromancy), but nothing there gave a satisfactory explanation. Most of the answers were either "It makes sense when doing higher math" or "Because it works". When I asked my professor about it, he muttered something about needing to see the proof of the normal curve and sort of apologetically admitted to not being able to give me the answer. If someone could really break it down for me like I'm five, I'd appreciate it a lot.

Yesila
Posts: 221
Joined: Sun Dec 16, 2007 11:38 am UTC

### Re: Why standard deviation?

Are you wondering why we don't just use the variance instead of taking its square root? Or are you wondering why we care about such things at all? Or is this a question of why we like this measure of the "spread" or "dispersion" of the data as opposed to some other way of measuring it?

arbiteroftruth
Posts: 481
Joined: Wed Sep 21, 2011 3:44 am UTC

### Re: Why standard deviation?

Basically, if you think of the mean as being a value that defines what the "typical" value of a given sample will be, the standard deviation is a value that defines the "typical" amount that a given sample will vary away from that mean. If you want to define a value to give you this information, the first instinct might be to simply measure each sample's distance from the mean and average all the distances. The problem is that if you do this just by subtracting the mean from each value, then values below the mean yield negative distance, and when you sum up all the distances, the negatives will cancel out the positives and you get zero.

So you need a formula that will pertain to the magnitude of the distance from the mean without regard for the direction of that distance. The most direct function to do this would be absolute value. You would take the absolute value of the distance between each sample and the mean, and average the resulting values together. On its own there's nothing wrong with this convention, but although absolute value by definition does what we want in this application, it's not on its own a simple algebraic function. So we should find a way to express a similar process in simple algebraic terms for ease of use.

Fortunately, another way to express |x| is (x^2)^(1/2), where you only take the positive square root. Squaring and then immediately taking the positive square root is completely identical to taking the absolute value, but if we split up the squaring and the taking of the square root with some intermediate step, then we have a function that does not simplify down to absolute value, but serves the same basic purpose in our application. In our case, the only other step we're worried about is the averaging of the values, so that's what we put between the squaring and the taking of the square root.

So the final result is that we define the standard deviation by first squaring all the distances from each sample to the mean, then taking the mean of all the resulting values, then taking the square root of the result.

I don't know if that reasoning has any bearing on official reasons why this is considered the best convention, but it's an explanation that makes the existing convention at least make intuitive sense to me.

Macbi
Posts: 941
Joined: Mon Apr 09, 2007 8:32 am UTC
Location: UKvia

### Re: Why standard deviation?

We want some way of measuring the spread of a distribution. There are many ways of doing this, but the standard deviation is the favourite because it shows up in the Gaussian distribution. The Gaussian is everyone's favourite distribution because it shows up all over the place, especially coming out of the central limit theorem.
Indigo is a lie.
Which idiot decided that websites can't go within 4cm of the edge of the screen?
There should be a null word, for the question "Is anybody there?" and to see if microphones are on.

mister k
Posts: 643
Joined: Sun Aug 27, 2006 11:28 pm UTC
Contact:

### Re: Why standard deviation?

TennysonXII wrote:I'm taking a Psych Stats course this semester and we're getting into measures of variability. I'm understanding all of the math and terminology, but one thing is bothering me: why do we use the standard deviation?

I get that the standard deviation is the square root of the mean of squared deviations, and that for a normal distribution 68.27% of scores will fall within one standard deviation of the mean, 95.45% within two standard deviations, etc., but it still seems like a rather arbitrary measure. I've read the merged thread from ages ago (apologies if this should have gone there, the rules weren't clear on thread necromancy), but nothing there gave a satisfactory explanation. Most of the answers were either "It makes sense when doing higher math" or "Because it works". When I asked my professor about it, he muttered something about needing to see the proof of the normal curve and sort of apologetically admitted to not being able to give me the answer. If someone could really break it down for me like I'm five, I'd appreciate it a lot.

Well we need a measure of the variation observed, and the standard deviation gives us this. It has lots of nice properties as others have mentioned, it being coupled with the normal distribution, which we can often use as our error distribution (we might have to fiddle with the data a bit first, but we can do that in understood ways). There are distributions where the standard deviation is a bit less meaningful: for any heavily skewed distribution (having more weight to the left or right of the mean than it does on the other side) then it gives us less intuitive information.
Elvish Pillager wrote:you're basically a daytime-miller: you always come up as guilty to scumdar.

kbltd
Posts: 32
Joined: Wed Jul 25, 2007 2:47 pm UTC
Location: Indoors

### Re: Why standard deviation?

I'd say don't get too distracted by the behaviour of the normal distribution. The standard deviation applies to any distribution. Dason
Posts: 1311
Joined: Wed Dec 02, 2009 7:06 am UTC
Location: ~/

### Re: Why standard deviation?

Like arbiteroftruth mentioned the standard deviation isn't the only way to measure spread - but it is the most convenient. The mean absolute deviation (MAD) which is the average of the absolute values of the differences from the mean is actually used for some things. But it isn't an easy quantity to do math with. The standard deviation (or rather the variance...) is a lot easier to do work with because it does pretty much what we want and has a couple useful properties (the variance is additive for independent random variables which is really nice). It also has a really natural interpretation for the normal distribution.

The variance is also one of the central moments of a distribution which are interesting and useful to describing various features of a distribution. Some distributions have really interesting mean to variance relationships (in the Poisson the mean and the variance are always the same). I don't know if we get some of those interesting things with the MAD.
double epsilon = -.0000001;

nomadiq
Posts: 18
Joined: Wed Apr 27, 2011 8:57 pm UTC

### Re: Why standard deviation?

Macbi wrote:The Gaussian is everyone's favourite distribution because it shows up all over the place, especially coming out of the central limit theorem.

Macbi is right, but I'd like to point out that other distributions also show up all over the place. The Gaussian is not everywhere. While you can measure the standard deviation of any set of data, that calculated number has limited meaning outside of Gaussian distributions. If your data is not Gaussian distributed, be very careful how you interpret your standard deviation.

TennysonXII
Posts: 14
Joined: Thu Dec 31, 2009 9:36 pm UTC

### Re: Why standard deviation?

Dunno if I'm asking the right question here. Happens all the time. Let me try a different way.

The interquartile range makes sense to me. Four chunks of 25 is an easy way to deal with data, so a long time ago someone arbitrarily said "Let's break up the data that way!" Hell, even the MAD makes sense to me. What I don't understand is why someone would go, "Let's take the square root of the mean of squared deviations, that will really work!" What's so special about the standard deviation? How is it that x% of scores will always show up within y standard deviations of the mean on a normal curve?

mister k
Posts: 643
Joined: Sun Aug 27, 2006 11:28 pm UTC
Contact:

### Re: Why standard deviation?

TennysonXII wrote:Dunno if I'm asking the right question here. Happens all the time. Let me try a different way.

The interquartile range makes sense to me. Four chunks of 25 is an easy way to deal with data, so a long time ago someone arbitrarily said "Let's break up the data that way!" Hell, even the MAD makes sense to me. What I don't understand is why someone would go, "Let's take the square root of the mean of squared deviations, that will really work!" What's so special about the standard deviation? How is it that x% of scores will always show up within y standard deviations of the mean on a normal curve?

Well look at the way the normal distribution is defined! The shape of the curve is dependent on the standard deviation! Given that this is the case, of course theres going to be lots of nice properties there.

If the mean absolute difference makes sense to you, why not the square? In general its not a ridiculous thing to square distances- lets think about the distance between two points in a two dimensional plane here. We can measure how far east one point is from the other, and how far apart north they are. So whats the distance? Well pythaogras' theorem says that a^2=b^2+c^2 for a right angled triangle, so the distance between these two points is the square root of the square of the difference in east-west direction, and difference in north south direction.

We can extend this to any number of dimensions, so on a sphere we can look at the left right, up down distance, and distance in depth, square and sum them, then root to get the distance between their points. This is called the cartesian distance.

If we have a set of data points, and their mean, we can (kinda) think of each of these deviations as representing a distance on an n-sided plane, so a measure of the data's distance from the mean is given by the cartesian distance.
Elvish Pillager wrote:you're basically a daytime-miller: you always come up as guilty to scumdar.

skullturf
Posts: 556
Joined: Thu Dec 07, 2006 8:37 pm UTC
Location: Chicago
Contact:

### Re: Why standard deviation?

I also had a bit of trouble with the notions of variance and standard deviation when I was first introduced to them. It wasn't really intuitive to me why we were squaring.

What made more intuitive sense to me at that time was: take the average of the absolute values of the deviations from the mean.

My instructor explained it by saying the absolute value isn't a nice function to do calculus with, so we use squares instead.

At the time, that almost seemed a little bit like a cop-out to me -- are we using squares just because it's convenient, and not because it captures some intuitive notion of typical deviation from the mean?

Perhaps one possible answer to your question is the following:

In general, when measuring how far a set of values is from some other set of values, the square root of the sum of the squares of the differences is in fact a natural measure to use.

Euclidean distance in two, three, or more dimensions is a square root of a sum of squares of differences.

EDIT: I was scooped by mister k. Maybe my remarks are still of some use.

mfb
Posts: 950
Joined: Thu Jan 08, 2009 7:48 pm UTC

### Re: Why standard deviation?

TennysonXII wrote:Dunno if I'm asking the right question here. Happens all the time. Let me try a different way.

The interquartile range makes sense to me. Four chunks of 25 is an easy way to deal with data, so a long time ago someone arbitrarily said "Let's break up the data that way!" Hell, even the MAD makes sense to me. What I don't understand is why someone would go, "Let's take the square root of the mean of squared deviations, that will really work!" What's so special about the standard deviation? How is it that x% of scores will always show up within y standard deviations of the mean on a normal curve?

You can scale all errors you use with a positive real factor and nothing would change. Only the interpretation as the standard deviation, as it would not be the standard deviation any more.

>> How is it that x% of scores will always show up within y standard deviations of the mean on a normal curve?
The gaussian distribution always has the same shape. The only changes are a shift of the whole distribution and a change of the width. Just rescale the axes and you get the same graph again. This statement is true for many (but not all) distributions.

The square has mathematical reasons - for example, the general error propagation, where you can square individual errors, add them, calculate the square root and get the standard deviation of the value. That is not possible with other definitions of deviations (again, except with multiples of the standard deviation).
It is linked to the distance in an n-dimensional space, where you have to square the indiviual differences, too.

kbltd wrote:I'd say don't get too distracted by the behaviour of the normal distribution. The standard deviation applies to any distribution.

Show me the standard deviation of a Cauchy distribution, please.

gmalivuk
GNU Terry Pratchett
Posts: 26818
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

### Re: Why standard deviation?

TennysonXII wrote:What's so special about the standard deviation?
If you have two independent sets of data, and you want to look at how their sums are distributed, variance is nice because the variance of the sum is the sum of the variance. I don't know that MAD or IQR or the other variability measurements have any nice such properties, do they?
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)

TennysonXII
Posts: 14
Joined: Thu Dec 31, 2009 9:36 pm UTC

### Re: Why standard deviation?

So let me see if I'm understanding a little better now:

Why does the standard deviation always fall along the same place in a normal distribution? Because data always falls along the same place in a normal distribution. That's kind of the definition of a normal distribution. If all the data line up the same way, then all measures of variability will line up the same way too.

Why do we use the standard deviation as opposed to some other measure of variability? Because it plays nicely with the normal curve and with other more complicated stuff that I'll find out about later.

I guess the only question I have left is why the standard deviation always falls at 34.1% above the mean as opposed to always falling at some other point. I don't see the relationship there.

Xanthir
My HERO!!!
Posts: 5423
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: The Googleplex
Contact:

### Re: Why standard deviation?

TennysonXII wrote:I guess the only question I have left is why the standard deviation always falls at 34.1% above the mean as opposed to always falling at some other point. I don't see the relationship there.

That's the "definition of the normal distribution" thing. The normal distribution has a particular shape, such that if you measure the area 1 stdev away from the mean, it's 68% of the total area. It just falls out of the math.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

Yakk
Poster with most posts but no title.
Posts: 11129
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

### Re: Why standard deviation?

How about I hit you with a theory hammer?

https://secure.wikimedia.org/wikipedia/ ... ral_moment

So we have this family of "moments about the mean".

The first 3 of which (not counting zero) are linear. Linear things are good. This is the "mean", the "variance" and the "kurtosis" of the distribution of data.

The standard deviation is the square root of the variance, because that brings its "scale" back in line with the scale of the values from the data (and/or the mean).

If you take a look at the equation for the normal curve:
https://secure.wikimedia.org/wikipedia/ ... stribution
you'll see those sigma squared terms ([imath]\sigma[/imath]2). Sigma squared is the variance. With the possible exception of a factor of 2, it should be pretty clear that the shape of a normal curve is very naturally a function of the mean ([imath]\mu[/imath]) and the variance.

The CDF describes how the integral of the normal curve behaves. You'll notice the sqrt of [imath]\sigma[/imath]2 there? That means that the "spread" of the sum of the area under the curve varies with the square root of the variance -- [imath]\sigma[/imath]. So if you want to know what percentage of elements are within some window, the width of the window is scaled by sigma.

As to why the normal curve is important, well, if you average up a bunch of uncorrelated random variables, the result ends up moving towards a normal curve.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

antonfire
Posts: 1772
Joined: Thu Apr 05, 2007 7:31 pm UTC

### Re: Why standard deviation?

TennysonXII wrote:I guess the only question I have left is why the standard deviation always falls at 34.1% above the mean as opposed to always falling at some other point. I don't see the relationship there.
It turns out that, if you start with pretty much any distribution with standard deviation sigma, sample that distribution N times (where N is pretty large), and take the average, then about 34.1% of the time that average will be within sigma/sqrt(N) of the mean of the distribution.

Our nice definition of "standard deviation" comes first, from which we can work out the esoteric-seeming number 34.1%. (And the rest of the esoteric-seeming numbers that come from the esoteric-seeming formula exp(-(x/sigma)2)/sqrt(2pi sigma2).) If you want to see how it's worked out, see a proof of the central limit theorem.
Jerry Bona wrote:The Axiom of Choice is obviously true; the Well Ordering Principle is obviously false; and who can tell about Zorn's Lemma?

jtheoph
Posts: 2
Joined: Thu Sep 22, 2011 8:58 pm UTC

### Re: Why standard deviation?

I'd never understood why standard deviation was so widely used instead of the simpler and more intuitive mean variation (average of the absolute value of the variance). As far as I can tell the only difference between the two is that standard deviation tends to magnify the importance of outliers due to averaging the square of the values rather than the values themselves. If the goal is to magnify the outliers, squaring seems arbitrary since you might as well cube all the values, average them, then take the cube root. This would result in a different number, but not a better number, for describing the dispersion of the data.

There's an interesting page that argues that mean variation is actually better than standard deviation in real life data since it is less likely to magnify error values. However, the main advantage of mean variation is that it has a clear, intuitive meaning which makes it more useful to the people interpreting the data.

Link wasn't allowed for some reason... here it is (copy/paste/fix):

http ://www.leeds.ac.uk/educol/documents/00003759.htm

Link removed from user's first post.
Last edited by jtheoph on Thu Sep 22, 2011 9:37 pm UTC, edited 1 time in total.

gmalivuk
GNU Terry Pratchett
Posts: 26818
Joined: Wed Feb 28, 2007 6:02 pm UTC
Location: Here and There
Contact:

### Re: Why standard deviation?

jtheoph wrote:As far as I can tell the only difference between the two is that standard deviation tends to magnify the importance of outliers due to averaging the square of the values rather than the values themselves. If the goal is to magnify the outliers, squaring seems arbitrary since you might as well cube all the values, average them, then take the cube root. This would result in a different number, but not a better number, for describing the dispersion of the data.
You know what would be neat? You reading the responses people have already posted in this thread prior to repeating the claim that standard deviation doesn't seem to be a good measure.

it has a clear, intuitive meaning which makes it more useful to the people interpreting the data.
This is only true if the people interpreting the data place a lot of value on clear, intuitive meanings. Rather than, say, being able to do other kinds of analysis on the data, which are difficult or impossible with as badly-behaved a function as absolute value.

Do you also have a suggestion to replace skewness and (excess) kurtosis? Because those are currently defined in terms of the standard deviation.
Unless stated otherwise, I do not care whether a statement, by itself, constitutes a persuasive political argument. I care whether it's true.
---
If this post has math that doesn't work for you, use TeX the World for Firefox or Chrome

(he/him/his)

TennysonXII
Posts: 14
Joined: Thu Dec 31, 2009 9:36 pm UTC

### Re: Why standard deviation?

antonfire wrote:Our nice definition of "standard deviation" comes first, from which we can work out the esoteric-seeming number 34.1%. (And the rest of the esoteric-seeming numbers that come from the esoteric-seeming formula exp(-(x/sigma)2)/sqrt(2pi sigma2).) If you want to see how it's worked out, see a proof of the central limit theorem.

So what you're saying is that once I understand exp(-(x/sigma)2)/sqrt(2pi sigma2), I'll understand the whole shebang. And in order to do that, I need to go find "Proof of the Central Limit Theorem for Dummies." That works for me, and thanks to everyone for helping me work this out. This is my first statistics course, and so far it's all really interesting and intuitively useful. That's something I couldn't have said about any other math I've worked through. My career decision to be a researcher is looking like an excellent fit. jtheoph
Posts: 2
Joined: Thu Sep 22, 2011 8:58 pm UTC

### Re: Why standard deviation?

Do you also have a suggestion to replace skewness and (excess) kurtosis? Because those are currently defined in terms of the standard deviation.

You can use the Bowley coefficient of skewedness as a robust alternative to the traditional definition of skewness. Kurtosis can also be calculated without relying on standard deviation as explained here: http://weber.ucsd.edu/~hwhite/pub_files/hwcv-092.pdf.

mfb
Posts: 950
Joined: Thu Jan 08, 2009 7:48 pm UTC

### Re: Why standard deviation?

If you have several values from a normal distribution, the probability for certain numbers depends on the sum of squares of their deviations from the mean value only. Try to get this nice feature with other definitions of the width of a distribution.

Stickman
Posts: 90
Joined: Fri Mar 07, 2008 11:55 am UTC
Location: Decatur, Ga

### Re: Why standard deviation?

jtheoph wrote:There's an interesting page that argues that mean variation is actually better than standard deviation in real life data since it is less likely to magnify error values. However, the main advantage of mean variation is that it has a clear, intuitive meaning which makes it more useful to the people interpreting the data.

http ://www.leeds.ac.uk/educol/documents/00003759.htm

There are a few interesting points here, but there is a bit half-way where Gorard claims that any [infinite] super-population (that is, infinite population used to approximate a large finite population) must necessarily have an infinite variance. At that point, it became clear that he doesn't really understand how parameter estimation works, which invalidates most of his main points about the efficiency of the mean deviation (and I pretty much gave up at that point).

gorcee
Posts: 1501
Joined: Sun Jul 13, 2008 3:14 am UTC

### Re: Why standard deviation?

Standard deviation is also easier to calculate when it comes to studying stochastic process, i.e., via the Cameron-Martin theorem, a solution of the polynomial chaos expansion coefficients (c_i) yields mu = c_0, sigma^2 = sum_i=1..P c_i^2. So you can obtain statistical moments without actually having to generate statistics.

Dynotec
Posts: 89
Joined: Fri Dec 15, 2006 2:13 pm UTC
Contact:

### Re: Why standard deviation?

It depends on the data you're looking at. If the excess kurtosis of your data is zero (IE, roughly normal), then fitting with least squares and thinking in terms of standard deviations is optimal under a bunch of criteria (Efficiency of statistical inferrence, etc). Generally, most is going to be normal due to the CLT, with maybe one or two easily detectable outliers from contamination.

But if your data is laplace distributed, then it's theoretically optimal to use mean absolute error. More generally, I remember reading an article claiming that the correct measure of dispersion is an L_p norm that depends on the kurtosis of the distribution in question according to some formula.

Return to “Mathematics”

### Who is online

Users browsing this forum: No registered users and 6 guests