## Question about statistical hypothesis test

For the discussion of math. Duh.

Moderators: gmalivuk, Moderators General, Prelates

ludw
Posts: 11
Joined: Thu Nov 29, 2007 10:55 pm UTC
Contact:

### Question about statistical hypothesis test

Today we had a class on hypothesis tests, but there's one thing that I don't understand.

We have two sets of hypothesises;
H0: u = 397
H1: u != 397
and,
H2: u = 397
H3: u > 397

In the example we worked on we had z0 = 1,7
We wanted a significance level of 5% so a = 0,05

In the first set of hypothesises we have la/2 = 1,9600,
and since z0 is between -la/2 and la/2 we cannot reject H0.

In the second set of hypothesis we have la = 1,6449,
since z0 > la we can reject H2.

The thing I can't understand is, why can we say that u > 397 when we can't say that u != 397?
Is there something terribly wrong in what I do, or is there something I have misinterpreted?

Note:
I don't know if the forum supports utf-8 so instead of greek letters i used latin ones.
u - small mu
a - small alpha
l - small lambda

Kizyr
Posts: 2070
Joined: Wed Nov 15, 2006 4:16 am UTC
Location: Virginia
Contact:

### Re: Question about statistical hypothesis test

ludw wrote:The thing I can't understand is, why can we say that u > 397 when we can't say that u != 397?
Is there something terribly wrong in what I do, or is there something I have misinterpreted?

It's pretty simple, really. It's the basic difference between a one-tailed and a two-tailed t-test. I'll try to explain this best I can; hopefully I don't end up sounding confusing (I'm not a teacher, so it's hard to translate some of my thoughts into words on this--sorry). Anyway...

With the test of u != 397, the null cannot be rejected if x-bar is within 1.96*sd above or below 397.
With the test of u > 397, the null cannot be rejected if x-bar is within 1.64*sd above or below 397.
(x-bar = the sample's mean)

The key thing to note is the 1.96 vs 1.64. The reason the significance threshold changes is that a two-tailed t-test has more to prove than a one-tailed t-test. In a two-tailed t-test, you're worried about all the values above and below your mean; in a one-tailed t-test, you're only worried about the values on ones side of the mean. Basically, since you have more to prove with the (u != 397) test, than with the (u > 397) test, the threshold is higher.

...does this make sense so far? I can try to clarify, but since I'm not an educator, it's a bit difficult to word things right. KF
~Kizyr

ludw
Posts: 11
Joined: Thu Nov 29, 2007 10:55 pm UTC
Contact:

### Re: Question about statistical hypothesis test

Yeah, that makes sense.

But it still buggs me that u > 397 – in my mind at least – seems to imply that u != 397.
Or is it wrong to think like that about these tests?

nwinches
Posts: 6
Joined: Sat Nov 10, 2007 9:24 am UTC

### Re: Question about statistical hypothesis test

That's one thing that really bugged me about my stats class this year as well. Effectively, what you're after in the second case is not u = 397, but u <= 397. Since you're still using a = .05 you can have a 5% chance of it being greater than 397 to reject H2, whereas in the first case, you only have 2.5% chance of it being greater (and 2.5% chance of it being less) in order to reject H0

Kizyr
Posts: 2070
Joined: Wed Nov 15, 2006 4:16 am UTC
Location: Virginia
Contact:

### Re: Question about statistical hypothesis test

ludw wrote:But it still buggs me that u > 397 – in my mind at least – seems to imply that u != 397.
Or is it wrong to think like that about these tests?

Well, it's not that it's wrong to think like that about it, but that the test itself is more nuanced than that.

Think of it this way... You have a true mean (u), and you're trying to figure out what that true mean is by measuring the mean of a sample (x-bar).

Keeping this in mind, go back to the first test:
H0: u = 397
H1: u != 397

You want to know if the true mean u != 397. This is true if you can infer, based on your sample (x-bar), that the true mean is 396, 395, 394, etc., or 398, 399, 400, etc. This will happen if x-bar is significantly above or below 397.

For the second test:
H0: u <= 397 (fix'd, because the hypotheses should be exclusive)
H1: u > 397

You want to know if the true mean u > 397. This is true if you can infer, based on your sample (x-bar), that the true mean is 398, 399, 400, etc., but not 396, 395, 394, etc. This will happen if x-bar is significantly above 397 only.

Because you're using a sample (x-bar) to infer what the true mean (u) is, in the second test, you're only concerned with the range above 397 that x-bar has to be outside; if x-bar looks like it's below 397 then it still doesn't reject the null hypothesis. Meanwhile, in the first test, you're concerned with the range above and below 397 that x-bar has to be outside. In the first test (two-tailed), there's a greater range you're concerned with that x-bar has to fall outside of in order to reject the null.

...I really hope I haven't confused you further by this. But that second paragraph I started off with is perhaps one of the most important things you want to keep in mind when covering anything in stat. KF
~Kizyr

TemperedMartensite
Posts: 34
Joined: Tue Nov 20, 2007 4:32 am UTC

### Re: Question about statistical hypothesis test

Speaking from experience, the best introduction to Stats has to be "The Cartoon Guide to Statistics". It clarifies the little things, like hypothesis testing. Of course you actually need a real textbook too since it does it's very best to ignore F-distributions etc. Other than that Kizyr did a good job explaining. 2nd sentence especially

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

### Re: Question about statistical hypothesis test

Ah, time to confuse the issue.

Classical hypothesis testing is bunk!

Don't believe me? Let us compare 2 simple tests.

Couple A decides to have 8 kids. They have 7 sons and a daughter.

Couple B decides to have kids until they have both a son and a daughter, then they'll stop. They have 7 sons then a daughter.

Let's test the hypothesis that they are equally likely to have sons and daughters with both couple A and couple B.

There is one way that couple A could have 8 sons, and 8 that they could have 7 sons and a daughter. There is one way they could have 8 daughters and 8 that they could have 7 daughters and a son. Therefore there are 18 combinations of kids that would result in something as unlikely as we saw. There are 256 combinations of genders of kids they could have had. By the null hypothesis all are likely equal, so the odds of seeing something as unusual as what we saw are about 7%. At 95% confidence we would not reject the hypothesis that couple A is equally likely to have sons and daughters.

Under the null hypothesis, the only ways that couple B could get a result that is as or more unusual as what they saw is if their first 7 kids are sons, or their first 7 kids are daughters. The odds of that are 2/27 = 1.5625%. At 95% confidence we would reject the null hypothesis that couple B is equally likely to have sons and daughters.

Wait a minute! Didn't we just see 2 different couples do the same thing? Why are we drawing different conclusions? Well we're drawing different conclusions because we are given different statements about their behaviour. We therefore have a different set of probabilities to compare, and so we come to different answers.

Wait a minute! Should the extra information make a difference in our conclusions? Well, actually not. Suppose you start with some set of prior expectations about the universe. Such as, "I think there is a 90% chance that they are equally likely to have sons and daughters, a 5% chance that they'll have twice as many sons than daughters, and a 5% chance that they'll have twice as many daughters than sons." After observing the experiment your prior beliefs should get modified by experience. The exact way that you modify them is determined by Bayes' Theorem. And you can show that for any set of prior expectations, the conclusions that you make after the experiment will be identical for couples A and B!

The conclusion? Hypothesis testing causes us to factor in information that provably should be irrelevant to our final decisions!

Huh? How can this be.

Spoiler:
Everything that I've said is true, and is at the heart of a debate among statisticians about how to best do hypothesis testing. The ones who do not like classical hypothesis testing are called Bayesians. They have a number of alternatives. Their alternatives are, unfortunately, significantly more complicated to use and understand than hypothesis testing.

Hypothesis testing wins in the real world based on the fact that it is widely accepted and simple, no matter the subtle issues that might exist in it.

The true heart of the problem is that people want to use statistics to answer an impossible problem. Namely, "Now that I've done this experiment, what are the odds that theory X is true?" That is impossible to answer. Hypothesis testing gets around it by asking a related question that you can answer. "What are the odds under the null hypothesis of getting evidence that would be stronger evidence against than what I got?" Bayesian statistics gets around it by explicitly not answering the question, and instead giving you pretty graphs saying things like, "If you had this type of prior expectation, here is how your beliefs would change based on the evidence."
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

TemperedMartensite
Posts: 34
Joined: Tue Nov 20, 2007 4:32 am UTC

### Re: Question about statistical hypothesis test

Obviously you called into play other data when basing your conclusion on it. Not to mention a small n that invalidates your conclusion. Normality cannot even be assumed as n<~30. Besides that, you are factoring in information that in no way belongs there. By factoring in the order of your data, you are testing a completely different hypothesis, it should have remained as an nCr as in your first example due to your null hypothesis AND the implicit assumption of normality.

Either way I agree that hypothesis testing is pretty sketchy. Only time I would ever use it would be for stats class.

Also I thought an argument like that ended with "And thats what separates the Baye's from the Men"

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

### Re: Question about statistical hypothesis test

TemperedMartensite wrote:Obviously you called into play other data when basing your conclusion on it. Not to mention a small n that invalidates your conclusion. Normality cannot even be assumed as n<~30. Besides that, you are factoring in information that in no way belongs there. By factoring in the order of your data, you are testing a completely different hypothesis, it should have remained as an nCr as in your first example due to your null hypothesis AND the implicit assumption of normality.

Nowhere did I use normal approximations. All probability calculations were exact so small n does not enter into it. The reason why ordering of the data entered in for couple B was due to experimental design - they were going to stop having kids after they had both a boy and a girl, so, for instance, "boy, girl" would just stop there and we'd never have known whether the next 50 might have been boys.
TemperedMartensite wrote:Either way I agree that hypothesis testing is pretty sketchy. Only time I would ever use it would be for stats class.

You'd be amazed at how useful it is in the real world...
TemperedMartensite wrote:Also I thought an argument like that ended with "And thats what separates the Baye's from the Men"

You should be ashamed of yourself! (And yes, I'll have to use that line sometime.)
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

TemperedMartensite
Posts: 34
Joined: Tue Nov 20, 2007 4:32 am UTC

### Re: Question about statistical hypothesis test

btilly wrote:
TemperedMartensite wrote:Either way I agree that hypothesis testing is pretty sketchy. Only time I would ever use it would be for stats class.

You'd be amazed at how useful it is in the real world...

Everytime I have even thought about using it, I always realize that it would be much easier to take the mathematical model I have created and simulate with it attempting to reproduce my results on a much much larger scale. I imagine that certain sciences would lend themselves far better to hypothesis testing - unlike mine.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

### Re: Question about statistical hypothesis test

TemperedMartensite wrote:
btilly wrote:
TemperedMartensite wrote:Either way I agree that hypothesis testing is pretty sketchy. Only time I would ever use it would be for stats class.

You'd be amazed at how useful it is in the real world...

Everytime I have even thought about using it, I always realize that it would be much easier to take the mathematical model I have created and simulate with it attempting to reproduce my results on a much much larger scale. I imagine that certain sciences would lend themselves far better to hypothesis testing - unlike mine.

By contrast I have personally done hypothesis tests whose outcome resulted in millions of dollars for my employer. But then again you can't form a mathematical model of user behaviour on a website.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

TemperedMartensite
Posts: 34
Joined: Tue Nov 20, 2007 4:32 am UTC

### Re: Question about statistical hypothesis test

btilly wrote:By contrast I have personally done hypothesis tests whose outcome resulted in millions of dollars for my employer. But then again you can't form a mathematical model of user behaviour on a website.

By contrast to that I have successfully simulated the deposition of tidal rhythmites that were deposited approximately 100 million years ago proving that the area was tidal so the sand, full of oil of course, is probably awesome without a lot of mud and carbonaceous crap.

Good times with statistics.

Cosmologicon
Posts: 1806
Joined: Sat Nov 25, 2006 9:47 am UTC
Location: Cambridge MA USA
Contact:

### Re: Question about statistical hypothesis test

btilly wrote:Wait a minute! Should the extra information make a difference in our conclusions? Well, actually not. Suppose you start with some set of prior expectations about the universe. Such as, "I think there is a 90% chance that they are equally likely to have sons and daughters, a 5% chance that they'll have twice as many sons than daughters, and a 5% chance that they'll have twice as many daughters than sons." After observing the experiment your prior beliefs should get modified by experience. The exact way that you modify them is determined by Bayes' Theorem. And you can show that for any set of prior expectations, the conclusions that you make after the experiment will be identical for couples A and B!

I believe you, but I found it surprising, so I tried to show it for your example. I failed. Can you tell me where I went wrong?

H1 is the hypothesis that sons and daughters are equally likely (prior of 90%)
H2 is the hypothesis that sons are twice as likely as daughters (prior of 5%)
Since the other one is symmetric, I'll just double the values for H2

For couple A:
Event C is having at least 7/8 children the same sex.
P(C) = P(C|H1) P(H1) + 2 P(C|H2) P(H2) = (18/256)(9/10) + 2(1297/6561)(1/20) = 697457/8398080
P(H1|C) = P(C|H1) P(H1) / P(C) = (18/256)(9/10) / P(C) = 531441/697457 = 0.762

For couple B:
Event C is their first 7 children all being the same sex
P(C) = P(C|H1) P(H1) + 2 P(C|H2) P(H2) = (1/64)(9/10) + 2(43/729)(1/20) = 9313/466560
P(H1|C) = P(C|H1) P(H1) / P(C) = (1/64)(9/10) / P(C) = 6561/9313 = 0.705

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

### Re: Question about statistical hypothesis test

TemperedMartensite wrote:
btilly wrote:By contrast I have personally done hypothesis tests whose outcome resulted in millions of dollars for my employer. But then again you can't form a mathematical model of user behaviour on a website.

By contrast to that I have successfully simulated the deposition of tidal rhythmites that were deposited approximately 100 million years ago proving that the area was tidal so the sand, full of oil of course, is probably awesome without a lot of mud and carbonaceous crap.

Good times with statistics.

And you'll have better times if you understand it than if you don't.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

Cosmologicon
Posts: 1806
Joined: Sat Nov 25, 2006 9:47 am UTC
Location: Cambridge MA USA
Contact:

### Re: Question about statistical hypothesis test

Sorry to bump this thread, but I wonder if you missed my question above? I checked my math and I don't see anything wrong, but it's quite possible I'm misapplying Bayes' Theorem.

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

### Re: Question about statistical hypothesis test

Cosmologicon wrote:Sorry to bump this thread, but I wonder if you missed my question above? I checked my math and I don't see anything wrong, but it's quite possible I'm misapplying Bayes' Theorem.

Sorry, I had missed the question.

To properly apply Bayes' Theorem you need to input the complete observation. For both couples, you observed the same thing, 7 sons then a daughter. You therefore perform the same calculation to work out the conditional probability of the observed outcome under any particular set of hypothesis, and therefore you come out with the same conclusion.

The extra information about their intentions does not enter into the calculation in any way.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

Cosmologicon
Posts: 1806
Joined: Sat Nov 25, 2006 9:47 am UTC
Location: Cambridge MA USA
Contact:

### Re: Question about statistical hypothesis test

Thanks a lot for responding, but that's it?? You said "you can show that for any set of prior expectations, the conclusions that you make after the experiment will be identical for couples A and B!"? If you ignore the intentions of the couples, isn't that a completely trivial statement? What's there left to show? I have to say, with that italics and exclamation point and all, you made it sound to me like there was something special going on.

If you ignore the intentions and go only on observations, then you'll get the same result no matter whether you're using Bayesian analysis or hypothesis testing. So what does this example prove?

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

### Re: Question about statistical hypothesis test

Cosmologicon wrote:Thanks a lot for responding, but that's it?? You said "you can show that for any set of prior expectations, the conclusions that you make after the experiment will be identical for couples A and B!"? If you ignore the intentions of the couples, isn't that a completely trivial statement? What's there left to show? I have to say, with that italics and exclamation point and all, you made it sound to me like there was something special going on.

If you ignore the intentions and go only on observations, then you'll get the same result no matter whether you're using Bayesian analysis or hypothesis testing. So what does this example prove?

You're missing that you can't go only on observations with hypothesis testing. You have to take intentions into account in some way. You might not think about how you're bringing intentions into it, but you are implicitly whether you like it or not.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.

Cosmologicon
Posts: 1806
Joined: Sat Nov 25, 2006 9:47 am UTC
Location: Cambridge MA USA
Contact:

### Re: Question about statistical hypothesis test

Well, I don't see why that's the case at all. In the example you gave with the two couples there was nothing forcing you to take their intentions into account. You could just as easily have ignored their intentions, in which case you'd analyze them both as you would couple A. And conversely there's nothing with Bayesian analysis that forces you to exclude intentions. You could just as easily do it like I did, incorrectly.

It also seems circular the way you've decided that it's incorrect to include intentions in the first place. You say that's the case because Bayesian analysis proves they don't matter. But they only don't matter because Bayesian analysis doesn't include them. And it only doesn't include them because it's incorrect to!

Logodaedalus
Posts: 34
Joined: Thu Aug 16, 2007 11:21 pm UTC
Contact:

### Re: Question about statistical hypothesis test

I have to say I'm with Cosmologicon here. I don't think this problem says anything about Bayesian vs. Frequentist statistics -- I think it just highlights the fact that the information you use to arrive at a conclusion affects your conclusion.

Furthermore, I'm inclined to say that the knowledge about the couples' behavior *should* influence our conclusion, and that it's perfectly possible to construct a Bayesian model that does so. In fact, any respectable Bayesian would likely tell you that it would be irresponsible not to use all available information.

EDIT: to be more specific, the fact that couple B made it to their 8th child *is valuable information*, and counts as evidence against the null hypothesis. If we knew a priori that they were going to have eight children, then the mere presence of 8 children would be without evidential value. So "intention" is critical, since it affects the generating mechanism based on which we're inferring.
"I feel like a quote out of context"

btilly
Posts: 1877
Joined: Tue Nov 06, 2007 7:08 pm UTC

### Re: Question about statistical hypothesis test

Cosmologicon wrote:Well, I don't see why that's the case at all. In the example you gave with the two couples there was nothing forcing you to take their intentions into account. You could just as easily have ignored their intentions, in which case you'd analyze them both as you would couple A. And conversely there's nothing with Bayesian analysis that forces you to exclude intentions. You could just as easily do it like I did, incorrectly.

Let me address those in reverse order.

As you note, you applied Bayes' Theorem incorrectly. In Bayesian analysis you're supposed to apply it correctly. Meaning you apply it to exactly what you observed. If you do that, then there is no place for your notions about what you think is relevant to affect your analysis.

You got different numbers because for couple B you didn't include the data point of the 8th daughter. However that is a data point that you have, and therefore a correctly done Bayesian analysis must include it.

Now to hypothesis testing. In hypothesis testing you set up a universe of possible answers and you're comparing what you did observe with what you think you might have observed instead. When you try to ignore intentions, you're incorrectly analyzing what you might have observed instead. In short while you can try to ignore intentions, by doing so you've violated the hypothesis testing methodology.
Cosmologicon wrote:It also seems circular the way you've decided that it's incorrect to include intentions in the first place. You say that's the case because Bayesian analysis proves they don't matter. But they only don't matter because Bayesian analysis doesn't include them. And it only doesn't include them because it's incorrect to!

No, it is not circular. It only looks that way.

If you have 2 hypotheses, and a priori beliefs about their relative likelyhood, there is no question about how you should modify your beliefs in the face of evidence. You should modify them in accord to Bayes' Theorem. You should also include all of the evidence that you can include because it is all pertinent.

But once you include all of the evidence that you can include and apply Bayes' Theorem, you get an answer. There is no room to get a different answer, you have the one answer. And that answer has absolutely no room to be affected by differences in experimental setup. This method of drawing inferences is provably correct.

What about hypothesis testing? Hypothesis testing answers a different question. It tells you, "Under the null hypothesis, how likely is it that I'd have seen something at least as unlikely as what I saw?" It explicitly is not telling us how likely the null hypothesis is. It is merely answering a question that we think relates.

No theorem of probability theory tells us that hypothesis testing is a correct way to draw inferences. No theorem can possibly do so because we apply it in situations where it is impossible to quantify the odds that the null hypothesis is correct. It is a methodology that we use and hope works well. However results like this one demonstrate that it pulls in factors that we don't really want to pull in when drawing inferences.

Which is at least worth thinking about.
Logodaedalus wrote:Furthermore, I'm inclined to say that the knowledge about the couples' behavior *should* influence our conclusion, and that it's perfectly possible to construct a Bayesian model that does so. In fact, any respectable Bayesian would likely tell you that it would be irresponsible not to use all available information.

Fine. Tell me how to construct a calculation using Bayes' Theorem that does so.
Logodaedalus wrote:EDIT: to be more specific, the fact that couple B made it to their 8th child *is valuable information*, and counts as evidence against the null hypothesis. If we knew a priori that they were going to have eight children, then the mere presence of 8 children would be without evidential value. So "intention" is critical, since it affects the generating mechanism based on which we're inferring.

The fact that they made it to their 8th child is useful information. However we have more information to include, namely that they stopped at their 8th child. When you factor in all of the information we have, you get the same exact answer for couples A and B.
Some of us exist to find out what can and can't be done.

Others exist to hold the beer.