## 3 Statistics Questions

### 3 Statistics Questions

I do not have that p-with-a-hat symbol, so I am going to use p instead.

First, if 1-p = q and E = Zc * square root of (pq/n), then it follows E = Zc * square root of ((p - p2/n), so why should I even bother finding q?

Second, if I am using p to estimate p, and P(-E < p-p < E) = c, then it follows P(-E < 0 < E) = c, so why is the former given instead of the latter.

Third;
Spoiler:
In the population p1, in couples that have not been able to conceive infertility problems have been attributed to the men 50% of the time and the women 50% of the time. In the population p2, in couples that have not been able to conceive infertility problems have been attributed to the men 60% of the time and the women 40% of the time. This change is not because men in p2 are more likely to have fertility problem than men in p1, rather women in p2 are less likely to have fertility problems than women in p1. How much more likely is a women in p1 to have a fertility problem than a women in p2?

The reason I cannot find the answer is because I do not know how a couple who had fertility problems attributed to the man and woman is counted.

### Re: 3 Statistics Questions

Can you just go ahead and define all your variables to start with? All of those letters can have different meanings in different contexts.

More generally, "why should I bother" questions are sometimes answered with, "In many cases, you shouldn't." If you're not using q for anything else, don't worry about what it is and just do everything in terms of p. You're listing identities and definitions, which are not all going to be equal in any one problem. There are multiple different ways to express the same information, and different ways are useful in different situations.
### Re: 3 Statistics Questions

For 1, pq requires less ink to write than p-p^2. Concise notation is a beautiful thing. Maybe concise means, "Let's use fewer symbols / less ink." But maybe it means, "Let's use fewer distinct symbols." Either is absolutely fine. You can use p-p^2 all over the place, completely ignore q. No problem. This distinction is just aesthetic.

For 2, P(-E < 0 < E) is not equal to c, it's equal to 1. Of course zero is between a positive and a negative number. That is an absolute fact!
The thing that is interesting is that you have a certain confidence that the population value is within some range of the sample value.

For three, of course you don't. If you knew, you would just check the couple, you wouldn't be making an inference. So instead, if you have a couple from group 1, you are deciding whose fault it is with a coin toss, and for group 2, with a weighted coin toss.
### Re: 3 Statistics Questions

For #3, they're saying that the ratio changed, but the actual number of men with fertility problems in p2 didn't change; it's exactly the same as p1. So the question is, what # of women with fertility problems in p2 do you need so that you get the right right ratio but have the same # of men? Then, compare that with the # of women in p1 to get the answer.

So, say in p1 there are 1000 people, and 10% of men have fertility problems (so 100 infertile men total). This means there are also 100 infertile women, due to the 50/50 ratio. Then in p2 we have 1000 people, and there's still 10% of the male population that's infertile (100 total). How many women are infertile, to make the men/women ratio 60/40?
