University of Minnesota, Twin Cities School of Statistics Stat 3011 Rweb Textbook (Wild and Seber)
In Section 8.5 of Wild and Seber a valiant attempt is made to clarify some extremely important issues which are generally ignored by other introductory statistics books (for which they deserve cheers). But last year's experience was that more was needed than just those few pages. Hence this page.
Statistical accuracy varies as the square root of the sample size (sample sizes when two samples are involved).
What the sample size of a poll is depends on what you are talking about. If a proportion is for a subgroup, then the relevant sample size is the size of the subgroup, not the whole sample.
Wild and Seber mention this on p. 350, but it needs more emphasis.
If the margin of error of a poll is reported as 3%, then that margin of error only applies to questions involving the whole sample.
For questions involving a subgroup, you must multiply the margin of error
by the square root of the whole sample size over the subgroup size.
For example, if the margin of error is reported as 3% and a question involves
a subgroup that is only one-tenth of the sample, then the margin of error
must be multiplied by sqrt(10)
= 3.162. So the relevant
margin of error for proportions in this subgroup is 9.3% rather than 3%.
In Figure 8.5.1 and Table 8.5.5 Wild and Seber make a valiant attempt to clarify a rather confusing situation. Past experience says their discussion is clear as mud. So here's my try.
There are three situations, labeled (a), (b), and (c) in Table 8.5.5 in Wild and Seber. All we need is to be able to identify the situations as they arise so we can use the correct formula. Here's my description.
Each poll is done independently of the other, so that's why Wild and Seber
call this two independent samples
.
Note that this is the case we have known about since the end of Chapter 7.
So far it is the only interval for difference of proportions that
we have studied. It is also the only kind of interval the R function
prop.test
knows about.
The following two kinds of intervals are completely new. We have never seen them before.
Of course, if there's only one poll, there is only
one sample size. The several response categories
vaguely refers
to the fact that we are comparing different answers
(response categories
) to the same question.
Of course, if there's only one poll, there is only
one sample size. The many yes/no items
is rather misleading.
Whether the questions are yes/no are multiple guess is irrelevant.
The point is that we are comparing answers to different questions.
Suppose in two polls a month apart 50% of likely voters said they favored Jones in the first poll and only 45% in the second poll. Both polls had a sample size of 1000. What is a 2 se interval for the difference?
The result from Rweb is (0.005, 0.095). Since this interval only contains positive numbers, it looks like the true population proportion has decreased in the month between the two polls.
Note: this is case (a) in Wild and Seber's classification, the one we have long known how to do.
Another note: if the proportions involve subgroups,
then n1
and n2
are the subgroup sizes not the whole sample sizes!
(See also the first section of this page and
the last section of this page).
Suppose in one poll 50% of likely voters said they favored Jones, 45% said they favored Smith and 5% were undecided. The poll had a sample size of 1000. What is a 2 se interval for the difference between Jones and Smith?
The result from Rweb is (-0.012, 0.112).
Note that this is very different (quite a bit wider) than
the interval for the type (a) example
although everything is the same except the type. The sample sizes are all
the same (1000). The sample proportions p1
and p2
are the same. The only thing different is the formula for the standard
error of p1 - p2
, which is very different
for type (a) and type (b) problems.
Another note: if the proportions involve a subgroup,
then n
is the subgroup size not the whole sample size!
(See also the first section of this page and
the last section of this page).
Suppose in one poll 50% of respondents said they liked Dilbert and 45% said they like Doonesbury. The poll had a sample size of 1000. What is a 2 se interval for the true population difference between liking for these two comic strips?
Note that these are answers to two different questions. Some people like both. Some people like neither. Some only one or the other. So this is a problem of type (c).
The result from Rweb is (-0.012, 0.112).
Here the type (b) and type (c) standard error formulas give exactly the same result, so the intervals for our two examples are exactly the same.
The type (b) and type (c) standard error formulas give quite different
results if p1
and p2
are both large. Change
them to 0.80 and 0.75 in both examples and see how different the results are.
Another note: if the proportions involve a subgroup,
then n
is the subgroup size not the whole sample size!
(See also the first section of this page and
the last section of this page).
Wild and Seber call these mental adjustments
(p. 350).
They are invaluable in reading about polls in newspapers and magazines
or even watching TV news about polls.
Let's redo the three examples above the quick and dirty way.
Suppose the reported margins of error for the polls are 3% (which is about right for sample size 1000).
For the type (a) example the interval is 5% plus or minus 1.5 times 3%, which is 5% plus or minus 4.5%.
That's (0.005, 0.095), which happens to be the same as the exact calculation (!) to this many significant figures.
For the type (b) and (c) examples the interval is 5% plus or minus 2 times 3%, which is 5% plus or minus 6%.
That's (-0.01, 0.11), which is not so different from the result (-0.012, 0.112) of the exact calculation.
It's not so easy to do in your head the square roots required for subgroup calculations (first section of this page).
In their section on mental adjustments
Wild and Seber suggest just
taking the closest fraction that is a perfect square. So for our example
above, instead of using sqrt(10)
= 3.162, they say to use
sqrt(9)
= 3.
However, if you don't know all the perfect squares, this may still be hard. Maybe it's best to use a calculator.
If the proportions in a difference of proportions problem involve subgroups,
then the n1
and n2
for a type (a) problem or
the n
for a type (b) problem are the subgroup sizes
not the whole sample sizes!
The quick and dirty
calculation or mental adjustment
is to just apply both adjustments.