University of Minnesota, Twin Cities School of Statistics Stat 3011 Rweb Textbook (Wild and Seber)

Stat 3011 (Geyer) In-Class Examples (Chapter 8)

General Instructions
Computer-Produced Confidence Intervals

General Instructions

To do each example, just click the "Submit" button. You do not have to type in any R instructions (that's already done for you). You do not have to select a dataset (that's already done for you).

Computer-Produced Confidence Intervals

R has many functions that produce confidence intervals and their related hypothesis tests. We don't know what hypothesis tests are yet. They are the subject of Chapter 9 in Wild and Seber. The only reason we mention tests now, is that the R procedures, although they do both tests and confidence intervals are named after the test. The two such functions we will need in this chapter are called t.test and prop.test.

One-Sample Confidence Intervals for Means (Example 8.2.1 in Wild and Seber)

Basic Usage of the `t.test` function

To make a 95% confidence interval for the population mean using a random sample that is in an R dataset named fred you just say t.test(fred).

For an example we will use the data for Example 8.2.1 in Wild and Seber, which is in the file passtime.txt.

Since the output is rather complicated, we copy it below with the part about the confidence interval highlighted

   Rweb:> t.test(passtime)

            One Sample t-test

   data:  passtime
   t = 21667.79, df = 19, p-value = < 2.2e-16
   alternative hypothesis: true mean is not equal to 0
   95 percent confidence interval:
    24.82615 24.83095             
   sample estimates:
   mean of x
    24.82855

You can ignore the rest of the output (until we cover tests in the next chapter).

Changing the Confidence Level

For a confidence level different from 95%, use the conf.level optional argument to the t.test function. This is documented on the R on-line help for t.test.

For example, to get a 99% confidence interval for the same data do

Comparison With Hand Calculation

Just out of curiosity, let's check that the t.test function actually calculates the same interval we would do by hand.

Indeed. Exactly the same.

Computer-Aided Hand Calculation

Note: In order to use the t.test function, you must have the whole data set. There is no version of the function that produces an interval from just the sample mean and standard deviation.

So if you are given just the sample mean, sample standard deviation, and sample size and told to calculate a confidence interval, you will have to do it by hand like we did in Chapter 7. You can't use the computer, except as a fancy calculator as in the example above. We call this computer-aided hand calculation.

Question: The sample of 20 passage times had sample mean 24.8286 and sample s. d. 0.0051. What is the 95% confidence interval for the population mean?

Answer:

Note that the confidence interval calculated by the expression in line 5 uses only the numbers 24.8286, 0.0051, and 20. It does not use the data vector passtime. And, except for rounding error in the fifth decimal place, it agrees with the result computed by t.test.

We could, of course boil this whole calculation down to one line, if we want to.

You may think having to do some intervals by hand is an imposition. Why not do them all using the computer? But hand calculation is a useful skill. There are a variety of situations where you are not given the whole data, perhaps you are reading a scientific paper in which the authors provide the sample mean and standard deviation but no confidence interval, and you want the confidence interval. Then you must do it by hand. Or perhaps the authors provide a confidence interval for one confidence level and you want one for a different confidence level. Again you must do it by hand.

One-Sample Confidence Intervals for Proportions (Example 8.3.1 in Wild and Seber)

Basic Usage of the `prop.test` function

To make a 95% confidence interval for the population proportion using a random sample of size n in which there were x "successes" (yeses, votes for Fred, whatever is being counted, so the sample proportion is x / n), you just say prop.test(x, n).

For an example we will use the data for Example 8.3.1 in Wild and Seber, in which n = 200 and the sample proportion is reported as 70%, so x is apparently 0.70 * 200 = 140.

Hence to calculate the confidence interval we do

Since the output is rather complicated, we copy it below with the part about the confidence interval highlighted

   Rweb:> prop.test(140, 200)

            1-sample proportions test with continuity correction

   data:  140 out of 200, null probability 0.5
   X-squared = 31.205, df = 1, p-value = 2.322e-08
   alternative hypothesis: true p is not equal to 0.5
   95 percent confidence interval:
    0.6306108 0.7615577           
   sample estimates:
     p
   0.7

You can ignore the rest of the output (until we cover tests in the next chapter).

Changing the Confidence Level

For a confidence level different from 95%, use the conf.level optional argument to the prop.test function. This is documented on the R on-line help for prop.test.

For example, to get a 99% confidence interval for the same data do

Comparison With Hand Calculation

Just out of curiosity, let's check that the prop.test function actually calculates the same interval we would do by hand. Surprise! It doesn't, as a comparison with the text of Example 8.3.1 in Wild and Seber clearly shows.

Interval calculated by prop.test (rounded to 3 decimal places) (0.631, 0.762)
Interval calculated by Wild and Seber (Example 8.3.1) (0.636, 0.764)
There's not much difference (only in the third decimal place), but the difference is not just sloppy calculation or rounding. R is doing something different from what the textbook describes.

It's a mystery. But before we can say more about the mystery, we'll have to cover something completely different. What prop.test actually does is too complicated to explain here. We'll just look at how well it works.

How Bad are Intervals for Proportions?

Wild and Seber in Section 8.3 and Appendix A3 make a shocking claim about the sample sizes necessary for acceptable confidence intervals for proportions using the type of interval described in the text and in our review of Chapter 7 in Wild and Seber.

We saw in the preceding section that prop.test doesn't use the ordinary interval. Presumably it uses something better. How much better?

The widely used rule of thumb disparaged in the footnote on p. 338 in Wild and Seber says that we can consider we are in ``large sample'' territory if both the number of successes and the number of failures are at least five. Does the rule of thumb work with the interval computed by prop.test?

The Rweb commands below calculate the actual coverage probability of the interval produced by prop.test and the ordinary interval (estimate plus or minus 1.96 standard errors).

The black jagged line in the plot is the coverage probability of the ordinary interval as a function of the true probability p.
The red jagged line in the plot is the coverage probability of the interval produced by prop.test (with a minor bug fix) as a function of the true probability p.
The green line is the nominal coverage probability of the intervals.

As can be seen from the plot, the ordinary interval does terrible and the prop.test interval (with bug fix) does fine.

Various things to change

Change n in the first line.
Change conf.level in the second line.
Change p in the third line to get a better look at a subinterval of (0, 1).

For example, change the first three lines to

n <- 30
conf.level <- 0.95
p <- seq(0.5, 0.7, 0.0005)

Looking at the n = 30 case shows that the prop.test interval (with bug fix) except for a small set of true population proportions near 0.32 and 0.68.

Conclusion

The ordinary interval (estimate plus or minus 1.96 standard errors) should not be used unless the sample size is really large (See Section 8.3 and Appendix A3 in Wild and Seber).
The prop.test interval (with bug fix) works well (is generally conservative) for sample sizes as small as 30 at the 90% and 95% confidence levels.

A bug fix for `prop.test`

The prop.test does something really stupid when the sample proportion is exactly zero.

   Rweb:> prop.test(0, 30) 
 
            1-sample proportions test with continuity correction  
 
   data:  0 out of 30, null probability 0.5  
   X-squared = 28.0333, df = 1, p-value = 1.192e-07  
   alternative hypothesis: true p is not equal to 0.5  
   95 percent confidence interval:
    0.003043053 0.141320480       
   sample estimates: 
   p  
   0

This means if the true population proportion p is actually below this interval, that is, 0 < p < 0.003043, then the confidence interval never covers the true p. So the nominal coverage probability is 95% and the actual coverage probability is 0%. Ooof! That's a whopping error.

You should change the interval so that the lower endpoint is zero (only when the sample proportion is zero). That is, you read the highlighted interval in the output, but you use

   95 percent confidence interval:
    0.0 0.141320480

The prop.test makes a similar dumb mistake when the sample proportion is exactly one.

   Rweb:> prop.test(30, 30) 
 
            1-sample proportions test with continuity correction  
 
   data:  30 out of 30, null probability 0.5  
   X-squared = 28.0333, df = 1, p-value = 1.192e-07  
   alternative hypothesis: true p is not equal to 0.5  
   95 percent confidence interval:
    0.8586795 0.9969569           
   sample estimates: 
   p  
   1

Change the confidence interval (only when the sample proportion is exactly one) to

   95 percent confidence interval:
    0.8586795 1.0

Two-Sample Confidence Intervals for Means (Example 8.4.1 in Wild and Seber)

To make a 95% confidence interval for the difference of two population means using two independent random samples that are in R dataset named fred and sally you just say t.test(fred, sally) (or the same thing with the names swapped).

For an example we will use the data for Example 8.4.1 in Wild and Seber, which is in the file thiol.txt. Unfortunately, this data file does not present the data in the form we want, so we have to do a little bit of work first.

Since the output is rather complicated, we copy it below with the part about the confidence interval highlighted

   Rweb:> t.test(rheumatoid, normal) 
 
            Welch Two Sample t-test  
 
   data:  rheumatoid and normal  
   t = 8.4772, df = 5.253, p-value = 0.0002937  
   alternative hypothesis: true difference in means is not equal to 0  
   95 percent confidence interval:
    1.082205 2.004938             
   sample estimates: 
   mean of x mean of y  
    3.465000  1.921429

You can ignore the rest of the output (until we cover tests in the next chapter).

Note that R is using the Welch procedure mentioned on p. 340 in Wild and Seber, which is better than the degrees of freedom calculation they explain (as they say). The Right Thing is too complicated to do by hand but easy for the computer.

Note: The R on-line help for t.test gives all the optional arguments, including conf.level, which is used to get a confidence level different from 95%.

Two-Sample Confidence Intervals for Proportions (Example 8.5.1 in Wild and Seber)

To make a 95% confidence interval for the difference of two population proportions using two independent random samples sizes n₁ and n₂ in which there were x₁ and x₂ "successes" (yeses, votes for Fred, whatever is being counted, so the sample proportions are x₁ / n₁ and x₂ / n₂) you just say

   prop.test(c(x1, x2), c(n1, n2))

For an example we will use the data for Example 8.5.1 in Wild and Seber. Unfortunately, this example does not present the data in the form we want, it gives only rounded sample proportions not the actual counts. The fault is not Wild and Seber's but that of the authors of the original study (too much rounding makes sensible reanalysis of the data impossible).

Let us just suppose the data were

x n
592 870
53 130
These numbers are consistent with the reported percentages (but so would be any number between 588 and 595 instead of 592). Then the following calculates the confidence interval.

x	n
592	870
53	130

Since the output is rather complicated, we copy it below with the part about the confidence interval highlighted

   Rweb:> prop.test(c(592, 53), c(870, 130)) 
 
            2-sample test for equality of proportions with continuity correction  
 
   data:  c(592, 53) out of c(870, 130)  
   X-squared = 35.5686, df = 1, p-value = 2.462e-09  
   alternative hypothesis: two.sided  
   95 percent confidence interval:
    0.1783704 0.3671645           
   sample estimates: 
      prop 1    prop 2  
   0.6804598 0.4076923

You can ignore the rest of the output (until we cover tests in the next chapter).

Note that as in the one-sample case, the prop.test function is doing something more complicated than the procedure described by Wild and Seber. Thus the warning that extremely large sample sizes are needed (p. 342 in Wild and Seber) does not apply to the intervals produced by prop.test.

For once, however, we will restrain ourselves and not produce the computations to see exactly how well prop.test does.

Note: The R on-line help for prop.test gives all the optional arguments, including conf.level, which is used to get a confidence level different from 95%.

Interval calculated by `prop.test` (rounded to 3 decimal places)	(0.631, 0.762)
Interval calculated by Wild and Seber (Example 8.3.1)	(0.636, 0.764)

Stat 3011 (Geyer) In-Class Examples (Chapter 8)

Contents

One-Sample Confidence Intervals for Means (Example 8.2.1 in Wild and Seber)

One-Sample Confidence Intervals for Proportions (Example 8.3.1 in Wild and Seber)

Conclusion

Two-Sample Confidence Intervals for Means (Example 8.4.1 in Wild and Seber)

Two-Sample Confidence Intervals for Proportions (Example 8.5.1 in Wild and Seber)