PC Labs for SO5041: Week 6

Table of Contents

Week 6: Sampling distributions and confidence intervals

Sampling distributions

Links to the sampling distribution applications:

Confidence Intervals

Confidence intervals are bands around the point estimate (e.g. sample mean, median, proportion) for which we are reasonably confident the true population value lies. "Reasonably" often means 95% confident, or 99% confident, which is to say that respectively 95 times or 99 times out of a hundred, the true value will lie within the interval.

We calculate a CI as the point estimate (e.g. sample mean) plus or minus Z times the standard error.

The Standard Error is estimated as the sample standard deviation divided by the square root of the sample size.

Z depends on the Confidence Coefficient, and is the z score from the standard normal distribution for which 95 or 99% of the distribution is in the range -Z to +Z. For 95% we want to find the z score corresponding to a "right tail" of 0.025 (add the right and left tails to get 0.05 = 1 - 95%). For 99% we want a right tail of 0.005 (half of 1%).

A table of the standard normal distribution is available here. See also the online calculator.

  1. Mean age for a sample of voters is calculated as 34.2, with a standard deviation of 10.7. The sample size is 1000:
    • Calculate the confidence interval for 95% confidence
    • Calculate the confidence interval for 99% confidence
    • Repeat the exercise assuming the sample size was actually 2000, for both confidence levels
  2. Load the slsextract.dta file as follows:
. use http://teaching.sociology.ul.ie/so5041/labs/slsextract.dta

Find the mean of gross earnings, and construct a 95% and a 99% confidence interval – all the information is available through the summarize command.

  1. With the same variable, do ci mean grsearn. This is Stata's way of calculating the confidence interval for the gross earnings variable. How do the results compare with your estimate?
  2. Do help ci and see if you can figure out how to get the ci to give you a 99% confidence interval.

Proportions

When we are constructing the CI for a proportion (e.g. percent voting yes, proportion female, percent unemployed) we have a shortcut: the standard deviation of a proportion is the square root of p times q, where q is 1 minus p (proportion voting no, or make, or not unemployed). Use that information in the following:

  1. From a sample of 1600, 43% say they will vote against the EU Constitution: construct a 99% confidence interval
  2. Using the data set already downloaded, calculate the proportion unemployed (include looking for first job). Construct a confidence interval around your point estimate.
  3. Interpret your findings.