# PC Labs for SO5041: Week 6

## Table of Contents

## Week 6: Sampling distributions and confidence intervals

### Sampling distributions

Links to the sampling distribution applications:

### Confidence Intervals

Confidence intervals are bands around the *point estimate* (e.g. sample mean, median, proportion) for which we are reasonably confident the true population value lies. "Reasonably" often means 95% confident, or 99% confident, which is to say that respectively 95 times or 99 times out of a hundred, the true value will lie within the interval.

We calculate a CI as the point estimate (e.g. sample mean) plus or minus *Z* times the *standard error*.

The Standard Error is estimated as the sample standard deviation divided by the square root of the sample size.

Z depends on the Confidence Coefficient, and is the z score from the standard normal distribution for which 95 or 99% of the distribution is in the range -Z to +Z. For 95% we want to find the z score corresponding to a "right tail" of 0.025 (add the right and left tails to get 0.05 = 1 - 95%). For 99% we want a right tail of 0.005 (half of 1%).

A table of the standard normal distribution is available here. See also the online calculator.

- Mean age for a sample of voters is calculated as 34.2, with a standard deviation of 10.7. The sample size is 1000:
- Calculate the confidence interval for 95% confidence
- Calculate the confidence interval for 99% confidence
- Repeat the exercise assuming the sample size was actually 2000, for both confidence levels

- Load the
`slsextract.dta`

file as follows:

. use http://teaching.sociology.ul.ie/so5041/labs/slsextract.dta

Find the mean of gross earnings, and construct a 95% and a 99%
confidence interval – all the information is available through the `summarize`

command.

- With the same variable, do
`ci mean grsearn`

. This is Stata's way of calculating the confidence interval for the gross earnings variable. How do the results compare with your estimate? - Do
`help ci`

and see if you can figure out how to get the`ci`

to give you a 99% confidence interval.

### Proportions

When we are constructing the CI for a proportion (e.g. percent voting yes, proportion female, percent unemployed) we have a shortcut: the standard deviation of a proportion is the square root of p times q, where q is 1 minus p (proportion voting no, or make, or not unemployed). Use that information in the following:

- From a sample of 1600, 43% say they will vote against the EU Constitution: construct a 99% confidence interval
- Using the data set already downloaded, calculate the proportion unemployed (include looking for first job). Construct a confidence interval around your point estimate.
- Interpret your findings.