# PC Labs for SO5041: Week 6

## 1 Week 6: Sampling distributions and confidence intervals

### 1.1 Sampling distributions

Links to the two sampling distribution applications:

### 1.2 Confidence Intervals

Confidence intervals are bands around the point estimate (e.g. sample mean, median, proportion) for which we are reasonably sure the true population value lies. "Reasonably" often means 95% sure, or 99% sure, which is to say that respectively 95 times or 99 times out of a hundred, the true value will lie within the interval.

We calculate a CI as the point estimate (e.g. sample mean) plus or minus Z times the standard error.

The Standard Error is estimated as the sample standard deviation divided by the square root of the sample size.

Z depends on the Confidence Coefficient, and is the z score from the standard normal distribution for which 95 or 99% of the distribution is in the range -Z to +Z. For 95% we want to find the z score corresponding to a "right tail" of 0.025 (add the right and left tails to get 0.05 = 1 - 95%). For 99% we want a right tail of 0.005 (half of 1%).

A table of the standard normal distribution is available here. See also the online calculator.

1. Mean age for a sample of voters is calculated as 34.2, with a standard deviation of 10.7. The sample size is 1000:
• Calculate the confidence interval for 95% confidence
• Calculate the confidence interval for 99% confidence
• Repeat the exercise assuming the sample size was actually 2000, for both confidence levels
2. Load the `slsextract.dta` file as follows:
```. use http://teaching.sociology.ul.ie/so5041/labs/slsextract.dta
```

Find the mean of gross earnings, and construct a 95% and a 99% confidence interval – all the information is available through the `summarize` command.

1. With the same variable, do `ci grsearn`. This is Stata's way of calculating the confidence interval for the gross earnings variable. How do the results compare with your estimate?
2. Do `help ci` and see if you can figure out how to get the `ci` to give you a 99% confidence interval.

### 1.3 Proportions

When we are constructing the CI for a proportion (e.g. percent voting yes, proportion female, percent unemployed) we have a shortcut: the standard deviation of a proportion is the square root of p times q, where q is 1 minus p (proportion voting no, or make, or not unemployed). Use that information in the following:

1. From a sample of 1600, 43% say they will vote against the EU Constitution: construct a 99% confidence interval
2. Using the data set already downloaded, calculate the proportion unemployed (include looking for first job). Construct a confidence interval around your point estimate.