PC Labs for SO5041: Week 6
Table of Contents
Week 6: Sampling distributions and confidence intervals
Sampling distributions
Links to the sampling distribution applications:
Confidence Intervals
Confidence intervals are bands around the point estimate (e.g. sample mean, median, proportion) for which we are reasonably confident the true population value lies. "Reasonably" often means 95% confident, or 99% confident, which is to say that respectively 95 times or 99 times out of a hundred, the true value will lie within the interval.
We calculate a CI as the point estimate (e.g. sample mean) plus or minus Z times the standard error.
The Standard Error is estimated as the sample standard deviation divided by the square root of the sample size.
Z depends on the Confidence Coefficient, and is the z score from the standard normal distribution for which 95 or 99% of the distribution is in the range -Z to +Z. For 95% we want to find the z score corresponding to a "right tail" of 0.025 (add the right and left tails to get 0.05 = 1 - 95%). For 99% we want a right tail of 0.005 (half of 1%).
A table of the standard normal distribution is available here. See also the online calculator.
- Mean age for a sample of voters is calculated as 34.2, with a standard deviation of 10.7. The sample size is 1000:
- Calculate the confidence interval for 95% confidence
- Calculate the confidence interval for 99% confidence
- Repeat the exercise assuming the sample size was actually 2000, for both confidence levels
- Load the
slsextract.dta
file as follows:
. use http://teaching.sociology.ul.ie/so5041/labs/slsextract.dta
Find the mean of gross earnings, and construct a 95% and a 99%
confidence interval – all the information is available through the summarize
command.
- With the same variable, do
ci mean grsearn
. This is Stata's way of calculating the confidence interval for the gross earnings variable. How do the results compare with your estimate? - Do
help ci
and see if you can figure out how to get theci
to give you a 99% confidence interval.
Proportions
When we are constructing the CI for a proportion (e.g. percent voting yes, proportion female, percent unemployed) we have a shortcut: the standard deviation of a proportion is the square root of p times q, where q is 1 minus p (proportion voting no, or make, or not unemployed). Use that information in the following:
- From a sample of 1600, 43% say they will vote against the EU Constitution: construct a 99% confidence interval
- Using the data set already downloaded, calculate the proportion unemployed (include looking for first job). Construct a confidence interval around your point estimate.
- Interpret your findings.