Links to the two sampling distribution applications:
Confidence intervals are bands around the point estimate (e.g. sample mean, median, proportion) for which we are reasonably sure the true population value lies. "Reasonably" often means 95% sure, or 99% sure, which is to say that respectively 95 times or 99 times out of a hundred, the true value will lie within the interval.
We calculate a CI as the point estimate (e.g. sample mean) plus or minus Z times the standard error.
The Standard Error is estimated as the sample standard deviation divided by the square root of the sample size.
Z depends on the Confidence Coefficient, and is the z score from the standard normal distribution for which 95 or 99% of the distribution is in the range -Z to +Z. For 95% we want to find the z score corresponding to a "right tail" of 0.025 (add the right and left tails to get 0.05 = 1 - 95%). For 99% we want a right tail of 0.005 (half of 1%).
A table of the standard normal distribution is available here.
ci grsearn. This is Stata's way of calculating the confidence interval for the gross earnings variable. How do the results compare with your estimate?
help ciand see if you can figure out how to get the
cito give you a 99% confidence interval.
When we are constructing the CI for a proportion (e.g. percent voting yes, proportion female, percent unemployed) we have a shortcut: the standard deviation of a proportion is the square root of p times q, where q is 1 minus p (proportion voting no, or make, or not unemployed). Use that information in the following:
In Stata, with the School-Leavers' Survey data, calculate the proportion who are either unemployed or looking for a first job. Using the formula, calculate the standard error and confidence interval.
Construct a new variable so that it is equal to 1 for
unemployed/looking for first job, and 0 otherwise. Use the Stata
ci command to calculate the confidence interval around
it. Compare this with your result from the formula.
What do you see, and why may this happen?