Table of Contents
1. Week 2 Lab
1.1. Adding capabilities to Stata
Lots of people write additional procedures for Stata, and many of these are easily available. See help net
and help ssc
for an overview. We are going to use one such add-on, TAB_CHI
, today. Do net search tab_chi
first, just to see how to search. One place it is found is the Stata web site, and another is the Repec economics article archive. Either of the two following commands should install it:
net install tab_chi
or
ssc install tab_chi
The former looks on the Stata website, the latter on the Statistical Software Components archive.
Once it is installed, load the NLSW88 data and try the following:
sysuse nlsw88, clear tabchi occupation union tabchi occupation union, noe noo adj
1.2. Putting tables into Stata
We can enter tables into Stata quite easily using the following strategy. Given a table that looks like this:
Agree | Disagree | Total | |
Male | 122 | 223 | 345 |
Female | 268 | 1632 | 1900 |
we can put it into Stata like this:
input gender att count 1 1 122 1 2 223 2 1 268 2 2 1632 end label define gndr 1 "Male" 2 "Female" label values gender gndr label define agree 1 "Agree" 2 "Disagree" label values att agree tab gender att [freq=count]
Run this syntax, and run a χ2 test. Do help tab
if you need a hint for the χ2 test.
1.3. An ordinal view
Re-generate this table in Stata, using the method above:
| qual class | Univ 2nd level Incomplet | Total -------------------+---------------------------------+---------- Prof/Man | 1025 1566 767 | 3358 Routine non-manual | 124 687 713 | 1524 Skilled manual | 31 483 464 | 978 Semi/unskilled | 18 361 716 | 1095 -------------------+---------------------------------+---------- Total | 1198 3097 2660 | 6955 Source: British Household Panel Survey 2001
Note that both variables have an ordinal interpretation.
- Calculate the correlation and the Spearman Rank Correlation:
corr class qual [freq=n] expand n spearman class qual
(Note: [freq=n]
makes most commands act as if there were n cases for every row of data. For commands that don't do so, expand n
turns each data row into n rows. Then, [freq=n]
isn't needed.)
- Run and interpret the gamma test
- What does it tell you? Compare with the pattern of association shown in the adjusted residuals with the gamma, and consider which gives you the better summary.
1.4. Spurious association and suppression
Use the scouting example to explore an association that differs when you take account of more variables.
do http://teaching.sociology.ul.ie/so5032/labs/church.do tab s d [freq=n], row tab3way s d c [freq=n]
(Note that tab3way
will need to be installed the first time: ssc install tab3way
.)
Start by calculating a measure of association for the scouting by delinquency table, then for each of the sub-panels (Odds Ratios would be a good idea, since this is 2X2). Then do the same for the three subtables in the church by scouting by delinquency table. Finally, figure out how the three subtables without association add up into a 2-way table with association.
1.5. Tables and complex association
do http://teaching.sociology.ul.ie/so5032/labs/dpbig.do
Agresti uses data on race and the death penalty (use code above) to illustrate the possible complexity of a three-way relationship. The data classifies the sentence handed down in murder trials in Florida, by defendent's race and victim's race.
Look first at the defendent/penalty table, then at the three-way table (ie, defendent/penalty controlling for victim's race). Calculating odds ratios would also be useful here. What is going on with this data set?