Table of Contents

1. Week 2 Lab

1.1. Adding capabilities to Stata

Lots of people write additional procedures for Stata, and many of these are easily available. See help net and help ssc for an overview. We are going to use one such add-on, TAB_CHI, today. Do net search tab_chi first, just to see how to search. One place it is found is the Stata web site, and another is the Repec economics article archive. Either of the two following commands should install it:

net install tab_chi


ssc install tab_chi

The former looks on the Stata website, the latter on the Statistical Software Components archive.

Once it is installed, load the NLSW88 data and try the following:

sysuse nlsw88, clear

tabchi occupation union
tabchi occupation union, noe noo adj

1.2. Putting tables into Stata

We can enter tables into Stata quite easily using the following strategy. Given a table that looks like this:

  Agree Disagree Total
Male 122 223 345
Female 268 1632 1900

we can put it into Stata like this:

input gender att count
1 1  122
1 2  223
2 1  268
2 2 1632

label define gndr 1 "Male" 2 "Female"
label values gender gndr
label define agree 1 "Agree" 2 "Disagree"
label values att agree

tab gender att [freq=count]

Run this syntax, and run a χ2 test. Do help tab if you need a hint for the χ2 test.

1.3. An ordinal view

Re-generate this table in Stata, using the method above:

                   |               qual
             class |      Univ  2nd level  Incomplet |     Total
          Prof/Man |      1025       1566        767 |      3358 
Routine non-manual |       124        687        713 |      1524 
    Skilled manual |        31        483        464 |       978 
    Semi/unskilled |        18        361        716 |      1095 
             Total |      1198       3097       2660 |      6955 

Source: British Household Panel Survey 2001

Note that both variables have an ordinal interpretation.

  • Calculate the correlation and the Spearman Rank Correlation:
corr class qual [freq=n]
expand n
spearman class qual

(Note: [freq=n] makes most commands act as if there were n cases for every row of data. For commands that don't do so, expand n turns each data row into n rows. Then, [freq=n] isn't needed.)

  • Run and interpret the gamma test
  • What does it tell you? Compare with the pattern of association shown in the adjusted residuals with the gamma, and consider which gives you the better summary.

1.4. Spurious association and suppression

Use the scouting example to explore an association that differs when you take account of more variables.

tab s d [freq=n], row
tab3way s d c [freq=n]

(Note that tab3way will need to be installed the first time: ssc install tab3way.)

Start by calculating a measure of association for the scouting by delinquency table, then for each of the sub-panels (Odds Ratios would be a good idea, since this is 2X2). Then do the same for the three subtables in the church by scouting by delinquency table. Finally, figure out how the three subtables without association add up into a 2-way table with association.

1.5. Tables and complex association


Agresti uses data on race and the death penalty (use code above) to illustrate the possible complexity of a three-way relationship. The data classifies the sentence handed down in murder trials in Florida, by defendent's race and victim's race.

Look first at the defendent/penalty table, then at the three-way table (ie, defendent/penalty controlling for victim's race). Calculating odds ratios would also be useful here. What is going on with this data set?

Author: Brendan Halpin

Created: 2022-02-22 Tue 15:02