# SO5032: Lab Materials

## Table of Contents

## 1 Week 4 Lab

### 1.1 Mental health

This file contains code that relates a mental impairment score to SES (socioeconomic status) and a negative life-events score. Run it as follows:

do http://teaching.sociology.ul.ie/so5032/mental.do

Fit the regression model predicting impairment from the other two variables. Interpret the model.

### 1.2 Predicted values

Taking the regression results, calculate the predicted
values by hand (calculator!) for the first few cases (i.e. use
their values on the independent variables). Then, after running the
regression, do `predict`

*var* to get Stata to generate predicted values. Were your
calculations correct?

- Do a scatter plot of the predicted values versus the observed values
- Are the predicted values close to the real ones?
- Calculate the correlation between the predicted and observed values – relate it to the R
^{2}from the regression

### 1.3 Adjusted R^{2}

F-tests can be used to globally test a model, and also do compare
two models, one with extra variables. An approximate but quicker
way to do this is to look at Adjusted R^{2}, which is
R^{2} scaled to take account of the number of cases and
number of parameters, in a calculation similar to that for the
F-statistic. Adjusted R^{2} *can* fall as variables
are added to the model, unlike R^{2}, if their
contribution is insignificant.

### 1.4 F-tests

Stata's regression output presents the result of an F-test against the null model (top-right of output) but doesn't do incremental F-tests. A handy add-on for this can be installed using

ssc install ftest

Using it means you need to fit a model, store its details, fit another and compare the two:

use http://teaching.sociology.ul.ie/so5032/labs/agresticounties, clear reg c u estimates store urban reg c u i hs ftest urban

Interpret that result, and compare it with the result of testing
`reg c i hs`

and `reg c i hs u`

.

### 1.5 GPA

Agresti's GPA data set is available as follows:

use http://teaching.sociology.ul.ie/so5032/agrestiGPA.dta, clear

Examine it, and fit a regression
model explaining college GPA using whatever explanatory variables
you think might matter. Use t-tests, delta-F tests and adjusted
R^{2} to help choose.

### 1.6 Note: Dummy variables

If you have a categorical explanatory variable, you can enter it as a set of n-1 "dummy" variables, where n is the number of values. A dummy variable is a variable taking the values 0 and 1, indicating that the original variable takes the appropriate value:

Original | d1 | d2 | d3 |
---|---|---|---|

1 | 1 | 0 | 0 |

2 | 0 | 1 | 0 |

3 | 0 | 0 | 1 |

4 | 0 | 0 | 0 |

In this example, the original value takes the values 1 to 4. There are three dummy variables, d1 to d3, taking the values 0 and 1, each corresponding to one value of the original variable. For value 4 of the original variable, all three dummy variables have the value 0. Once the dummy variables are entered in a regression analysis, the interpretation of their parameter estimates is the effect on the dependent variable of being in this category compared with category 4.

You can create dummy variables easily in Stata:

tab pa, gen(d)

However, you don't need to. You can simply use "factor notation":
`reg co hi ag i.pa`

. This gives the same result as
putting in `d2`

and `d3`

. Try both ways to satisfy
yourself.