# PC Labs for SO5041: Week 12

## Table of Contents

## 1. MA Lab Materials

### 1.1. Week 12 Lab

#### 1.1.1. Linear Regression

Do `sysuse nlsw88`

to load the National Longitudinal Study of Women data
set that comes with Stata. Look at `wage`

, the hourly wage rate. Predict
`wage`

using `grade`

:

`reg wage grade`

Write out the `Y = a + bx`

equation. Calculate the predicted value for
`grade=0`

and `grade=20`

, and draw the line on a graph (on paper).

#### 1.1.2. R-squared

Considering the following list of variables:

`age`

`ttl_exp`

, total lifetime work experience`tenure`

, tenure in current job`grade`

, years of education`union`

, whether a member of a union

Let's consider wage as the "dependent variable", to be explained by the others (ignoring union for the moment as it only has two values). Create scatterplots for wage (on the Y-axis) compared with each of the other variables. Consider the correlations too (e.g., =corr age wage =Can you see much of a relationship?

Now do regression analyses: `reg wage`

*varname*,
with each of the other variables **one at a time** as the
independent. There are two things to look at: the
R^{2} figure and the parameter estimate (B for the
independent variable, along with its significance). Which variables
affect wage much? Do any not affect it at all?

Interpret the results: in each case ask the question, "what happens to the predicted value of income, if the value of X were to change by one unit?". For two different values of the independent variable (X) calculate the predicted value of income – see where these fall on the scatterplot, and see where the regression line would lie. Does it seem like a good summary of the relationship?

If R^{2} is big, the independent variable "explains" the
dependent variable "a lot". However, it is possible for R^{2} to
be small and yet for the independent variable to a systematic effect
(i.e. very low p-value for significance): this independent variable may
be only one thing among many that affect the dependent variable.

- Union effects

Test the effect of

`union`

on wage. Use a t-test in the first instance, and then fit a regression. Compare the results.Do the same relating

`grade`

to`union`

. Note that unionised workers tend to earn more and be better educated. Could it be that the union effect is simply due to them being better educated? That is, for workers with similar education does union status matter?Fit the wage/grade regression for unionised and non-unionised workers separately, and think about the results (make scatterplots too): do

`reg wage grade if union==0`

,`reg wage grade if union==1`

. - Two explanatory variables

You can also fit a model with both union status and grage explaining wage. Fit a regression with both

`grade`

and`union`

as explanatory variables. Interpret the parameter estimates.Draw the regression lines for union members and non-union members.

Compare your results to the previous separate regressions, and the t-test.