SO5032: Lab Materials
Table of Contents
1. Week 7 Lab
1.1. Non-linearity
1.1.1. Sketch a quadratic function
Using pen and paper and/or Excel, plot the curve \(Y=20+0.75X-0.03X^2\).
1.1.2. Age example
Age often has non-linear effects: how to handle it?
library(foreign)
library(ggplot2)
ex <- read.dta("http://teaching.sociology.ul.ie/so5032/example1.dta")
ex <- ex[ex$age>=15,]
ex <- ex[ex$income<=6000,]
ggplot(data=ex, aes(x=age, y=income)) + geom_point(color="blue", alpha=0.1)
mod0 <- lm(data=ex, income ~ age)
modyoung <- lm(data=subset(ex, age<=34), income ~ age)
modold <- lm(data=subset(ex, age>34), income ~ age)
ex$age2 <- ex$age^2
modquad <- lm(data=ex, income ~ age + age2)
ex$ageg = factor(10*round(ex$age/10))
modgroup <- lm(data=ex, income ~ ageg)
To plot models: 1: Scatter with predicted values
ex$pred0 <- predict(mod0) ggplot(data=ex, aes(x=age, y=income)) + geom_point(color="blue", alpha=0.1) + geom_point(aes(x=age, y=pred0), color="red")
2: Scatter with one or more lines as functions
ggplot(data=ex, aes(x=age, y=income)) + geom_point(color="blue", alpha=0.1) +
geom_function(data=ex, fun=function (x)
mod0$coefficients["(Intercept)"] +
mod0$coefficients["age"]*x) +
geom_function(data=ex, color="red", fun=function (x)
modyoung$coefficients["(Intercept)"] +
modyoung$coefficients["age"]*x)
Compare the results of the grouped model with this:
aggregate(income ~ ageg, ex, mean)
1.1.3. Model a non-linear relationship
Load the Week 6 R file at https://teaching.sociology.ul.ie/so5032/so5032unit06.R, and look for the reference to brgnp (second half).
Use it to fit models predicting birth rate using GNP as
- a linear effect
- a quadratic effect (GNP plus squared GNP)
- logged GNP and
- a grouped effect.
Consider the fit of the four models.
Plot the four predicted values as lines/curves on the same graph: how do they compare? Plot the residuals as well.