SO5032: Lab Materials

Table of Contents

1. Week 7 Lab

1.1. Non-linearity

1.1.1. Sketch a quadratic function

Using pen and paper and/or Excel, plot the curve \(Y=20+0.75X-0.03X^2\).

1.1.2. Age example

Age often has non-linear effects: how to handle it?

library(foreign)
library(ggplot2)
ex <- read.dta("http://teaching.sociology.ul.ie/so5032/example1.dta")

ex <- ex[ex$age>=15,]
ex <- ex[ex$income<=6000,]

ggplot(data=ex, aes(x=age, y=income)) + geom_point(color="blue", alpha=0.1)

mod0 <- lm(data=ex, income ~ age)
modyoung <- lm(data=subset(ex, age<=34), income ~ age)
modold <- lm(data=subset(ex, age>34), income ~ age)

ex$age2 <- ex$age^2
modquad <- lm(data=ex, income ~ age + age2)

ex$ageg = factor(10*round(ex$age/10))
modgroup <- lm(data=ex, income ~ ageg)

To plot models: 1: Scatter with predicted values

ex$pred0 <- predict(mod0)
ggplot(data=ex, aes(x=age, y=income)) +
  geom_point(color="blue", alpha=0.1) +
  geom_point(aes(x=age, y=pred0), color="red")

2: Scatter with one or more lines as functions

ggplot(data=ex, aes(x=age, y=income)) + geom_point(color="blue", alpha=0.1) +
  geom_function(data=ex, fun=function (x)
    mod0$coefficients["(Intercept)"] +
    mod0$coefficients["age"]*x) +
  geom_function(data=ex, color="red", fun=function (x)
    modyoung$coefficients["(Intercept)"] +
    modyoung$coefficients["age"]*x)

Compare the results of the grouped model with this:

aggregate(income ~ ageg, ex, mean)

1.1.3. Model a non-linear relationship

Load the Week 6 R file at https://teaching.sociology.ul.ie/so5032/so5032unit06.R, and look for the reference to brgnp (second half).

Use it to fit models predicting birth rate using GNP as

  • a linear effect
  • a quadratic effect (GNP plus squared GNP)
  • logged GNP and
  • a grouped effect.

Consider the fit of the four models.

Plot the four predicted values as lines/curves on the same graph: how do they compare? Plot the residuals as well.

Author: Brendan Halpin

Created: 2026-03-10 Tue 11:52

Validate