UCAS, ethnicity and admission rates

UCAS, the UK university admissions clearing house, have released data relating to ethnicity and admissions to English universities, in part in response to Vikki Boliver‘s research in Sociology suggesting that members of ethnic minorities are less likely to be admitted to Russell Group universities.

The analysis note with the release is sober and correct, showing a mostly consistent pattern of offer rates for ethnic minority students being lower (but not far lower) than expected. However, UCAS’s press release seems to have suggested that the effect is almost explained away, and attributes it to ethnic minority students disproportionately applying to courses with low acceptance rates. This does not seem to be the case.

Update: see also next blog entry.

The data comes as a spreadsheet, with numbers of applicants and the offer rate broken down by

  • Tariff (fee level in 3 bands, a proxy for institution’s prestige)
  • Subject area (25 groups)
  • Predicted A-level points (10-17, excludes the highest band)
  • Ethnicity (white/non-white).

We can recover the number of offers by multiply the rate by the number of applicants and rounding, so we can regard this as a 4-way table containing number of trials and number of successes in each cell (i.e., number of applicants and number of offers). This lends itself to a grouped logistic regression.

If we fit a model with the full interaction between tariff, subject and predicted points, the predicted offer level shows what should be expected if white and non-white applicants have the same outcome, controlling for what is applied for, and the measure of ability.

-----------------------------------
          |     Ethnicity             
Tariff    | Non-white  White  Total
----------+------------------------
1. Higher |   0.75     0.78    0.78
          |   0.70     0.79    0.78
          |                        
2. Medium |   0.87     0.85    0.85
          |   0.85     0.85    0.85
          |                        
3. Lower  |   0.85     0.83    0.84
          |   0.86     0.83    0.84
          |                       
Total     |   0.82     0.82    0.82
          |   0.80     0.82    0.82
-----------------------------------

From this we see a modest but nonetheless distinct pattern: non-white applicants have an observed offer rate of 70% in high tariff universities, where given their characteristics the predicted rate (if ethnicity has no effect) is 75%. For medium tariff universities, the predicted and observed rates are 85% vs 87% expected, and for low tariff universities ethnic minority applicants receive more slightly offers than predicted.

If we add ethnicity to this model, as a single effect, the odds ratio for whites versus non-whites is 1.229 (CI 1.217-1.241). If we allow that to vary by tariff, the OR is 1.526 for high tariff, 1.188 for medium and 0.813 for low.

These are relatively modest effects, but they are clear and strongly significant. How they emerge is another question, except that the data show that it is not explained by where ethnic minority students apply, nor by their expected second level performance.

While it is to UCAS’s credit that they have made this data available, it is incomplete, excluding applications to Oxbridge, medicine, and those with A*A*A* predicted performance.

Code to carry out the analysis is below:

// Sep 18 2015 12:36:44
// See https://www.ucas.com/corporate/data-and-analysis/analysis-notes

// The data is a table for 2010-2014 of applications and offers by
//  - ethnicity (white/other)
//  - tariff band (fee level?)
//  - subject area, and
//  - predicted A-levels points, from 10-17

// I treat it as a 4-way table, where cells contain n offers out of m applications
// thus grouped logistic regression is appropriate.

insheet using additional-data-file-analysis-note-2015-05.csv

// Data gives an acceptance rate: turn it into numbers by multiplying by group size
gen offers = round(output_app * output_act)

// Create integer versions of variables
encode ethnicity_group, gen(ethnicity)
encode tariffband , gen(tar)
encode subject_group, gen(subj)
gen points = real(substr(pred, 1,2)) - 10


set matsize 6000
glm offers i.points##i.tar##i.subj, family(binomial output_app)

predict n1

gen prate = n1/output_applications

table tariff ethnicity [freq=output_ap], c(mean prate mean output_ac) format(%5.2f) row col

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.