I had an interesting exchange by e-mail with Maggie Smith, the analyst at UCAS responsible for the note discussed in my previous blog entry. She tells me that the conclusion I derived from the released data (the small but significant ethnicity effect) is very much attenuated when the data is broken down by provider, and the published UCAS analysis is based on such disaggregated data.
The released data, however, pools across providers, with the cells in the table being the crossing of subject (25), tariff (3), predicted points (8) and ethnicity (2). It is understandable that the data broken down by provider are not released.
Along with the raw data at this level, UCAS also provided synthetic data based on the disaggregated data, in the form of predicted offer rates, assuming no ethnicity effect. In my analysis I ignored this because I wanted to run an independent analysis on raw data, one appropriate to making inferences from count data.
UCAS’s argument is that within these cells, ethnic minority applicants disproportionately choose more competitive courses. At a statistical level this is entirely plausible, though it doesn’t seem to be replicated at the level of tariff or subject group in the released data. That is, ethnic minority students are less likely to apply to high tariff institutions, and the pattern across subject group is inconclusive (there is a weak tendency for minorities to apply for less competitive subject groups). However, it is quite common for patterns to change at different levels of aggregation.
Using the released data, we can assess the relative contribution of different features to statistically explaining the ethnicity difference in offer rates. In the table below I present offer-rates for ethnic minority students under a variety of models. The first model assumes offer rates are constant within tariff band, and we see an expected 78% rate for high tariff compared with an actual rate of 70%. The second model allows offer rates to differ by subject within tariff, which reduces the expected rate to 76%. Allowing this to vary further by predicted A-level points reduces the expected rate a little more, to 75%. Allowing a single ethnicity effect (common across all tariff/subject/points combinations) reduces this only to 73%, showing that in fact the ethnicity effect is quite variable.
Predicted acceptance rates for ethnic minority applicants
| Tariff | Model | Observed | | | Tariff | t X subject | tXsXpts | tsx+eth | | | Higher | 0.78 | 0.76 | 0.75 | 0.73 | 0.70 | | Medium | 0.85 | 0.87 | 0.87 | 0.85 | 0.85 | | Lower | 0.84 | 0.85 | 0.85 | 0.83 | 0.86 | | | | | | | | | Total | 0.82 | 0.83 | 0.82 | 0.80 | 0.80 |
Thus we see that in the aggregated data, where ethnic minority students apply to accounts for some of their lower acceptance rate in high-tariff instititions, and that their A-level expectations additionally account for some more of the difference, but we are left with a significant (and variable) ethnicity effect. Without access to the disaggretated data, we can accept but not verify UCAS’s claim that this ethnicity penalty is much more attenuated in the provider-level data set.
While it is entirely reasonable not to publicly release the provider level data, it could be helpful to release such data to interested researchers outside UCAS, as would making available fuller data (including Oxbridge, medicine, and A*A*A* students, categories currently excluded from the data).