There are a staggering number of different names for these models, with different disciplines using different terminology. In the language used in this course, fixed effects are varying coefficients (which can be slopes or intercepts) that are implemented by creating group dummies, random effects are varying coefficients (again, slopes or intercepts) which are drawn from a common distribution, and multilevel models are models which include any of these effects.
Fixed Effects; Random Effects1
Unfortunately, one of the most widely used packages in R for multilevel models uses some confusing terminology. The lme4
package uses the following terms:
Multilevel models allow us to do two very powerful things: control for unobserved variation across groups in grouped data, and control for the fact that group level variables don’t actually introduce as many independent observations as we think they do. Load the radon data from https://www.stat.columbia.edu/~gelman/arm/examples/radon/srrs2.dat, and log transform the radon activity
variable (replace any values of 0 with .1 to avoid problems in the transformed variable). Drop any observations with a value of 9 for floor
, as this is a missing data code and treating it as a value will bias our estimates. Use state2
as the state variable for all subsequent examples.
sr <- read.delim('https://www.stat.columbia.edu/~gelman/arm/examples/radon/srrs2.dat', sep = ',')
sr$radon <- log(ifelse (sr$activity == 0, .1, sr$activity))
sr <- sr[which(sr$floor < 9), ]
Run a standard linear model (the complete pooling model) explaining the level of radon
as a function of floor
, and a model with fixed effects, using states as the group. Remember to generate group dummies for the fixed effects model.
pooled <- lm(radon ~ floor, data = sr)
fix_int <- lm(radon ~ floor + as.factor(state2), data = sr)
fix_slope <- lm(radon ~ floor + as.factor(state2), data = sr)
We can also estimate a fixed effects model with varying slopes instead of intercepts, although again, interpreting and presenting these results is not the easiest task. Create group dummies like before, except this time interact them with floor
. This produces a model with varying slope and intercept fixed effects; this is closest to the completely separate regression approach we saw last week with plyr
. We could also omit the group dummies as a separate predictor, yielding a model with varying slope fixed effects. The figure below shows the regression lines for each model. In the next section we’ll see how to create these plots for random effects models.
If we want to estimate a varying intercept model by county, there are some difficulties we run into. There are 386 counties in the data, so this model would be very difficult to read and work with. As an alternative, we might want to estimate a random effects model. Doing so would let us to describe the distribution that the varying intercepts are drawn from, allowing us to better understand what a random intercept of -.72 means relative to the mean of the distributions the intercepts are drawn from, and leveraging similarities across the 386 groups. We can use the lmer()
function in the package lme4
, to accomplish this. This function produces a model object of class lmerMod
. Estimate a random effects model with radon levels explained by floor and varying intercept by state (it’s much easier to understand how these models work with 9 groups instead of 386).
library(lme4)
no_pooling_lme <- lmer(radon ~ floor + (1 | state2), data = sr)
This model assumes that there is a different baseline probability of getting a given value for radon
based on differences in state2
. When you call the model object, or use summary()
on it, you’ll get the usual estimates and standard errors for your fixed effects (your regular coefficients), but only the variance and standard deviation for your random effects (the varying intercepts). Since we assume that these intercepts are all drawn from the same distribution, we can describe the set of intercepts using standard deviation or variance. If we want to examine what these intercepts actually are, we can use the coef()
function. Using coef()
on a lmerMod
is convenient, because it will automatically combine any fixed and random effects (remember what these mean in lme4
…), and then report the resulting coefficients by group membership.
coef(no_pooling_lme)$state2
## (Intercept) floor
## AZ 0.654 -0.68
## IN 1.061 -0.68
## MA 0.732 -0.68
## MI 0.083 -0.68
## MN 1.251 -0.68
## MO 0.776 -0.68
## ND 1.715 -0.68
## PA 1.232 -0.68
## WI 1.117 -0.68
If we want to extract the individual fixed and random effects, we can do so using fixef()
and ranef()
, respectively.
head(ranef(no_pooling_lme)$state2)
## (Intercept)
## AZ -0.30
## IN 0.10
## MA -0.23
## MI -0.87
## MN 0.29
## MO -0.18
fixef(no_pooling_lme)
## (Intercept) floor
## 0.96 -0.68
We can also access the standard errors of the fixed and random effects using se.fixef()
and se.ranef()
in the arm
package. The REextract()
function in merTools
allows us to easily extract both the estimates and standard errors of random effects.
library(arm)
se.ranef(no_pooling_lme)$state2
## (Intercept)
## AZ 0.027
## IN 0.023
## MA 0.025
## MI 0.069
## MN 0.029
## MO 0.023
## ND 0.025
## PA 0.021
## WI 0.047
se.fixef(no_pooling_lme)
## (Intercept) floor
## 0.155 0.023
library(merTools)
REextract(no_pooling_lme)
## groupFctr groupID (Intercept) (Intercept)_se
## 1 state2 AZ -0.30 0.027
## 2 state2 IN 0.10 0.023
## 3 state2 MA -0.23 0.025
## 4 state2 MI -0.87 0.069
## 5 state2 MN 0.29 0.029
## 6 state2 MO -0.18 0.023
## 7 state2 ND 0.76 0.025
## 8 state2 PA 0.27 0.021
## 9 state2 WI 0.16 0.047
We can even combine the fixed and random effects to recover the reported effects returned by coef()
. Let’s try this for Missouri, comparing the effect of both the intercept and floor
.
fixef(no_pooling_lme)[1] == coef(no_pooling_lme)$state2[6,1]
fixef(no_pooling_lme)[1] + ranef(no_pooling_lme)$state2[6,] == coef(no_pooling_lme)$state2[6, 1]
fixef(no_pooling_lme)[2] == coef(no_pooling_lme)$state2[6, 2]
## (Intercept)
## FALSE
## (Intercept)
## TRUE
## floor
## TRUE
An important thing to note here, is that the intercept is affected by the random effect. In this case, the intercept in the random effect term (1 | state2)
represents a baseline relative to all the states included in the data, so we need to combine the fixed and random effect of the intercept to get its total effect. In other words, the first entry in coef()
represents a case not represented in the data because it is a baseline that all of the states are relative to. However, this model doesn’t have a random effect varying slope term, so the fixed effect of floor
is its total effect.
This is a lot of words to describe what’s happening in a random effects varying intercept model, but sometimes it’s easier to understand with a picture. Use the predict()
function on our lmerMod
object to get fitted values and then plot radon
and floor
values with the regression line for each state.
sr$fit <- predict(no_pooling_lme)
ggplot(data = sr, aes(x = floor, y = radon, color = state2)) +
geom_point(alpha = .25, position = 'jitter', size = .75) +
geom_line(aes(y = fit)) +
theme_bw() +
theme(legend.position = 'right',
plot.background = element_blank(),
panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
panel.border = element_blank())
Now let’s add some county-level data to our data so that we can include effects as a function of county-level differences. Load the data file of county level info from https://www.stat.columbia.edu/~gelman/arm/examples/radon/cty.dat.
Before we can merge the county-level data onto the individual-level data, we need to have common identifiers. These data are from the US, so they use FIPS codes. The data only have separate county and state level FIPS codes. The formula for full FIPS codes is $ (1000 ) + $. Create full FIPS codes in both the individual and county level data, then merge them together.
cty <- read.delim('https://www.stat.columbia.edu/~gelman/arm/examples/radon/cty.dat', sep = ',')
sr$fips <- sr$stfips * 1000 + sr$cntyfips
cty$fips <- cty$stfips * 1000 + cty$ctfips
ml_data <- merge(sr, cty)
Write a model where radon
is explained by floor
and typebldg
with a varying intercept by Uppm
, which is a continuous predictor measured at the county level. There are 467 counties in the data, but 463 different values of Uppm
, indcating that some counties have the same value. This means that Uppm
is an actual subsantively interesting measurement and not just a group ID variable.
mlm <- lmer(radon ~ floor + typebldg + (1 | Uppm), data = ml_data)
Note that in this case, the baseline case where Uppm
= 0 makes sense because this is a measurement, and we have 165 cases in our data where Uppm
actually does equal zero. Thus, coef(mlm)$Uppm[1, 1]
represents the actual intercept in these cases, which is a combination of fixef(mlm)[1]
and ranef(mlm)$Uppm[1, ]
.
coef(mlm)$Uppm[1, 1] == fixef(mlm)[1] + ranef(mlm)$Uppm[1,]
## (Intercept)
## TRUE
The real advantage of lme4
however, is its ability to specify varying slope random effects models. In these models, the effect of a coefficient is dependent on group membership. Write a model where radon
is explained by typebuilding
, but with a coefficient that varies by state2
. We can think of this model as an interaction, since the coefficient for the effect of typebldg
on radon
is a function of the value of state2
.
mlm <- lmer(radon ~ (typebldg | state2), data = ml_data)
As with the fixed effects model above, we can access the fixed, random, and total effects separately. Compare the typebldg
coefficient for Indiana in the random effect component to the typebldg
coefficient for Indiana.
coef(mlm)$state2
## typebldg (Intercept)
## AZ -0.029 0.022
## IN -0.158 0.889
## MA -0.094 0.765
## MI 0.043 -0.369
## MN -0.225 1.342
## MO -0.192 0.710
## ND -0.202 1.829
## PA -0.160 1.328
## WI -0.156 0.828
fixef(mlm)
## (Intercept)
## 0.14
ranef(mlm)$state2
## (Intercept) typebldg
## AZ -0.11 -0.029
## IN 0.75 -0.158
## MA 0.63 -0.094
## MI -0.51 0.043
## MN 1.21 -0.225
## MO 0.57 -0.192
## ND 1.69 -0.202
## PA 1.19 -0.160
## WI 0.69 -0.156
ranef(mlm)$state2[2, 2] == coef(mlm)$state2[2, 1]
## [1] TRUE
Just like we can use plots to visually explore the effect of random intercepts, we can do the same with random slopes. Use the predict()
command again to create a scatter plot of radon levels against building type, and include the linear regression line for each state.
ml_data$fit<- predict(mlm)
slope_plot <- ggplot(data = ml_data, aes(x = typebldg, y = radon, color = state2)) +
geom_point(alpha = .25, position = 'jitter', size = .75) +
geom_line(aes(y = fit)) +
theme_bw() +
theme(legend.position = 'right',
plot.background = element_blank(),
panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
panel.border = element_blank())
slope_plot
In this case, the coefficient of typebldg
is equal to the random effect because we did not specify a fixed effect for it in the model formula. If we did, then we would have to combine the appropriate random effect with the fixed effect to obtain our total effect. Try this with a model that explain radon
as a function of floor
and typebldg
, with a varying coefficient on typebldg
by Uppm
.
mlm <- lmer(radon ~ floor + typebldg + (typebldg | Uppm), data = ml_data)
summary(mlm)
## Linear mixed model fit by REML ['lmerMod']
## Formula: radon ~ floor + typebldg + (typebldg | Uppm)
## Data: ml_data
##
## REML criterion at convergence: 35058
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.594 -0.610 -0.001 0.605 5.176
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Uppm (Intercept) 0.30489 0.5522
## typebldg 0.00225 0.0474 -0.64
## Residual 0.88247 0.9394
## Number of obs: 12631, groups: Uppm, 463
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 1.2100 0.0356 34.02
## floor -0.7049 0.0222 -31.73
## typebldg -0.1142 0.0177 -6.45
##
## Correlation of Fixed Effects:
## (Intr) floor
## floor -0.204
## typebldg -0.562 -0.061
ranef(mlm)$Uppm[2, 2] == coef(mlm)$Uppm[2, 3]
## [1] FALSE
fixef(mlm)[3] + ranef(mlm)$Uppm[2, 2] == coef(mlm)$Uppm[2, 3]
## typebldg
## TRUE
We can also specify a random effects model without a varying intercept term. Try writing a model that explains the radon level as a function of typebuilding
with a varying slope by state2
. Look at the summary of the model object and notice that there is no longer an intercept listed under the random effects. This means that observations in each group will have their own slope, but use the overall intercept. Plotting this model clearly illustrates the difference between one with an intercept in the random effect term
mlm <- lmer(radon ~ (typebldg - 1 | state2), data = ml_data)
ml_data$fit <- predict(mlm)
slope_plot_noint <- ggplot(data = ml_data, aes(x = typebldg, y = radon, color = state2)) +
geom_point(alpha = .25, position = 'jitter', size = .75) +
geom_line(aes(y = fit)) +
labs(title = 'Shared Intercept, Random Slopes', x = 'building type') +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5),
legend.position = 'right',
plot.background = element_blank(),
panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
panel.border = element_blank())
slope_plot <- slope_plot +
labs(title = 'Random Intercepts, Random Slopes', x = 'building type') +
theme(plot.title = element_text(hjust = 0.5))
ml_data$fit <- predict(lm(radon ~ typebldg, data = ml_data))
slope_plot_pool <- ggplot(data = ml_data, aes(x = typebldg, y = radon, color = state2)) +
geom_point(alpha = .25, position = 'jitter', size = .75) +
geom_line(aes(y = fit), color = 'darkgrey') +
labs(title = 'Shared Intercept, Shared Slope', x = 'building type') +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5),
legend.position = 'right',
plot.background = element_blank(),
panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
panel.border = element_blank())
## arrange in order of increasing similarity
gridExtra::grid.arrange(slope_plot, slope_plot_noint, slope_plot_pool, nrow = 3)
By default, lme4
assumes that random effects sharing the same factor are correlated. This is based on the assumption of unit heterogeneity i.e. that there are unobserved factors at the group level that lead to differences in the effect of certain variables. If we think that there are group level effects, but that these effects are independent of one another, we can model them independently by using two pipes instead of one in the random effect term in the formula. This is similar to, but not quite the same as specifying two different random effect terms. If we suppressed the intercept in all random effect terms, then they would be equivalent e.g. (typebldg + stratum - 1 || Uppm)
= (typebldg - 1 | Uppm) + (stratum - 1 | Uppm)
.
mlm <- lmer(radon ~ floor + (typebldg + stratum | Uppm), data = ml_data)
summary(mlm)$varcor
## Groups Name Std.Dev. Corr
## Uppm (Intercept) 0.5376
## typebldg 0.0858 -0.25
## stratum 0.0172 -0.09 -0.20
## Residual 0.9400
mlm <- lmer(radon ~ floor + (typebldg + stratum || Uppm), data = ml_data)
summary(mlm)$varcor
## Groups Name Std.Dev.
## Uppm (Intercept) 0.5116
## Uppm.1 typebldg 0.0709
## Uppm.2 stratum 0.0153
## Residual 0.9403
Use the WDI
package to download GDP per capita (NY.GDP.PCAP.KD) and gross domestic savings (NY.GNS.ICTR.ZS) data for the Nordic countries from 1960-2016. Fit varying intercept, varying slope, and varying intercept and slope random effects models that explain GDP as a function of savings, with country as the group. Produce plots illustrating the regression lines for each country in each model.
Metaphor borrowed from Johannes Karreth.↩