Hi there,
In multiple regression one way to view R^2 is as (the square of) the
correlation between original y's and the estimated y's.
Suppose you fit a multilevel model with random intercept for each
cluster. Would it be valid to compute an R^2 by using fixed effects
plus the group intercepts to reduce the residuals?
I suspect this has been done and, given its absence from the lmer
output, refuted somewhere. A reference would be lovely.
Code below.
Many thanks,
Andy
#################################################################
require(lme4)
# First generate some data
people = 100
obs = 4
# The random intercepts:
sub_noise = data.frame(id = 1:people, rand_int = rnorm(people,0,60))
# Merge everything
id = rep(1:people,obs)
thedata = data.frame(id)
thedata = merge(sub_noise, thedata)
thedata$x = rep(1:obs,people)
# Setup the relationship between x and y
thedata$y = 23*thedata$x + 20 + thedata$rand_int
+ rnorm(people*obs, 0, 20) # plus residuals
# Have a look
plot(y~x, data = thedata)
# Now fit a standard regression model
lm1 = lm(y ~ x, data = thedata)
summary(lm1)
# Use the model to get the R^2
predicted = coef(lm1)[1] + thedata$x * coef(lm1)[2]
plot(thedata$y, predicted)
# It's a noisy mess
cor(thedata$y, predicted)^2
# Now how about adjusting everyone's intercept towards the group
# intercept?
lmer1 = lmer(y ~ x + (1|id), data=thedata)
summary(lmer1)
# Get the random intercepts and stick them in a table
ran_effects = data.frame(rownames(ranef(lmer1)$id), ranef(lmer1)$id[1])
names(ran_effects) = c("id", "b")
ran_effects_data = merge(thedata, ran_effects)
# Now compute
predicted.ml = fixef(lmer1)[1] + ran_effects_data$x * fixef(lmer1)[2]
+ ran_effects_data$b
plot(thedata$y, predicted.ml)
cor(thedata$y, predicted.ml)^2
# Looks much nicer