I have two questions about the built-in function step. Ultimately I want to
apply a lm fitting and subsequent step procedure to thousands of data sets
groups by a factor defined as a unique ID.
Q1. The code below creates a data.frame comprising three marginally noisy
surfaces. The code below fails when I use step in a function but summary
seems to show the model fits are legitimate. Any ideas on what I'm doing
wrong?
# y = x1 + x2 for grp A
# y = 2 + 2x1 + 4x2 for grp B
# y = 3 + 2x1 + 4x2 + 6x1x2 for grp C
ind <- matrix(runif(200), ncol=2, dimnames=list(c(),
c("x1","x2")))
d1 <- data.frame(ind,
y=ind[,"x1"]+ind[,"x2"]+rnorm(100,0,0.05),
grp=rep("A",100))
d2 <- data.frame(ind,
y=2+2*ind[,"x1"]+4*ind[,"x2"]+rnorm(100,0,0.05),grp=rep("B",
100))
d3 <- data.frame(ind,
y=3+2*ind[,"x1"]+4*ind[,"x2"]+6*ind[,"x1"]*ind[,"x2"]+rnorm(100,0,0.05),grp=rep("C",100))
data2 <- rbind(d1,d2,d3)
# Fit each surface by grp
model <- y ~ x1*x2
fits <- lapply(split(data2, list(data2$grp), drop=TRUE),
function(x){lm(model, data = x)})
lapply(fits, function(x){summary(x)})
# FAIL
lapply(fits, function(x){step(x)})
# Output
Start: AIC=-601.41
y ~ x1 * x2
Df Sum of Sq RSS AIC
- x1:x2 1 0.00159 0.23 -602.71
<none> 0.23 -601.41
Error in as.data.frame.default(data) :
cannot coerce class "lm" into a data.frame
I get a different error if I use step directly on the first fit in the list.
step(fits[[1]])
# Output
Start: AIC=-601.41
y ~ x1 * x2
Df Sum of Sq RSS AIC
- x1:x2 1 0.00159 0.23 -602.71
<none> 0.23 -601.41
Error in eval(predvars, data, env) :
numeric 'envir' arg not of length one
However step works if I do the fit outside of a function.
fit <- lm(model, data=data2[which(data2$grp=="A"),])
step(fit)
# Output
Start: AIC=-601.41
y ~ x1 * x2
Df Sum of Sq RSS AIC
- x1:x2 1 0.00159 0.23 -602.71
<none> 0.23 -601.41
Step: AIC=-602.71
y ~ x1 + x2
Df Sum of Sq RSS AIC
<none> 0.23 -602.71
- x2 1 8.36 8.58 -241.54
- x1 1 9.55 9.78 -228.47
Call:
lm(formula = y ~ x1 + x2, data = data2[which(data2$grp == "A"), ])
Coefficients:
(Intercept) x1 x2
0.001457 0.999313 1.010880
Q2. How can I setup step to exclude the intercept. In this example above
both the intercept and interaction terms are insignificant. I have tried
setting direction to "both" and "forward" but that
didn't help.
Thanks for your help.
--
View this message in context:
http://www.nabble.com/Error-when-using-step-tp24142487p24142487.html
Sent from the R help mailing list archive at Nabble.com.