Dear all,
I have used following code but everytime I encounter a problem of not having
coefficients for all the variables in the predictor set.
# code
rm(list=ls())
library(caret)
# generating response and design matrix
X<-matrix(rnorm(50*100),nrow=50)
y<-rnorm(50*1)
# Applying caret package
con<-trainControl(method="cv",number=10)
data<-NULL
data<- train(X,y, "lasso", metric="RMSE",tuneLength = 10,
trControl = con)
coefs<-predict(data$finalModel,s=data$bestTune$.fraction, type
="coefficients", mode ="fraction")$coef
coefs
*This is the output which I got :*
you can see some of the predictors are missing like V4, V6, V7
V1 V2 V3 V5 V8
V9 V10 V11 V13 V14 V15
V17 V19 V22 V24 V26
0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
0.06165530 0.02693335 0.00000000 0.00000000 0.00000000 -0.15699831
0.00000000 0.00000000 0.00000000 0.00000000
V27 V28 V33 V35 V36
V37 V39 V41 V42 V43 V45
V46 V47 V48 V49 V50
0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 -0.01881011
0.00000000 0.00000000 0.00000000 0.00000000
V51 V52 V54 V55 V56
V57 V58 V60 V61 V64 V65
V66 V67 V72 V74 V75
0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
0.00000000 0.00000000 -0.02772797 0.01659148 0.00000000 0.00000000
0.00000000 0.00000000 0.00000000 0.21293642
V77 V78 V79 V81 V84
V85 V86 V88 V91 V94 V95
V99 V100
0.00000000 0.00000000 0.00000000 0.04849013 0.04563922 0.00000000
0.00000000 0.00000000 0.00000000 0.06291593 0.00000000 0.00000000
0.00000000
Thanks in advance
--
Linda Garcia
[[alternative HTML version deleted]]
Linda,
Thanks for the example.
I did this to make it more reproducible:
set.seed(1)
X<-matrix(rnorm(50*100),nrow=50)
y<-rnorm(50*1)
dimnames(X)
colnames(X) <- paste("V", 1:nrow(X))
# Applying caret package
set.seed(2)
con<-trainControl(method="cv",number=10)
data<-NULL
data<- train(X,y, "lasso", metric="RMSE",tuneLength =
10, trControl = con)
I see your point here, but this code gives the same results:
fit2 <- enet(X, y, lambda = 0)
predict(fit2, mode = "fraction", s = data$bestTune$.fraction, type
"coefficient")$coef
(at least train() names the predictors).
To me, it looks like enet is doing some filtering:
> dim(X)
[1] 50 100
> length(fit2$meanx)
[1] 56
This appears to be independent of caret. I would contact the package
maintainer off-list and ask.
Max