Martin Spindler
2014-Feb-13 20:23 UTC
[R] Standardisation of variables with Lasso (glmnet)
Dear all, I am working with glmnet but the problem arises also in all other Lasso implementations: It is ususally recommended to standardize the variables / use intercept and this works well with the implemented options: x <- matrix(rnorm(10000), ncol=50) y <- rnorm(200) cv.out =cv.glmnet(x,y, alpha =1, intercept=T , standardize=T) coef <- coef(cv.out, s = "lambda.min") ind1 <- which(coef>0) coef[ind1,] but when I would like to do this by hand: xs <- apply(x,2, function(x) (x-mean(x))/sqrt(var(x))) ys <- y - mean(y) cv.out =cv.glmnet(xs,ys, alpha =1, intercept=F , standardize=F) coef <- coef(cv.out, s = "lambda.min") ind1 <- which(coef>0) coef[ind1,] The following error appears: > cv.out =cv.glmnet(xs,ys, alpha =1, intercept=F , standardize=F) Error in elnet(x, is.sparse, ix, jx, y, weights, offset, type.gaussian, : NA/NaN/Inf in foreign function call (arg 5) Therefore my question is what am I doing wrong and what is the "best practice" with Lasso (intercept yes / no, standardisation by hand, ...) Thank you very much for your efforts and replies in advance! Best, Martin