Nathan Stephens
2012-Apr-23 15:58 UTC
[R] glmnet sparse matrix error: dim specifies too large an array
I'm running into an unexpected error using the glmnet and Matrix packages. I have a matrix that is 8 million rows by 100 columns with 75% of the entries being zero. When I run a vanilla glmnet logistic model on my server with 300 GB of RAM, the task completes in 20 minutes:> x # 8 million x 100 matrix > model1 <- glmnet(x,y,'binomial',alpha=1) # run time 20 minutesBut if I convert the matrix to a sparse matrix using the Matrix package, the model does not run at all:> x2 <- Matrix(x,sparse=T) # 75% sparse > model2 <- glmnet(x2,y,'binomial',alpha=1) # errorError in array(0, c(n, p)) : 'dim' specifies too large an array This result is the opposite of what I might have expected. The non-sparse data runs fine, but the sparse data fails because it is "too large". Is this a glmnet issue or an R memory issue? Is there a way to fix this in glmnet? --Nathan [[alternative HTML version deleted]]