Daniel Wiesmann
2010-Jul-05 21:17 UTC
[R] Memory problem in multinomial logistic regression
Dear All I am trying to fit a multinomial logistic regression to a data set with a size of 94279 by 14 entries. The data frame has one "sample" column which is the categorical variable, and the number of different categories is 9. The size of the data set (as a csv file) is less than 10 MB. I tried to fit a multinomial logistic regression, either using vglm() from the VGAM package or mlogit() from the mlogit package. In both cases the estimation crashes because I do not have enough memory, although the free memory before starting the regression is more than 2GB. The regression functions eat up all of my memory. Does anyone know why this relatively small data set leads to memory problems, and how I could work around my problem? thank you for your help, Daniel
Charles C. Berry
2010-Jul-06 04:40 UTC
[R] Memory problem in multinomial logistic regression
On Mon, 5 Jul 2010, Daniel Wiesmann wrote:> Dear All > > I am trying to fit a multinomial logistic regression to a data set with > a size of 94279 by 14 entries. The data frame has one "sample" column > which is the categorical variable, and the number of different > categories is 9. The size of the data set (as a csv file) is less than > 10 MB.First, do str( your.data.frame ) so we can be sure that you do not have a factor lurking among your regressors. Then report the calls you used for vglm() and mlogit(). It might not hurt to construct the model.matrix() first and check on it with object.size() Also try for (i in levels(your.data.frame$sample)){ print( glm(I(sample==i) ~. , your.data.,frame, family=binomial) )} just to check on your data. If that loop fails all bets are off. HTH, Chuck> > I tried to fit a multinomial logistic regression, either using vglm() > from the VGAM package or mlogit() from the mlogit package. > > In both cases the estimation crashes because I do not have enough > memory, although the free memory before starting the regression is more > than 2GB. The regression functions eat up all of my memory. > > Does anyone know why this relatively small data set leads to memory > problems, and how I could work around my problem? > > thank you for your help, > > Daniel > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901