Debuggers, I wrote to r-help about this and was appropriately told off by Peter Dalgaard. I append that mail in case you have not seen it. Following Peter's advice I have attempted to simplify the problem. First note that the following does *not* fail (by which I mean crash, as in generate a memory access violation):> tmp<-matrix(c(1,0,0,1,1,1),2,3) > dimnames(tmp)<-list(NULL,c('yvar','x1','x2')) > lm(tmp[,'yvar']~tmp[,'x1']+tmp[,'x2']) > summary(.Last.value)I tried to cut down my original data set to just the first ten rows to make it manageable to transmit. Of course then when I ran lm() there were NA estimates. Thus I wasn't totally surprised that summary() would have trouble. But, unlike the above, it crashes fatally. Thinking to reproduce this very simply, I used (sorry quick and dirty, I know there's a way to use paste to give the model formula):> tmp<-matrix(c(1,0,0,1,rep(1,56)),2,30) > dimnames(tmp)<-list(NULL,paste('x',1:30,sep='')) > lm(tmp[, "x1"] ~ tmp[, "x2"] + tmp[, "x3"] + tmp[, "x4"] + tmp[,"x4"] + tmp[, "x5"] + tmp[, "x6"] + tmp[, "x7"] + tmp[, "x8"] + tmp[, "x9"] + tmp[, "x10"] + tmp[, "x11"] + tmp[, "x12"] + tmp[, "x13"] + tmp[, "x14"] + tmp[, "x15"] + tmp[, "x16"] + tmp[, "x17"] + tmp[, "x18"] + tmp[, "x19"] + tmp[, "x20"]+tmp[, "x21"]+tmp[, "x22"]+tmp[, "x23"]+tmp[, "x24"]+tmp[, "x25"]+tmp[, "x26"]+tmp[, "x27"]+tmp[, "x28"]+tmp[, "x29"]+tmp[, "x30"])> summary(.Last.value)But this has no problem. So it doesn't seem to be singularity of X or the length of the model formula at fault (my problem data has 27 variables). What follows now is what *does* give a fault. The data (in sasch2) is truncated to just the first 10 rows. I made it so the modified dataset is called sasch2 so that I could cut and paste the exact same lm() call:> tmp<-sasch2[1:10,] > holdsasch2<-sasch2 > sasch2<-tmp > dump('sasch2','bugs.dump') > lm(sasch2[, "ddiff"] ~ sasch2[, "td30"] + sasch2[, "td60"] +sasch2[, "td90"] + sasch2[, + "td120"] + +sasch2[, "td180"] + sasch2[, "td240"] + sasch2[, "td300"] + sasch2[, "td360"] + + sasch2[, "td420"] + sasch2[, "td480"] + sasch2[, "db1"] + sasch2[, "db1.5"] + sasch2[, "db2"] + + sasch2[, "db2.5"] + +sasch2[, "db3.5"] + sasch2[, "db4"] + sasch2[, "db4.5"] + sasch2[, "db5"] + + sasch2[, "db5.5"] + +sasch2[, "db6"] + sasch2[, "db6.5"] + sasch2[, "db7"] + sasch2[, "db7.5"] + + sasch2[, "db8"] + +sasch2[, "db8.5"] + sasch2[, "db9"] + sasch2[, "db9.5"])> summary(.Last.value)Dr Watson tells me the access violation is 0xc0000005 at address 0x2020200 (whatever that means; I last used machine code on a PDP 8). The data as dumped now follows (though it is anonymised, please continue to treat the data as confidential). After that is my original report to r-help. I'd be very happy to provide any other info you might need to know. PD suggested running on Unix but my version there is hopelessly out of date since I started using NT. I hope I've given you enough here to try it for yourself on the latest unix version without too much bother. "sasch2" <- structure(c(1, 1, 1, 2, 2, 3, 4, 5, 5, 6, 3, 2, 0.5, 0, 2, 4, 6, 1.5, 6, 5, 225, 175, 180, 255, 140, 140, 236, 315, 90, 190, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 5, 7, 3, 3, 5, 4, 2.5, 4, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 611231, 611231, 611231, 611041, 611041, 611553, 605829, 604881, 604881, 612966, 4, 4, 4, 4, 4, 4, 3, 3, 3, 2, 25, 25, 25, 29, 29, 29, 31, 28, 28, 18, 289, 289, 289, 279, 279, 289, 296, 268, 268, 281, 1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 655, 655, 655, 780, 780, 180, 347, 295, 295, 240, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 0, 0, 0, 0, 0, 5, 5, 5, 4, 4, 4, 4, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 3, 3, 1, 2, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 4, 4, 5, 5, 3, 4, 3, 3, 2, 234, 234, 234, 24, 24, 12, 17, 127, 127, 27, 1, 1, 1, 1, 1, 3, 3, 1, 1, 1, 2, 2, 2, 2, 2, 4, 4, 2, 2, 0, 300, 300, 300, 100, 100, 200, 500, 200, 200, 400, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 809782, 809782, 809782, 809790, 809790, 809797, 809813, 809832, 809832, 809840, 3600, 3600, 3600, 3740, 3740, 3280, 3090, 3100, 3100, 4110, 4, 4, 4, 9, 9, 9, 9, 10, 10, 9, 9, 9, 9, 10, 10, 10, 9, 10, 10, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7.38, 7.38, 7.38, 7.34, 7.34, 7.27, 7.33, 7.3, 7.3, 7.29, NA, NA, NA, -3.4, -3.4, -5.8, -6.4, -2, -2, -7.2, 2, 2, 2, 3, 3, 5, 4, 2.5, 2.5, 3, 225, 225, 225, 255, 255, 140, 236, 315, 315, 190, 5, 5, 5, 3, 3, 9, 10, 4, 4, 8, 400, 400, 400, 395, 395, NA, NA, 405, 405, 210, 7, 7, 7, 5, 5, NA, NA, 10, 10, 10, 580, 580, 580, NA, NA, NA, NA, NA, NA, NA, 7.5, 7.5, 7.5, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 315, 315, 190, NA, NA, NA, NA, NA, NA, NA, 4, 4, 8, 440, 440, 440, 390, 390, NA, NA, NA, NA, NA, NA, NA, NA, 3, 3, NA, NA, NA, NA, NA), .Dim = c(10, 83), .Dimnames = list(NULL, c("pxid", "ddiff", "tdiff", "td30", "td60", "td90", "td120", "td180", "td240", "td300", "td360", "td420", "td480", "td2000", "dbase", "db1", "db1.5", "db2", "db2.5", "db3", "db3.5", "db4", "db4.5", "db5", "db5.5", "db6", "db6.5", "db7", "db7.5", "db8", "db8.5", "db9", "db9.5", "lwh.num", "partogram", "age", "gestation", "cervix", "effaced", "membranes", "rd.interval", "action.line", "synto", "synto.time", "rom", "blood", "ctg", "fbs", "fbs.no", "iupc", "amnioinf", "ves", "anaesth", "delivery.mode", "perineum", "blood.loss", "transfus", "drugs", "retained.pl", "baby.no", "weight", "apgar1", "apgar5", "scbu", "ph.ven", "be.ven", "dilat1", "time2", "dilat2", "time3", "dilat3", "time4", "dilat4", "time5", "dilat5", "time6", "dilat6", "time7", "dilat7", "srom.time", "srom.dilat", "epi.time", "epi.dilat"))) ************************************************************************ Under NT 4.0, using Version 0.63.2 Beta (Jan 12, 1999): Not sure if this is a bug or a feature (forcing me to program less clumsily) so I'll report it here rather than to bugs. With a medium size data set (1700 observations,70 explanatory variables) and plenty of memory, specifically> gc()free total Ncells 886738 1000000 Vcells 7912909 8388608 I get a fatal error when attempting summary() on the fit of an lm() on a large-ish set of dummy variables (stored in a matrix): Call: lm(formula = sasch2[, "ddiff"] ~ sasch2[, "td30"] + sasch2[, "td60"] + sasch2[, "td90"] + sasch2[, "td120"] + +sasch2[, "td180"] + sasch2[, "td240"] + sasch2[, "td300"] + sasch2[, "td360"] + sasch2[, "td420"] + sasch2[, "td480"] + sasch2[, "db1"] + sasch2[, "db1.5"] + sasch2[, "db2"] + sasch2[, "db2.5"] + +sasch2[, "db3.5"] + sasch2[, "db4"] + sasch2[, "db4.5"] + sasch2[, "db5"] + sasch2[, "db5.5"] + +sasch2[, "db6"] + sasch2[, "db6.5"] + sasch2[, "db7"] + sasch2[, "db7.5"] + sasch2[, "db8"] + +sasch2[, "db8.5"] + sasch2[, "db9"] + sasch2[, "db9.5"]) I get estimates OK, but summary() collapses. However, if I do the same thing less clumsily, by writing all the relevant variables to a new data frame, and then calling Call: lm(formula = ddiff ~ ., data = dtmp) I get not only the estimates but can also summary() with no problem. Any ideas why? Seems to be memory-linked, because I can lm() and summary() the matrix versions using only the sasch2[,'td*'] or db* variable sets. Simon Fear -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Guido Masarotto) (Guido Masarotto
1999-Mar-09 20:27 UTC
summary() of lm() problem (PR#135)
On Tue, Mar 09, 1999 at 02:20:59PM +0100, fears@roycastle.liv.ac.uk wrote:> .......................................................................Dear Simon, I have just now tried your example. My temptative conclusion is that it is due to problem described in r-bugs in message with id 101; in short: when the R console is a windowed one (surely under Windows, but I suspect also under the Mac) printing makes use of a buffer with a fixed and hard-coded length. Indeed: (i) I got a segmentation fault using rw0632; (ii) all work without problem in my pre-release rw0633 where problem has been 'cured' simply by enlarging the buffer (of course, this WILL not been the definitive solution) but (iii) I got a segmentation fault also in rw0633 if I reset the length of the buffer to the R-0.63.2 one. We hope to make available rw0633 towards the end of the week. Brian Ripley and I have introduced a lot of Windows specific changes and this is the reason of the delay. guido -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Guido Masarotto) (Guido Masarotto
1999-Mar-09 20:54 UTC
summary() of lm() problem (PR#135)
On Tue, Mar 09, 1999 at 08:36:18PM -0000, Simon Fear wrote:> Perhaps I might avoid wasting your time if I knew where to locate:Don't worry. Bugs report are vital for the entire project.> > > My temptative conclusion is that > > it is due to problem described in r-bugs in message with id 101 > > I haven't seen reference in the FAQ to an archive of bug reports? Have I > just missed it?But, yes, there is an archive of bugs reports. From CRAN main page (e.g., http://www.ci.tuwien.ac.at/R/contents.html) just follows the 'Bug tracking system' link. guido (ps: I am taking the freedom to post this replay to the mailing list to remark the existence of the archive). -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._