R-Users, I hope this is not a uniformed question, but I am a little lost. I ran into a problem this morning and I was wondering if anyone had seen it before. I was trying to summarize each column of a data set (150,000 rows, ~50mb, so it was a relatively big file) imported from a text file using the below code; data.summary <- read.csv("c:/summary.txt", sep="") data.summary <- as.matrix(data.summary) my.summary <- function(x){ return(c(min=min(x),max=max(x), mean=mean(x)))} apply(data.summary, 2, my.summary) And I got this weird error that I can not find out anything about? "Process R unknown signal at Wed Apr 14 08:17:22 2004" Have you seen anything like this before? Do you think it is the size of the dataset that is causing the problem, since the same code works for 25000 rows (~17mb) and gives the correct results (I cross-checked in SAS and EXCEL). I was using R 1.8.0 in Xemacs. TIA, Bret Collier
Bret Collier <bacolli at uark.edu> writes:> R-Users, > > I hope this is not a uniformed question, but I am a little lost.Don't worry, they all look alike... ;-)> I ran into a problem this morning and I was wondering if anyone had > seen it before. I was trying to summarize each column of a data set > (150,000 rows, ~50mb, so it was a relatively big file) imported from a text > file using the below code; > > data.summary <- read.csv("c:/summary.txt", sep="") > data.summary <- as.matrix(data.summary) > my.summary <- function(x){ > return(c(min=min(x),max=max(x), mean=mean(x)))} > apply(data.summary, 2, my.summary) > > > And I got this weird error that I can not find out anything about? > > > "Process R unknown signal at Wed Apr 14 08:17:22 2004" > > > Have you seen anything like this before? Do you think it is the size of > the dataset that is causing the problem, since the same code works for > 25000 rows (~17mb) and gives the correct results (I cross-checked in > SAS and EXCEL).Running out of memory and having the OS intervening could give that kind of message. Or bad RAM. In the first case look up how to set the memory limits, in the other, change machines to verify. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Bret Collier <bacolli at uark.edu> writes:> I hope this is not a uniformed question, but I am a little lost. > > I ran into a problem this morning and I was wondering if anyone had > seen it before. I was trying to summarize each column of a data set > (150,000 rows, ~50mb, so it was a relatively big file) imported from a text > file using the below code; > > data.summary <- read.csv("c:/summary.txt", sep="") > data.summary <- as.matrix(data.summary) > my.summary <- function(x){ > return(c(min=min(x),max=max(x), mean=mean(x)))} > apply(data.summary, 2, my.summary)Peter responded about the error. You may be able to circumvent the error by using apply(data.summary, 2, range) to get the minimum and maximum and colMeans(data.summary) to get the means. Those are internal functions and will generate less overhead (and fewer copies) than calls to your own function.