I am dealing with a huge matrix in R (20 columns, 54000 rows) and have lots of missing values within the dataset which are currently displayed as the value "-999.00" I am trying to create a new matrix (or change the existing one) to display these values as "NA" so that I can then perform the necessary analysis on the columns within the matrix. The matrix name is temp and the column names are t1 to t20 inclusive. I have tried the following command: temp$t1[temp$t1 == -999.00] <- NA and it returns a segmentation fault, can someone tell me what I am doing wrong? Thanks Laura
Laura Quinn <laura at env.leeds.ac.uk> writes:> I am dealing with a huge matrix in R (20 columns, 54000 rows) and have > lots of missing values within the dataset which are currently displayed as > the value "-999.00" I am trying to create a new matrix (or change the > existing one) to display these values as "NA" so that I can then perform > the necessary analysis on the columns within the matrix. > > The matrix name is temp and the column names are t1 to t20 inclusive. > > I have tried the following command: > > temp$t1[temp$t1 == -999.00] <- NA > > and it returns a segmentation fault, can someone tell me what I am doing > wrong?Not telling us which system and which version you are using, and not giving us a reproducible example... OK, the latter can be tricky, but does it happen all the time? Only after doing X? Also if you deal with a subset of data? The command as such should work as far as I can see, and segmentation faults should basically not happen unless the user has been messing about at the C code level. (BTW, that's a data frame, not a "matrix", I assume.) -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
I cannot explain the segmentation fault but try this instead (which works for matrices) temp[which(temp==-999, arr.ind=T)] <- NA Are you sure temp is matrix and not a dataframe ? Use class(temp) to find out. Also, if you are getting these "-999.00" because you have read files containing them, it might just be easier to code the missing values when reading in. Try read.table( file="lala.txt", na.strings = "-999.00"). -- Adaikalavan Ramasamy -----Original Message----- From: Laura Quinn [mailto:laura at env.leeds.ac.uk] Sent: Tuesday, October 07, 2003 8:04 PM To: r-help at stat.math.ethz.ch Subject: [R] Beginner's query - segmentation fault I am dealing with a huge matrix in R (20 columns, 54000 rows) and have lots of missing values within the dataset which are currently displayed as the value "-999.00" I am trying to create a new matrix (or change the existing one) to display these values as "NA" so that I can then perform the necessary analysis on the columns within the matrix. The matrix name is temp and the column names are t1 to t20 inclusive. I have tried the following command: temp$t1[temp$t1 == -999.00] <- NA and it returns a segmentation fault, can someone tell me what I am doing wrong? Thanks Laura ______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
On Tue, 7 Oct 2003, Laura Quinn wrote:> I am dealing with a huge matrix in R (20 columns, 54000 rows) and have > lots of missing values within the dataset which are currently displayed as > the value "-999.00" I am trying to create a new matrix (or change the > existing one) to display these values as "NA" so that I can then perform > the necessary analysis on the columns within the matrix. > > The matrix name is temp and the column names are t1 to t20 inclusive. > > I have tried the following command: > > temp$t1[temp$t1 == -999.00] <- NA > > and it returns a segmentation fault, can someone tell me what I am doing > wrong?Well, R should not segfault, so there is bug here somewhere. However, I don't think what you have described can actually work. Is temp really a matrix? If so temp$t1 will return NULL, and you should get an error message. If temp is a matrix temp[temp == -999.00] <- NA will do what you want. If as is more likely temp is a data frame with all columns numeric, there are several ways to do this, e.g. temp[] <- lapply(temp, function(x) ifelse(x == -999, NA, x)) temp[as.matrix(temp) == -999] <- NA # only in recent versions of R as well as explicit looping over columns. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Laura Quinn wrote:> I am dealing with a huge matrix in R (20 columns, 54000 rows) and have > lots of missing values within the dataset which are currently displayed as > the value "-999.00" I am trying to create a new matrix (or change the > existing one) to display these values as "NA" so that I can then perform > the necessary analysis on the columns within the matrix. > > The matrix name is temp and the column names are t1 to t20 inclusive. > > I have tried the following command: > > temp$t1[temp$t1 == -999.00] <- NA > > and it returns a segmentation fault, can someone tell me what I am doing > wrong?The crash for this inappropriate usage has already been fixed for R-1.7.1, so you are using an outdated version, I guess. 1. If temp is a matrix, you have to use matrix indexing, not data.frame or list indexing, see the manuals. Now, we have got the (still wrong) line temp[temp[ ,"t1"] == -999.00, "t1"] <- NA 2. Use "is.na(x) <- TRUE" instead of "x <- NA": is.na(temp[temp[ ,"t1"] == -999.00, "t1"]) <- TRUE Or change all values "-999" to "NA" in the whole matrix by is.na(temp[temp == -999.00]) <- TRUE Uwe Ligges> Thanks > Laura > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help