michael watson (IAH-C)
2005-Jan-14 11:20 UTC
[R] Replacing NAs in a data frame using is.na() fails if there are no NAs
Hi This is a difference between the way matrices and data frames work I guess. I want to replace the NA values in a data frame by 0, and the code works as long as the data frame in question actually includes an NA value. If it doesn't, there is an error: df <- data.frame(c1=c(1,1,1),c2=c(2,2,NA)) df[is.na(df)] <- 0 df df <- data.frame(c1=c(1,1,1),c2=c(2,2,2)) df[is.na(df)] <- 0 Df Any help would be appreciated. I could just convert the data frame to a matrix, execute the code, then convert it back to a data frame, but that appears long winded. Thanks Mick
Barry Rowlingson
2005-Jan-14 11:49 UTC
[R] Replacing NAs in a data frame using is.na() fails if there are no NAs
michael watson (IAH-C) wrote:> Any help would be appreciated. I could just convert the data frame to a > matrix, execute the code, then convert it back to a data frame, but that > appears long winded.Slightly less long-winded (but probably a worse solution than some R-guru is about to give you) would be to stick a row of NAs at the end of the dataframe, do your replacement, then remove the last row. This slightly reminds me of the old joke about algorithms for hunting elephants in Africa, which involve placing a known elephant at the Cape Of Good Hope so that the algorithm is guaranteed to terminate, like all good algorithms should... But probably better to test for NAs in the dataframe beforehand: if(any(is.na(f)))f[is.na(f)]=0 Baz
Dimitris Rizopoulos
2005-Jan-14 11:50 UTC
[R] Replacing NAs in a data frame using is.na() fails if there areno NAs
Hi Mick, try the following: dat[] <- lapply(dat, function(x) ifelse(is.na(x), 0, x)) dat I hope it helps. Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.ac.be/biostat http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm ----- Original Message ----- From: "michael watson (IAH-C)" <michael.watson at bbsrc.ac.uk> To: <r-help at stat.math.ethz.ch> Sent: Friday, January 14, 2005 12:20 PM Subject: [R] Replacing NAs in a data frame using is.na() fails if there areno NAs> Hi > > This is a difference between the way matrices and data frames work I > guess. I want to replace the NA values in a data frame by 0, and > the > code works as long as the data frame in question actually includes > an NA > value. If it doesn't, there is an error: > > df <- data.frame(c1=c(1,1,1),c2=c(2,2,NA)) > df[is.na(df)] <- 0 > df > > df <- data.frame(c1=c(1,1,1),c2=c(2,2,2)) > df[is.na(df)] <- 0 > Df > > Any help would be appreciated. I could just convert the data frame > to a > matrix, execute the code, then convert it back to a data frame, but > that > appears long winded. > > Thanks > Mick > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >
Prof Brian Ripley
2005-Jan-14 11:56 UTC
[R] Replacing NAs in a data frame using is.na() fails if there are no NAs
On Fri, 14 Jan 2005, michael watson (IAH-C) wrote:> Hi > > This is a difference between the way matrices and data frames work I > guess. I want to replace the NA values in a data frame by 0, and the > code works as long as the data frame in question actually includes an NA > value. If it doesn't, there is an error: > > df <- data.frame(c1=c(1,1,1),c2=c(2,2,NA)) > df[is.na(df)] <- 0 > df > > df <- data.frame(c1=c(1,1,1),c2=c(2,2,2)) > df[is.na(df)] <- 0 > Df > > Any help would be appreciated. I could just convert the data frame to a > matrix, execute the code, then convert it back to a data frame, but that > appears long winded.As always, look at the objects:> is.na(df)c1 c2 1 FALSE FALSE 2 FALSE FALSE 3 FALSE FALSE so there is nothing to replace by 0. What you should have is ind <- is.na(df) df[ind] <- rep(0, sum(ind)) to give the right number of replacements. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
michael watson (IAH-C)
2005-Jan-14 12:03 UTC
[R] Replacing NAs in a data frame using is.na() fails if there are no NAs
Thank you for the answers. Checking for NAs using any() will help. I just thought it was worth mentioning because the behaviour of data frames (which throw an error) is different to the behaviour of matrices (which don't), and that might not be expected. mat <- matrix(c(1,1,1,1),nrow=2,ncol=2) mat[is.na(mat)] <- 0 mat <- matrix(c(1,1,1,NA),nrow=2,ncol=2) mat[is.na(mat)] <- 0 -----Original Message----- From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] Sent: 14 January 2005 11:57 To: michael watson (IAH-C) Cc: r-help at stat.math.ethz.ch Subject: Re: [R] Replacing NAs in a data frame using is.na() fails if there are no NAs On Fri, 14 Jan 2005, michael watson (IAH-C) wrote:> Hi > > This is a difference between the way matrices and data frames work I > guess. I want to replace the NA values in a data frame by 0, and the > code works as long as the data frame in question actually includes an > NA value. If it doesn't, there is an error: > > df <- data.frame(c1=c(1,1,1),c2=c(2,2,NA)) > df[is.na(df)] <- 0 > df > > df <- data.frame(c1=c(1,1,1),c2=c(2,2,2)) > df[is.na(df)] <- 0 > Df > > Any help would be appreciated. I could just convert the data frame to> a matrix, execute the code, then convert it back to a data frame, but > that appears long winded.As always, look at the objects:> is.na(df)c1 c2 1 FALSE FALSE 2 FALSE FALSE 3 FALSE FALSE so there is nothing to replace by 0. What you should have is ind <- is.na(df) df[ind] <- rep(0, sum(ind)) to give the right number of replacements. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Sean Davis
2005-Jan-14 12:05 UTC
[R] Replacing NAs in a data frame using is.na() fails if there are no NAs
Mick, The actual error is telling: > df <- data.frame(c1=c(1,1,1),c2=c(2,2,2)) > df[is.na(df)] <- 0 Error in "[<-.data.frame"(`*tmp*`, is.na(df), value = 0) : rhs is the wrong length for indexing by a logical matrix If you look at is.na(df), you will see that it is all FALSE, of course. The right-hand-side (rhs) can't be assigned to a vector of length=0 (the length of df[is.na(df)] if there are no NAs), hence the error. An easy work-around is to check if there are NAs first. tmp <- is.na(df); #get total number of NAs if (sum(tmp)) { #only execute if there is at least one NA df[tmp] <- 0 } Sean On Jan 14, 2005, at 6:20 AM, michael watson ((IAH-C)) wrote:> Hi > > This is a difference between the way matrices and data frames work I > guess. I want to replace the NA values in a data frame by 0, and the > code works as long as the data frame in question actually includes an > NA > value. If it doesn't, there is an error: > > df <- data.frame(c1=c(1,1,1),c2=c(2,2,NA)) > df[is.na(df)] <- 0 > df > > df <- data.frame(c1=c(1,1,1),c2=c(2,2,2)) > df[is.na(df)] <- 0 > Df > > Any help would be appreciated. I could just convert the data frame to > a > matrix, execute the code, then convert it back to a data frame, but > that > appears long winded. > > Thanks > Mick > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html