This is probably a simple problem but I don't see a solution. I have a data.frame with a number of columns where I would like 0 <- NA thus I have df1[,144:157] <- NA if df1[, 144: 157] ==0 and df1[, 190:198] <- NA if df1[, 190:198] ==0 but I cannot figure out a way do this. cata <- c( 1,1,6,1,1,NA) catb <- c( 1,2,3,4,5,6) doga <- c(3,5,3,6,4, 0) dogb <- c(2,4,6,8,10, 12) rata <- c (NA, 9, 9, 8, 9, 8) ratb <- c( 1,2,3,4,5,6) bata <- c( 12, 42,NA, 45, 32, 54) batb <- c( 13, 15, 17,19,21,23) id <- c('a', 'b', 'b', 'c', 'a', 'b') site <- c(1,1,4,4,1,4) mat1 <- cbind(cata, catb, doga, dogb, rata, ratb, bata, batb) data1 <- data.frame(site, id, mat1) data1 # Obviously this works fine for one column data1$site[data1$site ==1] <- NA ; data1 but I cannot see how to do this with indices that would allow me to do more than one column in the data.frame. At one point I even tried something like this a <- c("site") data1$a[data1$a ==1] <- NA which seems to produce a corrupt data.frame. I am sure it is simple but I don't see it. Any help would be much appreciated.
Erik Iverson
2007-Feb-07 22:57 UTC
[R] setting a number of values to NA over a data.frame.
John - Your initial problem uses 0, but the example uses 1 for the value that gets an NA. My solution uses 1 to fit with your example. There may be a better way, but try something like data1[3:5] <- data.frame(lapply(data1[3:5], function(x) ifelse(x==1, NA, x))) The data1[3:5] is just a test subset of columns I chose from your data1 example. Notice it appears twice, once on each side of the assignment operator. In English, apply to each column of the data frame (which is a list) a function that will return NA if the element is 1, and the value otherwise, and then turn the modified lists into a data.frame, and save it as data1. See the help files for lapply and ifelse if you haven't seen those before. Maybe someone has a better way? Erik John Kane wrote:> This is probably a simple problem but I don't see a > solution. > > I have a data.frame with a number of columns where I > would like 0 <- NA > > thus I have df1[,144:157] <- NA if df1[, 144: 157] ==0 > and df1[, 190:198] <- NA if df1[, 190:198] ==0 > > but I cannot figure out a way do this. > > cata <- c( 1,1,6,1,1,NA) > catb <- c( 1,2,3,4,5,6) > doga <- c(3,5,3,6,4, 0) > dogb <- c(2,4,6,8,10, 12) > rata <- c (NA, 9, 9, 8, 9, 8) > ratb <- c( 1,2,3,4,5,6) > bata <- c( 12, 42,NA, 45, 32, 54) > batb <- c( 13, 15, 17,19,21,23) > id <- c('a', 'b', 'b', 'c', 'a', 'b') > site <- c(1,1,4,4,1,4) > mat1 <- cbind(cata, catb, doga, dogb, rata, ratb, > bata, batb) > > data1 <- data.frame(site, id, mat1) > data1 > > # Obviously this works fine for one column > > data1$site[data1$site ==1] <- NA ; data1 > > but I cannot see how to do this with indices that > would allow me to do more than one column in the > data.frame. > > At one point I even tried something like this > a <- c("site") > data1$a[data1$a ==1] <- NA > > which seems to produce a corrupt data.frame. > > I am sure it is simple but I don't see it. > > Any help would be much appreciated. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
John Kane wrote:> This is probably a simple problem but I don't see a > solution. > > I have a data.frame with a number of columns where I > would like 0 <- NA >Hi John, You might have a look at "toNA" in the prettyR package. Wait for version 1.0-4, just uploaded, as I have fixed a bug in that function. Jim
Hi Strange. It works for me without any problem.> zetatepl tio2 al2o3 iep 1 60 1 3.5 5.65 2 60 1 2.0 5.00 3 60 1 1.0 5.30 4 60 0 2.0 4.65 5 40 1 3.5 5.20 6 40 1 2.0 4.85 7 40 0 3.5 5.70 8 40 0 2.0 5.25> zeta[zeta==1]<-NA > zetatepl tio2 al2o3 iep 1 60 NA 3.5 5.65 2 60 NA 2.0 5.00 3 60 NA NA 5.30 4 60 0 2.0 4.65 5 40 NA 3.5 5.20 6 40 NA 2.0 4.85 7 40 0 3.5 5.70 8 40 0 2.0 5.25> str(zeta)'data.frame': 8 obs. of 4 variables: $ tepl : int 60 60 60 60 40 40 40 40 $ tio2 : num NA NA NA 0 NA NA 0 0 $ al2o3: num 3.5 2 NA 2 3.5 2 3.5 2 $ iep : num 5.65 5 5.3 4.65 5.2 4.85 5.7 5.25>HTH Petr On 7 Feb 2007 at 16:57, Erik Iverson wrote: Date sent: Wed, 07 Feb 2007 16:57:40 -0600 From: Erik Iverson <iverson at biostat.wisc.edu> To: John Kane <jrkrideau at yahoo.ca> Copies to: R R-help <r-help at stat.math.ethz.ch> Subject: Re: [R] setting a number of values to NA over a data.frame.> John - > > Your initial problem uses 0, but the example uses 1 for the value that > gets an NA. My solution uses 1 to fit with your example. There may > be a better way, but try something like > > data1[3:5] <- data.frame(lapply(data1[3:5], function(x) ifelse(x==1, > NA, x))) > > The data1[3:5] is just a test subset of columns I chose from your > data1 example. Notice it appears twice, once on each side of the > assignment operator. > > In English, apply to each column of the data frame (which is a list) a > function that will return NA if the element is 1, and the value > otherwise, and then turn the modified lists into a data.frame, and > save it as data1. > > > > See the help files for lapply and ifelse if you haven't seen those > before. > > Maybe someone has a better way? > > Erik > > John Kane wrote: > > This is probably a simple problem but I don't see a > > solution. > > > > I have a data.frame with a number of columns where I > > would like 0 <- NA > > > > thus I have df1[,144:157] <- NA if df1[, 144: 157] ==0 > > and df1[, 190:198] <- NA if df1[, 190:198] ==0 > > > > but I cannot figure out a way do this. > > > > cata <- c( 1,1,6,1,1,NA) > > catb <- c( 1,2,3,4,5,6) > > doga <- c(3,5,3,6,4, 0) > > dogb <- c(2,4,6,8,10, 12) > > rata <- c (NA, 9, 9, 8, 9, 8) > > ratb <- c( 1,2,3,4,5,6) > > bata <- c( 12, 42,NA, 45, 32, 54) > > batb <- c( 13, 15, 17,19,21,23) > > id <- c('a', 'b', 'b', 'c', 'a', 'b') > > site <- c(1,1,4,4,1,4) > > mat1 <- cbind(cata, catb, doga, dogb, rata, ratb, > > bata, batb) > > > > data1 <- data.frame(site, id, mat1) > > data1 > > > > # Obviously this works fine for one column > > > > data1$site[data1$site ==1] <- NA ; data1 > > > > but I cannot see how to do this with indices that > > would allow me to do more than one column in the > > data.frame. > > > > At one point I even tried something like this > > a <- c("site") > > data1$a[data1$a ==1] <- NA > > > > which seems to produce a corrupt data.frame. > > > > I am sure it is simple but I don't see it. > > > > Any help would be much appreciated. > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html and provide commented, > > minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, > minimal, self-contained, reproducible code.Petr Pikal petr.pikal at precheza.cz
Charilaos Skiadas
2007-Feb-09 13:40 UTC
[R] setting a number of values to NA over a data.frame.
Once again I forgot to reply to the whole list.... On Feb 9, 2007, at 8:39 AM, Charilaos Skiadas wrote:> On Feb 9, 2007, at 8:13 AM, John Kane wrote: > >> The problem is that my dataframe has 1,s in about 50% >> of the columns and I only want it to apply to a few >> specified columns. My explanation may not have been >> clear enough. >> >> Using your example,I want all values for tio2 set to 1 >> but not any values in al2o3 whereas zeta[zeta==1]<-NA >> is also changing al2o3[3] to NA. >> > > You need to index the zeta in zeta==1 in the same way as you do > with the zeta outside. > I think the point is that if you do zeta[,cols][zeta==1] <- NA, > then the recycling of NA to obtain the correct number of elements > is done based on the elements in zeta[,cols]. But since zeta==1 is > a much longer vector than zeta[,cols], then zeta[,cols][zeta==1] > has a number of NA objects attached to its end, and hence has now > a longer length than the recycled NA that is supposed to replace it. > But perhaps someone more expert in the internals can explain it in > greater detail, if the above is not right. In the mean time, the > following seems to work: > > > y <- rbinom(20, 1, 1/2) > > dim(y) <- c(5,4) > > colnames(y) <- c("one", "two", "three", "four") > > x <- as.data.frame(y) > > cl <- c("two", "three") > > x[,cl][x[,cl]==1] <- NA > > x > one two three four > 1 0 0 NA 0 > 2 0 0 0 0 > 3 1 0 0 0 > 4 0 NA 0 1 > 5 1 0 0 1 > >> Thanks > > HarisHaris
Gregor Gorjanc
2007-Feb-12 08:22 UTC
[R] setting a number of values to NA over a data.frame.
Jim Lemon <jim <at> bitwrit.com.au> writes:> Hi John, > You might have a look at "toNA" in the prettyR package. Wait for version > 1.0-4, just uploaded, as I have fixed a bug in that function.There is also a set of generic functions exactly for such cases: unknownToNA(), NAToUnknown() and isUnknown() in gdata package. Gregor
--- Gregor Gorjanc <gregor.gorjanc at bfro.uni-lj.si> wrote:> Jim Lemon <jim <at> bitwrit.com.au> writes: > > Hi John, > > You might have a look at "toNA" in the prettyR > package. Wait for version > > 1.0-4, just uploaded, as I have fixed a bug in > that function. > > There is also a set of generic functions exactly for > such cases: unknownToNA(), > NAToUnknown() and isUnknown() in gdata package. > > Gregor >Thanks very much. I have a wealth of approaches now.