StellathePug
2011-Aug-23 16:29 UTC
[R] Replacing NAs in one variable with values of another variable
Hello everyone, I am trying to figure out a way of replacing missing observations in one of the variables of a data frame by values of another variable. For example, assume my data is X X <-as.data.frame(matrix(c(9, 6, 1, 3, 9, "NA", "NA","NA","NA","NA", 6, 4, 3,"NA", "NA", "NA", 5, 4, 1, 3), ncol=2)) names(X)<-c("X1","X2") I want to change X1 so that instead of the missing values it uses the values in X2 (regardless of whether these are missing). So my X1, should become X$X1 <- c(9, 6, 1, 3, 9, "NA", 5, 4, 1, 3). I have searched online for a while and looked at the manuals and the best (unsuccessful) attempt I have come up with is X$X1[X$X1=="NA"] <- X$X2 and that produces the following X1 X$X1<-c(9, 6, 1, 3, 9, 6, "NA", 3, "NA", "NA") and generates the following warning: Warning messages: 1: In `[<-.factor`(`*tmp*`, X$X1 == "NA", value = c(5L, 3L, 2L, 6L, : invalid factor level, NAs generated 2: In x[...] <- m : number of items to replace is not a multiple of replacement length I think that my error is that it is ignoring the non-missing values of X1 and the dimensions don't match. But what I want my code to do is to look at the rows of X1, see if it's a missing value; if it is, replace it with the value that is in the row of X2; if it's not missing, leave it as is. What am I doing wrong? Thank you very much! Rita -- View this message in context: http://r.789695.n4.nabble.com/Replacing-NAs-in-one-variable-with-values-of-another-variable-tp3763269p3763269.html Sent from the R help mailing list archive at Nabble.com.
Ista Zahn
2011-Aug-23 18:05 UTC
[R] Replacing NAs in one variable with values of another variable
Hi, On Tue, Aug 23, 2011 at 12:29 PM, StellathePug <ritacarreira at hotmail.com> wrote:> Hello everyone, > I am trying to figure out a way of replacing missing observations in one of > the variables of a data frame by values of another variable. For example, > assume my data is X > > X <-as.data.frame(matrix(c(9, 6, 1, 3, 9, "NA", "NA","NA","NA","NA", > ? ? ? ? ? ? ? ? ? ?6, 4, 3,"NA", "NA", "NA", 5, 4, 1, 3), ncol=2)) > names(X)<-c("X1","X2") > > I want to change X1 so that instead of the missing values it uses the values > in X2 (regardless of whether these are missing).Note that you don't have any missing values in X, as "NA" != NA So my X1, should become> X$X1 <- c(9, 6, 1, 3, 9, "NA", 5, 4, 1, 3). > > I have searched online for a while and looked at the manuals and the best > (unsuccessful) attempt I have come up with is > > X$X1[X$X1=="NA"] <- X$X2 > > and that produces the following X1 > > X$X1<-c(9, 6, 1, 3, 9, 6, "NA", 3, "NA", "NA") > > and generates the following warning: > > Warning messages: > 1: In `[<-.factor`(`*tmp*`, X$X1 == "NA", value = c(5L, 3L, 2L, 6L, ?: > ?invalid factor level, NAs generated > 2: In x[...] <- m : > ?number of items to replace is not a multiple of replacement length > > I think that my error is that it is ignoring the non-missing values of X1 > and the dimensions don't match. But what I want my code to do is to look at > the rows of X1, see if it's a missing value; if it is, replace it with the > value that is in the row of X2; if it's not missing, leave it as is.Here are two solutions, one that is a correction to your first attempt, and another using ifelse: X$X1[X$X1=="NA"] <- X$X2[X$X1=="NA"] X$X1 <- ifelse(X$X1 == "NA", X$X2, X$X1) Best, Ista> > What am I doing wrong? > > Thank you very much! > Rita > > > -- > View this message in context: http://r.789695.n4.nabble.com/Replacing-NAs-in-one-variable-with-values-of-another-variable-tp3763269p3763269.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org
StellathePug
2011-Aug-24 13:35 UTC
[R] Replacing NAs in one variable with values of another variable
Thank you Dan and Ista! Both of you are correct, I should have used NA rather than "NA" in my example. So the correct code should be: X <-as.data.frame(matrix(c(9, 6, 1, 3, 9, NA, NA,NA,NA,NA, 6, 4, 3,NA, NA, NA, 5, 4, 1, 3), ncol=2)) names(X)<-c("X1","X2") X$X1[is.na(X$X1)] <- X$X2[is.na(X$X1)] Where the last line replaces the missing observations of X1 by those of X2. The "if else" statement also works. Thank you very much, again! Rita -- View this message in context: http://r.789695.n4.nabble.com/Replacing-NAs-in-one-variable-with-values-of-another-variable-tp3763269p3765317.html Sent from the R help mailing list archive at Nabble.com.
Apparently Analagous Threads
- What does class "call" mean? How do I make class "formula" into a "call"?
- Weighted Average on More than One Variable in Data Frame
- How do I delete multiple blank variables from a data frame?
- Loop in variable names
- Function for deleting variables with >=50% missing obs from a data frame