Christopher Desjardins
2011-Jun-27 16:56 UTC
[R] Recoding several variables into one use the most recent data
Hi, I have the following data management issue. I am trying to combine multiple years of ethnicity data into one variable called ethnic. The data looks similar to the following id ethnic07 ethnic08 ethnic09 ethnic10 1 1 1 1 1 2 1 1 2 2 3 3 4 4 NA 4 2 3 NA NA So, what I'd like to do is create a variable called 'ethnic' and I'd like to have this variable be filled with the most recent data available. So ethnic10 would have the highest priority, then ethnic09, followed by ethnic08, and finally ethnic07. So the ethnic variable based on the data above would look like the following: ethnic 1 2 4 3 I thought an ifelse() statement might work but I seem to be writing over my data every time I do this. Thanks, Chris [[alternative HTML version deleted]]
David Winsemius
2011-Jun-27 17:48 UTC
[R] Recoding several variables into one use the most recent data
On Jun 27, 2011, at 12:56 PM, Christopher Desjardins wrote:> Hi, > I have the following data management issue. I am trying to combine > multiple > years of ethnicity data into one variable called ethnic. The data > looks > similar to the following > > id ethnic07 ethnic08 ethnic09 ethnic10 > 1 1 1 1 1 > 2 1 1 2 2 > 3 3 4 4 NA > 4 2 3 NA NA > > So, what I'd like to do is create a variable called 'ethnic' and I'd > like to > have this variable be filled with the most recent data available. So > ethnic10 would have the highest priority, then ethnic09, followed by > ethnic08, and finally ethnic07. So the ethnic variable based on the > data > above would look like the following: > > ethnic > 1 > 2 > 4 > 3 >rd.txt <- function(txt, header=TRUE, ...) { rd <- read.table(textConnection(txt), header=header, ...) closeAllConnections() rd } > tail_noNA <- function(x,n) tail(x[!is.na(x)], n) > tail_noNA(c(1,2,3,NA),1) [1] 3 > dat <-rd.txt("id ethnic07 ethnic08 ethnic09 ethnic10 + 1 1 1 1 1 + 2 1 1 2 2 + 3 3 4 4 NA + 4 2 3 NA NA") > apply(dat, 1, tail_noNA, 1) [1] 1 2 4 3 > as.matrix(apply(dat, 1, tail_noNA, 1)) [,1] [1,] 1 [2,] 2 [3,] 4 [4,] 3> I thought an ifelse() statement might work but I seem to be writing > over my > data every time I do this. > > Thanks, > Chris > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT