Hi, I have the following data:> data[1:20,c(1,2,20)]idr schyear year 1 8 0 1 9 1 1 10 NA 2 4 NA 2 5 -1 2 6 0 2 7 1 2 8 2 2 9 3 2 10 4 2 11 NA 2 12 6 3 4 NA 3 5 -2 3 6 -1 3 7 0 3 8 1 3 9 2 3 10 3 3 11 NA What I want to do is replace the NAs in the year variable with the following: idr schyear year 1 8 0 1 9 1 1 10 2 2 4 -2 2 5 -1 2 6 0 2 7 1 2 8 2 2 9 3 2 10 4 2 11 5 2 12 6 3 4 -3 3 5 -2 3 6 -1 3 7 0 3 8 1 3 9 2 3 10 3 3 11 4 I have no idea how to do this. What it needs to do is make sure that for each subject (idr) that it either adds a 1 if it is preceded by a value in year or subtracts a 1 if it comes before a year value. Does that make sense? I could do this in Excel but I am at a loss for how to do this in R. Please reply to me as well as the list if you respond. Thanks! Chris [[alternative HTML version deleted]]
Hello, Try the following. I've called your data.frames 'dat' and 'dat2' # First your datasets, see ?dput dput(dat) structure(list(idr = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), schyear = c(8L, 9L, 10L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L), year = c(0L, 1L, NA, NA, -1L, 0L, 1L, 2L, 3L, 4L, NA, 6L, NA, -2L, -1L, 0L, 1L, 2L, 3L, NA)), .Names = c("idr", "schyear", "year"), class = "data.frame", row.names = c(NA, -20L )) dput(dat2) structure(list(idr = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), schyear = c(8L, 9L, 10L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L), year = c(0L, 1L, 2L, -2L, -1L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, -3L, -2L, -1L, 0L, 1L, 2L, 3L, 4L)), .Names = c("idr", "schyear", "year"), class = "data.frame", row.names = c(NA, -20L )) # Now the code fun <- function(x){ for(i in which(is.na(x$year))){ if(i == 1) x$year[i] <- x$year[i + 1] - 1L else x$year[i] <- x$year[i - 1] + 1L } x } result <- do.call(rbind, lapply(split(dat, dat$idr), fun)) all.equal(result, dat2) Hope this helps, Rui Barradas Em 03-11-2012 17:14, Christopher Desjardins escreveu:> Hi, > I have the following data: > >> data[1:20,c(1,2,20)] > idr schyear year > 1 8 0 > 1 9 1 > 1 10 NA > 2 4 NA > 2 5 -1 > 2 6 0 > 2 7 1 > 2 8 2 > 2 9 3 > 2 10 4 > 2 11 NA > 2 12 6 > 3 4 NA > 3 5 -2 > 3 6 -1 > 3 7 0 > 3 8 1 > 3 9 2 > 3 10 3 > 3 11 NA > > What I want to do is replace the NAs in the year variable with the > following: > > idr schyear year > 1 8 0 > 1 9 1 > 1 10 2 > 2 4 -2 > 2 5 -1 > 2 6 0 > 2 7 1 > 2 8 2 > 2 9 3 > 2 10 4 > 2 11 5 > 2 12 6 > 3 4 -3 > 3 5 -2 > 3 6 -1 > 3 7 0 > 3 8 1 > 3 9 2 > 3 10 3 > 3 11 4 > > I have no idea how to do this. What it needs to do is make sure that for > each subject (idr) that it either adds a 1 if it is preceded by a value in > year or subtracts a 1 if it comes before a year value. > > Does that make sense? I could do this in Excel but I am at a loss for how > to do this in R. Please reply to me as well as the list if you respond. > > Thanks! > Chris > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
> x <- read.table(text = "idr schyear year+ 1 8 0 + 1 9 1 + 1 10 NA + 2 4 NA + 2 5 -1 + 2 6 0 + 2 7 1 + 2 8 2 + 2 9 3 + 2 10 4 + 2 11 NA + 2 12 6 + 3 4 NA + 3 5 -2 + 3 6 -1 + 3 7 0 + 3 8 1 + 3 9 2 + 3 10 3 + 3 11 NA", header = TRUE)> # you did not specify if there might be multiple contiguous NAs, > # so there are a lot of checks to be made > x.l <- lapply(split(x, x$idr), function(.idr){+ # check for all NAs -- just return indeterminate state + if (sum(is.na(.idr$year)) == nrow(.idr)) return(.idr) + # repeat until all NAs have been fixed; takes care of contiguous ones + while (any(is.na(.idr$year))){ + # find all the NAs + for (i in which(is.na(.idr$year))){ + if ((i == 1L) && (!is.na(.idr$year[i + 1L]))){ + .idr$year[i] <- .idr$year[i + 1L] - 1 + } else if ((i > 1L) && (!is.na(.idr$year[i - 1L]))){ + .idr$year[i] <- .idr$year[i - 1L] + 1 + } else if ((i < nrow(.idr)) && (!is.na(.idr$year[i + 1L]))){ + .idr$year[i] <- .idr$year[i + 1L] -1 + } + } + } + return(.idr) + })> do.call(rbind, x.l)idr schyear year 1.1 1 8 0 1.2 1 9 1 1.3 1 10 2 2.4 2 4 -2 2.5 2 5 -1 2.6 2 6 0 2.7 2 7 1 2.8 2 8 2 2.9 2 9 3 2.10 2 10 4 2.11 2 11 5 2.12 2 12 6 3.13 3 4 -3 3.14 3 5 -2 3.15 3 6 -1 3.16 3 7 0 3.17 3 8 1 3.18 3 9 2 3.19 3 10 3 3.20 3 11 4> >On Sat, Nov 3, 2012 at 1:14 PM, Christopher Desjardins <cddesjardins at gmail.com> wrote:> Hi, > I have the following data: > >> data[1:20,c(1,2,20)] > idr schyear year > 1 8 0 > 1 9 1 > 1 10 NA > 2 4 NA > 2 5 -1 > 2 6 0 > 2 7 1 > 2 8 2 > 2 9 3 > 2 10 4 > 2 11 NA > 2 12 6 > 3 4 NA > 3 5 -2 > 3 6 -1 > 3 7 0 > 3 8 1 > 3 9 2 > 3 10 3 > 3 11 NA > > What I want to do is replace the NAs in the year variable with the > following: > > idr schyear year > 1 8 0 > 1 9 1 > 1 10 2 > 2 4 -2 > 2 5 -1 > 2 6 0 > 2 7 1 > 2 8 2 > 2 9 3 > 2 10 4 > 2 11 5 > 2 12 6 > 3 4 -3 > 3 5 -2 > 3 6 -1 > 3 7 0 > 3 8 1 > 3 9 2 > 3 10 3 > 3 11 4 > > I have no idea how to do this. What it needs to do is make sure that for > each subject (idr) that it either adds a 1 if it is preceded by a value in > year or subtracts a 1 if it comes before a year value. > > Does that make sense? I could do this in Excel but I am at a loss for how > to do this in R. Please reply to me as well as the list if you respond. > > Thanks! > Chris > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.