Janko Thyson
2011-May-23 10:59 UTC
[R] Remove duplicate elements in lists via recursive indexing
Dear list, I'm trying to solve something pretty basic here, but I can't really come up with a good solution. Basically, I would just like to remove duplicated named elements in lists via a their respective recursive indexes (given that I have a routine that identifies these recursive indexes). Here's a little example: # VECTORS # Here, it's pretty simple to remove duplicated entries y <- c(1,2,3,1,1) idx.dupl <- which(duplicated(y)) y <- y[-idx.dupl] # / # LISTS x <- list(a=list(a.1.1=1, a.1.1=2, a.1.1=3)) x[[c(1,1)]] x[[c(1,2)]] # Should be removed. x[[c(1,3)]] # Should be removed. # Let's say a 'checkDuplicates' routine would give me: idx.dupl <- list(c(1,2), c(1,3)) # Remove first duplicate: x[[idx.dupl[[1]]]] <- NULL x # Problem: # Once I remove the first duplicate, my duplicate index would have to be # updated as well as there is not third element anymore. x[[idx.dupl[[2]]]] <- NULL # So something like this would not work: sapply(idx.dupl, function(x.idx){ x[[x.idx]] <<- NULL }) # / Sorry if I'm missing something totally obvious here, but do you have any idea how to solve this? Thanks a lot, Janko
Timothy Bates
2011-May-23 12:23 UTC
[R] Remove duplicate elements in lists via recursive indexing
Dear Janko, I think requires a for loop. The approach I took here was mark the dups, then dump them all in one hit: testData = expand.grid(letters[1:4],c(1:3)) testData$keep=F uniqueIDS = unique(testData$Var1) for(thisID in uniqueIDS) { firstCaseOnly = match(thisID,testData$Var1) testData[firstCaseOnly,"keep"]=T } (testData = testData[testData$keep==T,]) On 23 May 2011, at 11:59 AM, Janko Thyson wrote:> Dear list, > > I'm trying to solve something pretty basic here, but I can't really come up with a good solution. Basically, I would just like to remove duplicated named elements in lists via a their respective recursive indexes (given that I have a routine that identifies these recursive indexes). Here's a little example: > > # VECTORS > # Here, it's pretty simple to remove duplicated entries > y <- c(1,2,3,1,1) > idx.dupl <- which(duplicated(y)) > y <- y[-idx.dupl] > # / > > # LISTS > x <- list(a=list(a.1.1=1, a.1.1=2, a.1.1=3)) > > x[[c(1,1)]] > x[[c(1,2)]] # Should be removed. > x[[c(1,3)]] # Should be removed. > > # Let's say a 'checkDuplicates' routine would give me: > idx.dupl <- list(c(1,2), c(1,3)) > > # Remove first duplicate: > x[[idx.dupl[[1]]]] <- NULL > x > # Problem: > # Once I remove the first duplicate, my duplicate index would have to be > # updated as well as there is not third element anymore. > x[[idx.dupl[[2]]]] <- NULL > > # So something like this would not work: > sapply(idx.dupl, function(x.idx){ > x[[x.idx]] <<- NULL > }) > # / > > Sorry if I'm missing something totally obvious here, but do you have any idea how to solve this? > > Thanks a lot, > Janko > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.