Hi Everyone, I'm working on a thorny subsetting problem involving list of lists. I've put a dput of the data here: https://gist.githubusercontent.com/aronlindberg/b916dee897d051ac5be5/raw/a78cbf873a7e865c3173f943ff6309ea688c653b/dput I can get one intense of the element I want this way: > input[[67]]$content[[1]]$sha [1] "58cf43ecdc1beb7e1043e9de612ecc817b090f15" However, I need to use a lapply function to loop over all of the items of the list. I've tried something like this, but it doesn't work: get_shas <- function(input){ x <- sapply(input, "[[", "content") y <- sapply(x, "[[", "sha") return(y) } sha_lists <- lapply(commit_lists, get_shas) However, this doesn't work. When I run each of the lapply commands "manually" it returns NULL for every list, and when I run the whole apply function it says:? Error in FUN(X[[1L]], ...) : subscript out of bounds I've tried reading the sections on lists and subsetting in Hadley's Advanced R, but I still cannot figure it out. Can anyone help or offer a pointer? Best, Aron --? Aron Lindberg Doctoral Candidate,?Information Systems Weatherhead School of Management? Case Western Reserve University aronlindberg.github.io [[alternative HTML version deleted]]
On 20/02/15 08:45, Aron Lindberg wrote:> Hi Everyone, > > > I'm working on a thorny subsetting problem involving list of lists.If you think this is "thorny" you ain't seen nothin' yet! But note that you've got a list of lists of lists ... i.e. the nesting is at least 3 deep.> I've put a dput of the data here: > > https://gist.githubusercontent.com/aronlindberg/b916dee897d051ac5be5/raw/a78cbf873a7e865c3173f943ff6309ea688c653b/dput >Thank you for creating a reproducible example.> I can get one intense of the element I want this way: > > > input[[67]]$content[[1]]$sha [1] > "58cf43ecdc1beb7e1043e9de612ecc817b090f15" > > However, I need to use a lapply function to loop over all of the > items of the list. I've tried something like this, but it doesn't > work: > > get_shas <- function(input){ x <- sapply(input, "[[", "content") y <- > sapply(x, "[[", "sha") return(y) } > > sha_lists <- lapply(commit_lists, get_shas) > > However, this doesn't work. When I run each of the lapply commands > "manually" it returns NULL for every list, and when I run the whole > apply function it says: > > > Error in FUN(X[[1L]], ...) : subscript out of bounds > > > I've tried reading the sections on lists and subsetting in Hadley's > Advanced R, but I still cannot figure it out. Can anyone help or > offer a pointer?At least part of the problem is that for some values of "i" input[[i]]$content[[1]] is a list (with an entry named "sha") and sometimes it is a character vector. I don't follow your function get_shas() completely, so I started from scratch: foo <- function (x){ sapply(x,function(y){ z <- y$content[[1]] if(is.list(z)) z$sha else NA }) } I find that foo(input) gives a vector of length 100, 81 entries of which are NA. Entry number 67 at least agrees with what was shown in your email. HTH cheers, Rolf Turner -- Rolf Turner Technical Editor ANZJS Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276 Home phone: +64-9-480-4619
Aron Lindberg <aron.lindberg <at> case.edu> writes:> > Hi Everyone, > > I'm working on a thorny subsetting problem involving list of lists. I've put adput of the data here:> > https://gist.githubusercontent.com/aronlindberg/b916dee897d051ac5be5/raw/a78cbf873a7e865c3173f943ff6309ea688c653b/dput>IIUC, you want the value of every list element that is named "sha" and that name will only apply to atomic objects. If so, this should do it.> input <- dget("/tmp/dpt") > shas <- unlist( input, use.names=FALSE )[ grepl( "sha", names(unlist(input)))] > input[[67]]$content[[1]]$sha[1] "58cf43ecdc1beb7e1043e9de612ecc817b090f15"> which(input[[67]]$content[[1]]$sha == shas )[1] 194 HTH, Chuck
Thanks Chuck and Rolf. While Rolf?s code also works on the dput that I actually gave you (a smaller subset of the full dataset), it failed to work on the larger dataset, because there are further exceptions: input[[i]]$content[[1]] is sometimes a list, sometimes a character vector, and sometimes input[[i]]$content simply returns list(). Chuck?s solution however bypasses this and works on the full dataset (which was 8mb, which is why I didn?t upload it as a gist). Best, Aron --? Aron Lindberg Doctoral Candidate,?Information Systems Weatherhead School of Management? Case Western Reserve University aronlindberg.github.io On Fri, Feb 20, 2015 at 12:44 AM, Charles Berry <ccberry at ucsd.edu> wrote:> Aron Lindberg <aron.lindberg <at> case.edu> writes: >> >> Hi Everyone, >> >> I'm working on a thorny subsetting problem involving list of lists. I've put a > dput of the data here: >> >> https://gist.githubusercontent.com/aronlindberg/b916dee897d051ac5be5/ > raw/a78cbf873a7e865c3173f943ff6309ea688c653b/dput >> > IIUC, you want the value of every list element that is named "sha" and > that name will only apply to atomic objects. > If so, this should do it. >> input <- dget("/tmp/dpt") >> shas <- unlist( input, use.names=FALSE )[ grepl( "sha", names(unlist(input)))] >> input[[67]]$content[[1]]$sha > [1] "58cf43ecdc1beb7e1043e9de612ecc817b090f15" >> which(input[[67]]$content[[1]]$sha == shas ) > [1] 194 > HTH, > Chuck > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]