Thanks Chuck and Rolf. While Rolf?s code also works on the dput that I actually gave you (a smaller subset of the full dataset), it failed to work on the larger dataset, because there are further exceptions: input[[i]]$content[[1]] is sometimes a list, sometimes a character vector, and sometimes input[[i]]$content simply returns list(). Chuck?s solution however bypasses this and works on the full dataset (which was 8mb, which is why I didn?t upload it as a gist). Best, Aron --? Aron Lindberg Doctoral Candidate,?Information Systems Weatherhead School of Management? Case Western Reserve University aronlindberg.github.io On Fri, Feb 20, 2015 at 12:44 AM, Charles Berry <ccberry at ucsd.edu> wrote:> Aron Lindberg <aron.lindberg <at> case.edu> writes: >> >> Hi Everyone, >> >> I'm working on a thorny subsetting problem involving list of lists. I've put a > dput of the data here: >> >> https://gist.githubusercontent.com/aronlindberg/b916dee897d051ac5be5/ > raw/a78cbf873a7e865c3173f943ff6309ea688c653b/dput >> > IIUC, you want the value of every list element that is named "sha" and > that name will only apply to atomic objects. > If so, this should do it. >> input <- dget("/tmp/dpt") >> shas <- unlist( input, use.names=FALSE )[ grepl( "sha", names(unlist(input)))] >> input[[67]]$content[[1]]$sha > [1] "58cf43ecdc1beb7e1043e9de612ecc817b090f15" >> which(input[[67]]$content[[1]]$sha == shas ) > [1] 194 > HTH, > Chuck > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
Hmm?Chuck?s solution may actually be problematic because there are several entries which at the deepest level are called ?sha?, but that should not be included, such as: input[[67]]$content[[1]]$commit$tree$sha and input[[67]]$content[[1]]$parents[[1]]$sha it?s only the ?sha? that fit the following subsetting pattern that should be included: input[[i]]$content[[1]]$sha[1] It?s getting thornier! To be fair to Rolf?s solution (which probably can be updated to solve the problem), I?ve posted the complete dput here: https://gist.githubusercontent.com/aronlindberg/92700c04c88ff112e4f7/raw/0f3cd8468f4dc82267be3cec72d53a7a04f5c449/dput.R --? Aron Lindberg Doctoral Candidate,?Information Systems Weatherhead School of Management? Case Western Reserve University aronlindberg.github.io On Fri, Feb 20, 2015 at 8:25 AM, Aron Lindberg <aron.lindberg at case.edu> wrote:> Thanks Chuck and Rolf. > While Rolf?s code also works on the dput that I actually gave you (a smaller subset of the full dataset), it failed to work on the larger dataset, because there are further exceptions: > input[[i]]$content[[1]] is sometimes a list, sometimes a character vector, and sometimes input[[i]]$content simply returns list(). > Chuck?s solution however bypasses this and works on the full dataset (which was 8mb, which is why I didn?t upload it as a gist). > Best, > Aron > --? > Aron Lindberg > Doctoral Candidate,?Information Systems > Weatherhead School of Management? > Case Western Reserve University > aronlindberg.github.io > On Fri, Feb 20, 2015 at 12:44 AM, Charles Berry <ccberry at ucsd.edu> wrote: >> Aron Lindberg <aron.lindberg <at> case.edu> writes: >>> >>> Hi Everyone, >>> >>> I'm working on a thorny subsetting problem involving list of lists. I've put a >> dput of the data here: >>> >>> https://gist.githubusercontent.com/aronlindberg/b916dee897d051ac5be5/ >> raw/a78cbf873a7e865c3173f943ff6309ea688c653b/dput >>> >> IIUC, you want the value of every list element that is named "sha" and >> that name will only apply to atomic objects. >> If so, this should do it. >>> input <- dget("/tmp/dpt") >>> shas <- unlist( input, use.names=FALSE )[ grepl( "sha", names(unlist(input)))] >>> input[[67]]$content[[1]]$sha >> [1] "58cf43ecdc1beb7e1043e9de612ecc817b090f15" >>> which(input[[67]]$content[[1]]$sha == shas ) >> [1] 194 >> HTH, >> Chuck >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
How can you expect a solution if you cannot specify the problem? -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll On Fri, Feb 20, 2015 at 6:13 AM, Aron Lindberg <aron.lindberg at case.edu> wrote:> Hmm?Chuck?s solution may actually be problematic because there are several entries which at the deepest level are called ?sha?, but that should not be included, such as: > > > > > > input[[67]]$content[[1]]$commit$tree$sha > > > > > and > > > > > input[[67]]$content[[1]]$parents[[1]]$sha > > > > > > it?s only the ?sha? that fit the following subsetting pattern that should be included: > > > > > > input[[i]]$content[[1]]$sha[1] > > > > > It?s getting thornier! > > > > > To be fair to Rolf?s solution (which probably can be updated to solve the problem), I?ve posted the complete dput here: > > https://gist.githubusercontent.com/aronlindberg/92700c04c88ff112e4f7/raw/0f3cd8468f4dc82267be3cec72d53a7a04f5c449/dput.R > > > > > > > > -- > > Aron Lindberg > > > > > Doctoral Candidate, Information Systems > > Weatherhead School of Management > > Case Western Reserve University > > aronlindberg.github.io > > On Fri, Feb 20, 2015 at 8:25 AM, Aron Lindberg <aron.lindberg at case.edu> > wrote: > >> Thanks Chuck and Rolf. >> While Rolf?s code also works on the dput that I actually gave you (a smaller subset of the full dataset), it failed to work on the larger dataset, because there are further exceptions: >> input[[i]]$content[[1]] is sometimes a list, sometimes a character vector, and sometimes input[[i]]$content simply returns list(). >> Chuck?s solution however bypasses this and works on the full dataset (which was 8mb, which is why I didn?t upload it as a gist). >> Best, >> Aron >> -- >> Aron Lindberg >> Doctoral Candidate, Information Systems >> Weatherhead School of Management >> Case Western Reserve University >> aronlindberg.github.io >> On Fri, Feb 20, 2015 at 12:44 AM, Charles Berry <ccberry at ucsd.edu> wrote: >>> Aron Lindberg <aron.lindberg <at> case.edu> writes: >>>> >>>> Hi Everyone, >>>> >>>> I'm working on a thorny subsetting problem involving list of lists. I've put a >>> dput of the data here: >>>> >>>> https://gist.githubusercontent.com/aronlindberg/b916dee897d051ac5be5/ >>> raw/a78cbf873a7e865c3173f943ff6309ea688c653b/dput >>>> >>> IIUC, you want the value of every list element that is named "sha" and >>> that name will only apply to atomic objects. >>> If so, this should do it. >>>> input <- dget("/tmp/dpt") >>>> shas <- unlist( input, use.names=FALSE )[ grepl( "sha", names(unlist(input)))] >>>> input[[67]]$content[[1]]$sha >>> [1] "58cf43ecdc1beb7e1043e9de612ecc817b090f15" >>>> which(input[[67]]$content[[1]]$sha == shas ) >>> [1] 194 >>> HTH, >>> Chuck >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Fri, 20 Feb 2015, Aron Lindberg wrote:> Hmm?Chuck?s solution may actually be problematic because there are several entries which at the deepest level are called ?sha?, but that should not be included, such as: > > > > > > input[[67]]$content[[1]]$commit$tree$sha > > > > > and > > > > > input[[67]]$content[[1]]$parents[[1]]$sha > > > > > > it?s only the ?sha? that fit the following subsetting pattern that should be included: > > > > > > input[[i]]$content[[1]]$sha[1] > >This should be straightforward. Look at what grepl() is doing. And look at what names(unlist(input)) yields. You can either write a regular expression to handle this (perhaps "content.sha$") or write other grepl() expressions to select (or get rid of) the desired (or unwanted) pattern. See ?grepl and the page on regular expression referenced there. HTH, Chuck
On Feb 20, 2015, at 6:13 AM, Aron Lindberg wrote:> Hmm?Chuck?s solution may actually be problematic because there are several entries which at the deepest level are called ?sha?, but that should not be included, such as: > > input[[67]]$content[[1]]$commit$tree$sh > > > and > > input[[67]]$content[[1]]$parents[[1]]$sha > > it?s only the ?sha? that fit the following subsetting pattern that should be included: > > > input[[i]]$content[[1]]$sha[1] > > > It?s getting thornier! > > To be fair to Rolf?s solution (which probably can be updated to solve the problem), I?ve posted the complete dput here: > > https://gist.githubusercontent.com/aronlindberg/92700c04c88ff112e4f7/raw/0f3cd8468f4dc82267be3cec72d53a7a04f5c449/dput.RI didn't try on the larger example, but this works on the smaller one: get_shas <- function(input){ x <- lapply(input, "[[", "content") y <- lapply(x, "[[", 1) z <- lapply(y, function(yy) if( length(names(yy)) && names(yy) =="sha" ){ yy[["sha"]] }) } sha_lists <- get_shas(input) It does deliver an entry for every leaf of the input-object which is either the value of "sha" or NA. I think that is not a bad thing because it lets you figure out where the values are coming from.> > -- > > Aron Lindberg > > > > > Doctoral Candidate, Information Systems > > Weatherhead School of Management > > Case Western Reserve University > > aronlindberg.github.io > > On Fri, Feb 20, 2015 at 8:25 AM, Aron Lindberg <aron.lindberg at case.edu> > wrote: > >> Thanks Chuck and Rolf. >> While Rolf?s code also works on the dput that I actually gave you (a smaller subset of the full dataset), it failed to work on the larger dataset, because there are further exceptions: >> input[[i]]$content[[1]] is sometimes a list, sometimes a character vector, and sometimes input[[i]]$content simply returns list(). >> Chuck?s solution however bypasses this and works on the full dataset (which was 8mb, which is why I didn?t upload it as a gist). >> Best, >> Aron >> -- >> Aron Lindberg >> Doctoral Candidate, Information Systems >> Weatherhead School of Management >> Case Western Reserve University >> aronlindberg.github.io >> On Fri, Feb 20, 2015 at 12:44 AM, Charles Berry <ccberry at ucsd.edu> wrote: >>> Aron Lindberg <aron.lindberg <at> case.edu> writes: >>>> >>>> Hi Everyone, >>>> >>>> I'm working on a thorny subsetting problem involving list of lists. I've put a >>> dput of the data here: >>>> >>>> https://gist.githubusercontent.com/aronlindberg/b916dee897d051ac5be5/ >>> raw/a78cbf873a7e865c3173f943ff6309ea688c653b/dput >>>> >>> IIUC, you want the value of every list element that is named "sha" and >>> that name will only apply to atomic objects. >>> If so, this should do it. >>>> input <- dget("/tmp/dpt") >>>> shas <- unlist( input, use.names=FALSE )[ grepl( "sha", names(unlist(input)))] >>>> input[[67]]$content[[1]]$sha >>> [1] "58cf43ecdc1beb7e1043e9de612ecc817b090f15" >>>> which(input[[67]]$content[[1]]$sha == shas ) >>> [1] 194 >>> HTH, >>> Chuck >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA