@vi@e@gross m@iii@g oii gm@ii@com
2023-May-26 04:02 UTC
[R] extract parts of a list before symbol
All true Jeff, but why do things the easy way! LOL! My point was that various data structures, besides the list we started with, store the names as an attribute. Yes, names(listname) works fine to extract whatever parts they want. My original idea of using a data.frame was because it creates names when they are absent. And you are correct that if the original list was not as shown with only all items of length 1, converting to a data.frame fails.>From what you say, it is a harder think to write a function that returns a"name" for column N given a list. As you note, you get a null when there are no names. You get empty strings when one or more (but not all) have no names. But it can be done. The OP initially was looking at a way to get a text version of a variable they could use using perhaps regular expressions to parse. Of course that is not as easy as just looking at the names attribute in one of several ways. But it may help in a sense to deal with the cases mentioned above. The problem is that str() does not return anything except to stdout so it must be captured to do silly things.> test <- list(a=3,b=5,c=11)> str(test)List of 3 $ a: num 3 $ b: num 5 $ c: num 11> str(test[1])List of 1 $ a: num 3> str(test[2])List of 1 $ b: num 5> str(list(a=1, 2, c=3))List of 3 $ a: num 1 $ : num 2 $ c: num 3> str(list(1, 2, 3))List of 3 $ : num 1 $ : num 2 $ : num 3> text <- str(list(a=1, 2, c=3)[1])List of 1 $ a: num 1> text <- capture.output(str(list(a=1, 2, c=3))) > text[1] "List of 3" " $ a: num 1" " $ : num 2" " $ c: num 3" So you could use some imaginative code that extracts what you want. I repeat, this is not a suggested way nor the best, just something that seems to work:> sub("(^[\\$ ]*)(\\w+|)(:.*$)", "\\2", text[2:length(text)])[1] "a" "" "c" Obviously the first line of output needs to be removed as it does not fit the pattern. Perhaps in this case a way less complex way is to use summary() rather than str as it does return the output as text.> summary(list(a=1, 2, c=3)) -> text > textLength Class Mode a 1 -none- numeric 1 -none- numeric c 1 -none- numeric This puts the variable name, if any, at the start but parsing that is not trivial as it is not plain text. Bottom line, try not to do things the hard way. Just carefully use names() ... -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Jeff Newmiller Sent: Thursday, May 25, 2023 10:32 PM To: r-help at r-project.org Subject: Re: [R] extract parts of a list before symbol What a remarkable set of detours, Avi, all deriving apparently from a few gaps in your understanding of R. As Rolf said, "names(test)" is the answer. a) Lists are vectors. They are not atomic vectors, but they are vectors, so as.vector(test) is a no-op. test <- list( a = 1, b = 2, c=3 ) attributes(test) attributes(as.vector(test)) (Were you thinking of the unlist function? If so, there is no reason to convert the value of the list to an atomic vector in order to look at the value of an attribute of that list.) b) Data frames are lists, with the additional constraint that all elements have the same length, and that a names attribute and a row.names attribute are both required. Converting a list to a data frame to get the names is expensive in CPU cycles and breaks as soon as the list elements have a variety of lengths. c) All data in R is stored as vectors. Worrying about whether a data value is a vector is pointless. d) All objects can have attributes, including the name attribute. However, not all objects must have a name attribute... including lists. Omitting a name for any of the elements of a list in the constructor will lead to having a zero-length character values in the name attribute where the names were omitted. Omitting all names in the list constructor will cause no names attribute to be created for that list. test2 <- list( 1, 2, 3 ) attributes(test2) e) The names() function returns the value of the names attribute. If that attribute is missing, it returns NULL. For dataframes, the colnames function is equivalent to the names function (I rarely use the colnames function). For lists, colnames returns NULL... there are no "columns" in a list, because there is no constraint on the (lengths of the) contents of a list. names(test2) f) The names attribute, if it exists, is just a character vector. It is never necessary to convert the output of names() to a character vector. If the names attribute doesn't exist, then it is up to the user to write code that creates it. names(test2) <- c( "A", "B", "C" ) attributes(test2) names(test2) # or use the argument names in the list function names(test2) <- 1:3 # integer names(test2) # character attributes(test2)$names <- 1:3 # integer attributes(test2) # character test2[[ "2" ]] == 2 # TRUE test2$`2` == 2 # TRUE On May 25, 2023 6:17:37 PM PDT, avi.e.gross at gmail.com wrote:>Evan, > >List names are less easy than data.frame column names so try this: > >> test <- list(a=3,b=5,c=11) >> colnames(test) >NULL >> colnames(as.data.frame(test)) >[1] "a" "b" "c" > >But note an entry with no name has one made up for it. > > >> test2 <- list(a=3,b=5, 666, c=11) >> colnames(data.frame(test2)) >[1] "a" "b" "X666" "c" > >But that may be overkill as simply converting to a vector if ALL parts are >of the same type will work too: > >> names(as.vector(test)) >[1] "a" "b" "c" > >To get one at a time: > >> names(as.vector(test))[1] >[1] "a" > >You can do it even simple by looking at the attributes of your list: > >> attributes(test) >$names >[1] "a" "b" "c" > >> attributes(test)$names >[1] "a" "b" "c" >> attributes(test)$names[3] >[1] "c" > > >-----Original Message----- >From: R-help <r-help-bounces at r-project.org> On Behalf Of Evan Cooch >Sent: Thursday, May 25, 2023 1:30 PM >To: r-help at r-project.org >Subject: [R] extract parts of a list before symbol > >Suppose I have the following list: > >test <- list(a=3,b=5,c=11) > >I'm trying to figure out how to extract the characters to the left of >the equal sign (i.e., I want to extract a list of the variable names, a, >b and c. > >I've tried the permutations I know of involving sub - things like >sub("\\=.*", "", test), but no matter what I try, sub keeps returning >(3, 5, 11). In other words, even though I'm trying to extract the >'stuff' before the = sign, I seem to be successful only at grabbing the >stuff after the equal sign. > >Pointers to the obvious fix? Thanks... > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html>and provide commented, minimal, self-contained, reproducible code. > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html>and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Many thanks to all. Wasn't even aware of the names function. That does the trick for present purposes. On 5/26/2023 12:02 AM, avi.e.gross at gmail.com wrote:> All true Jeff, but why do things the easy way! LOL! > > My point was that various data structures, besides the list we started with, > store the names as an attribute. Yes, names(listname) works fine to extract > whatever parts they want. My original idea of using a data.frame was because > it creates names when they are absent. > > And you are correct that if the original list was not as shown with only all > items of length 1, converting to a data.frame fails. > > >From what you say, it is a harder think to write a function that returns a > "name" for column N given a list. As you note, you get a null when there are > no names. You get empty strings when one or more (but not all) have no > names. But it can be done. > > The OP initially was looking at a way to get a text version of a variable > they could use using perhaps regular expressions to parse. Of course that > is not as easy as just looking at the names attribute in one of several > ways. But it may help in a sense to deal with the cases mentioned above. > The problem is that str() does not return anything except to stdout so it > must be captured to do silly things. > >> test <- list(a=3,b=5,c=11) >> str(test) > List of 3 > $ a: num 3 > $ b: num 5 > $ c: num 11 > >> str(test[1]) > List of 1 > $ a: num 3 > >> str(test[2]) > List of 1 > $ b: num 5 > >> str(list(a=1, 2, c=3)) > List of 3 > $ a: num 1 > $ : num 2 > $ c: num 3 > >> str(list(1, 2, 3)) > List of 3 > $ : num 1 > $ : num 2 > $ : num 3 > >> text <- str(list(a=1, 2, c=3)[1]) > List of 1 > $ a: num 1 > >> text <- capture.output(str(list(a=1, 2, c=3))) >> text > [1] "List of 3" " $ a: num 1" " $ : num 2" " $ c: num 3" > So you could use some imaginative code that extracts what you want. I > repeat, this is not a suggested way nor the best, just something that seems > to work: > >> sub("(^[\\$ ]*)(\\w+|)(:.*$)", "\\2", text[2:length(text)]) > [1] "a" "" "c" > > Obviously the first line of output needs to be removed as it does not fit > the pattern. > > Perhaps in this case a way less complex way is to use summary() rather than > str as it does return the output as text. > >> summary(list(a=1, 2, c=3)) -> text >> text > Length Class Mode > a 1 -none- numeric > 1 -none- numeric > c 1 -none- numeric > > This puts the variable name, if any, at the start but parsing that is not > trivial as it is not plain text. > > Bottom line, try not to do things the hard way. Just carefully use names() > ... > > -----Original Message----- > From: R-help <r-help-bounces at r-project.org> On Behalf Of Jeff Newmiller > Sent: Thursday, May 25, 2023 10:32 PM > To: r-help at r-project.org > Subject: Re: [R] extract parts of a list before symbol > > What a remarkable set of detours, Avi, all deriving apparently from a few > gaps in your understanding of R. > > As Rolf said, "names(test)" is the answer. > > a) Lists are vectors. They are not atomic vectors, but they are vectors, so > as.vector(test) is a no-op. > > test <- list( a = 1, b = 2, c=3 ) > attributes(test) > attributes(as.vector(test)) > > (Were you thinking of the unlist function? If so, there is no reason to > convert the value of the list to an atomic vector in order to look at the > value of an attribute of that list.) > > b) Data frames are lists, with the additional constraint that all elements > have the same length, and that a names attribute and a row.names attribute > are both required. Converting a list to a data frame to get the names is > expensive in CPU cycles and breaks as soon as the list elements have a > variety of lengths. > > c) All data in R is stored as vectors. Worrying about whether a data value > is a vector is pointless. > > d) All objects can have attributes, including the name attribute. However, > not all objects must have a name attribute... including lists. Omitting a > name for any of the elements of a list in the constructor will lead to > having a zero-length character values in the name attribute where the names > were omitted. Omitting all names in the list constructor will cause no names > attribute to be created for that list. > > test2 <- list( 1, 2, 3 ) > attributes(test2) > > e) The names() function returns the value of the names attribute. If that > attribute is missing, it returns NULL. For dataframes, the colnames function > is equivalent to the names function (I rarely use the colnames function). > For lists, colnames returns NULL... there are no "columns" in a list, > because there is no constraint on the (lengths of the) contents of a list. > > names(test2) > > f) The names attribute, if it exists, is just a character vector. It is > never necessary to convert the output of names() to a character vector. If > the names attribute doesn't exist, then it is up to the user to write code > that creates it. > > names(test2) <- c( "A", "B", "C" ) > attributes(test2) > names(test2) > # or use the argument names in the list function > > names(test2) <- 1:3 # integer > names(test2) # character > attributes(test2)$names <- 1:3 # integer > attributes(test2) # character > test2[[ "2" ]] == 2 # TRUE > test2$`2` == 2 # TRUE > > > > On May 25, 2023 6:17:37 PM PDT, avi.e.gross at gmail.com wrote: >> Evan, >> >> List names are less easy than data.frame column names so try this: >> >>> test <- list(a=3,b=5,c=11) >>> colnames(test) >> NULL >>> colnames(as.data.frame(test)) >> [1] "a" "b" "c" >> >> But note an entry with no name has one made up for it. >> >> >>> test2 <- list(a=3,b=5, 666, c=11) >>> colnames(data.frame(test2)) >> [1] "a" "b" "X666" "c" >> >> But that may be overkill as simply converting to a vector if ALL parts are >> of the same type will work too: >> >>> names(as.vector(test)) >> [1] "a" "b" "c" >> >> To get one at a time: >> >>> names(as.vector(test))[1] >> [1] "a" >> >> You can do it even simple by looking at the attributes of your list: >> >>> attributes(test) >> $names >> [1] "a" "b" "c" >> >>> attributes(test)$names >> [1] "a" "b" "c" >>> attributes(test)$names[3] >> [1] "c" >> >> >> -----Original Message----- >> From: R-help <r-help-bounces at r-project.org> On Behalf Of Evan Cooch >> Sent: Thursday, May 25, 2023 1:30 PM >> To: r-help at r-project.org >> Subject: [R] extract parts of a list before symbol >> >> Suppose I have the following list: >> >> test <- list(a=3,b=5,c=11) >> >> I'm trying to figure out how to extract the characters to the left of >> the equal sign (i.e., I want to extract a list of the variable names, a, >> b and c. >> >> I've tried the permutations I know of involving sub - things like >> sub("\\=.*", "", test), but no matter what I try, sub keeps returning >> (3, 5, 11). In other words, even though I'm trying to extract the >> 'stuff' before the = sign, I seem to be successful only at grabbing the >> stuff after the equal sign. >> >> Pointers to the obvious fix? Thanks... >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.