A bit too fast there, Duncan... x[[c(1,2)]] is illegal.
On July 9, 2021 5:16:13 PM PDT, Duncan Murdoch <murdoch.duncan at
gmail.com> wrote:>On 09/07/2021 6:44 p.m., Bert Gunter wrote:
>> OK, I stand somewhat chastised.
>>
>> But my point still is that what you get when you "extract"
depends on
>> how you define "extract." Do note that ?"[" yields
a help file titled
>> "Extract or Replace Parts of an object"; and afaics, the term
>"subset"
>> is not explicitly used as Duncan prefers.
>
>?"[[" gives you the same page, but I agree: this part of the
>documentation isn't written very clearly. The "Introduction to
R"
>manual
>uses the terms I used (see section 2.7, "Index vectors; selecting and
>modifying subsets of a data set"), as does the source code (and the R
>Language Definition manual, though it's not as clear as the Intro).
>
>But the point isn't to chastise you, it's to educate you (and the
OP).
>Thinking of [] as subsetting is more helpful than thinking of it as
>extraction. That way the result of x[c(1,2)] makes sense. It's a
>little bit more of a stretch, but the result of x[[c(1,2)]] also makes
>sense when you think of it as extraction.
>
>Duncan Murdoch
>
> The relevant part of the
>> Help file says for "[" for recursive objects says:
"Indexing by [ is
>> similar to atomic vectors and selects a list of the specified
>> element(s)." That a data.frame is a list is explicitly stated, as
I
>> noted; that lists are in fact vectors is also explicitly stated
>(?list
>> says: "Almost all lists in R internally are Generic Vectors")
but
>then
>> one is stuck with: a data.frame is a list and therefore a vector, but
>> is.vector(d3) is FALSE. The explanation is explicit again in
>> ?is.vector ("is.vector returns TRUE if x is a vector of the
specified
>> mode having no attributes other than names. It returns FALSE
>> otherwise."). But I would say these issues are sufficiently murky
>that
>> my warning to be precise is not entirely inappropriate;
>unfortunately,
>> I may have made them more so. Sigh....
>>
>> Cheers,
>> Bert
>>
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming
>along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic
strip )
>>
>> On Fri, Jul 9, 2021 at 3:05 PM Duncan Murdoch
><murdoch.duncan at gmail.com> wrote:
>>>
>>> On 09/07/2021 5:51 p.m., Jeff Newmiller wrote:
>>>> "Strictly speaking", Greg is correct, Bert.
>>>>
>>>>
>https://cran.r-project.org/doc/manuals/r-release/R-lang.html#List-objects
>>>>
>>>> Lists in R are vectors. What we colloquially refer to as
"vectors"
>are more precisely referred to as "atomic vectors". And without a
>doubt, this "vector" nature of lists is a key underlying concept
that
>explains why adding a dim attribute creates a matrix that can hold data
>frames. It is also a stumbling block for programmers from other
>languages that have things like linked lists.
>>>
>>> I would also object to v3 (below) as "extracting" a
column from d.
>>> "d[2]" doesn't extract anything, it
"subsets" the data frame, so the
>>> result is a data frame, not what you get when you extract something
>from
>>> a data frame.
>>>
>>> People don't realize that "x <- 1:10; y <-
x[[3]]" is perfectly
>legal.
>>> That extracts the 3rd element (the number 3). The problem is that
R
>has
>>> no way to represent a scalar number, only a vector of numbers, so
>x[[3]]
>>> gets promoted to a vector containing that number when it is
returned
>and
>>> assigned to y.
>>>
>>> Lists are vectors of R objects, so if x is a list, x[[3]] is
>something
>>> that can be returned, and it is different from x[3].
>>>
>>> Duncan Murdoch
>>>
>>>>
>>>> On July 9, 2021 2:36:19 PM PDT, Bert Gunter
><bgunter.4567 at gmail.com> wrote:
>>>>> "1. a column, when extracted from a data frame, *is*
a vector."
>>>>> Strictly speaking, this is false; it depends on exactly
what is
>meant
>>>>> by "extracted." e.g.:
>>>>>
>>>>>> d <- data.frame(col1 = 1:3, col2 = letters[1:3])
>>>>>> v1 <- d[,2] ## a vector
>>>>>> v2 <- d[[2]] ## the same, i.e
>>>>>> identical(v1,v2)
>>>>> [1] TRUE
>>>>>> v3 <- d[2] ## a data.frame
>>>>>> v1
>>>>> [1] "a" "b" "c" ## a
character vector
>>>>>> v3
>>>>> col2
>>>>> 1 a
>>>>> 2 b
>>>>> 3 c
>>>>>> is.vector(v1)
>>>>> [1] TRUE
>>>>>> is.vector(v3)
>>>>> [1] FALSE
>>>>>> class(v3) ## data.frame
>>>>> [1] "data.frame"
>>>>> ## but
>>>>>> is.list(v3)
>>>>> [1] TRUE
>>>>>
>>>>> which is simply explained in ?data.frame (where else?!) by:
>>>>> "A data frame is a **list** [emphasis added] of
variables of the
>same
>>>>> number of rows with unique row names, given class
"data.frame". If
>no
>>>>> variables are included, the row names determine the number
of
>rows."
>>>>>
>>>>> "2. maybe your question is "is a given function
for a vector, or
>for a
>>>>> data frame/matrix/array?". if so, i think the
only way is
>reading
>>>>> the help information (?foo)."
>>>>>
>>>>> Indeed! Is this not what the Help system is for?! But note
also
>that
>>>>> the S3 class system may somewhat blur the issue: foo() may
work
>>>>> appropriately and differently for different (S3) classes of
>objects. A
>>>>> detailed explanation of this behavior can be found in
appropriate
>>>>> resources or (more tersely) via ?UseMethod .
>>>>>
>>>>> "you might find reading ?"[" and
?"[.data.frame" useful"
>>>>>
>>>>> Not just 'useful" -- **essential** if you want to
work in R,
>unless
>>>>> one gets this information via any of the numerous online
>tutorials,
>>>>> courses, or books that are available. The Help system is
accurate
>and
>>>>> authoritative, but terse. I happen to like this mode of
>documentation,
>>>>> but others may prefer more extended expositions. I stand by
this
>claim
>>>>> even if one chooses to use the "Tidyverse",
data.table package, or
>>>>> other alternative frameworks for handling data. Again,
others may
>>>>> disagree, but R is structured around these basics, and imo
one
>remains
>>>>> ignorant of them at their peril.
>>>>>
>>>>> Cheers,
>>>>> Bert
>>>>>
>>>>>
>>>>> Bert Gunter
>>>>>
>>>>> "The trouble with having an open mind is that people
keep coming
>along
>>>>> and sticking things into it."
>>>>> -- Opus (aka Berkeley Breathed in his "Bloom
County" comic strip )
>>>>>
>>>>> On Fri, Jul 9, 2021 at 11:57 AM Greg Minshall <minshall
at umich.edu>
>>>>> wrote:
>>>>>>
>>>>>> Kai,
>>>>>>
>>>>>>> one more question, how can I know if the function
is for column
>>>>>>> manipulations or for vector?
>>>>>>
>>>>>> i still stumble around R code. but, i'd say the
following (and
>look
>>>>>> forward to being corrected! :):
>>>>>>
>>>>>> 1. a column, when extracted from a data frame, *is* a
vector.
>>>>>>
>>>>>> 2. maybe your question is "is a given function
for a vector, or
>for
>>>>> a
>>>>>> data frame/matrix/array?". if so, i think
the only way is
>>>>> reading
>>>>>> the help information (?foo).
>>>>>>
>>>>>> 3. sometimes, extracting the column as a vector from a
data
>>>>> frame-like
>>>>>> object might be non-intuitive. you might find
reading ?"["
>and
>>>>>> ?"[.data.frame" useful (as well as
?"[.data.table" if you
>use
>>>>> that
>>>>>> package). also, the str() command can be helpful
in
>>>>> understanding
>>>>>> what is happening. (the lobstr:: package's
sxp() function,
>as
>>>>> well
>>>>>> as more verbose .Internal(inspect()) can also
give you
>insight.)
>>>>>>
>>>>>> with the data.table:: package, for example, if
"DT" is a
>>>>> data.table
>>>>>> object, with "x2" as a column, adding
or leaving off
>quotation
>>>>> marks
>>>>>> for the column name can make all the difference
between
>ending up
>>>>>> with a vector, or with a (much reduced) data
table:
>>>>>> ----
>>>>>>> is.vector(DT[, x2])
>>>>>> [1] TRUE
>>>>>>> str(DT[, x2])
>>>>>> num [1:9] 32 32 32 32 32 32 32 32 32
>>>>>>>
>>>>>>> is.vector(DT[, "x2"])
>>>>>> [1] FALSE
>>>>>>> str(DT[, "x2"])
>>>>>> Classes ?data.table? and 'data.frame': 9 obs.
of 1 variable:
>>>>>> $ x2: num 32 32 32 32 32 32 32 32 32
>>>>>> - attr(*,
".internal.selfref")=<externalptr>
>>>>>> ----
>>>>>>
>>>>>> a second level of indexing may or may not help,
mostly
>depending
>>>>> on
>>>>>> the use of '[' versus of '[['.
this can sometimes cause
>>>>> confusion
>>>>>> when you are learning the language.
>>>>>> ----
>>>>>>> str(DT[, "x2"][1])
>>>>>> Classes ?data.table? and 'data.frame': 1 obs.
of 1 variable:
>>>>>> $ x2: num 32
>>>>>> - attr(*,
".internal.selfref")=<externalptr>
>>>>>>> str(DT[, "x2"][[1]])
>>>>>> num [1:9] 32 32 32 32 32 32 32 32 32
>>>>>> ----
>>>>>>
>>>>>> the tibble:: package (used in, e.g., the dplyr::
package)
>also
>>>>>> (always?) returns a single column as a
non-vector. again,
>a
>>>>>> second indexing with double '[[]]' can
produce a vector.
>>>>>> ----
>>>>>>> DP <- tibble(DT)
>>>>>>> is.vector(DP[, "x2"])
>>>>>> [1] FALSE
>>>>>>> is.vector(DP[, "x2"][[1]])
>>>>>> [1] TRUE
>>>>>> ----
>>>>>>
>>>>>> but, note that a list of lists is also a vector:
>>>>>>> is.vector(list(list(1), list(1,2,3)))
>>>>>> [1] TRUE
>>>>>>> str(list(list(1), list(1,2,3)))
>>>>>> List of 2
>>>>>> $ :List of 1
>>>>>> ..$ : num 1
>>>>>> $ :List of 3
>>>>>> ..$ : num 1
>>>>>> ..$ : num 2
>>>>>> ..$ : num 3
>>>>>>
>>>>>> etc.
>>>>>>
>>>>>> hth. good luck learning!
>>>>>>
>>>>>> cheers, Greg
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained,
reproducible
>code.
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained,
reproducible code.
>>>>
>>>
--
Sent from my phone. Please excuse my brevity.