Hervé Pagès
2018-May-16 15:33 UTC
[Rd] Dispatch mechanism seems to alter object before calling method on it
On 05/15/2018 09:13 PM, Michael Lawrence wrote:> My understanding is that array (or any other structure) does not > "simply" inherit from vector, because structures are not vectors in > the strictest sense. Basically, once a vector gains attributes, it is > a structure, not a vector. The methods package accommodates this by > defining an "is" relationship between "structure" and "vector" via an > "explicit coerce", such that any "structure" passed to a "vector" > method is first passed to as.vector(), which strips attributes. This > is very much by design.It seems that the problem is really with matrices and arrays, not with "structures" in general: f <- factor(c("z", "x", "z"), levels=letters) m <- matrix(1:12, ncol=3) df <- data.frame(f=f) x <- structure(1:3, titi="A") Only the matrix looses its attributes when passed to a "vector" method: setGeneric("foo", function(x) standardGeneric("foo")) setMethod("foo", "vector", identity) foo(f) # attributes are preserved # [1] z x z # Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z foo(m) # attributes are stripped # [1] 1 2 3 4 5 6 7 8 9 10 11 12 foo(df) # attributes are preserved # f # 1 z # 2 x # 3 z foo(x) # attributes are preserved # [1] 1 2 3 # attr(,"titi") # [1] "A" Also if structures are passed to as.vector() before being passed to a "vector" method, shouldn't as.vector() and foo() be equivalent on them? For 'f' and 'x' they're not: as.vector(f) # [1] "z" "x" "z" as.vector(x) # [1] 1 2 3 Finally note that for factors and data frames the "vector" method gets selected despite the fact that is( , "vector") is FALSE: is(f, "vector") # [1] FALSE is(m, "vector") # [1] TRUE is(df, "vector") # [1] FALSE is(x, "vector") # [1] TRUE Couldn't we recognize these problems as real, even if they are by design? Hopefully we can all agree that: - the dispatch mechanism should only dispatch, not alter objects; - is() and selectMethod() should not contradict each other. Thanks, H.> > Michael > > > On Tue, May 15, 2018 at 5:25 PM, Herv? Pag?s <hpages at fredhutch.org> wrote: >> Hi, >> >> This was quite unexpected: >> >> setGeneric("foo", function(x) standardGeneric("foo")) >> >> setMethod("foo", "vector", identity) >> >> foo(matrix(1:12, ncol=3)) >> # [1] 1 2 3 4 5 6 7 8 9 10 11 12 >> >> foo(array(1:24, 4:2)) >> # [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 >> 24 >> >> If I define a method for array objects, things work as expected though: >> >> setMethod("foo", "array", identity) >> >> foo(matrix(1:12, ncol=3)) >> # [,1] [,2] [,3] >> # [1,] 1 5 9 >> # [2,] 2 6 10 >> # [3,] 3 7 11 >> # [4,] 4 8 12 >> >> So, luckily, I have a workaround. >> >> But shouldn't the dispatch mechanism stay away from the business of >> altering objects before passed to it? >> >> Thanks, >> H. >> >> -- >> Herv? Pag?s >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: hpages at fredhutch.org >> Phone: (206) 667-5791 >> Fax: (206) 667-1319 >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=gynT4YhbmVKZhnX4srXlCWZZRyVBMXG211CKgftdEs0&s=_I0aFHQVnXdBfB5kTLg9TxK_2LHdSuaB6gqZwSx1orQ&e>>-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319
Michael Lawrence
2018-May-16 17:22 UTC
[Rd] Dispatch mechanism seems to alter object before calling method on it
Factors and data.frames are not structures, because they must have a class attribute. Just call them "objects". They are higher level than structures, which in practice just shape data without adding a lot of semantics. Compare getClass("matrix") and getClass("factor"). I agree that inheritance through explicit coercion is confusing. As far as I know, there are only 2 places where it is used: 1) Objects with attributes but no class, basically "structure" and its subclasses "array" <- "matrix" 2) Classes that extend a reference type ("environment", "name" and "externalptr") via hidden delegation (@.xData) I'm not sure if anyone should be doing #2. For #1, a simple "fix" would be just to drop inheritance of "structure" from "vector". I think the intent was to mimic base R behavior, where it will happily strip (or at least ignore) attributes when passing an array or matrix to an internal function that expects a vector. A related problem, which explains why factor and data.frame inherit from "vector" even though they are objects, is that any S4 object derived from those needs to be (for pragmatic compatibility reasons) an integer vector or list, respectively, internally (the virtual @.Data slot). Separating that from inheritance would probably be difficult. Yes, we can consider these to be problems, to some extent stemming from the behavior and design of R itself, but I'm not sure it's worth doing anything about them at this point. Michael On Wed, May 16, 2018 at 8:33 AM, Herv? Pag?s <hpages at fredhutch.org> wrote:> On 05/15/2018 09:13 PM, Michael Lawrence wrote: >> >> My understanding is that array (or any other structure) does not >> "simply" inherit from vector, because structures are not vectors in >> the strictest sense. Basically, once a vector gains attributes, it is >> a structure, not a vector. The methods package accommodates this by >> defining an "is" relationship between "structure" and "vector" via an >> "explicit coerce", such that any "structure" passed to a "vector" >> method is first passed to as.vector(), which strips attributes. This >> is very much by design. > > > It seems that the problem is really with matrices and arrays, not > with "structures" in general: > > f <- factor(c("z", "x", "z"), levels=letters) > m <- matrix(1:12, ncol=3) > df <- data.frame(f=f) > x <- structure(1:3, titi="A") > > Only the matrix looses its attributes when passed to a "vector" > method: > > setGeneric("foo", function(x) standardGeneric("foo")) > setMethod("foo", "vector", identity) > > foo(f) # attributes are preserved > # [1] z x z > # Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z > > foo(m) # attributes are stripped > # [1] 1 2 3 4 5 6 7 8 9 10 11 12 > > foo(df) # attributes are preserved > # f > # 1 z > # 2 x > # 3 z > > foo(x) # attributes are preserved > # [1] 1 2 3 > # attr(,"titi") > # [1] "A" > > Also if structures are passed to as.vector() before being passed to > a "vector" method, shouldn't as.vector() and foo() be equivalent on > them? For 'f' and 'x' they're not: > > as.vector(f) > # [1] "z" "x" "z" > > as.vector(x) > # [1] 1 2 3 > > Finally note that for factors and data frames the "vector" method gets > selected despite the fact that is( , "vector") is FALSE: > > is(f, "vector") > # [1] FALSE > > is(m, "vector") > # [1] TRUE > > is(df, "vector") > # [1] FALSE > > is(x, "vector") > # [1] TRUE > > Couldn't we recognize these problems as real, even if they are by > design? Hopefully we can all agree that: > - the dispatch mechanism should only dispatch, not alter objects; > - is() and selectMethod() should not contradict each other. > > Thanks, > H. > >> >> Michael >> >> >> On Tue, May 15, 2018 at 5:25 PM, Herv? Pag?s <hpages at fredhutch.org> wrote: >>> >>> Hi, >>> >>> This was quite unexpected: >>> >>> setGeneric("foo", function(x) standardGeneric("foo")) >>> >>> setMethod("foo", "vector", identity) >>> >>> foo(matrix(1:12, ncol=3)) >>> # [1] 1 2 3 4 5 6 7 8 9 10 11 12 >>> >>> foo(array(1:24, 4:2)) >>> # [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 >>> 22 23 >>> 24 >>> >>> If I define a method for array objects, things work as expected though: >>> >>> setMethod("foo", "array", identity) >>> >>> foo(matrix(1:12, ncol=3)) >>> # [,1] [,2] [,3] >>> # [1,] 1 5 9 >>> # [2,] 2 6 10 >>> # [3,] 3 7 11 >>> # [4,] 4 8 12 >>> >>> So, luckily, I have a workaround. >>> >>> But shouldn't the dispatch mechanism stay away from the business of >>> altering objects before passed to it? >>> >>> Thanks, >>> H. >>> >>> -- >>> Herv? Pag?s >>> >>> Program in Computational Biology >>> Division of Public Health Sciences >>> Fred Hutchinson Cancer Research Center >>> 1100 Fairview Ave. N, M1-B514 >>> P.O. Box 19024 >>> Seattle, WA 98109-1024 >>> >>> E-mail: hpages at fredhutch.org >>> Phone: (206) 667-5791 >>> Fax: (206) 667-1319 >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=gynT4YhbmVKZhnX4srXlCWZZRyVBMXG211CKgftdEs0&s=_I0aFHQVnXdBfB5kTLg9TxK_2LHdSuaB6gqZwSx1orQ&e>>> > > -- > Herv? Pag?s > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages at fredhutch.org > Phone: (206) 667-5791 > Fax: (206) 667-1319
Hervé Pagès
2018-May-16 19:23 UTC
[Rd] Dispatch mechanism seems to alter object before calling method on it
On 05/16/2018 10:22 AM, Michael Lawrence wrote:> Factors and data.frames are not structures, because they must have a > class attribute. Just call them "objects". They are higher level than > structures, which in practice just shape data without adding a lot of > semantics. Compare getClass("matrix") and getClass("factor"). > > I agree that inheritance through explicit coercion is confusing. As > far as I know, there are only 2 places where it is used: > 1) Objects with attributes but no class, basically "structure" and its > subclasses "array" <- "matrix" > 2) Classes that extend a reference type ("environment", "name" and > "externalptr") via hidden delegation (@.xData) > > I'm not sure if anyone should be doing #2. For #1, a simple "fix" > would be just to drop inheritance of "structure" from "vector". I > think the intent was to mimic base R behavior, where it will happily > strip (or at least ignore) attributes when passing an array or matrix > to an internal function that expects a vector. > > A related problem, which explains why factor and data.frame inherit > from "vector" even though they are objects, is that any S4 object > derived from those needs to be (for pragmatic compatibility reasons) > an integer vector or list, respectively, internally (the virtual > @.Data slot). Separating that from inheritance would probably be > difficult. > > Yes, we can consider these to be problems, to some extent stemming > from the behavior and design of R itself, but I'm not sure it's worth > doing anything about them at this point.Thanks for the informative discussion. It still doesn't explain why 'm' gets its attributes stripped and 'x' does not though: m <- matrix(1:12, ncol=3) x <- structure(1:3, titi="A") setGeneric("foo", function(x) standardGeneric("foo")) setMethod("foo", "vector", identity) foo(m) # [1] 1 2 3 4 5 6 7 8 9 10 11 12 foo(x) # [1] 1 2 3 # attr(,"titi") # [1] "A" If I understand correctly, both are "structures", not "objects". Why aren't these problems worth fixing? More generally speaking the erratic behavior of the S4 system with respect to S3 objects has been a plague since the beginning of the methods package. And many people have complained about this in many occasions in one way or another. For the record, here are some of the most notorious problems: class(as.numeric(1:4)) # [1] "numeric" class(as(1:4, "numeric")) # [1] "integer" is.vector(matrix()) # [1] FALSE is(matrix(), "vector") # [1] TRUE is.list(data.frame()) # [1] TRUE is(data.frame(), "list") # [1] FALSE extends("data.frame", "list") # [1] TRUE setClassUnion("vector_OR_factor", c("vector", "factor")) is(data.frame(), "vector") # [1] FALSE is(data.frame(), "factor") # [1] FALSE is(data.frame(), "vector_OR_factor") # [1] TRUE etc... Many people stay away from S4 because of these incomprehensible behaviors. Finally note that even pure S3 operations can produce output that doesn't make sense: is.list(data.frame()) # [1] TRUE is.vector(list()) # [1] TRUE is.vector(data.frame()) # [1] FALSE (that is: a data frame is a list and a list is a vector but a data frame is not a vector!) Why aren't these problems taken more seriously? Thanks, H.> > Michael > > On Wed, May 16, 2018 at 8:33 AM, Herv? Pag?s <hpages at fredhutch.org> wrote: >> On 05/15/2018 09:13 PM, Michael Lawrence wrote: >>> >>> My understanding is that array (or any other structure) does not >>> "simply" inherit from vector, because structures are not vectors in >>> the strictest sense. Basically, once a vector gains attributes, it is >>> a structure, not a vector. The methods package accommodates this by >>> defining an "is" relationship between "structure" and "vector" via an >>> "explicit coerce", such that any "structure" passed to a "vector" >>> method is first passed to as.vector(), which strips attributes. This >>> is very much by design. >> >> >> It seems that the problem is really with matrices and arrays, not >> with "structures" in general: >> >> f <- factor(c("z", "x", "z"), levels=letters) >> m <- matrix(1:12, ncol=3) >> df <- data.frame(f=f) >> x <- structure(1:3, titi="A") >> >> Only the matrix looses its attributes when passed to a "vector" >> method: >> >> setGeneric("foo", function(x) standardGeneric("foo")) >> setMethod("foo", "vector", identity) >> >> foo(f) # attributes are preserved >> # [1] z x z >> # Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z >> >> foo(m) # attributes are stripped >> # [1] 1 2 3 4 5 6 7 8 9 10 11 12 >> >> foo(df) # attributes are preserved >> # f >> # 1 z >> # 2 x >> # 3 z >> >> foo(x) # attributes are preserved >> # [1] 1 2 3 >> # attr(,"titi") >> # [1] "A" >> >> Also if structures are passed to as.vector() before being passed to >> a "vector" method, shouldn't as.vector() and foo() be equivalent on >> them? For 'f' and 'x' they're not: >> >> as.vector(f) >> # [1] "z" "x" "z" >> >> as.vector(x) >> # [1] 1 2 3 >> >> Finally note that for factors and data frames the "vector" method gets >> selected despite the fact that is( , "vector") is FALSE: >> >> is(f, "vector") >> # [1] FALSE >> >> is(m, "vector") >> # [1] TRUE >> >> is(df, "vector") >> # [1] FALSE >> >> is(x, "vector") >> # [1] TRUE >> >> Couldn't we recognize these problems as real, even if they are by >> design? Hopefully we can all agree that: >> - the dispatch mechanism should only dispatch, not alter objects; >> - is() and selectMethod() should not contradict each other. >> >> Thanks, >> H. >> >>> >>> Michael >>> >>> >>> On Tue, May 15, 2018 at 5:25 PM, Herv? Pag?s <hpages at fredhutch.org> wrote: >>>> >>>> Hi, >>>> >>>> This was quite unexpected: >>>> >>>> setGeneric("foo", function(x) standardGeneric("foo")) >>>> >>>> setMethod("foo", "vector", identity) >>>> >>>> foo(matrix(1:12, ncol=3)) >>>> # [1] 1 2 3 4 5 6 7 8 9 10 11 12 >>>> >>>> foo(array(1:24, 4:2)) >>>> # [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 >>>> 22 23 >>>> 24 >>>> >>>> If I define a method for array objects, things work as expected though: >>>> >>>> setMethod("foo", "array", identity) >>>> >>>> foo(matrix(1:12, ncol=3)) >>>> # [,1] [,2] [,3] >>>> # [1,] 1 5 9 >>>> # [2,] 2 6 10 >>>> # [3,] 3 7 11 >>>> # [4,] 4 8 12 >>>> >>>> So, luckily, I have a workaround. >>>> >>>> But shouldn't the dispatch mechanism stay away from the business of >>>> altering objects before passed to it? >>>> >>>> Thanks, >>>> H. >>>> >>>> -- >>>> Herv? Pag?s >>>> >>>> Program in Computational Biology >>>> Division of Public Health Sciences >>>> Fred Hutchinson Cancer Research Center >>>> 1100 Fairview Ave. N, M1-B514 >>>> P.O. Box 19024 >>>> Seattle, WA 98109-1024 >>>> >>>> E-mail: hpages at fredhutch.org >>>> Phone: (206) 667-5791 >>>> Fax: (206) 667-1319 >>>> >>>> ______________________________________________ >>>> R-devel at r-project.org mailing list >>>> >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=gynT4YhbmVKZhnX4srXlCWZZRyVBMXG211CKgftdEs0&s=_I0aFHQVnXdBfB5kTLg9TxK_2LHdSuaB6gqZwSx1orQ&e>>>> >> >> -- >> Herv? Pag?s >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: hpages at fredhutch.org >> Phone: (206) 667-5791 >> Fax: (206) 667-1319-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319
Apparently Analagous Threads
- Dispatch mechanism seems to alter object before calling method on it
- Dispatch mechanism seems to alter object before calling method on it
- Dispatch mechanism seems to alter object before calling method on it
- Dispatch mechanism seems to alter object before calling method on it
- Dispatch mechanism seems to alter object before calling method on it