Hi! I noticed that there is a (minor) bug either the command all.equal() or in the "plm" package. I demonstrate this using an example taken from the documentation of plm(): =====================================R> data("Produc", package="plm") R> zz <- plm(log(gsp)~log(pcap)+log(pc)+log(emp)+unemp, + data=Produc, index=c("state","year")) R> all.equal(zz,zz) [1] TRUE Warning message: In if (length(target) != length(current)) return(paste("target, current differ in having response: ", : the condition has length > 1 and only the first element will be used> all.equal(zz$formula,zz$formula)[1] TRUE Warning message: In if (length(target) != length(current)) return(paste("target, current differ in having response: ", : the condition has length > 1 and only the first element will be used> class(zz$formula)[1] "pFormula" "Formula" "formula" ===================================== The last commands show that the warning message comes from comparing the elements "formula", which are of the class "pFormula" (inheriting from "Formula" and "formula"). It would be great if this issue could be fixed in the future. Thanks a lot, Arne -- Arne Henningsen http://www.arne-henningsen.name
Arne Henningsen wrote:> Hi! > > I noticed that there is a (minor) bug either the command all.equal() > or in the "plm" package. I demonstrate this using an example taken > from the documentation of plm(): >I'm not sure this is a bug, but I'd call it at least a design flaw. The problem is that the length.Formula method in the Formula package (which plm depends on) returns a vector of length 2. Now there's nothing in R that requires length() to return a scalar, but all.equal assumes it does, and I'd guess there are lots of other places this assumption is made. Duncan Murdoch> =====================================> R> data("Produc", package="plm") > R> zz <- plm(log(gsp)~log(pcap)+log(pc)+log(emp)+unemp, > + data=Produc, index=c("state","year")) > R> all.equal(zz,zz) > [1] TRUE > Warning message: > In if (length(target) != length(current)) return(paste("target, > current differ in having response: ", : > the condition has length > 1 and only the first element will be used > >> all.equal(zz$formula,zz$formula) >> > [1] TRUE > Warning message: > In if (length(target) != length(current)) return(paste("target, > current differ in having response: ", : > the condition has length > 1 and only the first element will be used > >> class(zz$formula) >> > [1] "pFormula" "Formula" "formula" > =====================================> > The last commands show that the warning message comes from comparing > the elements "formula", which are of the class "pFormula" (inheriting > from "Formula" and "formula"). It would be great if this issue could > be fixed in the future. > > Thanks a lot, > Arne > >
> -----Original Message----- > From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r- > project.org] On Behalf Of Arne Henningsen > Sent: Tuesday, November 10, 2009 2:24 AM > To: Duncan Murdoch; r-devel at r-project.org; Yves Croissant; > Giovanni_Millo at generali.com; Achim Zeileis > Subject: Re: [Rd] Bug in all.equal() or in the plm package > > On Mon, Nov 9, 2009 at 12:24 PM, Duncan Murdoch <murdoch at stats.uwo.ca> > wrote: > > Arne Henningsen wrote: > >> > >> I noticed that there is a (minor) bug either the command all.equal() > >> or in the "plm" package. I demonstrate this using an example taken > >> from the documentation of plm(): > >> > > > > I'm not sure this is a bug, but I'd call it at least a design flaw. > ?The > > problem is that the length.Formula method in the Formula package > (which plm > > depends on) returns a vector of length 2. ?Now there's nothing in R > that > > requires length() to return a scalar,No, but outside of R, length is a one dimensional real number except perhaps in some esoteric mathematics, so I'm puzzled why length in R would be redefined to produce non-scalars.> >but all.equal assumes it does, > and I'd > > guess there are lots of other places this assumption is made. > > Okay, let's call it "design flaw". Given that the "unusual" behaviour > of length.Formula() causes this problem, I suggest that the > length.Formula() method should be changed. Maybe to something like > > R> a <- as.Formula( y ~ x | z | w ) > # current behaviour: > R> length(a) > [1] 1 3 > # suggested behaviour: > R> length(a) > [1] 2 > R> length(a[[1]]) > [1] 1 > R> length(a[[2]]) > [1] 3 >How about # Total number of variables in model R> length(a) [1] 4 # Predictor variables (on the right hand side) pred(a) or rhs(a) R> length(pred(a)) [1] 3 # Response variables (on the left hand side) resp(a) or lhs(a) R> length(resp(a)) [1] 1 so all lengths of a formula's components can be obtained as scalars. R> length(a) [1] 3 is what R 2.9.1 produced, and may often be what is expected for the length of a formula, so the above could be # Total number of variables in model R> length(total(a)) [1] 4 # Predictor variables (on the right hand side) pred(a) or rhs(a) R> length(a) [1] 3 # Response variables (on the left hand side) resp(a) or lhs(a) R> length(resp(a)) [1] 1 Steve McKinney> This would be more consistent with the usual behaviour of length, e.g. > R> b <- list( 1, 1:3 ) > R> length(b) > [1] 2 > R> length(b[[1]]) > [1] 1 > R> length(b[[2]]) > [1] 3 > > /Arne > > > >> =====================================> >> R> data("Produc", package="plm") > >> R> zz <- plm(log(gsp)~log(pcap)+log(pc)+log(emp)+unemp, > >> + ? data=Produc, index=c("state","year")) > >> R> all.equal(zz,zz) > >> [1] TRUE > >> Warning message: > >> In if (length(target) != length(current)) return(paste("target, > >> current differ in having response: ", ?: > >> ?the condition has length > 1 and only the first element will be > used > >> > >>> > >>> all.equal(zz$formula,zz$formula) > >>> > >> > >> [1] TRUE > >> Warning message: > >> In if (length(target) != length(current)) return(paste("target, > >> current differ in having response: ", ?: > >> ?the condition has length > 1 and only the first element will be > used > >> > >>> > >>> class(zz$formula) > >>> > >> > >> [1] "pFormula" "Formula" ?"formula" > >> =====================================> >> > >> The last commands show that the warning message comes from comparing > >> the elements "formula", which are of the class "pFormula" > (inheriting > >> from "Formula" and "formula"). It would be great if this issue could > >> be fixed in the future. > >> > >> Thanks a lot, > >> Arne > > -- > Arne Henningsen > http://www.arne-henningsen.name > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Hi, sorry for replying so late to this. I somehow missed the original thread and was just pointed to it by Yves...>>> I noticed that there is a (minor) bug either the command all.equal() >>> or in the "plm" package. I demonstrate this using an example taken >>> from the documentation of plm(): >>> >> >> I'm not sure this is a bug, but I'd call it at least a design flaw. >>?The problem is that the length.Formula method in the Formula package >> (which plm depends on) returns a vector of length 2. ?Now there's >> nothing in R that requires length() to return a scalar, but all.equal >> assumes it does, and I'd guess there are lots of other places this >> assumption is made.Well, ?length says: The default method currently returns an 'integer' of length 1. Since this may change in the future and may differ for other methods, programmers should not rely on it. The problem IMO is that the all.equal() method for "formula" gets called by inheritance without assuring that it works. I think we just need to supply a suitable all.equal() method for "Formula" objects.>> Okay, let's call it "design flaw". Given that the "unusual" behaviour >> of length.Formula() causes this problem, I suggest that the >> length.Formula() method should be changed. Maybe to something like> R> a <- as.Formula( y ~ x | z | w ) > # current behaviour: > R> length(a) > [1] 1 3 > # suggested behaviour: > R> length(a) > [1] 2This wouldn't be correct either because this is not a list of length 2. A "Formula" is a "formula" (of length 2 or 3) with two attributes ("lhs" and "rhs"). Thus, currently length() does not reflect the internal structure but rather the conceptual structure (of a formula consiting of a LHS and RHS, both with a certain length). Unless there are good reasons to do otherwise, I would keep the length() method and just supply a suitable all.equal() method for "Formula" objects. hth, Z