thr3ads.net - R devel - [Rd] Bug in all.equal() or in the plm package [Nov 2009]

If this information is useful, please help other people find it:
Share via:

Arne Henningsen

2009-Nov-09 10:29 UTC

[Rd] Bug in all.equal() or in the plm package

Hi!

I noticed that there is a (minor) bug either the command all.equal()
or in the "plm" package. I demonstrate this using an example taken
from the documentation of plm():

=====================================R> data("Produc",
package="plm")
R> zz <- plm(log(gsp)~log(pcap)+log(pc)+log(emp)+unemp,
+   data=Produc, index=c("state","year"))
R> all.equal(zz,zz)
[1] TRUE
Warning message:
In if (length(target) != length(current)) return(paste("target,
current differ in having response: ",  :
  the condition has length > 1 and only the first element will be
used> all.equal(zz$formula,zz$formula)[1] TRUE
Warning message:
In if (length(target) != length(current)) return(paste("target,
current differ in having response: ",  :
  the condition has length > 1 and only the first element will be
used> class(zz$formula)[1] "pFormula" "Formula"  "formula"
=====================================
The last commands show that the warning message comes from comparing
the elements "formula", which are of the class "pFormula"
(inheriting
from "Formula" and "formula"). It would be great if this
issue could
be fixed in the future.

Thanks a lot,
Arne

-- 
Arne Henningsen
http://www.arne-henningsen.name

Duncan Murdoch

2009-Nov-09 11:24 UTC

head link

[Rd] Bug in all.equal() or in the plm package

Arne Henningsen wrote:> Hi!
>
> I noticed that there is a (minor) bug either the command all.equal()
> or in the "plm" package. I demonstrate this using an example
taken
> from the documentation of plm():
>   
I'm not sure this is a bug, but I'd call it at least a design flaw.  The
problem is that the length.Formula method in the Formula package (which 
plm depends on) returns a vector of length 2.  Now there's nothing in R 
that requires length() to return a scalar, but all.equal assumes it 
does, and I'd guess there are lots of other places this assumption is made.

Duncan Murdoch> =====================================> R> data("Produc",
package="plm")
> R> zz <- plm(log(gsp)~log(pcap)+log(pc)+log(emp)+unemp,
> +   data=Produc, index=c("state","year"))
> R> all.equal(zz,zz)
> [1] TRUE
> Warning message:
> In if (length(target) != length(current)) return(paste("target,
> current differ in having response: ",  :
>   the condition has length > 1 and only the first element will be used
>   
>> all.equal(zz$formula,zz$formula)
>>     
> [1] TRUE
> Warning message:
> In if (length(target) != length(current)) return(paste("target,
> current differ in having response: ",  :
>   the condition has length > 1 and only the first element will be used
>   
>> class(zz$formula)
>>     
> [1] "pFormula" "Formula"  "formula"
> =====================================>
> The last commands show that the warning message comes from comparing
> the elements "formula", which are of the class
"pFormula" (inheriting
> from "Formula" and "formula"). It would be great if
this issue could
> be fixed in the future.
>
> Thanks a lot,
> Arne
>
>

Steven McKinney

2009-Nov-10 19:17 UTC

head link

[Rd] Bug in all.equal() or in the plm package

> -----Original Message-----
> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-
> project.org] On Behalf Of Arne Henningsen
> Sent: Tuesday, November 10, 2009 2:24 AM
> To: Duncan Murdoch; r-devel at r-project.org; Yves Croissant;
> Giovanni_Millo at generali.com; Achim Zeileis
> Subject: Re: [Rd] Bug in all.equal() or in the plm package
> 
> On Mon, Nov 9, 2009 at 12:24 PM, Duncan Murdoch <murdoch at
stats.uwo.ca>
> wrote:
> > Arne Henningsen wrote:
> >>
> >> I noticed that there is a (minor) bug either the command
all.equal()
> >> or in the "plm" package. I demonstrate this using an
example taken
> >> from the documentation of plm():
> >>
> >
> > I'm not sure this is a bug, but I'd call it at least a design
flaw.
> ?The
> > problem is that the length.Formula method in the Formula package
> (which plm
> > depends on) returns a vector of length 2. ?Now there's nothing in
R
> that
> > requires length() to return a scalar, 
No, but outside of R, length is a one dimensional real number
except perhaps in some esoteric mathematics, so I'm puzzled
why length in R would be redefined to produce non-scalars.

> >but all.equal assumes it does,
> and I'd
> > guess there are lots of other places this assumption is made.
> 
> Okay, let's call it "design flaw". Given that the
"unusual" behaviour
> of length.Formula() causes this problem, I suggest that the
> length.Formula() method should be changed. Maybe to something like
> 
> R> a <- as.Formula( y ~ x | z | w )
> # current behaviour:
> R> length(a)
> [1] 1 3
> # suggested behaviour:
> R> length(a)
> [1] 2
> R> length(a[[1]])
> [1] 1
> R> length(a[[2]])
> [1] 3
> 
How about 
# Total number of variables in model
R> length(a)
[1] 4

# Predictor variables (on the right hand side) pred(a) or rhs(a)
R> length(pred(a))
[1] 3

# Response variables (on the left hand side) resp(a) or lhs(a)
R> length(resp(a))
[1] 1

so all lengths of a formula's components can
be obtained as scalars.

R> length(a)
[1] 3
is what R 2.9.1 produced, and may often be what is expected
for the length of a formula, so the above could be

# Total number of variables in model
R> length(total(a))
[1] 4

# Predictor variables (on the right hand side) pred(a) or rhs(a)
R> length(a)
[1] 3

# Response variables (on the left hand side) resp(a) or lhs(a)
R> length(resp(a))
[1] 1

Steve McKinney
> This would be more consistent with the usual behaviour of length, e.g.
> R> b <- list( 1, 1:3 )
> R> length(b)
> [1] 2
> R> length(b[[1]])
> [1] 1
> R> length(b[[2]])
> [1] 3
> 
> /Arne
> 
> 
> >> =====================================> >> R>
data("Produc", package="plm")
> >> R> zz <- plm(log(gsp)~log(pcap)+log(pc)+log(emp)+unemp,
> >> + ? data=Produc, index=c("state","year"))
> >> R> all.equal(zz,zz)
> >> [1] TRUE
> >> Warning message:
> >> In if (length(target) != length(current))
return(paste("target,
> >> current differ in having response: ", ?:
> >> ?the condition has length > 1 and only the first element will
be
> used
> >>
> >>>
> >>> all.equal(zz$formula,zz$formula)
> >>>
> >>
> >> [1] TRUE
> >> Warning message:
> >> In if (length(target) != length(current))
return(paste("target,
> >> current differ in having response: ", ?:
> >> ?the condition has length > 1 and only the first element will
be
> used
> >>
> >>>
> >>> class(zz$formula)
> >>>
> >>
> >> [1] "pFormula" "Formula" ?"formula"
> >> =====================================> >>
> >> The last commands show that the warning message comes from
comparing
> >> the elements "formula", which are of the class
"pFormula"
> (inheriting
> >> from "Formula" and "formula"). It would be
great if this issue could
> >> be fixed in the future.
> >>
> >> Thanks a lot,
> >> Arne
> 
> --
> Arne Henningsen
> http://www.arne-henningsen.name
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Achim Zeileis

2009-Nov-25 15:21 UTC

head link

[Rd] Bug in all.equal() or in the plm package

Hi,

sorry for replying so late to this. I somehow missed the original thread 
and was just pointed to it by Yves...
>>> I noticed that there is a (minor) bug either the command
all.equal()
>>> or in the "plm" package. I demonstrate this using an
example taken
>>> from the documentation of plm():
>>>
>>
>> I'm not sure this is a bug, but I'd call it at least a design
flaw.
>>?The problem is that the length.Formula method in the Formula package 
>> (which plm depends on) returns a vector of length 2. ?Now there's 
>> nothing in R that requires length() to return a scalar, but all.equal 
>> assumes it does, and I'd guess there are lots of other places this 
>> assumption is made.
Well, ?length says:

      The default method currently returns an 'integer' of length 1.
      Since this may change in the future and may differ for other
      methods, programmers should not rely on it.

The problem IMO is that the all.equal() method for "formula" gets
called
by inheritance without assuring that it works. I think we just need to 
supply a suitable all.equal() method for "Formula" objects.
>> Okay, let's call it "design flaw". Given that the
"unusual" behaviour
>> of length.Formula() causes this problem, I suggest that the
>> length.Formula() method should be changed. Maybe to something like
> R> a <- as.Formula( y ~ x | z | w )
> # current behaviour:
> R> length(a)
> [1] 1 3
> # suggested behaviour:
> R> length(a)
> [1] 2
This wouldn't be correct either because this is not a list of length 2. A 
"Formula" is a "formula" (of length 2 or 3) with two
attributes ("lhs" and
"rhs"). Thus, currently length() does not reflect the internal
structure
but rather the conceptual structure (of a formula consiting of a LHS and 
RHS, both with a certain length).

Unless there are good reasons to do otherwise, I would keep the length() 
method and just supply a suitable all.equal() method for 
"Formula" objects.

hth,
Z

Possibly Parallel Threads

Search for more apparently analagous threads

R devel - Nov 2009 - Bug in all.equal() or in the plm package

[Rd] Bug in all.equal() or in the plm package

[Rd] Bug in all.equal() or in the plm package

[Rd] Bug in all.equal() or in the plm package

[Rd] Bug in all.equal() or in the plm package

Possibly Parallel Threads