Martin Maechler
2015-Nov-16 16:43 UTC
[R] Why does a custom function called is.numeric.factor break lattice?
>>>>> Bert Gunter <bgunter.4567 at gmail.com> >>>>> on Mon, 16 Nov 2015 08:21:09 -0800 writes:> Thanks Duncan. You are right; I missed this. > Namespaces and full qualification seems the only reliable solution to > the general issue though -- right? Not in this case; full qualification is very very rarely needed in package code (even some "schools" do use and propagate it much more than I would recommend), and we are talking about the lattice code, i.e., package code, not user code, here. I.e., using base::is.numeric() would not help at all: It will still find the bogous is.numeric.factor because that is taken before the internal default method. Also, I'm almost sure S4 dispatch would suffer from the same feature of S (and hence R) here: You are allowed to define methods for your new classes and they are used "dynamically". (I also don't think that the problem is related to the fact that this a.b.c() case is S3-ambigous: a() method for "b.c" or a.b() method for "c".) Unfortunately, this can be misused to define methods for existing ("base") classes in case they are handled by the default method. OTOH, if base/stats/... already *had* a 'factor' method for is.numeric(), be it S3 or S4, no harm would have been done by the bad user defined is.numeric.factor definition, thanks to the namespace technology. To get full protection here, we would have to store "the dispatch table for all base classes" (a pretty vague notion) with the package at package build time or install time ("load time" is too late: the bad is.numeric.factor() could already be present at package load time). I'm not sure this would be is easily feasible.... but it may be something to envisage for R 4.0.0 .. Martin > Cheers, > Bert > Bert Gunter > "Data is not information. Information is not knowledge. And knowledge > is certainly not wisdom." > -- Clifford Stoll > On Mon, Nov 16, 2015 at 7:42 AM, Duncan Murdoch > <murdoch.duncan at gmail.com> wrote: >> On 16/11/2015 10:22 AM, Bert Gunter wrote: >>> >>> There is no multiple dispatch; just multiple misunderstanding. >>> >>> The generic function is "is.numeric" . Your method for factors is >>> "is.numeric.factor". >>> >>> You need to re-study. >> >> >> >> I think the problem is with S3. "is.numeric.factor" could be a >> "numeric.factor" method for the "is" generic, or a "factor" method for the >> "is.numeric" generic. Using names with dots is a bad idea. This would be >> all be simpler and less ambiguous if the class had been named >> "numeric_factor" or "numericFactor" or anything without a dot. >> >> Duncan Murdoch
Bert Gunter
2015-Nov-16 17:27 UTC
[R] Why does a custom function called is.numeric.factor break lattice?
Thanks, Martin. You have clearly stated the issue that concerned me. I am sorry that it cannot be (easily) resolved. Cheers, Bert Bert Gunter "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." -- Clifford Stoll On Mon, Nov 16, 2015 at 8:43 AM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:>>>>>> Bert Gunter <bgunter.4567 at gmail.com> >>>>>> on Mon, 16 Nov 2015 08:21:09 -0800 writes: > > > Thanks Duncan. You are right; I missed this. > > > Namespaces and full qualification seems the only reliable solution to > > the general issue though -- right? > > Not in this case; full qualification is very very rarely needed > in package code (even some "schools" do use and propagate it > much more than I would recommend), and we are talking about the > lattice code, i.e., package code, not user code, here. > > I.e., using base::is.numeric() would not help at all: It > will still find the bogous is.numeric.factor because that is > taken before the internal default method. > > Also, I'm almost sure S4 dispatch would suffer from the same > feature of S (and hence R) here: You are allowed to define > methods for your new classes and they are used "dynamically". > (I also don't think that the problem is related to the fact that this > a.b.c() case is S3-ambigous: a() method for "b.c" or a.b() method for "c".) > > Unfortunately, this can be misused to define methods for > existing ("base") classes in case they are handled by the default method. > OTOH, if base/stats/... already *had* a 'factor' method for > is.numeric(), be it S3 or S4, no harm would have been done by > the bad user defined is.numeric.factor definition, thanks to the > namespace technology. > > To get full protection here, we would have to > store "the dispatch table for all base classes" (a pretty vague notion) > with the package at package build time or install time ("load time" is too late: > the bad is.numeric.factor() could already be present at package load time). > > I'm not sure this would be is easily feasible.... but it may be > something to envisage for R 4.0.0 .. > > Martin > > > Cheers, > > Bert > > > Bert Gunter > > > "Data is not information. Information is not knowledge. And knowledge > > is certainly not wisdom." > > -- Clifford Stoll > > > > On Mon, Nov 16, 2015 at 7:42 AM, Duncan Murdoch > > <murdoch.duncan at gmail.com> wrote: > >> On 16/11/2015 10:22 AM, Bert Gunter wrote: > >>> > >>> There is no multiple dispatch; just multiple misunderstanding. > >>> > >>> The generic function is "is.numeric" . Your method for factors is > >>> "is.numeric.factor". > >>> > >>> You need to re-study. > >> > >> > >> > >> I think the problem is with S3. "is.numeric.factor" could be a > >> "numeric.factor" method for the "is" generic, or a "factor" method for the > >> "is.numeric" generic. Using names with dots is a bad idea. This would be > >> all be simpler and less ambiguous if the class had been named > >> "numeric_factor" or "numericFactor" or anything without a dot. > >> > >> Duncan Murdoch
sbihorel
2015-Nov-16 17:35 UTC
[R] Why does a custom function called is.numeric.factor break lattice?
Hi, Thanks everyone for all your insights... I feel that the discussion is getting way deeper and more technical and it needs to be from the point of view of what I was trying to achieve with my little "is.numeric.factor" function (ie, checking if an object is a factor and if all levels of this factor can be coerced to numeric values). I guess that, as Duncan pointed point, using dots in function names becomes bad practice for function starring "is". I'll rename my function, that's it. On 11/16/2015 11:43 AM, Martin Maechler wrote:>>>>>> Bert Gunter <bgunter.4567 at gmail.com> >>>>>> on Mon, 16 Nov 2015 08:21:09 -0800 writes: > > Thanks Duncan. You are right; I missed this. > > > Namespaces and full qualification seems the only reliable solution to > > the general issue though -- right? > > Not in this case; full qualification is very very rarely needed > in package code (even some "schools" do use and propagate it > much more than I would recommend), and we are talking about the > lattice code, i.e., package code, not user code, here. > > I.e., using base::is.numeric() would not help at all: It > will still find the bogous is.numeric.factor because that is > taken before the internal default method. > > Also, I'm almost sure S4 dispatch would suffer from the same > feature of S (and hence R) here: You are allowed to define > methods for your new classes and they are used "dynamically". > (I also don't think that the problem is related to the fact that this > a.b.c() case is S3-ambigous: a() method for "b.c" or a.b() method for "c".) > > Unfortunately, this can be misused to define methods for > existing ("base") classes in case they are handled by the default method. > OTOH, if base/stats/... already *had* a 'factor' method for > is.numeric(), be it S3 or S4, no harm would have been done by > the bad user defined is.numeric.factor definition, thanks to the > namespace technology. > > To get full protection here, we would have to > store "the dispatch table for all base classes" (a pretty vague notion) > with the package at package build time or install time ("load time" is too late: > the bad is.numeric.factor() could already be present at package load time). > > I'm not sure this would be is easily feasible.... but it may be > something to envisage for R 4.0.0 .. > > Martin > > > Cheers, > > Bert > > > Bert Gunter > > > "Data is not information. Information is not knowledge. And knowledge > > is certainly not wisdom." > > -- Clifford Stoll > > > > On Mon, Nov 16, 2015 at 7:42 AM, Duncan Murdoch > > <murdoch.duncan at gmail.com> wrote: > >> On 16/11/2015 10:22 AM, Bert Gunter wrote: > >>> > >>> There is no multiple dispatch; just multiple misunderstanding. > >>> > >>> The generic function is "is.numeric" . Your method for factors is > >>> "is.numeric.factor". > >>> > >>> You need to re-study. > >> > >> > >> > >> I think the problem is with S3. "is.numeric.factor" could be a > >> "numeric.factor" method for the "is" generic, or a "factor" method for the > >> "is.numeric" generic. Using names with dots is a bad idea. This would be > >> all be simpler and less ambiguous if the class had been named > >> "numeric_factor" or "numericFactor" or anything without a dot. > >> > >> Duncan Murdoch-- Sebastien Bihorel Cognigen Corporation (t) +1 716 633 3463 ext 323 Cognigen Corporation, a wholly owned subsidiary of Simulations Plus, Inc.
David Winsemius
2015-Nov-16 17:59 UTC
[R] Why does a custom function called is.numeric.factor break lattice?
> On Nov 16, 2015, at 9:35 AM, sbihorel <Sebastien.Bihorel at cognigencorp.com> wrote: > > Hi, > > Thanks everyone for all your insights... > > I feel that the discussion is getting way deeper and more technical and it needs to be from the point of view of what I was trying to achieve with my little "is.numeric.factor" function (ie, checking if an object is a factor and if all levels of this factor can be coerced to numeric values).You seem to be asking for a compound test: first with is.factor, then to see whether all the levels could be coerced to numeric properly. I would think that you would need something like: if( is.factor(varname) ) { !sum(is.na(as.numeric(as.character(varname)))) } else { FALSE } ? David.> > I guess that, as Duncan pointed point, using dots in function names becomes bad practice for function starring "is". I'll rename my function, that's it. > > > On 11/16/2015 11:43 AM, Martin Maechler wrote: >>>>>>> Bert Gunter <bgunter.4567 at gmail.com> >>>>>>> on Mon, 16 Nov 2015 08:21:09 -0800 writes: >> > Thanks Duncan. You are right; I missed this. >> >> > Namespaces and full qualification seems the only reliable solution to >> > the general issue though -- right? >> >> Not in this case; full qualification is very very rarely needed >> in package code (even some "schools" do use and propagate it >> much more than I would recommend), and we are talking about the >> lattice code, i.e., package code, not user code, here. >> >> I.e., using base::is.numeric() would not help at all: It >> will still find the bogous is.numeric.factor because that is >> taken before the internal default method. >> >> Also, I'm almost sure S4 dispatch would suffer from the same >> feature of S (and hence R) here: You are allowed to define >> methods for your new classes and they are used "dynamically". >> (I also don't think that the problem is related to the fact that this >> a.b.c() case is S3-ambigous: a() method for "b.c" or a.b() method for "c".) >> >> Unfortunately, this can be misused to define methods for >> existing ("base") classes in case they are handled by the default method. >> OTOH, if base/stats/... already *had* a 'factor' method for >> is.numeric(), be it S3 or S4, no harm would have been done by >> the bad user defined is.numeric.factor definition, thanks to the >> namespace technology. >> >> To get full protection here, we would have to >> store "the dispatch table for all base classes" (a pretty vague notion) >> with the package at package build time or install time ("load time" is too late: >> the bad is.numeric.factor() could already be present at package load time). >> >> I'm not sure this would be is easily feasible.... but it may be >> something to envisage for R 4.0.0 .. >> >> Martin >> >> > Cheers, >> > Bert >> >> > Bert Gunter >> >> > "Data is not information. Information is not knowledge. And knowledge >> > is certainly not wisdom." >> > -- Clifford Stoll >> >> >> > On Mon, Nov 16, 2015 at 7:42 AM, Duncan Murdoch >> > <murdoch.duncan at gmail.com> wrote: >> >> On 16/11/2015 10:22 AM, Bert Gunter wrote: >> >>> >> >>> There is no multiple dispatch; just multiple misunderstanding. >> >>> >> >>> The generic function is "is.numeric" . Your method for factors is >> >>> "is.numeric.factor". >> >>> >> >>> You need to re-study. >> >> >> >> >> >> >> >> I think the problem is with S3. "is.numeric.factor" could be a >> >> "numeric.factor" method for the "is" generic, or a "factor" method for the >> >> "is.numeric" generic. Using names with dots is a bad idea. This would be >> >> all be simpler and less ambiguous if the class had been named >> >> "numeric_factor" or "numericFactor" or anything without a dot. >> >> >> >> Duncan Murdoch > > -- > Sebastien Bihorel > Cognigen Corporation > (t) +1 716 633 3463 ext 323 > Cognigen Corporation, a wholly owned subsidiary of Simulations Plus, Inc. > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA