Roger Koenker wrote:>
> Suppose you have a class, say sex, for lack of a better example, and
> you are tempted, in defining the behavior of the call,
>
> is(x,"sex")
>
> to check whether certain basic features are satisfied, not to just trust
the claim
> that x is specified to be of class "sex".
Well, it depends. Sometimes (in dispatching methods, e.g.) you don't
want to ask too deeply about what the object REALLY is, but would just
like to (excuse the expression in the context) "get on with it", given
a
basic assertion.
That's fundamentally what is() does: it simply checks the class
inheritance structure for class(x).
The function defined to poke more deeply into the issue is
validObject():
validObject(x)
checks as much as possible into whether x is a valid object of its
class. Some checks are built in (slots of the right classes, etc.).
Others if needed can be incorporated in the class definitions (argument
validity=) or via setValidity(). Validity checking uses inheritance, so
validity methods for contained classes will be applied.
>`Without delving into details
> further sanity checking of the structure of the object is sometimes prudent
to
> avoid subsequent nonsense. This checking could be built into the is
method,
> but the documentation for "new" suggests that one might
alternatively want to
> define a method, "initialize" that was used to create objects of
class "sex"
> and not let people just create such objects willy-nilly. My questions are
these:
>
> 1. Am I correct in thinking that initialize is the right way to handle
this?
Well, about the best there is, for a single attack. See comment to your
point 3.
>
> 2. Are there examples of the initialize strategy, beyond the one given in
> the "new" documentation?
I believe some of the packages in BioConductor use initialize methods.
Others??
>
> 3. Are there efficiency issues that one should be cautious about?
Indeed. The "obvious" strategy is: if the class has a validity
method,
direct or inherited, then initialize() should invoke it. The default
initialize method does not (in either R or S-Plus).
Should that be changed? Logically, I would say yes: if the class
designer specified a validity method, it should not be possible to use
new() to create invalid objects. But there is an efficiency penalty.
At the moment, you need to build your own initialize() method. That's
not all bad--in the process you may also make the arguments to
initialize() reasonable names instead of ..., or otherwise get beyond
the notion of just supplying slot names in calls to new().
Here's a simple example (which I'll add to the initialize
documentation). The validity method requires a single string for the
"id" slot.
setClass("a", representation(x="numeric", id =
"character"),
validity = function(object)
if(length(object@id)==1) TRUE else
"Expected a single string as the \"id\" slot")
and the initialize method calls validObject.
setMethod("initialize", "a", function(.Object, x =
numeric(), id "<>"){
.Object@x <- x
.Object@id <- id
validObject(.Object)
.Object})
With that definition, you get a check with new():
R> new("a", x=1:10, id=character())
Error in validObject(.Object) : Invalid "a" object: Expected a single
string as the "id" slot
[A couple of details for those interested. The default values in the
initialize() method above are important. Otherwise, simple calls such
as new("a") will fail.
Also, R (but not currently S-Plus) has a function callNextMethod() that
looks good for writing initialize methods. It often is, but there is a
current bug that requires you to supply all the arguments to
callNextMethod in this case, contrary to the documentation. With luck,
the bug will be fixed. Meanwhile, the way to use callNextMethod in
initialize() is like this:
R> setMethod("initialize", "a", function(.Object, ...) {
+ x <- callNextMethod(.Object, ...)
+ validObject(x)
+ x})
]
>
> 4. Is there any way to enforce the Foucaultian imperative that any new
> "sex" object has to pass through the initialize phase?
Well, as someone might have said, it depends what you mean by "new".
The above mechanism pretty much ensures that objects will be valid when
created.
But it doesn't prevent some code from doing:
x@id <- character(0)
So, one might define methods for "@<-" that included validity
checking.
Sometimes, though, (as in checking that two slots have the same length,
e.g.) it may take some care not to create invalid objects temporarily:
the discipline of always creating the objects through a call to new()
will usually work, but once again with some slight efficiency penalty.
The whole area of valid objects is one that all of us interested folks
should discuss.
It would be nice to have some more "real" examples.
John
>
> Oh, for those simple days of yesteryore when the pecadillos of the
president
> caused no harm, and the Dow was over 10,000...
>
> Apologies in advance if this moment of sexistential doubt offends anyone.
>
> url: www.econ.uiuc.edu Roger Koenker Dept. of Economics
UCL,
> email rkoenker@uiuc.edu Department of Economics Drayton House,
> vox: 217-333-4558 University of Illinois 30 Gorden St,
> fax: 217-244-6678 Champaign, IL 61820 London,WC1H 0AX, UK
> vox:
020-7679-5838
>
> ______________________________________________
> R-devel@stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
--
John M. Chambers jmc@bell-labs.com
Bell Labs, Lucent Technologies office: (908)582-2681
700 Mountain Avenue, Room 2C-282 fax: (908)582-3340
Murray Hill, NJ 07974 web: http://www.cs.bell-labs.com/~jmc