Anthony Rossini
2002-Mar-19 17:43 UTC
[Rd] Question re:S4 classes and design; clashing classes?
Thanks to Duncan's recent work on SJava, I've got a (so far) stable platform to put together the R-Orca interface (dynamic graphics, including brushing, subseting, etc). While the first version will be looking at installation issues and study basic user interfaces for providing a few working examples, it will be important for this particular package to look at how one integrates a "toolkit" at the user-level with other R libraries. The obvious place (to me, at least initially) to look is at how the S4 classes might be used for this. However, I'm having a heck of a time wrapping my head around the proper design for this. Here is the general issue: For "normal data objects" with standard data views (i.e. class of grand tours, etc), I can simply setMethod on the standard data views, i.e. setMethod("grandtour","dataframe", ...) # standard unstructured grand tour setMethod("grandtour","Surv", ... ) # tour, restricting temporal data, # handling survival etc. I'm fine on this, and have an acceptable sense of what I'd like to do here. Note that here, I'm talking about visualizations that currently exist, not that might be written for such data. This is in fact the first step beyond the package I'm currently finishing up. For the third version (beyond the initial package and beyond the initial methods version of the package), I've got a second class of objects, i.e. the data views (possibly customized themselves), which I'd like to think about. In particular, I'd like to be able to extend/subclass "grandtour", for example, to regular ("nice") longitudinal and spatio-temporal data, i.e. classify the methods, not just the data being acted on by the methods. I think I'm missing something really obvious here, and I suspect it is related to using too many object-oriented systems (I'm more used to Python's, right now, for example; but even with Java, there is a sense of balance between methods and data within a class). Thinking about it more, I guess I'm not clear on the path or approach that S4 classes use for inheritance and for contractual negotiation. Here's the example that I'm using as a basis for thinking about design: I've got a grandtour object which has slots for Data, Colors (a vector used for subsetting from brushes -- one can color points in orca or from R, this being the first approach for it), as well as Guidance (at this point, merely a string, but will be classed for robustness and extensibility at somepoint). At the first step, this is for a single sample. Now, I'd like to extend this to problems where one variable is fixed (i.e. TIME, in the longitudinal or time-series problem, or GROUP, in the correlated data setting). So we've got, in a sense, the multiple inheritance problem. I'd like to be able to inherit from both the Orca object as well as the data library object. In other OO systems, I'd subclass grandtour, probably using multiple inheritance, and modify only the methods which change (or to paraphrase the Java approach, use interfaces for the orca toolkit, and classes for the data libraries). For example, I might want to try a different visualization for survival data, and only want to change the display in some way (not the brushing/colors or guidance methods). I'd like to be able to "rerun" numerical analyses on potentially interesting subsets, or possibly even predict from current fit models, and have the required objects be part of the orca object. Orca objects could have associated methods: dumpPipeline, buildPipeline, brush/selectData, getData, setData, addPipe, rmPipe. In particular, the Data methods will need to use S4 data objects. Do I simply extend the data classes to add slots for visualization and the requisite parameters and subclass from existing R libraries? (i.e. add slots for both parameters and dispatched functions). Is there a better approach to approaching this problem (which should be phrased as: "I've got more than one set of initial classes to build from; what might be the preferred approach to extend"?) I think I'm missing something from reading John's Green book. If anyone has pointers to pages/sections I need to read much more carefully, I'd be grateful to hear them! (note: I know a number of approaches for how I can implement it, but I don't have a good enough sense of S4 class design intentions to pick any of them as being "reasonably faithful" to good S4 design). best, -tony -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
John Chambers
2002-Mar-20 15:21 UTC
[Rd] Question re:S4 classes and design; clashing classes?
It wasn't entirely clear what your design goals were, but the general question of comparing functional method languages (S, CLOS, Dylan) and OOP languages (Java, etc) is of much interest to me, so here are some general comments. Maybe we can then iterate on your example, with or without r-devel. The basic difference in approach is easy to state: Functional languages define methods based on the function and a signature (a list matching classes to formal arguments); OOP languages attach all methods to a class definition. The distinction becomes relevant if a function does "multiple dispatch"; that is, defining methods that depend on more than one argument. (In that sense, S3 methods made no use of the functional nature of the language.) If, say, f(x, y) only has methods that depend on x, then it's just syntactic sugar, and a matter of organization, that distinguishes the functional form f(x, y) from an OOP-style form, say x$f(y). (That's a slight exaggeration, because the functional language model thinks that "f" should mean something similar regardless of the class of x, where in OOP you could define the method in a totally inconsistent way.) In that sense, I read your description as essentially OOP. You talk of "Orca objects having methods", but in a functional language the question is how the functions that do the computation depend on the objects that are their arguments. Nothing wrong with the OOP view, but it means that your question goes more to how, and if, the underlying R code uses functions. If it's just basically an interface to an OOP definition, then it can stay in that form. (And I'd like to discuss in separate mail using the Omegahat OOP package to formalize the R side.) Functional methods would become relevant if the R software involved multiple objects. Not knowing the innards of how things work, let me make up an example. If the current view included, say, an object describing a wireframe that was displayed around the points, then one might have useful classes of wireframe objects, then a function that stepped the display to the next view could have methods that depended on both the current data and the wireframe: step(data, wireframe) (Definitely not asserting this is a relevant example, but perhaps the general distinction is clear.) John Anthony Rossini wrote:> > Thanks to Duncan's recent work on SJava, I've got a (so far) stable > platform to put together the R-Orca interface (dynamic graphics, > including brushing, subseting, etc). While the first version will be > looking at installation issues and study basic user interfaces for providing a > few working examples, it will be important for this particular package to look at how one integrates a "toolkit" at the user-level with other R libraries. > > The obvious place (to me, at least initially) to look is at how the S4 > classes might be used for this. > > However, I'm having a heck of a time wrapping my head around the > proper design for this. > > Here is the general issue: > > For "normal data objects" with standard data views (i.e. class of > grand tours, etc), I can simply setMethod on the standard data views, > i.e. > > setMethod("grandtour","dataframe", ...) # standard unstructured grand tour > setMethod("grandtour","Surv", ... ) # tour, restricting temporal data, > # handling survival > > etc. I'm fine on this, and have an acceptable sense of what I'd like > to do here. Note that here, I'm talking about visualizations that > currently exist, not that might be written for such data. This is in > fact the first step beyond the package I'm currently finishing up. > > For the third version (beyond the initial package and beyond the initial methods version of the package), I've got a second class of objects, > i.e. the data views (possibly customized themselves), which I'd like > to think about. In particular, I'd like to be able to extend/subclass > "grandtour", for example, to regular ("nice") longitudinal and > spatio-temporal data, i.e. classify the methods, not just the data > being acted on by the methods. > > I think I'm missing something really obvious here, and I suspect it is > related to using too many object-oriented systems (I'm more used to > Python's, right now, for example; but even with Java, there is a sense > of balance between methods and data within a class). Thinking about > it more, I guess I'm not clear on the path or approach that S4 classes > use for inheritance and for contractual negotiation. > > Here's the example that I'm using as a basis for thinking about design: > > I've got a grandtour object which has slots for Data, Colors (a vector > used for subsetting from brushes -- one can color points in orca or > from R, this being the first approach for it), as well as Guidance (at > this point, merely a string, but will be classed for robustness and > extensibility at somepoint). > > At the first step, this is for a single sample. Now, I'd like to > extend this to problems where one variable is fixed (i.e. TIME, in the > longitudinal or time-series problem, or GROUP, in the correlated data > setting). So we've got, in a sense, the multiple inheritance problem. > I'd like to be able to inherit from both the Orca object as well as > the data library object. > > In other OO systems, I'd subclass grandtour, probably using multiple > inheritance, and modify only the methods which change (or to > paraphrase the Java approach, use interfaces for the orca toolkit, and > classes for the data libraries). For example, I might want to try a > different visualization for survival data, and only want to change the > display in some way (not the brushing/colors or guidance methods). > I'd like to be able to "rerun" numerical analyses on potentially > interesting subsets, or possibly even predict from current fit models, > and have the required objects be part of the orca object. > > Orca objects could have associated methods: dumpPipeline, buildPipeline, brush/selectData, getData, setData, addPipe, rmPipe. In particular, the Data methods will need to use S4 data objects. > > Do I simply extend the data classes to add slots for visualization and the requisite parameters and subclass from existing R libraries? (i.e. add slots for both parameters and dispatched functions). > > Is there a better approach to approaching this problem (which should > be phrased as: "I've got more than one set of initial classes to build from; what might be the preferred approach to extend"?) > > I think I'm missing something from reading John's Green book. If > anyone has pointers to pages/sections I need to read much more > carefully, I'd be grateful to hear them! > > (note: I know a number of approaches for how I can implement it, but I > don't have a good enough sense of S4 class design intentions to pick > any of them as being "reasonably faithful" to good S4 design). > > best, > -tony > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- John M. Chambers jmc@bell-labs.com Bell Labs, Lucent Technologies office: (908)582-2681 700 Mountain Avenue, Room 2C-282 fax: (908)582-3340 Murray Hill, NJ 07974 web: http://www.cs.bell-labs.com/~jmc -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._