Hin-Tak Leung
2007-Oct-07 03:56 UTC
[Rd] R 2.6.0 S4 data breakage, R _data_class(), class<-, etc.
Hi, (somebody would probably yell at me for not checking 2.6.0rc, for which I can only apologize...) Our R package (snpMatrix in http://www-gene.cimr.cam.ac.uk/clayton/software/) is broken rather badly in 2.6.0 ; I have fixed most of it now so a new release is imminent; but I'd like to mention a few things, mostly to summarize my experience and hopefully the 'writing R extensions' document can be updated to reflect some of this... 1) We created and bundled some data in the past in the 2.2 to 2.5 time frame (well, 18 months in reality); most of them triggers a warning 'pre-2.4.0 S4 objects detected... consider recreating...' a) I could fix all of them with just 'a <- asS4(a)' and save() (they are relatively simple objects just missing the S4 object bit flag) b) I am surprised one of them were actually saved from 2.5 - our buggy code no doubt, see below. We never noticed we didn't do SET_S4_OBJECT() in our C code nor asS4() in our R code until this week. Obviously we were mistakenly relying on the S4 method dispatch on S3 objects, which were withdrawn in 2.6.0... 2) I am surprised that 'class(a)' can read S4 class names, but 'class(a)<-' does not set the S4 object bit. I suppose the correct way would be to do new(...)? This needs to be written down somewhere... The asymmetry is somewhat surprising though. 3) We have some C code which branches depending on the S4 class. The R extension doc didn't explain that one needs to do R_data_class() rather than classgets() (or 'getAttrib(x, RClassSymbol)') to retrieve S4 classes; further more, R_data_class() is not part of the public API, and I only found it by looking at the C code of 'class()' (do_class()). But R_data_class() is part of exposed binary interface and the methods package certainly uses it; isn't it time to make it part of the public API? In any case, I think a way of retrieving the S4 class in C is needed. 4) The documentation is missing a fair part - specifically, I need to be able to read and write the S4 class attribute... so R_data_class() needs to be documented and exposed as part of the public API (and included in the Rinternals.h include), and the recommended way of making an S4 object in C? I found classgets() + SET_S4_OBJECT() seem to work, but I'd like an authoritative answer... 5) I am finding 'class()<-' + asS4() in R and classgets()+ SET_S4_OBJECT() in C combo's a bit awkward. Is there any reasons why class<- or classgets() (or if there is a more 'correct' API to use for S4) cannot automatically set the S4 bit if the name is a known S4 class? Thanks for reading so far... Hin-Tak
John Chambers
2007-Oct-07 14:44 UTC
[Rd] R 2.6.0 S4 data breakage, R _data_class(), class<-, etc.
Most of your problems seem related to assigning an S4 class to an arbitrary object--a really bad idea, since it can produce invalid objects. Objects from S4 classes are created by calling the function new(), and in principal _only_ by calling that function. Objects from one class are coerced to another by calling the function as(). Assigning a class to any old object is a very S3 idea (and not a good idea except in low-level code there, either). At the C level there are macros for new() (R recommends NEW_OBJECT()), although the safest approach when feasible is to allocate the object in R. The general as() computation really needs to be done in R because of its special use of method dispatch; there are macros for the equivalent of the as.<type>() functions. Perhaps some improvements to the documentation would make this clearer, although Chapter 7 and Appendix A of Programming with Data seem reasonably definite. Thanks for sharing your notes. John Hin-Tak Leung wrote:> Hi, > > (somebody would probably yell at me for not checking 2.6.0rc, > for which I can only apologize...) > > Our R package (snpMatrix in > http://www-gene.cimr.cam.ac.uk/clayton/software/) is broken rather badly > in 2.6.0 ; I have fixed most of it now so a new release is imminent; > but I'd like to mention a few things, mostly to summarize my experience > and hopefully the 'writing R extensions' document can be updated to > reflect some of this... > > 1) We created and bundled some data in the past in the 2.2 to 2.5 > time frame (well, 18 months in reality); > most of them triggers a warning 'pre-2.4.0 S4 objects detected... > consider recreating...' > a) I could fix all of them with just 'a <- asS4(a)' and save() > (they are relatively simple objects just missing the S4 object > bit flag) > b) I am surprised one of them were actually saved from 2.5 - our buggy > code no doubt, see below. > > We never noticed we didn't do SET_S4_OBJECT() in our C code nor > asS4() in our R code until this week. Obviously we were mistakenly > relying on the S4 method dispatch on S3 objects, which were withdrawn in > 2.6.0... > > 2) I am surprised that 'class(a)' can read S4 class names, but > 'class(a)<-' does not set the S4 object bit. I suppose the correct way > would be to do new(...)? This needs to be written down somewhere... > The asymmetry is somewhat surprising though. > > 3) We have some C code which branches depending on the S4 class. > The R extension doc didn't explain that one needs to do R_data_class() > rather than classgets() (or 'getAttrib(x, RClassSymbol)') to retrieve > S4 classes; further more, > R_data_class() is not part of the public API, and I only found it by > looking at the C code of 'class()' (do_class()). But R_data_class() > is part of exposed binary interface and the methods package certainly > uses it; isn't it time to make it part of the public API? In any case, I > think a way of retrieving the S4 class in C is needed. >Yes, or at the least instructions to handle the case of a NULL class attribute, but a macro would be good.> 4) The documentation is missing a fair part - specifically, > I need to be able to read and write the S4 class attribute... > so R_data_class() needs to be documented and exposed as part of the > public API (and included in the Rinternals.h include), > and the recommended way of making an S4 object in C? I found > classgets() + SET_S4_OBJECT() seem to work, but I'd like an > authoritative answer... > > 5) I am finding 'class()<-' + asS4() in R and classgets()+ > SET_S4_OBJECT() in C combo's a bit awkward. Is there any reasons why > class<- or classgets() (or if there is a more 'correct' API to use for > S4) cannot automatically set the S4 bit if the name is a known S4 class? > > Thanks for reading so far... > > Hin-Tak > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >