Hi, (somebody would probably yell at me for not checking 2.6.0rc, for which I can only apologize...) Our R package (snpMatrix in http://www-gene.cimr.cam.ac.uk/clayton/software/) is broken rather badly in 2.6.0 ; I have fixed most of it now so a new release is imminent; but I'd like to mention a few things, mostly to summarize my experience and hopefully the 'writing R extensions' document can be updated to reflect some of this... 1) We created and bundled some data in the past in the 2.2 to 2.5 time frame (well, 18 months in reality); most of them triggers a warning 'pre-2.4.0 S4 objects detected... consider recreating...' a) I could fix all of them with just 'a <- asS4(a)' and save() (they are relatively simple objects just missing the S4 object bit flag) b) I am surprised one of them were actually saved from 2.5 - our buggy code no doubt, see below. We never noticed we didn't do SET_S4_OBJECT() in our C code nor asS4() in our R code until this week. Obviously we were mistakenly relying on the S4 method dispatch on S3 objects, which were withdrawn in 2.6.0... 2) I am surprised that 'class(a)' can read S4 class names, but 'class(a)<-' does not set the S4 object bit. I suppose the correct way would be to do new(...)? This needs to be written down somewhere... The asymmetry is somewhat surprising though. 3) We have some C code which branches depending on the S4 class. The R extension doc didn't explain that one needs to do R_data_class() rather than classgets() (or 'getAttrib(x, RClassSymbol)') to retrieve S4 classes; further more, R_data_class() is not part of the public API, and I only found it by looking at the C code of 'class()' (do_class()). But R_data_class() is part of exposed binary interface and the methods package certainly uses it; isn't it time to make it part of the public API? In any case, I think a way of retrieving the S4 class in C is needed. 4) The documentation is missing a fair part - specifically, I need to be able to read and write the S4 class attribute... so R_data_class() needs to be documented and exposed as part of the public API (and included in the Rinternals.h include), and the recommended way of making an S4 object in C? I found classgets() + SET_S4_OBJECT() seem to work, but I'd like an authoritative answer... 5) I am finding 'class()<-' + asS4() in R and classgets()+ SET_S4_OBJECT() in C combo's a bit awkward. Is there any reasons why class<- or classgets() (or if there is a more 'correct' API to use for S4) cannot automatically set the S4 bit if the name is a known S4 class? Thanks for reading so far... Hin-Tak
R 2.6.0 S4 data breakage, R _data_class(), class<-, etc.
5 messages · Hin-Tak Leung, John Chambers, Martin Morgan +1 more
Most of your problems seem related to assigning an S4 class to an arbitrary object--a really bad idea, since it can produce invalid objects. Objects from S4 classes are created by calling the function new(), and in principal _only_ by calling that function. Objects from one class are coerced to another by calling the function as(). Assigning a class to any old object is a very S3 idea (and not a good idea except in low-level code there, either). At the C level there are macros for new() (R recommends NEW_OBJECT()), although the safest approach when feasible is to allocate the object in R. The general as() computation really needs to be done in R because of its special use of method dispatch; there are macros for the equivalent of the as.<type>() functions. Perhaps some improvements to the documentation would make this clearer, although Chapter 7 and Appendix A of Programming with Data seem reasonably definite. Thanks for sharing your notes. John
Hin-Tak Leung wrote:
Hi, (somebody would probably yell at me for not checking 2.6.0rc, for which I can only apologize...) Our R package (snpMatrix in http://www-gene.cimr.cam.ac.uk/clayton/software/) is broken rather badly in 2.6.0 ; I have fixed most of it now so a new release is imminent; but I'd like to mention a few things, mostly to summarize my experience and hopefully the 'writing R extensions' document can be updated to reflect some of this... 1) We created and bundled some data in the past in the 2.2 to 2.5 time frame (well, 18 months in reality); most of them triggers a warning 'pre-2.4.0 S4 objects detected... consider recreating...' a) I could fix all of them with just 'a <- asS4(a)' and save() (they are relatively simple objects just missing the S4 object bit flag) b) I am surprised one of them were actually saved from 2.5 - our buggy code no doubt, see below. We never noticed we didn't do SET_S4_OBJECT() in our C code nor asS4() in our R code until this week. Obviously we were mistakenly relying on the S4 method dispatch on S3 objects, which were withdrawn in 2.6.0... 2) I am surprised that 'class(a)' can read S4 class names, but 'class(a)<-' does not set the S4 object bit. I suppose the correct way would be to do new(...)? This needs to be written down somewhere... The asymmetry is somewhat surprising though. 3) We have some C code which branches depending on the S4 class. The R extension doc didn't explain that one needs to do R_data_class() rather than classgets() (or 'getAttrib(x, RClassSymbol)') to retrieve S4 classes; further more, R_data_class() is not part of the public API, and I only found it by looking at the C code of 'class()' (do_class()). But R_data_class() is part of exposed binary interface and the methods package certainly uses it; isn't it time to make it part of the public API? In any case, I think a way of retrieving the S4 class in C is needed.
Yes, or at the least instructions to handle the case of a NULL class attribute, but a macro would be good.
4) The documentation is missing a fair part - specifically, I need to be able to read and write the S4 class attribute... so R_data_class() needs to be documented and exposed as part of the public API (and included in the Rinternals.h include), and the recommended way of making an S4 object in C? I found classgets() + SET_S4_OBJECT() seem to work, but I'd like an authoritative answer... 5) I am finding 'class()<-' + asS4() in R and classgets()+ SET_S4_OBJECT() in C combo's a bit awkward. Is there any reasons why class<- or classgets() (or if there is a more 'correct' API to use for S4) cannot automatically set the S4 bit if the name is a known S4 class? Thanks for reading so far... Hin-Tak
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
John Chambers <jmc at r-project.org> writes:
Most of your problems seem related to assigning an S4 class to an arbitrary object--a really bad idea, since it can produce invalid objects. Objects from S4 classes are created by calling the function new(), and in principal _only_ by calling that function. Objects from one class are coerced to another by calling the function as().
But both 'new' and 'as' appear to produce invalid (in a different sense, I guess) objects:
setClass("snp", contains="raw",
+ validity=function(object) {
+ if (length(object) < 1) "too short"
+ else TRUE
+ })
[1] "snp"
new("snp")
An object of class "snp" raw(0)
as(raw(), "snp")
An object of class "snp" raw(0)
new("snp", raw())
Error in validObject(.Object) : invalid class "snp" object: too short Conversely, I think the S4 implementation implicitly requires that 'new' with a single argument (i.e., class name) return a valid object -- see https://stat.ethz.ch/pipermail/bioc-devel/2007-September/001323.html Also, coercing a genome's worth of 'raw' SNPs to 'snp' appears to be more memory efficient than creating a new 'snp' (even with an explicit validity check):
x <- raw(1) tracemem(x)
[1] "<0x1034e28>"
y <- as(x, "snp")
tracemem[0x1034e28 -> 0x7b67e8]: .mergeAttrs setDataPart .Call slot<- @<- asMethod as<- asMethod as
y <- new("snp", x)
tracemem[0x1034e28 -> 0x7bed28]: initialize initialize new tracemem[0x7bed28 -> 0x8fd968]: .mergeAttrs setDataPart .Call slot<- @<- asMethod as<- initialize initialize new tracemem[0x8fd968 -> 0x906518]: switch getDataPart .Call slot validObject initialize initialize new
validObject(y <- as(x, "snp"))
tracemem[0x1034e28 -> 0xa1bfc8]: .mergeAttrs setDataPart .Call slot<- @<- asMethod as<- asMethod as validObject tracemem[0xa1bfc8 -> 0x9e6dd8]: switch getDataPart .Call slot validObject TRUE Martin
Assigning a class to any old object is a very S3 idea (and not a good idea except in low-level code there, either). At the C level there are macros for new() (R recommends NEW_OBJECT()), although the safest approach when feasible is to allocate the object in R. The general as() computation really needs to be done in R because of its special use of method dispatch; there are macros for the equivalent of the as.<type>() functions. Perhaps some improvements to the documentation would make this clearer, although Chapter 7 and Appendix A of Programming with Data seem reasonably definite. Thanks for sharing your notes. John Hin-Tak Leung wrote:
Hi, (somebody would probably yell at me for not checking 2.6.0rc, for which I can only apologize...) Our R package (snpMatrix in http://www-gene.cimr.cam.ac.uk/clayton/software/) is broken rather badly in 2.6.0 ; I have fixed most of it now so a new release is imminent; but I'd like to mention a few things, mostly to summarize my experience and hopefully the 'writing R extensions' document can be updated to reflect some of this... 1) We created and bundled some data in the past in the 2.2 to 2.5 time frame (well, 18 months in reality); most of them triggers a warning 'pre-2.4.0 S4 objects detected... consider recreating...' a) I could fix all of them with just 'a <- asS4(a)' and save() (they are relatively simple objects just missing the S4 object bit flag) b) I am surprised one of them were actually saved from 2.5 - our buggy code no doubt, see below. We never noticed we didn't do SET_S4_OBJECT() in our C code nor asS4() in our R code until this week. Obviously we were mistakenly relying on the S4 method dispatch on S3 objects, which were withdrawn in 2.6.0... 2) I am surprised that 'class(a)' can read S4 class names, but 'class(a)<-' does not set the S4 object bit. I suppose the correct way would be to do new(...)? This needs to be written down somewhere... The asymmetry is somewhat surprising though. 3) We have some C code which branches depending on the S4 class. The R extension doc didn't explain that one needs to do R_data_class() rather than classgets() (or 'getAttrib(x, RClassSymbol)') to retrieve S4 classes; further more, R_data_class() is not part of the public API, and I only found it by looking at the C code of 'class()' (do_class()). But R_data_class() is part of exposed binary interface and the methods package certainly uses it; isn't it time to make it part of the public API? In any case, I think a way of retrieving the S4 class in C is needed.
Yes, or at the least instructions to handle the case of a NULL class attribute, but a macro would be good.
4) The documentation is missing a fair part - specifically, I need to be able to read and write the S4 class attribute... so R_data_class() needs to be documented and exposed as part of the public API (and included in the Rinternals.h include), and the recommended way of making an S4 object in C? I found classgets() + SET_S4_OBJECT() seem to work, but I'd like an authoritative answer... 5) I am finding 'class()<-' + asS4() in R and classgets()+ SET_S4_OBJECT() in C combo's a bit awkward. Is there any reasons why class<- or classgets() (or if there is a more 'correct' API to use for S4) cannot automatically set the S4 bit if the name is a known S4 class? Thanks for reading so far... Hin-Tak
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Martin Morgan Computational Biology Shared Resource Director Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (208) 667-2793
Martin Morgan wrote:
But both 'new' and 'as' appear to produce invalid (in a different sense, I guess) objects:
setClass("snp", contains="raw",
+ validity=function(object) {
+ if (length(object) < 1) "too short"
+ else TRUE
+ })
Well, you _have_ designed a class with an invalid prototype (as determined by your own validity function). :-)
Bj?rn-Helge Mevik
bhs2 at mevik.net (Bj?rn-Helge Mevik) writes:
Martin Morgan wrote:
But both 'new' and 'as' appear to produce invalid (in a different sense, I guess) objects:
setClass("snp", contains="raw",
+ validity=function(object) {
+ if (length(object) < 1) "too short"
+ else TRUE
+ })
Well, you _have_ designed a class with an invalid prototype (as determined by your own validity function). :-)
Yeah, its true I did, but the software let me get away with it. Even with a valid protoytpe I can as(raw(), "snp"). Martin
-- Bj?rn-Helge Mevik
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel