Hello, I?m experimenting with ALTREP and was wondering if there is a preferred way to create an ALTREP wrapper vector without using .Internal(wrap_meta(?)), which R CMD check doesn?t like since it uses an .Internal() function. I was trying to create a factor that used an ALTREP integer, but attempting to set the class and levels attributes always ended up duplicating and materializing the integer vector. Using the wrapper avoided this issue. Here is my initial ALTREP integer vector:> fc0 <- factor(c("a", "a", "b")) > > y <- matter::as.matter(as.integer(fc0)) > y <- matter:::as.altrep(y) > > .Internal(inspect(y))@7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, mem=0) Here is what I get without a wrapper:> fc1 <- structure(y, class="factor", levels=levels(x)) > .Internal(inspect(fc1))@7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2 ATTRIB: @7fb0ce771868 02 LISTSXP g0c0 [] TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has value) @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0) @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "factor" TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels" (has value) @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0) @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a" @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b" Here is what I get with a wrapper:> fc2 <- structure(.Internal(wrap_meta(y, 0, 0)), class="factor", levels=levels(x)) > .Internal(inspect(fc2))@7fb0ce764630 13 INTSXP g0c0 [OBJ,NAM(2),ATT] wrapper [srt=0,no_na=0] @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, mem=0) ATTRIB: @7fb0ce764668 02 LISTSXP g0c0 [] TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has value) @7fb0c9fcb010 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0) @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "factor" TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels" (has value) @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0) @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a" @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b" Is there a way to do this that doesn?t rely on .Internal() and won?t produce R CMD check warnings? ~~~ Kylie Ariel Bemis Khoury College of Computer Sciences Northeastern University kuwisdelu.github.io<https://kuwisdelu.github.io> [[alternative HTML version deleted]]
Hi Kylie, For your question, I don't think a wrapper can completely solve your problem. The duplication occurs since your variable y has more than 1 reference number( Please see highlighted), so even you have a wrapper, any changes on the value of the wrapper still can trigger the duplication.> .Internal(inspect(y)) > @7fb0ce78c0f0 13 INTSXP g0c0 *[NAM(7)]* matter vector (mode=3, len=3, > mem=0)My guess is that *matter:::as.altrep* function assigned the variable *y* to a local variable so that it increases the reference number. For example: *This would not cause a duplication*> > a=c(1,2,3) > > .Internal(inspect(a)) > @0x000000002384f530 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3 > > attr(a,"dim")=c(1,3) > > .Internal(inspect(a)) > @0x000000002384f530 14 REALSXP g0c3 [NAM(1),ATT] (len=3, tl=0) 1,2,3 > ATTRIB: > @0x0000000023864b58 02 LISTSXP g0c0 [] > TAG: @0x00000000044b1a90 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "dim" > (has value) > @0x000000002384cb48 13 INTSXP g0c1 [NAM(7)] (len=2, tl=0) 1,3 >*This would cause a duplication, even though the function test does nothing.*> > test<-function(x) x1=x > > a=c(1,2,3) > > .Internal(inspect(a)) > @0x000000002384f260 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3 > > test(a) > > .Internal(inspect(a)) > @0x000000002384f260 14 REALSXP g0c3 [NAM(7)] (len=3, tl=0) 1,2,3 > > attr(a,"dim")=c(1,3) > > .Internal(inspect(a)) > @0x000000002384f0d0 14 REALSXP g0c3 [NAM(1),ATT] (len=3, tl=0) 1,2,3 > ATTRIB: > @0x00000000238666c0 02 LISTSXP g0c0 [] > TAG: @0x00000000044b1a90 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "dim" > (has value) > @0x000000002384c6e8 13 INTSXP g0c1 [NAM(7)] (len=2, tl=0) 1,3 >If that is the case and you are 100% sure the reference number should be 1 for your variable *y*, my solution is to call *SET_NAMED *in C++ to reset the reference number. Note that you need to unbind your local variable before you reset the number. To return an unbound SEXP, the C++ function should be placed at the end of your *matter:::as.altrep *function. I don't know if there is any simpler way to do that and I'll be happy to see any opinion. Also, I notice that you are using ALTREP to create a wrapper for your *matter_vec *class. I'm an author of AltWrapper package and the package is able to define an ALTREP in pure R level, it is capable to add an attribute to ALTREP object when creating the object and has a correct reference number. The simplest example would be *CODE* ``` library(AltWrapper) inspectFunc <- function(x) cat("Altrep object\n") lengthFunc <- function(x) return(length(x)) getPtrFunc <- function(x, writeable) return(x) setAltClass(className = "test", classType = "real") setAltMethod(className = "test", inspect = inspectFunc) setAltMethod(className = "test", getLength = lengthFunc) setAltMethod(className = "test", getDataptr = getPtrFunc) A = runif(6) A_alt = makeAltrep(className = "test", x = A, *attributes = list(dim = c(2, 3))*) ``` *RESULT* ```> .Internal(inspect(A_alt))@0x000000002385ac00 14 REALSXP g0c0 [NAM(1),ATT] Altrep object ATTRIB: @0x000000002385a8b8 02 LISTSXP g0c0 [] TAG: @0x00000000044b1a90 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "dim" (has value) @0x000000002384d590 13 INTSXP g0c1 [NAM(7)] (len=2, tl=0) 2,3> A_alt[,1] [,2] [,3] [1,] 0.9430458 0.548670 0.4148741 [2,] 0.9550899 0.251857 0.6077540 ``` I will be happy to talk more about it if you are interested in the package, it is available at https://github.com/Jiefei-Wang/AltWrapper Best, Jiefei On Thu, Jul 18, 2019 at 3:28 AM Bemis, Kylie <k.bemis at northeastern.edu> wrote:> Hello, > > I?m experimenting with ALTREP and was wondering if there is a preferred > way to create an ALTREP wrapper vector without using > .Internal(wrap_meta(?)), which R CMD check doesn?t like since it uses an > .Internal() function. > > I was trying to create a factor that used an ALTREP integer, but > attempting to set the class and levels attributes always ended up > duplicating and materializing the integer vector. Using the wrapper avoided > this issue. > > Here is my initial ALTREP integer vector: > > > fc0 <- factor(c("a", "a", "b")) > > > > y <- matter::as.matter(as.integer(fc0)) > > y <- matter:::as.altrep(y) > > > > .Internal(inspect(y)) > @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, mem=0) > > Here is what I get without a wrapper: > > > fc1 <- structure(y, class="factor", levels=levels(x)) > > .Internal(inspect(fc1)) > @7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2 > ATTRIB: > @7fb0ce771868 02 LISTSXP g0c0 [] > TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has > value) > @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0) > @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] > "factor" > TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels" > (has value) > @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0) > @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a" > @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b" > > Here is what I get with a wrapper: > > > fc2 <- structure(.Internal(wrap_meta(y, 0, 0)), class="factor", > levels=levels(x)) > > .Internal(inspect(fc2)) > @7fb0ce764630 13 INTSXP g0c0 [OBJ,NAM(2),ATT] wrapper [srt=0,no_na=0] > @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, > mem=0) > ATTRIB: > @7fb0ce764668 02 LISTSXP g0c0 [] > TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has > value) > @7fb0c9fcb010 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0) > @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] > "factor" > TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels" > (has value) > @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0) > @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a" > @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b" > > Is there a way to do this that doesn?t rely on .Internal() and won?t > produce R CMD check warnings? > > ~~~ > Kylie Ariel Bemis > Khoury College of Computer Sciences > Northeastern University > kuwisdelu.github.io<https://kuwisdelu.github.io> > > > > > > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Hi Jiefei and Kylie, Great to see people engaging with the ALTREP framework and identifying places we may need more tooling. Comments inline. On Thu, Jul 18, 2019 at 12:22 PM King Jiefei <szwjf08 at gmail.com> wrote:> > If that is the case and you are 100% sure the reference number should be 1 > for your variable *y*, my solution is to call *SET_NAMED *in C++ to reset > the reference number. Note that you need to unbind your local variable > before you reset the number. To return an unbound SEXP, the C++ function > should be placed at the end of your *matter:::as.altrep *function. I don't > know if there is any simpler way to do that and I'll be happy to see any > opinion. >So as far as I know, manually setting the NAMED value on any SEXP the garbage collector is aware of is a direct violation of C-API contract and not something that package code should ever be doing. Its not at all clear to me that you can *ever* be 100% sure that the reference number should be 1 when it is not currently one for an R object that exists at the R-level (as opposed to only in pure C code). Sure, maybe the object is created within the body of your R function instead of being passed in, but what if someone is debugging your function and assigns the value to the global environment using <<- for later inspection; now you have an invalidly low NAMED value, ie you have a segfault coming. I know of no way for you to prevent this or even know it has happened.> On Thu, Jul 18, 2019 at 3:28 AM Bemis, Kylie <k.bemis at northeastern.edu> > wrote: > > > Hello, > > > > I?m experimenting with ALTREP and was wondering if there is a preferred > > way to create an ALTREP wrapper vector without using > > .Internal(wrap_meta(?)), which R CMD check doesn?t like since it uses an > > .Internal() function. >So there is the .doSortWrap (and its currently inexplicably identical clone .doWrap) function in base, which is an R level function that calls down to .Internal(wrap_meta(...)), which you can use, but it doesn't look general enough for what I think you need (it was written for things that have just been sorted, thus the name). Specifically, its not able to indicate that things are of unknown sortedness as currently written. If matter vectors are guaranteed to be sorted for some reason, though, you can use this. I'll talk to Luke about whether we want to generalize this, it would be easy to have this support the full space of metadata for wrappers and be a general purpose wrapper-maker, but that isn't what it is right now. At the C-level, it looks like we do make R_tryWrap available (it appears in Rinternals.h, and not within a USE_RINTERNALS section),so you can call that from your own C(++) code. This creates a wrapper that has no metadata on it (or rather it has metadata but the metadata indicates that no special info is known about the vector).> > > I was trying to create a factor that used an ALTREP integer, but > > attempting to set the class and levels attributes always ended up > > duplicating and materializing the integer vector. Using the wrapper > avoided > > this issue. > > > > Here is my initial ALTREP integer vector: > > > > > fc0 <- factor(c("a", "a", "b")) > > > > > > y <- matter::as.matter(as.integer(fc0)) > > > y <- matter:::as.altrep(y) > > > > > > .Internal(inspect(y)) > > @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, > mem=0) > > > > Here is what I get without a wrapper: > > > > > fc1 <- structure(y, class="factor", levels=levels(x)) > > > .Internal(inspect(fc1)) > > @7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2 > > ATTRIB: > > @7fb0ce771868 02 LISTSXP g0c0 [] > > TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has > > value) > > @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0) > > @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] > > "factor" > > TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] > "levels" > > (has value) > > @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0) > > @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a" > > @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b" > > > > Here is what I get with a wrapper: > > > > > fc2 <- structure(.Internal(wrap_meta(y, 0, 0)), class="factor", > > levels=levels(x)) > > > .Internal(inspect(fc2)) > > @7fb0ce764630 13 INTSXP g0c0 [OBJ,NAM(2),ATT] wrapper [srt=0,no_na=0] > > @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, > > mem=0) > > ATTRIB: > > @7fb0ce764668 02 LISTSXP g0c0 [] > > TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has > > value) > > @7fb0c9fcb010 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0) > > @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] > > "factor" > > TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] > "levels" > > (has value) > > @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0) > > @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a" > > @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b" > > > > Is there a way to do this that doesn?t rely on .Internal() and won?t > > produce R CMD check warnings? > > > > ~~~ > > Kylie Ariel Bemis > > Khoury College of Computer Sciences > > Northeastern University > > kuwisdelu.github.io<https://kuwisdelu.github.io> > > > > > > > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]