Hello, I would like for my C function to be able to manipulate some values stored in an R data frame. To achieve this, a need the (real) memory address where the R data frame stores its data (hopefully in a contiguous way). Then, from R, I call the C function and passing this memory address as a parameter. The question: how can we get the memory address of the R data frame? Thank you! L.
Hi Lille, Is it possible you're looking for tracemem() or inspect() ?> x <- data.frame(z = 1:10)> tracemem(x)[1] "<0x55aa743e0bc0>"> x[1] <- 2Ltracemem[0x55aa743e0bc0 -> 0x55aa778f6ad0]:tracemem[0x55aa778f6ad0 -> 0x55aa778f6868]: [<-.data.frame [<- tracemem[0x55aa778f6868 -> 0x55aa778f5b48]: [<-.data.frame [<-> .Internal(inspect(x)) @55aa743e0bc0 19 VECSXP g0c1[OBJ,MARK,NAM(7),TR,ATT] (len=1, tl=0) @55aa7440d420 13 INTSXP g0c0 [MARK,NAM(7)] 1 : 10 (compact) ATTRIB: @55aa743f9ea0 02 LISTSXP g0c0 [MARK] TAG: @55aa72ac98a0 01 SYMSXP g0c0 [MARK,NAM(7),LCK,gp=0x6000] "names" (has value) @55aa743e0fb0 16 STRSXP g0c1 [MARK,NAM(7)] (len=1, tl=0) @55aa72be1c70 09 CHARSXP g0c1 [MARK,gp=0x61] [ASCII] [cached] "z" TAG: @55aa72ac9d70 01 SYMSXP g0c0 [MARK,NAM(7),LCK,gp=0x4000] "class" (has value) @55aa73ca59b8 16 STRSXP g0c1 [MARK,NAM(7)] (len=1, tl=0) @55aa72b562b8 09 CHARSXP g0c2 [MARK,gp=0x61,ATT] [ASCII] [cached] "data.frame" TAG: @55aa72ac9670 01 SYMSXP g0c0 [MARK,NAM(7),LCK,gp=0x4000] "row.names" (has value) @55aa743e1c98 13 INTSXP g0c1 [MARK,NAM(7)] (len=2, tl=0) -2147483648,-10 On Thu, Jan 9, 2020 at 6:48 AM lille stor <lille.stor at gmx.com> wrote:> Hello, > > I would like for my C function to be able to manipulate some values stored > in an R data frame. > > To achieve this, a need the (real) memory address where the R data frame > stores its data (hopefully in a contiguous way). Then, from R, I call the C > function and passing this memory address as a parameter. > > The question: how can we get the memory address of the R data frame? > > Thank you! > > L. > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Hi Lille, To my understanding, there's no need to get the actual memory address of the R data frame, as using .Call() or .External() can be used in a "call by reference" way as well. This would be contrary to standard R behaviour, so if you use that in a package, make sure you indicate this! There's a detailed explanation on how to deal with R objects in C code in the manual "Writing R extensions" here : https://cran.r-project.org/doc/manuals/R-exts.html#Handling-R-objects-in-C Especially check the section "Named objects and copying", which explains in more detail how to control the standard R behaviour. Also keep in mind that data frames are list-like structures, which are handled differently from atomic vectors. Hope this helps. Kind regards Joris On Thu, Jan 9, 2020 at 12:48 PM lille stor <lille.stor at gmx.com> wrote:> Hello, > > I would like for my C function to be able to manipulate some values stored > in an R data frame. > > To achieve this, a need the (real) memory address where the R data frame > stores its data (hopefully in a contiguous way). Then, from R, I call the C > function and passing this memory address as a parameter. > > The question: how can we get the memory address of the R data frame? > > Thank you! > > L. > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Joris Meys Statistical consultant Department of Data Analysis and Mathematical Modelling Ghent University Coupure Links 653, B-9000 Gent (Belgium) <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g> ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]
Hello Lille, raw data of a data.frame (or more precisely a list, because data.frame is just a list with "data.frame" class) is an array of R specific data structures (SEXP), so a generic C function will not be able to work with them. As a per-processing step, you may allocate an array for the pointers to the raw data of the columns yourself (there will be hopefully only a few compared to the size of the columns themselves). For this you'll need functions VECTOR_ELT to access the columns and DATAPTR to get their raw data (eventually TYPEOF to find out their type). Note that this won't work for a data frame that contains another list. If this memory layout doesn't work for you, then you may need to copy the whole data frame. If you want to update the data from C, then keep in mind that 1) R vectors have value semantics and you should not be altering raw data of any vector unless you know that its not referenced from anywhere else -- otherwise you should make a copy, alter that copy instead and return it as the result from your C function. 2) R has generational garbage collector, so it *must* know about references between R objects and so you should use SET_VECTOR_ELT to update the data of a list (some would say that you can update the raw data if you really understand how the GC and R internals work, I would say: just don't) Best, Stepan On 09. 01. 20 12:48, lille stor wrote:> Hello, > > I would like for my C function to be able to manipulate some values stored in an R data frame. > > To achieve this, a need the (real) memory address where the R data frame stores its data (hopefully in a contiguous way). Then, from R, I call the C function and passing this memory address as a parameter. > > The question: how can we get the memory address of the R data frame? > > Thank you! > > L. > > ______________________________________________ > R-devel at r-project.org mailing list > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=neKFCw86thQe2E2-61NAgpDMw4cC7oD_tUTTzraOkQM&m=ob3rEYy-Pk9cOE-VcE6_0TaHPYjGJ4kHYZru_jqXf38&s=AV2V5CyECZzyfSMZdViD_co5mAGurLNEu4jhA_CTDsk&e=
On 1/9/20 1:03 PM, Ezra Tucker wrote:> Hi Lille, > > Is it possible you're looking for tracemem() or inspect() ?Please note these functions are only for debugging. They should never be called from programs or packages. One should never try to manipulate pointers from R directly or even hold them (except for what "external pointer" objects allow and is described in Writing R Extensions). Tomas> >> x <- data.frame(z = 1:10)> tracemem(x)[1] "<0x55aa743e0bc0>" >> x[1] <- 2Ltracemem[0x55aa743e0bc0 -> 0x55aa778f6ad0]: > tracemem[0x55aa778f6ad0 -> 0x55aa778f6868]: [<-.data.frame [<- > tracemem[0x55aa778f6868 -> 0x55aa778f5b48]: [<-.data.frame [<- > >> .Internal(inspect(x)) @55aa743e0bc0 19 VECSXP g0c1 > [OBJ,MARK,NAM(7),TR,ATT] (len=1, tl=0) @55aa7440d420 13 INTSXP g0c0 > [MARK,NAM(7)] 1 : 10 (compact) ATTRIB: @55aa743f9ea0 02 LISTSXP g0c0 [MARK] > TAG: @55aa72ac98a0 01 SYMSXP g0c0 [MARK,NAM(7),LCK,gp=0x6000] "names" (has > value) @55aa743e0fb0 16 STRSXP g0c1 [MARK,NAM(7)] (len=1, tl=0) > @55aa72be1c70 09 CHARSXP g0c1 [MARK,gp=0x61] [ASCII] [cached] "z" TAG: > @55aa72ac9d70 01 SYMSXP g0c0 [MARK,NAM(7),LCK,gp=0x4000] "class" (has > value) @55aa73ca59b8 16 STRSXP g0c1 [MARK,NAM(7)] (len=1, tl=0) > @55aa72b562b8 09 CHARSXP g0c2 [MARK,gp=0x61,ATT] [ASCII] [cached] > "data.frame" TAG: @55aa72ac9670 01 SYMSXP g0c0 [MARK,NAM(7),LCK,gp=0x4000] > "row.names" (has value) @55aa743e1c98 13 INTSXP g0c1 [MARK,NAM(7)] (len=2, > tl=0) -2147483648,-10 > > > > On Thu, Jan 9, 2020 at 6:48 AM lille stor <lille.stor at gmx.com> wrote: > >> Hello, >> >> I would like for my C function to be able to manipulate some values stored >> in an R data frame. >> >> To achieve this, a need the (real) memory address where the R data frame >> stores its data (hopefully in a contiguous way). Then, from R, I call the C >> function and passing this memory address as a parameter. >> >> The question: how can we get the memory address of the R data frame? >> >> Thank you! >> >> L. >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
On 09. 01. 20 15:41, lille stor wrote:> I believe this could be done without creating side effects (e.g. > crash) as we are just talking about changing values.that is exactly the issue that my last two points warn about. Example: a <- mtcars .Call("my_innocent_function", a) Would you expect that mtcars data.frame would be altered after this code is executed? What if some existing code relies on mtcars always containing the same data, which is a perfectly valid assumption given R specification. If what you are trying to do is to have mutable data frame, then this goes against the philosophy of R. You can get mutability with environments and other R types that are intentionally mutable and their mutability is documented. You can get data.frame mutability with the data.table package, but the tricks it's doing under the hood may bite back. In its source code you can also see how these things can be done, but unless you really need to, I would advise against implementing this yourself. Best, Stepan
On 1/9/20 06:56, Stepan wrote:> On 09. 01. 20 15:41, lille stor wrote: > >> I believe this could be done without creating side effects (e.g. >> crash) as we are just talking about changing values.A crash would certainly be an annoying "side effect" ;-) As Stepan explained, data.frame objects like most objects in R should never be modified in-place. If you're looking for a data-frame-like structure with a reference semantic where in-place modifications are allowed, please take a look at the data.table package. H. -- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319