Michael Sannella
2018-Oct-22 19:52 UTC
[Rd] v3 serialization of compact_intseq altrep should write modified data
Experimenting with altrep objects and v3 serialization, I discovered a
possible bug. Calling DATAPTR on a compact_intseq object returns a
pointer to the expanded integer sequence in memory. If you modify
this data, the object values appear to be changed. However, if the
compact_intseq object is then serialized (with version=3), only the
original integer sequence info is written.
For example, suppose I have compiled and loaded the following C code:
SEXP set_intseq_data(SEXP x)
{
void* ptr = DATAPTR(x);
((int*)ptr)[3] = 1234;
return R_NilValue;
}
I see the following behavior in R 3.5.1:
> x <- 1:10
> x
[1] 1 2 3 4 5 6 7 8 9 10
> .Call("set_intseq_data", x)
NULL
> x
[1] 1 2 3 1234 5 6 7 8 9 10
> save(x, file="temp.rda", version=3)
> load(file="temp.rda")
> x
[1] 1 2 3 4 5 6 7 8 9 10
>
I would have expected the modified vector data to be serialized to the
file, and be restored when it is loaded.
~~ Michael Sannella
[[alternative HTML version deleted]]
Tierney, Luke
2018-Oct-22 21:48 UTC
[Rd] v3 serialization of compact_intseq altrep should write modified data
Try this C code:
SEXP set_intseq_data(SEXP x)
{
if (MAYBE_SHARED(x))
error("Oops, not supposed to do this!");
void* ptr = DATAPTR(x);
((int*)ptr)[3] = 1234;
return R_NilValue;
}
Lots of things will break if you modify objects that have been marked
as immutable (and hence where MAYBE_SHARED returns TRUE).
For now the implementation of compact sequences marks them as
immutable and so assumes the expanded version will not be changed.
That implementation detail might be changed at some point but C code
should not make assumptions.
Best,
luke
On Mon, 22 Oct 2018, Michael Sannella via R-devel wrote:
> Experimenting with altrep objects and v3 serialization, I discovered a
> possible bug. Calling DATAPTR on a compact_intseq object returns a
> pointer to the expanded integer sequence in memory. If you modify
> this data, the object values appear to be changed. However, if the
> compact_intseq object is then serialized (with version=3), only the
> original integer sequence info is written.
>
> For example, suppose I have compiled and loaded the following C code:
> SEXP set_intseq_data(SEXP x)
> {
> void* ptr = DATAPTR(x);
> ((int*)ptr)[3] = 1234;
> return R_NilValue;
> }
>
> I see the following behavior in R 3.5.1:
> > x <- 1:10
> > x
> [1] 1 2 3 4 5 6 7 8 9 10
> > .Call("set_intseq_data", x)
> NULL
> > x
> [1] 1 2 3 1234 5 6 7 8 9 10
> > save(x, file="temp.rda", version=3)
> > load(file="temp.rda")
> > x
> [1] 1 2 3 4 5 6 7 8 9 10
> >
>
> I would have expected the modified vector data to be serialized to the
> file, and be restored when it is loaded.
>
> ~~ Michael Sannella
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
Actuarial Science
241 Schaeffer Hall email: luke-tierney at uiowa.edu
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
Maybe Matching Threads
- [External] undefined symbol errors when compiling package using ALTREP API
- [External] undefined symbol errors when compiling package using ALTREP API
- ALTREP: Design concept of alternative string
- ALTREP: Design concept of alternative string
- [External] undefined symbol errors when compiling package using ALTREP API