Andrew Piskorski
2014-Apr-21 15:53 UTC
[Rd] read.table() code fails outside of the utils package
One of the great things about R is how readable and re-usable much of its own implementation is. If an R function doesn't do quite what you want but is close, it is usually very easy to read its code and start adapting that as the base for a modified version. In the 2.x versions of R, that was the case with read.table(). It was easy to experiment with its source code, as it all worked just fine when run at the top level or from inside any other package. In R 3.1.0, that is no longer true. The read.table() source ONLY works when run from inside the "utils" package. The (only) culprit is this: .External(C_readtablehead, file, 1L, comment.char, blank.lines.skip, quote, sep, skipNul) Older versions of read.table() instead did this, which ran fine from any package; this entry point no longer exists: .Internal(readTableHead(file, nlines, comment.char, blank.lines.skip, quote, sep)) The C implementation of readTableHead is in utils.so, but the symbol is marked as local. I tried adding "attribute_visible" to its function definition in "src/library/utils/src/io.c" and recompiling, which DID make the symbol globally visible. With that change, my own C code works just fine when calling readTableHead. But interestingly, R code using .External() like this still fails: .External("readtablehead", ..., PACKAGE="utils") Error: "readtablehead" not available for .External() for package "utils" Why is that? Apparently the C symbol being visible isn't enough, but what else is needed for .External() to work? (Clearly there's something here about how R C programming works that I don't understand.) Finally, since it is generally useful to be able to experiment with and re-use parts of the stock read.table() implementation, I suggest: 1. R add "attribute_visible" or otherwise make readtablehead callable from user C code. 2. R make readtablehead callable from user R code via .External(). What do you think? Note that I'm not asking that the current interface or behavior of readtablehead necessarily be SUPPORTED in any way, just that it be callable for experimental purposes, much as the old .Internal(readTableHead()) was in earlier versions of R. -- Andrew Piskorski <atp at piskorski.com>
Simon Urbanek
2014-Apr-21 16:43 UTC
[Rd] read.table() code fails outside of the utils package
Andrew, On Apr 21, 2014, at 11:53 AM, Andrew Piskorski <atp at piskorski.com> wrote:> One of the great things about R is how readable and re-usable much of > its own implementation is. If an R function doesn't do quite what you > want but is close, it is usually very easy to read its code and start > adapting that as the base for a modified version. > > In the 2.x versions of R, that was the case with read.table(). It was > easy to experiment with its source code, as it all worked just fine > when run at the top level or from inside any other package. > > In R 3.1.0, that is no longer true. The read.table() source ONLY works > when run from inside the "utils" package. The (only) culprit is this: > > .External(C_readtablehead, file, 1L, comment.char, blank.lines.skip, quote, sep, skipNul) > > Older versions of read.table() instead did this, which ran fine from > any package; this entry point no longer exists: > > .Internal(readTableHead(file, nlines, comment.char, blank.lines.skip, quote, sep)) > > The C implementation of readTableHead is in utils.so, but the symbol > is marked as local.And that's how it should be - there is not reason why any other code should link to it. Why don't you just use .External(utils:::C_readtablehead, ...) if you need to call it? Cheers, Simon> .External(C_readtablehead, file, 1L, comment.char, blank.lines.skip, quote, sep, skipNul) > I tried adding "attribute_visible" to its > function definition in "src/library/utils/src/io.c" and recompiling, > which DID make the symbol globally visible. With that change, my own > C code works just fine when calling readTableHead. But interestingly, > R code using .External() like this still fails: > > .External("readtablehead", ..., PACKAGE="utils") > Error: "readtablehead" not available for .External() for package "utils" > > Why is that? Apparently the C symbol being visible isn't enough, but > what else is needed for .External() to work? > (Clearly there's something here about how R C programming works that I > don't understand.) > > Finally, since it is generally useful to be able to experiment with > and re-use parts of the stock read.table() implementation, I suggest: > > 1. R add "attribute_visible" or otherwise make readtablehead callable > from user C code. > 2. R make readtablehead callable from user R code via .External(). > > What do you think? Note that I'm not asking that the current > interface or behavior of readtablehead necessarily be SUPPORTED in any > way, just that it be callable for experimental purposes, much as the > old .Internal(readTableHead()) was in earlier versions of R. > > -- > Andrew Piskorski <atp at piskorski.com> > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >