Duncan Murdoch
2016-Dec-16 18:29 UTC
[Rd] Upgrading a package to which other packages are LinkingTo
On 16/12/2016 12:35 PM, Karl Millar wrote:> A couple of points: > - rebuilding dependent packages is needed if there is an ABI change, > not just an API change. For packages like Rcpp which export inline > functions or macros that might have changed, this is potentially any > change to existing functions, but for packages like Matrix, it isn't > really an issue at all IIUC.This is why someone else needs to do this, not me. I know the three words that ABI stands for, but not what they mean in practice.> > - If we're looking into a way to check if package APIs are > compatible, then that's something that's relevant for all packages, > since they all export an R API. I believe that CRAN only tests > package compatibility with the most recent versions of packages on > CRAN that import or depend on it. There's no guarantee that a package > update won't contain API or behaviour changes that breaks older > versions of packages, packages not on CRAN or any scripts that use the > package, and these sorts of breakages do happen semi-regularly.That's correct.> > - AFAICT, the only difference with packages like Rcpp is that you can > potentially have all of your CRAN packages at the latest version, but > some of them might have inlined code from an older version of Rcpp > even after running update.packages(). While that is an issue, in my > experience that's been a lot less trouble than the general case of > backwards compatibility.I think that's an important difference. Package authors can play nicely with each other and keep their sources completely compatible, and package users can still end up with broken libraries that aren't fixed by anything simpler than re-installing everything. We do have (imperfect) processes in place to help with the general compatibility problem, but nothing to help with this one. Duncan Murdoch> > Karl > > On Fri, Dec 16, 2016 at 8:19 AM, Dirk Eddelbuettel <edd at debian.org> wrote: >> >> On 16 December 2016 at 11:00, Duncan Murdoch wrote: >> | On 16/12/2016 10:40 AM, Dirk Eddelbuettel wrote: >> | > On 16 December 2016 at 10:14, Duncan Murdoch wrote: >> | > | On 16/12/2016 8:37 AM, Dirk Eddelbuettel wrote: >> | > | > >> | > | > On 16 December 2016 at 08:20, Duncan Murdoch wrote: >> | > | > | Perhaps the solution is to recommend that packages which export their >> | > | > | C-level entry points either guarantee them not to change or offer >> | > | > | (require?) version checks by user code. So dplyr should start out by >> | > | > | saying "I'm using Rcpp interface 0.12.8". If Rcpp has a new version >> | > | > | with a compatible interface, it replies "that's fine". If Rcpp has >> | > | > | changed its interface, it says "Sorry, I don't support that any more." >> | > | > >> | > | > We try. But it's hard, and I'd argue, likely impossible. >> | > | > >> | > | > For example I even added a "frozen" package [1] in the sources / unit tests >> | > | > to test for just this. In practice you just cannot hit every possible access >> | > | > point of the (rich, in our case) API so the tests pass too often. >> | > | > >> | > | > Which is why we relentlessly test against reverse-depends to _at least ensure >> | > | > buildability_ from our releases. >> | > >> | > I meant to also add: "... against a large corpus of other packages." >> | > The intent is to empirically answer this. >> | > >> | > | > As for seamless binary upgrade, I don't think in can work in practice. Ask >> | > | > Uwe one day we he rebuilds everything every time on Windows. And for what it >> | > | > is worth, we essentially do the same in Debian. >> | > | > >> | > | > Sometimes you just need to rebuild. That may be the price of admission for >> | > | > using the convenience of rich C++ interfaces. >> | > | > >> | > | >> | > | Okay, so would you say that Kirill's suggestion is not overkill? Every >> | > | time package B uses LinkingTo: A, R should assume it needs to rebuild B >> | > | when A is updated? >> | > >> | > Based on my experience is a "halting problem" -- i.e. cannot know ex ante. >> | > >> | > So "every time" would be overkill to me. Sometimes you know you must >> | > recompile (but try to be very prudent with public-facing API). Many times >> | > you do not. It is hard to pin down. >> | > >> | > At work we have a bunch of servers with Rcpp and many packages against them >> | > (installed system-wide for all users). We _very really_ needs rebuild. >> >> Edit: "We _very rarely_ need rebuilds" is what was meant there. >> >> | So that comes back to my suggestion: you should provide a way for a >> | dependent package to ask if your API has changed. If you say it hasn't, >> | the package is fine. If you say it has, the package should abort, >> | telling the user they need to reinstall it. (Because it's a hard >> | question to answer, you might get it wrong and say it's fine when it's >> | not. But that's easy to fix: just make a new release that does require >> >> Sure. >> >> We have always increased the higher-order version number when that is needed. >> >> One problem with your proposal is that the testing code may run after the >> package load, and in the case where it matters ... that very code may not get >> reached because the package didn't load. >> >> Dirk >> >> -- >> http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel
Gábor Csárdi
2016-Dec-16 21:27 UTC
[Rd] Upgrading a package to which other packages are LinkingTo
I think that this problem is actually more general than just ABI versioning. The common definition of ABI refers to compiled code, but with R packages similar problems might happen (and they to happen) without any compiled code. I think the key issue is the concept of build-time dependencies. While R packages usually does not distinguish between build-time and run-time dependencies, they still do exist, and I think ideally we would need to treat them differently. AFAIK LinkingTo is the only form of a build-time dependency, that is completely explicit, so it is relatively easy to handle. The other frequent of build-time dependency is a function call to the other package, that happens at install time. E.g. with references or R6* classes you frequently include code like this in yourpackage: myclass <- R6::R6Class(...) and this code is evaluated at install time. So if the R6 package is updated, the installed version myclass in yourpackage is not affected at all. In fact, if the new version of R6 is not compatible with the myclass object created by the old version, then yourpackage will be broken. (This AFAIK cannot happen with R6, so it is not the best example, but it can happen in other similar cases.) The key here is that R6 is a build-time dependency of yourpackage, similarly to packages linking to (i.e. LinkingTo) Rpp. Another possible type of build-time dependency is if you put objects from another package in yourpackage. E.g. myfun <- otherpkg::fun Then a copy of otherpkg::fun will be saved in yourpackage. If you install a new version of otherpkg, yourpackage is unaffected, and if otherpkg::fun uses some (possibly internal) API from otherpkg, that has changed in the new version of otherpkg, you might easily end up with a broken yourpackage again. I think one lesson is to avoid running code at install time. This is not a new thing, AFAIR it is even mentioned in 'Writing R extensions'. Instead of running code at install time, you might consider running it in `.onLoad()`, and then these "problems" go away. But you obviously cannot always avoid it. Gabor * I think the R6 package is great, and I am not speaking in any way against it. I just needed an example, and I know R6 much better than reference classes, or other similar packages.
Gábor Csárdi
2017-Jan-25 14:04 UTC
[Rd] Upgrading a package to which other packages are LinkingTo
FWIW I wrote a tool that tests which dependencies of a package are build-time dependencies: https://github.com/r-hub/builddeps It is not very smart, just "brute-force", really. It tries to install the package several times, leaving out one dependency at a time, and if the installation fails, then the missing package is a build-time dependency. (First it tries with the LinkingTo dependencies only, and if that succeeds, then these are the only build time dependncies.) It does download all dependent packages, and runs R CMD install several times, so it is expensive. It is better to run it with binary packages. It is mostly trivial, except that 1) it needs to edit DESCRIPTION and NAMESPACE to omit a dependency. DESCRIPTION is easy, NAMESPACE somewhat more difficult, because there is a parser for it, but no "writer". 2) the dependencies need to be considered in a topological order, otherwise one gets wrong answers. I wrote this mainly for R-hub, to know which binary packages need to be rebuilt after a package update, but if you use it and have feedback, please email me or open an issue in the GitHub repo. Gabor On Fri, Dec 16, 2016 at 9:27 PM, G?bor Cs?rdi <csardi.gabor at gmail.com> wrote:> I think that this problem is actually more general than just ABI > versioning. The common definition of ABI refers to compiled code, but > with R packages similar problems might happen (and they to happen) > without any compiled code. > > I think the key issue is the concept of build-time dependencies. While > R packages usually does not distinguish between build-time and > run-time dependencies, they still do exist, and I think ideally we > would need to treat them differently. > > AFAIK LinkingTo is the only form of a build-time dependency, that is > completely explicit, so it is relatively easy to handle. The other > frequent of build-time dependency is a function call to the other > package, that happens at install time. E.g. with references or R6* > classes you frequently include code like this in yourpackage: > > myclass <- R6::R6Class(...) > > and this code is evaluated at install time. So if the R6 package is > updated, the installed version myclass in yourpackage is not affected > at all. In fact, if the new version of R6 is not compatible with the > myclass object created by the old version, then yourpackage will be > broken. (This AFAIK cannot happen with R6, so it is not the best > example, but it can happen in other similar cases.) > > The key here is that R6 is a build-time dependency of yourpackage, > similarly to packages linking to (i.e. LinkingTo) Rpp. > > Another possible type of build-time dependency is if you put objects > from another package in yourpackage. E.g. > > myfun <- otherpkg::fun > > Then a copy of otherpkg::fun will be saved in yourpackage. If you > install a new version of otherpkg, yourpackage is unaffected, and if > otherpkg::fun uses some (possibly internal) API from otherpkg, that > has changed in the new version of otherpkg, you might easily end up > with a broken yourpackage again. > > I think one lesson is to avoid running code at install time. This is > not a new thing, AFAIR it is even mentioned in 'Writing R extensions'. > Instead of running code at install time, you might consider running it > in `.onLoad()`, and then these "problems" go away. But you obviously > cannot always avoid it. > > Gabor > > * I think the R6 package is great, and I am not speaking in any way > against it. I just needed an example, and I know R6 much better than > reference classes, or other similar packages. >[[alternative HTML version deleted]]