Geoff Lee
2014-Dec-21 00:41 UTC
[Rd] loadNamespace and versionChecking and the otherpackage::otherfun syntax
This is an enquiry not so much about what the code for loadNamespace does, but rather about the intent and design of loadNamespace, and how it interacts with the `::` function, which seems to me to follow a slightly different philosophy. It is not an urgent question - the issue that started me wondering has been resolved another way - but I would like to complete my understanding of this aspect of how R's packaging mechanism is meant to operate. It's also rather a long query - if it's too long please don't waste your time - just ignore it. To try and make it slightly more digestible it is divided into sections, as follows SCENARIO THE QUERY MY OWN ATTEMPTS TO UNDERSTAND AN ASIDE ABOUT loadNamespace AND Depends: VERSION CHECKS ON otherpackage::otherfun AT LOAD TIME? VERSION CHECKS ON otherpackage::otherfun AT EXECUTION TIME? POSSIBLE ANSWER 1 - THIS IS TOO COMPLEX AND HYPOTHETICAL POSSIBLE ANSWER 2 - THIS IS AN ISSUE FOR THE `::` FUNCTION POSSIBLE ANSWER 3 - loadNamespace SHOULD BE VERSION AWARE EVEN WHEN `::` IS USED Many thanks in advance for any insights that are able to be offered. Geoff SCENARIO ======= The scenario is that `mypackage` uses an `otherpackage` via the the `otherpackage::otherfun` syntax, and that the version of `otherpackage` must be (say) (>= 2.0). The constraint on the version of `anotherpackage` is specified in the DESCRIPTION file of mypackage using either Imports or Depends. THE QUERY ======== The query is : a) Is it intended that loadNamespace should check the version of otherpackage (if one is specified) when it is `loadNamespace`ing mypackage? and if so b) Is it intended that the process of loading mypackage should ensure that the correct version of otherfun cannot be accidentally masked (for example by using .libPaths() to change the search path between the time when mypackage is loaded, and the time when the component of mypackage that calls otherpackage::otherfun is executed) ? MY OWN ATTEMPTS TO UNDERSTAND ============================ What I've done so far I've read the documentation I could find, stepped through loadNamespace using debugonce (several times) with toy packages to gains some insight into what loadNamespace does at the moment (using R3.1.2 on a Windows 64 bit machine), and read the loadNamespace code a few times (though I can't yet claim to follow all of its neat tricks and complexities). My understanding ths far is that a) loadNamespace learns about version constraints on `otherpackage` dependencies when it is processing the DESCRIPTION file, viz vI <- pkgInfo$Imports and b) defers the checking of these dependencies till later, when it is processing the NAMESPACE file (to create the imports:mypackage namespace/environment/frame which encloses the namespace:mypackage namespace/environment/frame). The checking occurs inside 3 loops, which all use an appropriate entry from vI as the versionCheck argument in a recursive call to loadNamespace, viz. for (i in nsInfo$imports) { ...etc... } for (imp in nsInfo$importClasses) ...etc... for (imp in nsInfo$importMethods) ...etc... AN ASIDE ABOUT loadNamespace AND Depends: ==================================== As an aside it appears that any version dependencies specified in the Depends field are overlooked - I suspect that loadNamespace might be more complete if there were something like vD <- pkgInfo$Depends vID <- c(vD, vI) #possibly with an unlist thrown in somewhere? and the versionCheck used vID instead of just vI VERSION CHECKS ON otherpackage::otherfun AT LOAD TIME? ============================================== The problem with this elegant recursive approach to checking the version of depended upon packages is that (as far as I know thus far) using the implicit loading syntax otherpackage::otherfun does not involve any entry in the mypackage NAMESPACE file, and hence loadNamespace never checks the version specification for otherpackage. Hence the first part of my query - is it intended that loadNamespace perform such a check? My initial thought was that the answer should be yes, loadNamespace should do such a check, and so I wrote a little function which checked all the versions specified in the DESCRIPTION file, at the time the file was initially encountered. (After a bit of debugging) it seemed to do what I wanted, in that if the right version of otherpackage was not available, my amended loadNamespace threw an error. But then I started to think about how I could test it thoroughly - by which I mean not does my code do what I think it should do, but does it achieve the outcome that motivated that coding in the first place. VERSION CHECKS ON otherpackage::otherfun AT EXECUTION TIME? =================================================== That led to the second part of my query - should / how could loadNamespace ensure that I actually get the otherfun from the version of otherpackage that has been specified in the mypackage DESCRIPTION file, when the otherpackage::otherfun code is actually executed? My understanding is that the underlying intent of the namespace mechanism in R packages is to ensure that when mypackage calls otherpackage::otherfun, it is indeed otherpackage::otherfun I get, ie I do not get a different function, also called otherfun, that for one reason or another exists in memory and is found as R works its way up the chain of enclosing environments. Usually the concern is about an identically named but different `otherfun` from `yetanother` package that has been loaded, or that the user has defined themselves in their globalNamespace. But in the motivating example for my case, I wanted to ensure I got the otherfun from version >= 2.0 of `otherpackage`, and in particular, I did **NOT** get `otherpackage(version 1.0)::otherfun` I confess I haven't actually tried it, but I think that even with my up front checking of the package dependencies mentioned in the DESCRIPTION file I could probably get the 'wrong' outcome if I changed my .libPath() between loading mypackage and executing it. This isn't *quite* as unlikely as it might seem - in my real world example I had the official CRAN versions of mypackage (actually someone else's package!) and otherpackage installed in my main library, and was using a development library to explore changes to updated versions of mypackage and otherpackage - I would load mypackage while I had the development library in .libPath(), and then without thinking of all the implications, remove the development library from .libPath() while doing some exploratory testing using (the loaded development version of) mypackage. Anyway, this lead to the 2nd part of my query - could / should loadNamespace ensure that at execution time, otherpackage::otherfun actually respects the version contraint specified in mypackages DESCRIPTION file? POSSIBLE ANSWER 1 - THIS IS TOO COMPLEX AND HYPOTHETICAL ================================================ One thought was - this is all getting too complex and hypothetical, there is only so much automatic protection that R / loadNamespace can offer, in which case the answer to query part b) is no. POSSIBLE ANSWER 2 - THIS IS AN ISSUE FOR THE `::` FUNCTION =============================================== Another thought was, this isn't loadNamespace's problem (it is doing what its name advertises - viz loading namespaces), rather it is something that the `::` function should look after. Looking at the code for `::` it does not seem to have provision for specifying a version constraint for the pkg argument. If this is the "correct" approach, the answer to part b) is again no, loadNamespace is behaving as designed - but the `::` function should be upgraded so it knows about package versioning. Under this "solution" the mypackage author would specify the otherpackage version in the same segment of code that calls otherfun POSSIBLE ANSWER 3 - loadNamespace SHOULD BE VERSION AWARE EVEN WHEN `::` IS USED ===================================================================== My most complicated possible answer was yes - loadNamespace should be "aware" of calls which use the otherpackage::otherfun syntax, and "enforce" any versioning given in the mypackage DESCRIPTION file, both at load time, by checking the version of otherpackage which is available then, and at execution time (by "somehow"" storing a reference to the code for the correct package and version of otherpackage::otherfun in the imports::mypackage namespace). The only "somehow"'s I could dream up were messy and hackish - eg loadNamespace parses the mypackage code to find `otherpackage::otherfun` calls, loads the otherpackage::otherfun code, and inserts a special mangled reference name into the imports:mypackage namespace environment, AND `::` is changed to look for a mangled name version of package before it runs as it currently does. And that feels so inelegant I decided I am approaching this / not understanding properly, so I decided to stop exploring this myself (very instructional though that has been) and pose this query instead.
Reasonably Related Threads
- library path in Rd link
- Problem understanding behaviour of versionCheck for loadNamespace (and when versions for Imports packages are checked)
- Problem understanding behaviour of versionCheck for loadNamespace (and when versions for Imports packages are checked)
- Problem understanding behaviour of versionCheck for loadNamespace (and when versions for Imports packages are checked)
- Problem understanding behaviour of versionCheck for loadNamespace (and when versions for Imports packages are checked)