Martin Maechler
2016-Jul-20 10:52 UTC
[Rd] package installation fails when symlink of same name exists
>>>>> Jeroen Ooms <jeroenooms at gmail.com> >>>>> on Wed, 20 Jul 2016 10:26:19 +0200 writes:> On Tue, Jul 19, 2016 at 6:46 PM, Kevin Ushey <kevinushey at gmail.com> wrote: >> R fails to install a package from source over a pre-existing package >> when the path to that package is a symlink, rather than a directory. >> ... >> I don't think anyone's reported this being an issue before > I ran into this as well a while back: > https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16725 I've now at least "acknowledged" that bug report. and have looked into changing the is_subdir() function so it returns TRUE in the case of a symlink [on those platforms where Sys.readlink() "works", i.e., supposedly not on Windows; however that maybe sufficient to close that bug report and also Kevin's issue, right ?] However, Kevin, in his posting, continues > I guess my wish here would be that R would check if any file already > existed at the 'instdir' path, and if it existed and was a symlink, R > would remove that symlink before install. are you sure? I think ... and from what you mention below ("packrat") it would rather be important to *keep* the symlink, and install to whereever the symlink is pointing, no ? > It could happen before creating the directory, e.g. here: > https://github.com/wch/r-source/blob/62f5acbdbdf36e1fc618510312125d1677d79941/src/library/tools/R/install.R#L277-L281 > One thing that was a bit surprising to me -- R does not remove a > pre-existing package installation if it exists (when installing from > source), it merely installs over it, so files / artifacts from a > previous package installation could be left over after installing a > new package. It seems this is not a problem in practice since I don't > think anyone's reported this being an issue before, but for hygiene it > seems like a pre-existing directory could / should be removed when > installing a new package. (It appears that R does clear out a > pre-existing directory when downloading and installing a package > binary directly from CRAN.) Well, at least with update.packages() it seems natural to me that R would not just remove all previous parts there .. > For motivation: I bumped into this when attempting to implement a > package caching feature with packrat. A packrat project using a global > cache will have a (private) R library containing symlinks to R package > installations in a separate, global library. This allows projects to > effectively be isolated from one another, while avoiding duplication > of packages used across multiple projects. Yes, I found this a nice feature when I heard about packrat. But then, really R should *not* remove the symlink and create a regular subdirectory in that library there ! > Unfortunately, some packrat > users bump into this when attempting to update a package that has > entered the cache (and so is a symlink in their R library). > Thanks for your time, > Kevin
Kevin Ushey
2016-Jul-20 17:19 UTC
[Rd] package installation fails when symlink of same name exists
On Wed, Jul 20, 2016 at 3:52 AM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:>>>>>> Jeroen Ooms <jeroenooms at gmail.com> >>>>>> on Wed, 20 Jul 2016 10:26:19 +0200 writes: > > > On Tue, Jul 19, 2016 at 6:46 PM, Kevin Ushey <kevinushey at gmail.com> wrote: > >> R fails to install a package from source over a pre-existing package > >> when the path to that package is a symlink, rather than a directory. > >> ... > >> I don't think anyone's reported this being an issue before > > > I ran into this as well a while back: > > https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16725 > > I've now at least "acknowledged" that bug report. > and have looked into changing the is_subdir() function so it > returns TRUE in the case of a symlink [on those platforms where > Sys.readlink() "works", i.e., supposedly not on Windows; however > that maybe sufficient to close that bug report and also Kevin's > issue, right ?] > > However, Kevin, in his posting, continues > > > I guess my wish here would be that R would check if any file already > > existed at the 'instdir' path, and if it existed and was a symlink, R > > would remove that symlink before install. > > are you sure? > I think ... and from what you mention below ("packrat") it would > rather be important to *keep* the symlink, and install to > whereever the symlink is pointing, no ?For packrat's case at least, removing the symlink and installing to a newly-created directory within the library would be fine -- later, when a user wants to 'save the state' of their library, they would call 'packrat::snapshot()', and that call would take care of moving the newly-installed package to the cache and restoring the symlink as required. That said, installing within the symlinked directory would definitely be nice :-) I just thought the request might be out of scope.> > It could happen before creating the directory, e.g. here: > > > https://github.com/wch/r-source/blob/62f5acbdbdf36e1fc618510312125d1677d79941/src/library/tools/R/install.R#L277-L281 > > > One thing that was a bit surprising to me -- R does not remove a > > pre-existing package installation if it exists (when installing from > > source), it merely installs over it, so files / artifacts from a > > previous package installation could be left over after installing a > > new package. It seems this is not a problem in practice since I don't > > think anyone's reported this being an issue before, but for hygiene it > > seems like a pre-existing directory could / should be removed when > > installing a new package. (It appears that R does clear out a > > pre-existing directory when downloading and installing a package > > binary directly from CRAN.) > > Well, at least with update.packages() it seems natural to me > that R would not just remove all previous parts there .. > > > For motivation: I bumped into this when attempting to implement a > > package caching feature with packrat. A packrat project using a global > > cache will have a (private) R library containing symlinks to R package > > installations in a separate, global library. This allows projects to > > effectively be isolated from one another, while avoiding duplication > > of packages used across multiple projects. > > Yes, I found this a nice feature when I heard about packrat. > > But then, really R should *not* remove the symlink and create a > regular subdirectory in that library there !I agree this would be ideal, I just thought this request might be out of scope, since the typical use case for R libraries is a directory-of-directories, not a directory-of-symlinks-to-directories (although packrat has had a lot of success with the second scenario!) Thanks, Martin!> > Unfortunately, some packrat > > users bump into this when attempting to update a package that has > > entered the cache (and so is a symlink in their R library). > > > Thanks for your time, > > Kevin >
Martin Maechler
2016-Jul-21 16:03 UTC
[Rd] package installation fails when symlink of same name exists
>>>>> Kevin Ushey <kevinushey at gmail.com> >>>>> on Wed, 20 Jul 2016 10:19:33 -0700 writes:> On Wed, Jul 20, 2016 at 3:52 AM, Martin Maechler > <maechler at stat.math.ethz.ch> wrote: >>>>>>> Jeroen Ooms <jeroenooms at gmail.com> >>>>>>> on Wed, 20 Jul 2016 10:26:19 +0200 writes: >> >> > On Tue, Jul 19, 2016 at 6:46 PM, Kevin Ushey <kevinushey at gmail.com> wrote: >> >> R fails to install a package from source over a pre-existing package >> >> when the path to that package is a symlink, rather than a directory. >> >> ... >> >> I don't think anyone's reported this being an issue before >> >> > I ran into this as well a while back: >> > https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16725 >> >> I've now at least "acknowledged" that bug report. >> and have looked into changing the is_subdir() function so it >> returns TRUE in the case of a symlink [on those platforms where >> Sys.readlink() "works", i.e., supposedly not on Windows; however >> that maybe sufficient to close that bug report and also Kevin's >> issue, right ?] >> >> However, Kevin, in his posting, continues >> >> > I guess my wish here would be that R would check if any file already >> > existed at the 'instdir' path, and if it existed and was a symlink, R >> > would remove that symlink before install. >> >> are you sure? >> I think ... and from what you mention below ("packrat") it would >> rather be important to *keep* the symlink, and install to >> whereever the symlink is pointing, no ? > For packrat's case at least, removing the symlink and installing to a > newly-created directory within the library would be fine -- later, > when a user wants to 'save the state' of their library, they would > call 'packrat::snapshot()', and that call would take care of moving > the newly-installed package to the cache and restoring the symlink as > required. > That said, installing within the symlinked directory would definitely > be nice :-) I just thought the request might be out of scope. >> > It could happen before creating the directory, e.g. here: >> >> > https://github.com/wch/r-source/blob/62f5acbdbdf36e1fc618510312125d1677d79941/src/library/tools/R/install.R#L277-L281 >> >> > One thing that was a bit surprising to me -- R does not remove a >> > pre-existing package installation if it exists (when installing from >> > source), it merely installs over it, so files / artifacts from a >> > previous package installation could be left over after installing a >> > new package. It seems this is not a problem in practice since I don't >> > think anyone's reported this being an issue before, but for hygiene it >> > seems like a pre-existing directory could / should be removed when >> > installing a new package. (It appears that R does clear out a >> > pre-existing directory when downloading and installing a package >> > binary directly from CRAN.) >> >> Well, at least with update.packages() it seems natural to me >> that R would not just remove all previous parts there .. >> >> > For motivation: I bumped into this when attempting to implement a >> > package caching feature with packrat. A packrat project using a global >> > cache will have a (private) R library containing symlinks to R package >> > installations in a separate, global library. This allows projects to >> > effectively be isolated from one another, while avoiding duplication >> > of packages used across multiple projects. >> >> Yes, I found this a nice feature when I heard about packrat. >> >> But then, really R should *not* remove the symlink and create a >> regular subdirectory in that library there ! > I agree this would be ideal, I just thought this request might be out > of scope, since the typical use case for R libraries is a > directory-of-directories, not a directory-of-symlinks-to-directories > (although packrat has had a lot of success with the second scenario!) > Thanks, Martin! You are welcome. I have committed a change (svn rev 70955) which no longer "errors out" on symlinks {{and the same change improves debugging: you can turn off the "dreaded" q(), and that's done by default if(interactive())}} However, that change indeed was mainly to is_subdir() and indeed the code later *does* replace the package-name symlink by a newly created directory <lib>/<package> rather than leaving the symlink.. where I continue to find the latter *the* correct action, but that would need changes in other places of the code. [tested (and "minimal") patches are welcome for that other goal ..] Martin >> > Unfortunately, some packrat >> > users bump into this when attempting to update a package that has >> > entered the cache (and so is a symlink in their R library). >> >> > Thanks for your time, >> > Kevin >>