I could do this...and I have before. This brings up a more fundamental question though. You're asking me to write code that changes the logic of the installation process (i.e. writing my own package installer). Instead of doing that, I would rather integrate that logic into R itself to improve the baseline installation process. This api proposal change would be additive and would not break legacy code. Package managers like pip (python), conda (python), yum (CentOS), apt (Ubuntu), and apk (Alpine) are all "smart" enough to know (by their defaults) when to not download a package again. By proposing this change, I'm essentially asking that R follow some of the same conventions and best practices that other package managers have adopted over the decades. I assumed this list is used to discuss proposals like this to the R codebase. If I'm on the wrong list, please let me know. P.S. if this change happened, it would be interesting to study the effect it has on the bandwidth across all CRAN mirrors. A significant drop would turn into actual $$ saved Josh Bradley On Fri, Nov 8, 2019 at 5:00 AM Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> On 08/11/2019 2:06 a.m., Joshua Bradley wrote: > > Hello, > > > > Currently if you install a package twice: > > > > install.packages("testit") > > install.packages("testit") > > > > R will build the package from source (depending on what OS you're using) > > twice by default. This becomes especially burdensome when people are > using > > big packages (i.e. lots of depends) and someone has a script with: > > > > install.packages("tidyverse") > > ... > > ... later on down the script > > ... > > install.packages("dplyr") > > > > In this case, "dplyr" is part of the tidyverse and will install twice. As > > the primary "package manager" for R, it should not install a package > twice > > (by default) when it can be so easily checked. Indeed, many people resort > > to writing a few lines of code to filter out already-installed packages > An > > r-help post from 2010 proposed a solution to improving the default > > behavior, by adding "force=FALSE" as a api addition to install.packages.( > > https://stat.ethz.ch/pipermail/r-help/2010-May/239492.html) > > > > Would the R-core devs still consider this proposal? > > Whether or not they'd do it, it's easy for you to do it. > > install.packages <- function(pkgs, ..., force = FALSE) { > if (!force) { > pkgs <- Filter(Negate(requireNamespace), pkgs > > utils::install.packages(pkgs, ...) > } > > You might want to make this more elaborate, e.g. doing update.packages() > on the ones that exist. But really, isn't the problem with the script > you're using, which could have done a simple test before forcing a slow > install? > > Duncan Murdoch >[[alternative HTML version deleted]]
While developing a package, I often run install.packages() on it many times in a session without updating its version number. How would your proposed change affect this workflow? Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Nov 8, 2019 at 11:56 AM Joshua Bradley <jgbradley1 at gmail.com> wrote:> I could do this...and I have before. This brings up a more fundamental > question though. You're asking me to write code that changes the logic of > the installation process (i.e. writing my own package installer). Instead > of doing that, I would rather integrate that logic into R itself to improve > the baseline installation process. This api proposal change would be > additive and would not break legacy code. > > Package managers like pip (python), conda (python), yum (CentOS), apt > (Ubuntu), and apk (Alpine) are all "smart" enough to know (by their > defaults) when to not download a package again. By proposing this change, > I'm essentially asking that R follow some of the same conventions and best > practices that other package managers have adopted over the decades. > > I assumed this list is used to discuss proposals like this to the R > codebase. If I'm on the wrong list, please let me know. > > P.S. if this change happened, it would be interesting to study the effect > it has on the bandwidth across all CRAN mirrors. A significant drop would > turn into actual $$ saved > > Josh Bradley > > > On Fri, Nov 8, 2019 at 5:00 AM Duncan Murdoch <murdoch.duncan at gmail.com> > wrote: > > > On 08/11/2019 2:06 a.m., Joshua Bradley wrote: > > > Hello, > > > > > > Currently if you install a package twice: > > > > > > install.packages("testit") > > > install.packages("testit") > > > > > > R will build the package from source (depending on what OS you're > using) > > > twice by default. This becomes especially burdensome when people are > > using > > > big packages (i.e. lots of depends) and someone has a script with: > > > > > > install.packages("tidyverse") > > > ... > > > ... later on down the script > > > ... > > > install.packages("dplyr") > > > > > > In this case, "dplyr" is part of the tidyverse and will install twice. > As > > > the primary "package manager" for R, it should not install a package > > twice > > > (by default) when it can be so easily checked. Indeed, many people > resort > > > to writing a few lines of code to filter out already-installed packages > > An > > > r-help post from 2010 proposed a solution to improving the default > > > behavior, by adding "force=FALSE" as a api addition to > install.packages.( > > > https://stat.ethz.ch/pipermail/r-help/2010-May/239492.html) > > > > > > Would the R-core devs still consider this proposal? > > > > Whether or not they'd do it, it's easy for you to do it. > > > > install.packages <- function(pkgs, ..., force = FALSE) { > > if (!force) { > > pkgs <- Filter(Negate(requireNamespace), pkgs > > > > utils::install.packages(pkgs, ...) > > } > > > > You might want to make this more elaborate, e.g. doing update.packages() > > on the ones that exist. But really, isn't the problem with the script > > you're using, which could have done a simple test before forcing a slow > > install? > > > > Duncan Murdoch > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Since we are on this topic, another area of improvement is when install.packages() downloads hundreds of packages only to realize later that many of them actually fail to install because one of the packages they depend on (directly or indirectly) failed to install. Cheers, H. On 11/8/19 11:55, Joshua Bradley wrote:> I could do this...and I have before. This brings up a more fundamental > question though. You're asking me to write code that changes the logic of > the installation process (i.e. writing my own package installer). Instead > of doing that, I would rather integrate that logic into R itself to improve > the baseline installation process. This api proposal change would be > additive and would not break legacy code. > > Package managers like pip (python), conda (python), yum (CentOS), apt > (Ubuntu), and apk (Alpine) are all "smart" enough to know (by their > defaults) when to not download a package again. By proposing this change, > I'm essentially asking that R follow some of the same conventions and best > practices that other package managers have adopted over the decades. > > I assumed this list is used to discuss proposals like this to the R > codebase. If I'm on the wrong list, please let me know. > > P.S. if this change happened, it would be interesting to study the effect > it has on the bandwidth across all CRAN mirrors. A significant drop would > turn into actual $$ saved > > Josh Bradley > > > On Fri, Nov 8, 2019 at 5:00 AM Duncan Murdoch <murdoch.duncan at gmail.com> > wrote: > >> On 08/11/2019 2:06 a.m., Joshua Bradley wrote: >>> Hello, >>> >>> Currently if you install a package twice: >>> >>> install.packages("testit") >>> install.packages("testit") >>> >>> R will build the package from source (depending on what OS you're using) >>> twice by default. This becomes especially burdensome when people are >> using >>> big packages (i.e. lots of depends) and someone has a script with: >>> >>> install.packages("tidyverse") >>> ... >>> ... later on down the script >>> ... >>> install.packages("dplyr") >>> >>> In this case, "dplyr" is part of the tidyverse and will install twice. As >>> the primary "package manager" for R, it should not install a package >> twice >>> (by default) when it can be so easily checked. Indeed, many people resort >>> to writing a few lines of code to filter out already-installed packages >> An >>> r-help post from 2010 proposed a solution to improving the default >>> behavior, by adding "force=FALSE" as a api addition to install.packages.( >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_r-2Dhelp_2010-2DMay_239492.html&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=UA8pThQCyQOMZf_tiAAnzSPckXg-h9-262Eu2WCyGHs&s=qtl85Oi2X2-U4rTQW-78pu9_Jb2vhBo1VZZN9pm6M8U&e= ) >>> >>> Would the R-core devs still consider this proposal? >> >> Whether or not they'd do it, it's easy for you to do it. >> >> install.packages <- function(pkgs, ..., force = FALSE) { >> if (!force) { >> pkgs <- Filter(Negate(requireNamespace), pkgs >> >> utils::install.packages(pkgs, ...) >> } >> >> You might want to make this more elaborate, e.g. doing update.packages() >> on the ones that exist. But really, isn't the problem with the script >> you're using, which could have done a simple test before forcing a slow >> install? >> >> Duncan Murdoch >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=UA8pThQCyQOMZf_tiAAnzSPckXg-h9-262Eu2WCyGHs&s=HfzpeqddkrDu5eqZrrwPlN34KZIazW5yNGF7Hp-B0Go&e>-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319
I guess you would just use force=TRUE H. On 11/8/19 12:06, William Dunlap via R-devel wrote:> While developing a package, I often run install.packages() on it many times > in a session without updating its version number. How would your proposed > change affect this workflow? > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > > On Fri, Nov 8, 2019 at 11:56 AM Joshua Bradley <jgbradley1 at gmail.com> wrote: > >> I could do this...and I have before. This brings up a more fundamental >> question though. You're asking me to write code that changes the logic of >> the installation process (i.e. writing my own package installer). Instead >> of doing that, I would rather integrate that logic into R itself to improve >> the baseline installation process. This api proposal change would be >> additive and would not break legacy code. >> >> Package managers like pip (python), conda (python), yum (CentOS), apt >> (Ubuntu), and apk (Alpine) are all "smart" enough to know (by their >> defaults) when to not download a package again. By proposing this change, >> I'm essentially asking that R follow some of the same conventions and best >> practices that other package managers have adopted over the decades. >> >> I assumed this list is used to discuss proposals like this to the R >> codebase. If I'm on the wrong list, please let me know. >> >> P.S. if this change happened, it would be interesting to study the effect >> it has on the bandwidth across all CRAN mirrors. A significant drop would >> turn into actual $$ saved >> >> Josh Bradley >> >> >> On Fri, Nov 8, 2019 at 5:00 AM Duncan Murdoch <murdoch.duncan at gmail.com> >> wrote: >> >>> On 08/11/2019 2:06 a.m., Joshua Bradley wrote: >>>> Hello, >>>> >>>> Currently if you install a package twice: >>>> >>>> install.packages("testit") >>>> install.packages("testit") >>>> >>>> R will build the package from source (depending on what OS you're >> using) >>>> twice by default. This becomes especially burdensome when people are >>> using >>>> big packages (i.e. lots of depends) and someone has a script with: >>>> >>>> install.packages("tidyverse") >>>> ... >>>> ... later on down the script >>>> ... >>>> install.packages("dplyr") >>>> >>>> In this case, "dplyr" is part of the tidyverse and will install twice. >> As >>>> the primary "package manager" for R, it should not install a package >>> twice >>>> (by default) when it can be so easily checked. Indeed, many people >> resort >>>> to writing a few lines of code to filter out already-installed packages >>> An >>>> r-help post from 2010 proposed a solution to improving the default >>>> behavior, by adding "force=FALSE" as a api addition to >> install.packages.( >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_r-2Dhelp_2010-2DMay_239492.html&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=iJofJNzrnbF8idVP_KjXyi-Pt9e0cAgor0UEiDJPPro&s=R1s-MHqzxEbvj-KerylYVqz-IkWatde6QREua4MPqmU&e= ) >>>> >>>> Would the R-core devs still consider this proposal? >>> >>> Whether or not they'd do it, it's easy for you to do it. >>> >>> install.packages <- function(pkgs, ..., force = FALSE) { >>> if (!force) { >>> pkgs <- Filter(Negate(requireNamespace), pkgs >>> >>> utils::install.packages(pkgs, ...) >>> } >>> >>> You might want to make this more elaborate, e.g. doing update.packages() >>> on the ones that exist. But really, isn't the problem with the script >>> you're using, which could have done a simple test before forcing a slow >>> install? >>> >>> Duncan Murdoch >>> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=iJofJNzrnbF8idVP_KjXyi-Pt9e0cAgor0UEiDJPPro&s=mIZ0fcjSg7KaJAY4wgLlKOaWwcD2uv9lI-GQNvcj4cg&e>> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=iJofJNzrnbF8idVP_KjXyi-Pt9e0cAgor0UEiDJPPro&s=mIZ0fcjSg7KaJAY4wgLlKOaWwcD2uv9lI-GQNvcj4cg&e>-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319
Exactly. Every major commit isn?t want to check that the package works. Also, besides package development, there are other reasons why one would install packages over themselves. For example, rebuilding from source after changing options in Makevars[.win]. The package hasn?t been updated but recompilation is desired. Avi On Fri, Nov 8, 2019 at 3:07 PM William Dunlap via R-devel < r-devel at r-project.org> wrote:> While developing a package, I often run install.packages() on it many times > in a session without updating its version number. How would your proposed > change affect this workflow? > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > > On Fri, Nov 8, 2019 at 11:56 AM Joshua Bradley <jgbradley1 at gmail.com> > wrote: > > > I could do this...and I have before. This brings up a more fundamental > > question though. You're asking me to write code that changes the logic of > > the installation process (i.e. writing my own package installer). Instead > > of doing that, I would rather integrate that logic into R itself to > improve > > the baseline installation process. This api proposal change would be > > additive and would not break legacy code. > > > > Package managers like pip (python), conda (python), yum (CentOS), apt > > (Ubuntu), and apk (Alpine) are all "smart" enough to know (by their > > defaults) when to not download a package again. By proposing this change, > > I'm essentially asking that R follow some of the same conventions and > best > > practices that other package managers have adopted over the decades. > > > > I assumed this list is used to discuss proposals like this to the R > > codebase. If I'm on the wrong list, please let me know. > > > > P.S. if this change happened, it would be interesting to study the effect > > it has on the bandwidth across all CRAN mirrors. A significant drop would > > turn into actual $$ saved > > > > Josh Bradley > > > > > > On Fri, Nov 8, 2019 at 5:00 AM Duncan Murdoch <murdoch.duncan at gmail.com> > > wrote: > > > > > On 08/11/2019 2:06 a.m., Joshua Bradley wrote: > > > > Hello, > > > > > > > > Currently if you install a package twice: > > > > > > > > install.packages("testit") > > > > install.packages("testit") > > > > > > > > R will build the package from source (depending on what OS you're > > using) > > > > twice by default. This becomes especially burdensome when people are > > > using > > > > big packages (i.e. lots of depends) and someone has a script with: > > > > > > > > install.packages("tidyverse") > > > > ... > > > > ... later on down the script > > > > ... > > > > install.packages("dplyr") > > > > > > > > In this case, "dplyr" is part of the tidyverse and will install > twice. > > As > > > > the primary "package manager" for R, it should not install a package > > > twice > > > > (by default) when it can be so easily checked. Indeed, many people > > resort > > > > to writing a few lines of code to filter out already-installed > packages > > > An > > > > r-help post from 2010 proposed a solution to improving the default > > > > behavior, by adding "force=FALSE" as a api addition to > > install.packages.( > > > > https://stat.ethz.ch/pipermail/r-help/2010-May/239492.html) > > > > > > > > Would the R-core devs still consider this proposal? > > > > > > Whether or not they'd do it, it's easy for you to do it. > > > > > > install.packages <- function(pkgs, ..., force = FALSE) { > > > if (!force) { > > > pkgs <- Filter(Negate(requireNamespace), pkgs > > > > > > utils::install.packages(pkgs, ...) > > > } > > > > > > You might want to make this more elaborate, e.g. doing > update.packages() > > > on the ones that exist. But really, isn't the problem with the script > > > you're using, which could have done a simple test before forcing a slow > > > install? > > > > > > Duncan Murdoch > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Sent from Gmail Mobile [[alternative HTML version deleted]]
Hi Josh, There are a few issues I can think of with this. The primary one is that CRAN(/Bioconductor) is not the only place one can install packages from. I might have version x.y.z of a package installed that was, at the time, a development version I got from github, or installed locally, etc. Hell I might have a later devel version but want the CRAN version. Not common, sure, but wiill likely happen often enough that install.packages not doing that for me when I tell it to is probably bad. Currently (though there has been some discussion of changing this) packages do not remember where they were installed from, so R wouldn't know if the version you have is actually fully the same one on the repository you pointed install.packages to or not. If that were changed and we knew that we were getting the byte identical package from the actual same source, I think this would be a nice addition, though without it I think it would be right a high but not high enough proportion of the time. R will build the package from source (depending on what OS you're using)> twice by default. This becomes especially burdensome when people are using > big packages (i.e. lots of depends) and someone has a script with: >install.packages("tidyverse")> ... > ... later on down the script > ... > install.packages("dplyr") >I mean, IMHO and as I think Duncan was alluding to, that's straight up an error by the script author. I think its a few of them, actually, but its at least one. An understandable one, sure, but thats still what it is. Scripts (which are meant to be run more than once, generally) usually shouldn't really be calling install.packages in the first place, but if they do, they should certainly not be installing umbrella packages and the packages they bring with them separately. Even having one vectorized call to install.packages where all the packages are installed would prevent this issue, including in the case where the user doesn't understand the purpose of the tidyverse package. Though the installation would still occur every time the script was run. The last thing to note is that there are at least 2 packages which provide a function which does this already (install.load and remotes), so people can get this functionality if they need it. On Fri, Nov 8, 2019 at 11:56 AM Joshua Bradley <jgbradley1 at gmail.com> wrote:> > > I assumed this list is used to discuss proposals like this to the R > codebase. If I'm on the wrong list, please let me know. >This is the right place to discuss things like this. Thanks for starting the conversation. Best, ~G> >[[alternative HTML version deleted]]
On 08/11/2019 2:55 p.m., Joshua Bradley wrote:> I could do this...and I have before. This brings up a more fundamental > question though. You're asking me to write code that changes the logic of > the installation process (i.e. writing my own package installer). Instead > of doing that, I would rather integrate that logic into R itself to improve > the baseline installation process. This api proposal change would be > additive and would not break legacy code.That's not true. The current behaviour is equivalent to force=TRUE; I believe the proposal was to change the default to force=FALSE. If you didn't change the default, it wouldn't help your example: the badly written script would run with force=TRUE, and wouldn't benefit at all. Duncan Murdoch> > Package managers like pip (python), conda (python), yum (CentOS), apt > (Ubuntu), and apk (Alpine) are all "smart" enough to know (by their > defaults) when to not download a package again. By proposing this change, > I'm essentially asking that R follow some of the same conventions and best > practices that other package managers have adopted over the decades. > > I assumed this list is used to discuss proposals like this to the R > codebase. If I'm on the wrong list, please let me know. > > P.S. if this change happened, it would be interesting to study the effect > it has on the bandwidth across all CRAN mirrors. A significant drop would > turn into actual $$ saved > > Josh Bradley > > > On Fri, Nov 8, 2019 at 5:00 AM Duncan Murdoch <murdoch.duncan at gmail.com> > wrote: > >> On 08/11/2019 2:06 a.m., Joshua Bradley wrote: >>> Hello, >>> >>> Currently if you install a package twice: >>> >>> install.packages("testit") >>> install.packages("testit") >>> >>> R will build the package from source (depending on what OS you're using) >>> twice by default. This becomes especially burdensome when people are >> using >>> big packages (i.e. lots of depends) and someone has a script with: >>> >>> install.packages("tidyverse") >>> ... >>> ... later on down the script >>> ... >>> install.packages("dplyr") >>> >>> In this case, "dplyr" is part of the tidyverse and will install twice. As >>> the primary "package manager" for R, it should not install a package >> twice >>> (by default) when it can be so easily checked. Indeed, many people resort >>> to writing a few lines of code to filter out already-installed packages >> An >>> r-help post from 2010 proposed a solution to improving the default >>> behavior, by adding "force=FALSE" as a api addition to install.packages.( >>> https://stat.ethz.ch/pipermail/r-help/2010-May/239492.html) >>> >>> Would the R-core devs still consider this proposal? >> >> Whether or not they'd do it, it's easy for you to do it. >> >> install.packages <- function(pkgs, ..., force = FALSE) { >> if (!force) { >> pkgs <- Filter(Negate(requireNamespace), pkgs >> >> utils::install.packages(pkgs, ...) >> } >> >> You might want to make this more elaborate, e.g. doing update.packages() >> on the ones that exist. But really, isn't the problem with the script >> you're using, which could have done a simple test before forcing a slow >> install? >> >> Duncan Murdoch >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
Suppose update.packages("pkg") installed "pkg" if it were not already installed, in addition to its current behavior of installing "pkg" if "pkg" is installed but a newer version is available. The OP could then use update.packages() all the time instead of install.packages() the first time and update.packages() subsequent times. Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Nov 8, 2019 at 2:51 PM Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> On 08/11/2019 2:55 p.m., Joshua Bradley wrote: > > I could do this...and I have before. This brings up a more fundamental > > question though. You're asking me to write code that changes the logic of > > the installation process (i.e. writing my own package installer). Instead > > of doing that, I would rather integrate that logic into R itself to > improve > > the baseline installation process. This api proposal change would be > > additive and would not break legacy code. > > That's not true. The current behaviour is equivalent to force=TRUE; I > believe the proposal was to change the default to force=FALSE. > > If you didn't change the default, it wouldn't help your example: the > badly written script would run with force=TRUE, and wouldn't benefit at > all. > > Duncan Murdoch > > > > > Package managers like pip (python), conda (python), yum (CentOS), apt > > (Ubuntu), and apk (Alpine) are all "smart" enough to know (by their > > defaults) when to not download a package again. By proposing this change, > > I'm essentially asking that R follow some of the same conventions and > best > > practices that other package managers have adopted over the decades. > > > > I assumed this list is used to discuss proposals like this to the R > > codebase. If I'm on the wrong list, please let me know. > > > > P.S. if this change happened, it would be interesting to study the effect > > it has on the bandwidth across all CRAN mirrors. A significant drop would > > turn into actual $$ saved > > > > Josh Bradley > > > > > > On Fri, Nov 8, 2019 at 5:00 AM Duncan Murdoch <murdoch.duncan at gmail.com> > > wrote: > > > >> On 08/11/2019 2:06 a.m., Joshua Bradley wrote: > >>> Hello, > >>> > >>> Currently if you install a package twice: > >>> > >>> install.packages("testit") > >>> install.packages("testit") > >>> > >>> R will build the package from source (depending on what OS you're > using) > >>> twice by default. This becomes especially burdensome when people are > >> using > >>> big packages (i.e. lots of depends) and someone has a script with: > >>> > >>> install.packages("tidyverse") > >>> ... > >>> ... later on down the script > >>> ... > >>> install.packages("dplyr") > >>> > >>> In this case, "dplyr" is part of the tidyverse and will install twice. > As > >>> the primary "package manager" for R, it should not install a package > >> twice > >>> (by default) when it can be so easily checked. Indeed, many people > resort > >>> to writing a few lines of code to filter out already-installed packages > >> An > >>> r-help post from 2010 proposed a solution to improving the default > >>> behavior, by adding "force=FALSE" as a api addition to > install.packages.( > >>> https://stat.ethz.ch/pipermail/r-help/2010-May/239492.html) > >>> > >>> Would the R-core devs still consider this proposal? > >> > >> Whether or not they'd do it, it's easy for you to do it. > >> > >> install.packages <- function(pkgs, ..., force = FALSE) { > >> if (!force) { > >> pkgs <- Filter(Negate(requireNamespace), pkgs > >> > >> utils::install.packages(pkgs, ...) > >> } > >> > >> You might want to make this more elaborate, e.g. doing update.packages() > >> on the ones that exist. But really, isn't the problem with the script > >> you're using, which could have done a simple test before forcing a slow > >> install? > >> > >> Duncan Murdoch > >> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Hi Gabe, Keeping track of where a package was installed from would be a nice feature. However it wouldn't be as reliable as comparing hashes to decide whether a package needs re-installation or not. H. On 11/8/19 12:37, Gabriel Becker wrote:> Hi Josh, > > There are a few issues I can think of with this. The primary one is that > CRAN(/Bioconductor) is not the only place one can install packages from. I > might have version x.y.z of a package installed that was, at the time, a > development version I got from github, or installed locally, etc. Hell I > might have a later devel version but want the CRAN version. Not common, > sure, but wiill likely happen often enough that install.packages not doing > that for me when I tell it to is probably bad. > > Currently (though there has been some discussion of changing this) packages > do not remember where they were installed from, so R wouldn't know if the > version you have is actually fully the same one on the repository you > pointed install.packages to or not. If that were changed and we knew that > we were getting the byte identical package from the actual same source, I > think this would be a nice addition, though without it I think it would be > right a high but not high enough proportion of the time. > > R will build the package from source (depending on what OS you're using) >> twice by default. This becomes especially burdensome when people are using >> big packages (i.e. lots of depends) and someone has a script with: >> > > > install.packages("tidyverse") >> ... >> ... later on down the script >> ... >> install.packages("dplyr") >> > > I mean, IMHO and as I think Duncan was alluding to, that's straight up an > error by the script author. I think its a few of them, actually, but its at > least one. An understandable one, sure, but thats still what it is. Scripts > (which are meant to be run more than once, generally) usually shouldn't > really be calling install.packages in the first place, but if they do, they > should certainly not be installing umbrella packages and the packages they > bring with them separately. > > Even having one vectorized call to install.packages where all the packages > are installed would prevent this issue, including in the case where the > user doesn't understand the purpose of the tidyverse package. Though the > installation would still occur every time the script was run. > > > The last thing to note is that there are at least 2 packages which provide > a function which does this already (install.load and remotes), so people > can get this functionality if they need it. > > > On Fri, Nov 8, 2019 at 11:56 AM Joshua Bradley <jgbradley1 at gmail.com> wrote: > >> >> >> I assumed this list is used to discuss proposals like this to the R >> codebase. If I'm on the wrong list, please let me know. >> > > This is the right place to discuss things like this. Thanks for starting > the conversation. > > Best, > ~G > >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=XG4gVQKZam41YLfI3w8XRAu8s7f2I5jCppA45q6NBu0&s=cOXQGMA9Va3o9x1USGggzF82D1LtFQb2ALpLRLQs2k4&e>-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319