On Mon, 7 Jan 2019 at 22:09, Gergely Dar?czi <daroczig at rapporter.net> wrote:> > Dear David, sharing some related (subjective) thoughts below. > > You can provide your app as a Docker image, so that the end-user > simply calls a "docker pull" and then "docker run" -- that can be done > from a user-friendly script as well. > Of course, this requires Docker to be installed, but if that's a > problem, probably better to "ship" the app as a web application and > share a URL with the user, eg backed by shinyproxy.ioIf Docker is a problem, you can also try podman: same usage, compatible with Dockerfiles and daemon-less, no admin rights required. https://podman.io/ I?aki
Belated thanks to all who replied to my initial query. In summary, three approaches have been mentioned to run R code "in production": 1) ShinyProxy, mentioned by Tobias, for deploying Shiny applications; 2) Docker-like solutions, mentioned by Gergely and I?aki; and 3) Solutions based on Rscript or littler, mentioned by Dirk. I can't speak to 1) because I don't currently use Shiny. And it seems to me that Docker-like solutions will still need some "point of entry" for the R application, which will have to be Rscript or littler. In my first email, I observed that Rscript expects a single expression or a single script, which is probably why (in my experience) many data scientists tend to provide their code in a very limited number of files. Gergely disagreed, arguing to the contrary that data scientists are encouraged to provide their application as an R package called by a short script executed by Rscript. But this doesn't happen where I work for several reasons: - it implies installing your package on the production machine(s), including its dependencies, which must be done by hand - some machine learning platforms will simply not accept code provided as an R package - we have some "big data" use cases for which we need Spark; Spark can run R or Python code, but only when it is provided as a single file. (On the other hand, Spark can run applications provided as JAR files) In summary, I'm convinced R would benefit from something similar to Java's `Main-Class` header or Python's `__main__()` function. A new R CMD command would take a package, install its dependencies, and run its "main" function. If we have this machinery available, we could even consider reaching out to Spark (and other tech stacks) developers and make it easier to develop R applications for those platforms. A candid comment from Dirk suggested that I should implement this myself, which I would be happy to do, provided this is the normal procedure. Or is there a more formal process I should follow? Kind regards, David Lindel?f [[alternative HTML version deleted]]
On 31/01/2019 9:32 a.m., David Lindelof wrote:> Belated thanks to all who replied to my initial query. In summary, three > approaches have been mentioned to run R code "in production": 1) > ShinyProxy, mentioned by Tobias, for deploying Shiny applications; 2) > Docker-like solutions, mentioned by Gergely and I?aki; and 3) Solutions > based on Rscript or littler, mentioned by Dirk. > > I can't speak to 1) because I don't currently use Shiny. And it seems to me > that Docker-like solutions will still need some "point of entry" for the R > application, which will have to be Rscript or littler. > > In my first email, I observed that Rscript expects a single expression or a > single script, which is probably why (in my experience) many data > scientists tend to provide their code in a very limited number of files. > Gergely disagreed, arguing to the contrary that data scientists are > encouraged to provide their application as an R package called by a short > script executed by Rscript. But this doesn't happen where I work for > several reasons: > > - it implies installing your package on the production machine(s), > including its dependencies, which must be done by hand > - some machine learning platforms will simply not accept code provided > as an R package > - we have some "big data" use cases for which we need Spark; Spark can > run R or Python code, but only when it is provided as a single file. (On > the other hand, Spark can run applications provided as JAR files) > > In summary, I'm convinced R would benefit from something similar to Java's > `Main-Class` header or Python's `__main__()` function. A new R CMD command > would take a package, install its dependencies, and run its "main" > function. If we have this machinery available, we could even consider > reaching out to Spark (and other tech stacks) developers and make it easier > to develop R applications for those platforms. > > A candid comment from Dirk suggested that I should implement this myself, > which I would be happy to do, provided this is the normal procedure. Or is > there a more formal process I should follow?You can't implement it to run under R CMD, but it should be straightforward to put this in an R package, to be run by Rscript using something like Rscript -e "yourpackage::run_main('somepackage')" You can use the installation code from the `remotes` package, so run_main() could be a pretty simple function. Duncan Murdoch