Dear R-devel, I am experiencing issues with running GAM models using mclapply, it fails to return any values if the data input becomes large. For example here the code runs fine with a df of 100 rows, but fails at 1000. library(mgcv) library(parallel)> df <- data.frame(+ x = 1:100, + y = 1:100 + )> > mclapply(1:2, function(i, df) {+ fit <- gam(y ~ s(x, bs = "cs"), data = df) + }, + df = df, + mc.cores = 2L + ) [[1]] Family: gaussian Link function: identity Formula: y ~ s(x, bs = "cs") Estimated degrees of freedom: 9 total = 10 GCV score: 0 [[2]] Family: gaussian Link function: identity Formula: y ~ s(x, bs = "cs") Estimated degrees of freedom: 9 total = 10 GCV score: 0> > > df <- data.frame(+ x = 1:1000, + y = 1:1000 + )> > mclapply(1:2, function(i, df) {+ fit <- gam(y ~ s(x, bs = "cs"), data = df) + }, + df = df, + mc.cores = 2L + ) [[1]] NULL [[2]] NULL There is no error message returned, and the code runs perfectly fine in lapply. I am on a MacBook 15 (2016) running MacOS 10.14.6 (Mojave) and R version 3.6.2. This bug could not be reproduced on my Ubuntu 19.10 running R 3.6.1. Kind regards, Shian Su ---- Shian Su PhD Student, Ritchie Lab 6W, Epigenetics and Development Walter & Eliza Hall Institute of Medical Research 1G Royal Parade, Parkville VIC 3052, Australia _______________________________________________ The information in this email is confidential and intend...{{dropped:15}}
Sorry, the code works perfectly fine for me in R even for 1e6 observations (but I was testing with R 4.0.0). Are you using some kind of GUI? Cheers, Simon> On 28/04/2020, at 8:11 PM, Shian Su <su.s at wehi.edu.au> wrote: > > Dear R-devel, > > I am experiencing issues with running GAM models using mclapply, it fails to return any values if the data input becomes large. For example here the code runs fine with a df of 100 rows, but fails at 1000. > > library(mgcv) > library(parallel) > >> df <- data.frame( > + x = 1:100, > + y = 1:100 > + ) >> >> mclapply(1:2, function(i, df) { > + fit <- gam(y ~ s(x, bs = "cs"), data = df) > + }, > + df = df, > + mc.cores = 2L > + ) > [[1]] > > Family: gaussian > Link function: identity > > Formula: > y ~ s(x, bs = "cs") > > Estimated degrees of freedom: > 9 total = 10 > > GCV score: 0 > > [[2]] > > Family: gaussian > Link function: identity > > Formula: > y ~ s(x, bs = "cs") > > Estimated degrees of freedom: > 9 total = 10 > > GCV score: 0 > >> >> >> df <- data.frame( > + x = 1:1000, > + y = 1:1000 > + ) >> >> mclapply(1:2, function(i, df) { > + fit <- gam(y ~ s(x, bs = "cs"), data = df) > + }, > + df = df, > + mc.cores = 2L > + ) > [[1]] > NULL > > [[2]] > NULL > > There is no error message returned, and the code runs perfectly fine in lapply. > > I am on a MacBook 15 (2016) running MacOS 10.14.6 (Mojave) and R version 3.6.2. This bug could not be reproduced on my Ubuntu 19.10 running R 3.6.1. > > Kind regards, > Shian Su > ---- > Shian Su > PhD Student, Ritchie Lab 6W, Epigenetics and Development > Walter & Eliza Hall Institute of Medical Research > 1G Royal Parade, Parkville VIC 3052, Australia > > > _______________________________________________ > > The information in this email is confidential and =\ i...{{dropped:8}}
Yes I am running on Rstudio 1.2.5033. I was also running this code without error on Ubuntu in Rstudio. Checking again on the terminal and it does indeed work fine even with large data.frames. Any idea as to what interaction between Rstudio and mclapply causes this? Thanks, Shian On 28 Apr 2020, at 7:29 pm, Simon Urbanek <simon.urbanek at R-project.org<mailto:simon.urbanek at R-project.org>> wrote: Sorry, the code works perfectly fine for me in R even for 1e6 observations (but I was testing with R 4.0.0). Are you using some kind of GUI? Cheers, Simon On 28/04/2020, at 8:11 PM, Shian Su <su.s at wehi.edu.au<mailto:su.s at wehi.edu.au>> wrote: Dear R-devel, I am experiencing issues with running GAM models using mclapply, it fails to return any values if the data input becomes large. For example here the code runs fine with a df of 100 rows, but fails at 1000. library(mgcv) library(parallel) df <- data.frame( + x = 1:100, + y = 1:100 + ) mclapply(1:2, function(i, df) { + fit <- gam(y ~ s(x, bs = "cs"), data = df) + }, + df = df, + mc.cores = 2L + ) [[1]] Family: gaussian Link function: identity Formula: y ~ s(x, bs = "cs") Estimated degrees of freedom: 9 total = 10 GCV score: 0 [[2]] Family: gaussian Link function: identity Formula: y ~ s(x, bs = "cs") Estimated degrees of freedom: 9 total = 10 GCV score: 0 df <- data.frame( + x = 1:1000, + y = 1:1000 + ) mclapply(1:2, function(i, df) { + fit <- gam(y ~ s(x, bs = "cs"), data = df) + }, + df = df, + mc.cores = 2L + ) [[1]] NULL [[2]] NULL There is no error message returned, and the code runs perfectly fine in lapply. I am on a MacBook 15 (2016) running MacOS 10.14.6 (Mojave) and R version 3.6.2. This bug could not be reproduced on my Ubuntu 19.10 running R 3.6.1. Kind regards, Shian Su ---- Shian Su PhD Student, Ritchie Lab 6W, Epigenetics and Development Walter & Eliza Hall Institute of Medical Research 1G Royal Parade, Parkville VIC 3052, Australia _______________________________________________ The information in this email is confidential and intend...{{dropped:26}}