A more general way is to change the environment of your formula to
a child of its original environment and add variables like 'weights' or
'subset' to the child environment. Since you change the environment
inside a function call it won't affect the formula outside of the function
call.
E.g.
fmla <- as.formula("y ~ .")
models <- foreach(d=1:10, .combine=rbind, .errorhandling='remove')
%dopar% {
datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
localEnvir <- new.env(parent=environment(fmla))
environment(fmla) <- localEnvir
localEnvir$weights <- rep(c(1,2), 50)
mod <- lm(fmla, data=datdf, weights=weights)
return(mod$coef)
}
models
# (Intercept) x
#result.1 -0.16910860 1.0022022
#result.2 0.03326814 0.9968325
#result.3 -0.08177174 1.0022907
#...
environment(fmla)
#<environment: R_GlobalEnv>
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Fri, Oct 7, 2016 at 7:44 AM, Bos, Roger <roger.bos at rothschild.com>
wrote:
> All,
>
> I figured out how to get it to work, so I am posting the solution in case
> anyone is interested. I had to use attr to set the weights as an attribute
> of the data object for the linear model. Seems convoluted, but anytime I
> tried to pass a named vector as the weights the foreach loop could not find
> the variable, even if I tried exporting it. If anybody knows of a better
> way please let me know as this does not seem ideal to me, but it works.
>
> library(doParallel)
> cl <- makeCluster(4)
> registerDoParallel(cl)
> fmla <- as.formula("y ~ .")
> models <- foreach(d=1:10, .combine=rbind, .errorhandling='pass')
%dopar% {
> datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
> attr(datdf, "weights") <- rep(c(1,2), 50)
> mod <- lm(fmla, data=datdf, weights=attr(data, "weights"))
> return(mod$coef)
> }
> Models
>
>
>
>
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Bos,
Roger
> Sent: Friday, October 07, 2016 9:25 AM
> To: R-help
> Subject: [R] weighted regression inside FOREACH loop
>
> I have a foreach loop that runs regressions in parallel and works fine,
> but when I try to add the weights parameter to the regression the
> coefficients don?t get stored in the ?models? variable like they are
> supposed to. Below is my reproducible example:
>
> library(doParallel)
> cl <- makeCluster(4)
> registerDoParallel(cl)
> fmla <- as.formula("y ~ .")
> models <- foreach(d=1:10, .combine=rbind,
.errorhandling='remove') %dopar%
> {
> datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
> weights <- rep(c(1,2), 50)
> mod <- lm(fmla, data=datdf, weights=weights)
> #mod <- lm(fmla, data=datdf)
> return(mod$coef)
> }
> models
>
> You can change the commenting on the two ?mod <-? lines to see that the
> non-weighted one works and the weighted regression doesn?t work. I tried
> using .export="weights" in the foreach line, but R says that
weights is
> already being exported.
>
> Thanks in advance for any suggestions.
>
>
>
>
>
> ***************************************************************
> This message and any attachments are for the intended recipient's use
only.
> This message may contain confidential, proprietary or legally privileged
> information. No right to confidential or privileged treatment of this
> message is waived or lost by an error in transmission.
> If you have received this message in error, please immediately notify the
> sender by e-mail, delete the message, any attachments and all copies from
> your system and destroy any hard copies. You must not, directly or
> indirectly, use, disclose, distribute, print or copy any part of this
> message or any attachments if you are not the intended recipient.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
Using the temporary child environment works because model.frame, hence lm, looks for the variables used in the formula, subset, and weights arguments first in the data argument and then, if the data argument is not an environment, in the environment of the formula argument. Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Oct 7, 2016 at 8:18 AM, William Dunlap <wdunlap at tibco.com> wrote:> A more general way is to change the environment of your formula to > a child of its original environment and add variables like 'weights' or > 'subset' to the child environment. Since you change the environment > inside a function call it won't affect the formula outside of the function > call. > E.g. > > fmla <- as.formula("y ~ .") > > models <- foreach(d=1:10, .combine=rbind, .errorhandling='remove') %dopar% > { > datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100)) > localEnvir <- new.env(parent=environment(fmla)) > environment(fmla) <- localEnvir > localEnvir$weights <- rep(c(1,2), 50) > mod <- lm(fmla, data=datdf, weights=weights) > return(mod$coef) > } > models > # (Intercept) x > #result.1 -0.16910860 1.0022022 > #result.2 0.03326814 0.9968325 > #result.3 -0.08177174 1.0022907 > #... > environment(fmla) > #<environment: R_GlobalEnv> > > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Fri, Oct 7, 2016 at 7:44 AM, Bos, Roger <roger.bos at rothschild.com> > wrote: > >> All, >> >> I figured out how to get it to work, so I am posting the solution in case >> anyone is interested. I had to use attr to set the weights as an attribute >> of the data object for the linear model. Seems convoluted, but anytime I >> tried to pass a named vector as the weights the foreach loop could not find >> the variable, even if I tried exporting it. If anybody knows of a better >> way please let me know as this does not seem ideal to me, but it works. >> >> library(doParallel) >> cl <- makeCluster(4) >> registerDoParallel(cl) >> fmla <- as.formula("y ~ .") >> models <- foreach(d=1:10, .combine=rbind, .errorhandling='pass') %dopar% { >> datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100)) >> attr(datdf, "weights") <- rep(c(1,2), 50) >> mod <- lm(fmla, data=datdf, weights=attr(data, "weights")) >> return(mod$coef) >> } >> Models >> >> >> >> >> >> -----Original Message----- >> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Bos, >> Roger >> Sent: Friday, October 07, 2016 9:25 AM >> To: R-help >> Subject: [R] weighted regression inside FOREACH loop >> >> I have a foreach loop that runs regressions in parallel and works fine, >> but when I try to add the weights parameter to the regression the >> coefficients don?t get stored in the ?models? variable like they are >> supposed to. Below is my reproducible example: >> >> library(doParallel) >> cl <- makeCluster(4) >> registerDoParallel(cl) >> fmla <- as.formula("y ~ .") >> models <- foreach(d=1:10, .combine=rbind, .errorhandling='remove') >> %dopar% { >> datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100)) >> weights <- rep(c(1,2), 50) >> mod <- lm(fmla, data=datdf, weights=weights) >> #mod <- lm(fmla, data=datdf) >> return(mod$coef) >> } >> models >> >> You can change the commenting on the two ?mod <-? lines to see that the >> non-weighted one works and the weighted regression doesn?t work. I tried >> using .export="weights" in the foreach line, but R says that weights is >> already being exported. >> >> Thanks in advance for any suggestions. >> >> >> >> >> >> *************************************************************** >> This message and any attachments are for the intended recipient's use >> only. >> This message may contain confidential, proprietary or legally privileged >> information. No right to confidential or privileged treatment of this >> message is waived or lost by an error in transmission. >> If you have received this message in error, please immediately notify the >> sender by e-mail, delete the message, any attachments and all copies from >> your system and destroy any hard copies. You must not, directly or >> indirectly, use, disclose, distribute, print or copy any part of this >> message or any attachments if you are not the intended recipient. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > >[[alternative HTML version deleted]]
Bill,
Thanks for your help. Not that I ever doubted you, but I tried your method on
my actual data and I can confirm it does work. I guess I am still wondering why
using .export in foreach doesn?t allow the variable to be found as that method
would seem to be the most straightforward.
Thanks again for your help!
Roger
This message and any attachments are for the intended recipient?s use only.
This message may contain confidential, proprietary or legally privileged
information. No right to confidential or privileged treatment
of this message is waived or lost by an error in transmission.
If you have received this message in error, please immediately
notify the sender by e-mail, delete the message, any attachments and all
copies from your system and destroy any hard copies. You must
not, directly or indirectly, use, disclose, distribute,
print or copy any part of this message or any attachments if you are not
the intended recipient.
From: William Dunlap [mailto:wdunlap at tibco.com]
Sent: Friday, October 07, 2016 11:57 AM
To: Bos, Roger
Cc: R-help
Subject: Re: [R] weighted regression inside FOREACH loop
Using the temporary child environment works because model.frame, hence lm, looks
for the variables used in the formula, subset, and weights arguments first in
the data argument and then, if the data argument is not an environment, in the
environment of the formula argument.
Bill Dunlap
TIBCO Software
wdunlap tibco.com<http://tibco.com>
On Fri, Oct 7, 2016 at 8:18 AM, William Dunlap <wdunlap at
tibco.com<mailto:wdunlap at tibco.com>> wrote:
A more general way is to change the environment of your formula to
a child of its original environment and add variables like 'weights' or
'subset' to the child environment. Since you change the environment
inside a function call it won't affect the formula outside of the function
call.
E.g.
fmla <- as.formula("y ~ .")
models <- foreach(d=1:10, .combine=rbind, .errorhandling='remove')
%dopar% {
datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
localEnvir <- new.env(parent=environment(fmla))
environment(fmla) <- localEnvir
localEnvir$weights <- rep(c(1,2), 50)
mod <- lm(fmla, data=datdf, weights=weights)
return(mod$coef)
}
models
# (Intercept) x
#result.1 -0.16910860 1.0022022
#result.2 0.03326814 0.9968325
#result.3 -0.08177174 1.0022907
#...
environment(fmla)
#<environment: R_GlobalEnv>
Bill Dunlap
TIBCO Software
wdunlap tibco.com<http://tibco.com>
On Fri, Oct 7, 2016 at 7:44 AM, Bos, Roger <roger.bos at
rothschild.com<mailto:roger.bos at rothschild.com>> wrote:
All,
I figured out how to get it to work, so I am posting the solution in case anyone
is interested. I had to use attr to set the weights as an attribute of the data
object for the linear model. Seems convoluted, but anytime I tried to pass a
named vector as the weights the foreach loop could not find the variable, even
if I tried exporting it. If anybody knows of a better way please let me know as
this does not seem ideal to me, but it works.
library(doParallel)
cl <- makeCluster(4)
registerDoParallel(cl)
fmla <- as.formula("y ~ .")
models <- foreach(d=1:10, .combine=rbind, .errorhandling='pass')
%dopar% {
datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
attr(datdf, "weights") <- rep(c(1,2), 50)
mod <- lm(fmla, data=datdf, weights=attr(data, "weights"))
return(mod$coef)
}
Models
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org<mailto:r-help-bounces at
r-project.org>] On Behalf Of Bos, Roger
Sent: Friday, October 07, 2016 9:25 AM
To: R-help
Subject: [R] weighted regression inside FOREACH loop
I have a foreach loop that runs regressions in parallel and works fine, but when
I try to add the weights parameter to the regression the coefficients don?t get
stored in the ?models? variable like they are supposed to. Below is my
reproducible example:
library(doParallel)
cl <- makeCluster(4)
registerDoParallel(cl)
fmla <- as.formula("y ~ .")
models <- foreach(d=1:10, .combine=rbind, .errorhandling='remove')
%dopar% {
datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
weights <- rep(c(1,2), 50)
mod <- lm(fmla, data=datdf, weights=weights)
#mod <- lm(fmla, data=datdf)
return(mod$coef)
}
models
You can change the commenting on the two ?mod <-? lines to see that the
non-weighted one works and the weighted regression doesn?t work. I tried using
.export="weights" in the foreach line, but R says that weights is
already being exported.
Thanks in advance for any suggestions.
***************************************************************
This message and any attachments are for the intended recipient's use only.
This message may contain confidential, proprietary or legally privileged
information. No right to confidential or privileged treatment of this message is
waived or lost by an error in transmission.
If you have received this message in error, please immediately notify the sender
by e-mail, delete the message, any attachments and all copies from your system
and destroy any hard copies. You must not, directly or indirectly, use,
disclose, distribute, print or copy any part of this message or any attachments
if you are not the intended recipient.
______________________________________________
R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To
UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]