thr3ads.net - R help - [R] Convergence in Monte Carlo Simulation [Jun 2020]

If this information is useful, please help other people find it:
Share via:

Phat Chau

2020-Jun-14 12:57 UTC

[R] Convergence in Monte Carlo Simulation

Thank you Michael. 

I will clarify some more. The function in the first part of the code that I
posted generates the simulated dataset for a cluster randomized trial from the
simstudy package.

I am not quite clear what you mean by placing it outside the loop. So the goal
here is to create n = 1000 independent datasets with different (randomly drawn
values from the specified normal distributions not shown) for all of the
parameters. What I have tried to do is place the seed at the very top of all my
code in the past, but what that does is it leads to the creation of a single
dataset that gets repeated over and over n = 1000 times. Hence, there ends up
being no variability in the data (and power estimates from the p-values given
the stated and required power).

Regarding the counter, is it correct in this instance that the loop will
continue until n = 1000 iterations have successfully converged? I am not so
concerned with counting failures.

Thank you.
Edward

?On 2020-06-14, 6:46 AM, "Michael Dewey" <lists at
dewey.myzen.co.uk> wrote:

    I am not 100% clear what your code is doing as it gets a bit wangled as 
    you posted in HTML but here are a couple of thoughts.
    
    You need to set the seed outside any loops so it happens once and for all.
    
    I would test after trycatch and keep a separate count of failures and 
    successes as the failure to converge must be meaningful about the 
    scientific question whatever that is. At the moment your count appears 
    to be in the correct place to count successes.
    
    Michael
    
    On 14/06/2020 02:50, Phat Chau wrote:
    > Hello,
    > 
    > I put together the following code and am curious about its correctness.
My first question relates to the Monte Carlo simulations ? the goal is to
continue to iterate until I get n = 1000 simulations where the model
successfully converges. I am wondering if I coded it correctly below with the
while loop. Is the idea that the counter increments by one only if ?model? does
not return a string?
    > 
    > I would also like to know how I can create n = 1000 independent data
sets. I think to do this, I would have to set a random number seed via
set.seed() before the creation of each dataset. Where would I enter set.seed in
the syntax below? Would it be in the function (as indicated in red)?
    > 
    > powercrosssw <- function(nclus, clsize) {
    > 
    >    set.seed()
    > 
    >    cohortsw <- genData(nclus, id = "cluster")
    >    cohortsw <- addColumns(clusterDef, cohortsw)
    >    cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars =
"cluster", perName = "period")
    >    cohortstep <- trtStepWedge(cohortswTm, "cluster",
nWaves = 4, lenWaves = 1, startPer = 1, grpName = "Ijt")
    > 
    >    pat <- genCluster(cohortswTm, cLevelVar = "timeID",
numIndsVar = clsize, level1ID = "id")
    > 
    >    dx <- merge(pat[, .(cluster, period, id)], cohortstep, by =
c("cluster", "period"))
    >    dx <- addColumns(patError, dx)
    > 
    >    setkey(dx, id, cluster, period)
    > 
    >    dx <- addColumns(outDef, dx)
    > 
    >    return(dx)
    > 
    > }
    > 
    > i=1
    > 
    > while (i < 1000) {
    > 
    >    dx <- powercrosssw()
    > 
    >    #Fit multi-level model to simulated dataset
    >    model5 <- tryCatch(lme(y ~ factor(period) + factor(Ijt), data =
dx, random = ~1|cluster, method = "REML"),
    >                       warning = function(w) { "warning" }
    >    )
    > 
    >    if (! is.character(model5)) {
    > 
    >      coeff <- coef(summary(model5))["factor(Ijt)1",
"Value"]
    >      pvalue <- coef(summary(model5))["factor(Ijt)1",
"p-value"]
    >      error <- coef(summary(model5))["factor(Ijt)1",
"Std.Error"]
    >      bresult <- c(bresult, coeff)
    >      presult <- c(presult, pvalue)
    >      eresult <- c(eresult, error)
    > 
    >      i <- i + 1
    >    }
    > }
    > 
    > Thank you so much.
    > 
    > 
    > 
    > 	[[alternative HTML version deleted]]
    > 
    > ______________________________________________
    > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.
    > 
    > 
    
    -- 
    Michael
    http://www.dewey.myzen.co.uk/home.html

Michael Dewey

2020-Jun-14 14:16 UTC

head link

[R] Convergence in Monte Carlo Simulation

Dear Edward

Every time you call your function powercrosssw() it resets the seed so 
you must be calling it multiple times in some way.

Michael

On 14/06/2020 13:57, Phat Chau wrote:> Thank you Michael.
> 
> I will clarify some more. The function in the first part of the code that I
posted generates the simulated dataset for a cluster randomized trial from the
simstudy package.
> 
> I am not quite clear what you mean by placing it outside the loop. So the
goal here is to create n = 1000 independent datasets with different (randomly
drawn values from the specified normal distributions not shown) for all of the
parameters. What I have tried to do is place the seed at the very top of all my
code in the past, but what that does is it leads to the creation of a single
dataset that gets repeated over and over n = 1000 times. Hence, there ends up
being no variability in the data (and power estimates from the p-values given
the stated and required power).
> 
> Regarding the counter, is it correct in this instance that the loop will
continue until n = 1000 iterations have successfully converged? I am not so
concerned with counting failures.
> 
> Thank you.
> Edward
> 
> ?On 2020-06-14, 6:46 AM, "Michael Dewey" <lists at
dewey.myzen.co.uk> wrote:
> 
>      I am not 100% clear what your code is doing as it gets a bit wangled
as
>      you posted in HTML but here are a couple of thoughts.
>      
>      You need to set the seed outside any loops so it happens once and for
all.
>      
>      I would test after trycatch and keep a separate count of failures and
>      successes as the failure to converge must be meaningful about the
>      scientific question whatever that is. At the moment your count appears
>      to be in the correct place to count successes.
>      
>      Michael
>      
>      On 14/06/2020 02:50, Phat Chau wrote:
>      > Hello,
>      >
>      > I put together the following code and am curious about its
correctness. My first question relates to the Monte Carlo simulations ? the goal
is to continue to iterate until I get n = 1000 simulations where the model
successfully converges. I am wondering if I coded it correctly below with the
while loop. Is the idea that the counter increments by one only if ?model? does
not return a string?
>      >
>      > I would also like to know how I can create n = 1000 independent
data sets. I think to do this, I would have to set a random number seed via
set.seed() before the creation of each dataset. Where would I enter set.seed in
the syntax below? Would it be in the function (as indicated in red)?
>      >
>      > powercrosssw <- function(nclus, clsize) {
>      >
>      >    set.seed()
>      >
>      >    cohortsw <- genData(nclus, id = "cluster")
>      >    cohortsw <- addColumns(clusterDef, cohortsw)
>      >    cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars =
"cluster", perName = "period")
>      >    cohortstep <- trtStepWedge(cohortswTm, "cluster",
nWaves = 4, lenWaves = 1, startPer = 1, grpName = "Ijt")
>      >
>      >    pat <- genCluster(cohortswTm, cLevelVar =
"timeID", numIndsVar = clsize, level1ID = "id")
>      >
>      >    dx <- merge(pat[, .(cluster, period, id)], cohortstep, by =
c("cluster", "period"))
>      >    dx <- addColumns(patError, dx)
>      >
>      >    setkey(dx, id, cluster, period)
>      >
>      >    dx <- addColumns(outDef, dx)
>      >
>      >    return(dx)
>      >
>      > }
>      >
>      > i=1
>      >
>      > while (i < 1000) {
>      >
>      >    dx <- powercrosssw()
>      >
>      >    #Fit multi-level model to simulated dataset
>      >    model5 <- tryCatch(lme(y ~ factor(period) + factor(Ijt),
data = dx, random = ~1|cluster, method = "REML"),
>      >                       warning = function(w) { "warning"
}
>      >    )
>      >
>      >    if (! is.character(model5)) {
>      >
>      >      coeff <- coef(summary(model5))["factor(Ijt)1",
"Value"]
>      >      pvalue <- coef(summary(model5))["factor(Ijt)1",
"p-value"]
>      >      error <- coef(summary(model5))["factor(Ijt)1",
"Std.Error"]
>      >      bresult <- c(bresult, coeff)
>      >      presult <- c(presult, pvalue)
>      >      eresult <- c(eresult, error)
>      >
>      >      i <- i + 1
>      >    }
>      > }
>      >
>      > Thank you so much.
>      >
>      >
>      >
>      > 	[[alternative HTML version deleted]]
>      >
>      > ______________________________________________
>      > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>      > https://stat.ethz.ch/mailman/listinfo/r-help
>      > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>      > and provide commented, minimal, self-contained, reproducible
code.
>      >
>      >
>      
>      --
>      Michael
>      http://www.dewey.myzen.co.uk/home.html
>      
> 
> 
> 
-- 
Michael
http://www.dewey.myzen.co.uk/home.html

Phat Chau

2020-Jun-15 12:19 UTC

head link

[R] Convergence in Monte Carlo Simulation

Dear Michael, 

So I shouldn't be setting the seed at all then since it is automatic? Or is
the suggestion here that a new seed is chosen each time?

I think rather than having you guess at the problem (my apologies) I will post
the entire of the code (with omissions where it is not directly impacting the
problem at hand). Sometimes I hesitate to post huge blocks because it can be a
bit daunting, but I realize in coding that even the smallest glitch can throw
everything off.

Set.seed(123) <<<Placing the seed here leads to no variation at all in
my simulations as noted previously

clusterDef <- defDataAdd(varname = "u_3", dist =
"normal", formula = 0, variance = 25.77)
patDef <- defDataAdd(varname = "u_2", dist = "normal",
formula = 0, variance = 120.62)
patError <- defDataAdd(varname = "error", dist =
"normal", formula = 0, variance = 38.35)

...(Data definition code omitted)

setkey(patTm, id, cluster, period)

#Define outcome y 
outDef <- defDataAdd(varname = "y", formula = "17.87 + 5.0*Ijt
- 5.42*I(period == 1) - 5.72*I(period == 2) - 7.03*I(period == 3) -
6.13*I(period == 4) - 9.13*I(period == 5) + u_3 + u_2 + error", dist =
"normal")

patTm <- addColumns(outDef, patTm)

powercrosssw <- function(nclus, clsize) {
	
	set.seed() < not sure if placing it the function rather than at the top is
appropriate to generate a new and independent dataset for each of the 1000
iterations

Regarding the convergence issue, it seems that what you are saying is I have it
all set up correctly (i.e. it will iterate until 1000 iterations converge). I do
get this rather peculiar error though in some cases:

Error in lme.formula(y ~ factor(period) + factor(Ijt), data = patTm, random = ~1
|  :
  nlminb problem, convergence error code = 1
  message = false convergence (8)

I am not quite sure what the problem is there.

Edward


?On 2020-06-14, 10:16 AM, "Michael Dewey" <lists at
dewey.myzen.co.uk> wrote:

    Dear Edward
    
    Every time you call your function powercrosssw() it resets the seed so 
    you must be calling it multiple times in some way.
    
    Michael
    
    On 14/06/2020 13:57, Phat Chau wrote:
    > Thank you Michael.
    > 
    > I will clarify some more. The function in the first part of the code
that I posted generates the simulated dataset for a cluster randomized trial
from the simstudy package.
    > 
    > I am not quite clear what you mean by placing it outside the loop. So
the goal here is to create n = 1000 independent datasets with different
(randomly drawn values from the specified normal distributions not shown) for
all of the parameters. What I have tried to do is place the seed at the very top
of all my code in the past, but what that does is it leads to the creation of a
single dataset that gets repeated over and over n = 1000 times. Hence, there
ends up being no variability in the data (and power estimates from the p-values
given the stated and required power).
    > 
    > Regarding the counter, is it correct in this instance that the loop
will continue until n = 1000 iterations have successfully converged? I am not so
concerned with counting failures.
    > 
    > Thank you.
    > Edward
    > 
    > On 2020-06-14, 6:46 AM, "Michael Dewey" <lists at
dewey.myzen.co.uk> wrote:
    > 
    >      I am not 100% clear what your code is doing as it gets a bit
wangled as
    >      you posted in HTML but here are a couple of thoughts.
    >      
    >      You need to set the seed outside any loops so it happens once and
for all.
    >      
    >      I would test after trycatch and keep a separate count of failures
and
    >      successes as the failure to converge must be meaningful about the
    >      scientific question whatever that is. At the moment your count
appears
    >      to be in the correct place to count successes.
    >      
    >      Michael
    >      
    >      On 14/06/2020 02:50, Phat Chau wrote:
    >      > Hello,
    >      >
    >      > I put together the following code and am curious about its
correctness. My first question relates to the Monte Carlo simulations ? the goal
is to continue to iterate until I get n = 1000 simulations where the model
successfully converges. I am wondering if I coded it correctly below with the
while loop. Is the idea that the counter increments by one only if ?model? does
not return a string?
    >      >
    >      > I would also like to know how I can create n = 1000
independent data sets. I think to do this, I would have to set a random number
seed via set.seed() before the creation of each dataset. Where would I enter
set.seed in the syntax below? Would it be in the function (as indicated in red)?
    >      >
    >      > powercrosssw <- function(nclus, clsize) {
    >      >
    >      >    set.seed()
    >      >
    >      >    cohortsw <- genData(nclus, id = "cluster")
    >      >    cohortsw <- addColumns(clusterDef, cohortsw)
    >      >    cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars
= "cluster", perName = "period")
    >      >    cohortstep <- trtStepWedge(cohortswTm,
"cluster", nWaves = 4, lenWaves = 1, startPer = 1, grpName =
"Ijt")
    >      >
    >      >    pat <- genCluster(cohortswTm, cLevelVar =
"timeID", numIndsVar = clsize, level1ID = "id")
    >      >
    >      >    dx <- merge(pat[, .(cluster, period, id)], cohortstep,
by = c("cluster", "period"))
    >      >    dx <- addColumns(patError, dx)
    >      >
    >      >    setkey(dx, id, cluster, period)
    >      >
    >      >    dx <- addColumns(outDef, dx)
    >      >
    >      >    return(dx)
    >      >
    >      > }
    >      >
    >      > i=1
    >      >
    >      > while (i < 1000) {
    >      >
    >      >    dx <- powercrosssw()
    >      >
    >      >    #Fit multi-level model to simulated dataset
    >      >    model5 <- tryCatch(lme(y ~ factor(period) +
factor(Ijt), data = dx, random = ~1|cluster, method = "REML"),
    >      >                       warning = function(w) {
"warning" }
    >      >    )
    >      >
    >      >    if (! is.character(model5)) {
    >      >
    >      >      coeff <-
coef(summary(model5))["factor(Ijt)1", "Value"]
    >      >      pvalue <-
coef(summary(model5))["factor(Ijt)1", "p-value"]
    >      >      error <-
coef(summary(model5))["factor(Ijt)1", "Std.Error"]
    >      >      bresult <- c(bresult, coeff)
    >      >      presult <- c(presult, pvalue)
    >      >      eresult <- c(eresult, error)
    >      >
    >      >      i <- i + 1
    >      >    }
    >      > }
    >      >
    >      > Thank you so much.
    >      >
    >      >
    >      >
    >      > 	[[alternative HTML version deleted]]
    >      >
    >      > ______________________________________________
    >      > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
    >      > https://stat.ethz.ch/mailman/listinfo/r-help
    >      > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
    >      > and provide commented, minimal, self-contained, reproducible
code.
    >      >
    >      >
    >      
    >      --
    >      Michael
    >      http://www.dewey.myzen.co.uk/home.html
    >      
    > 
    > 
    > 
    
    -- 
    Michael
    http://www.dewey.myzen.co.uk/home.html

R help - Jun 2020 - Convergence in Monte Carlo Simulation

[R] Convergence in Monte Carlo Simulation

[R] Convergence in Monte Carlo Simulation

[R] Convergence in Monte Carlo Simulation