Thank you Michael. I will clarify some more. The function in the first part of the code that I posted generates the simulated dataset for a cluster randomized trial from the simstudy package. I am not quite clear what you mean by placing it outside the loop. So the goal here is to create n = 1000 independent datasets with different (randomly drawn values from the specified normal distributions not shown) for all of the parameters. What I have tried to do is place the seed at the very top of all my code in the past, but what that does is it leads to the creation of a single dataset that gets repeated over and over n = 1000 times. Hence, there ends up being no variability in the data (and power estimates from the p-values given the stated and required power). Regarding the counter, is it correct in this instance that the loop will continue until n = 1000 iterations have successfully converged? I am not so concerned with counting failures. Thank you. Edward ?On 2020-06-14, 6:46 AM, "Michael Dewey" <lists at dewey.myzen.co.uk> wrote: I am not 100% clear what your code is doing as it gets a bit wangled as you posted in HTML but here are a couple of thoughts. You need to set the seed outside any loops so it happens once and for all. I would test after trycatch and keep a separate count of failures and successes as the failure to converge must be meaningful about the scientific question whatever that is. At the moment your count appears to be in the correct place to count successes. Michael On 14/06/2020 02:50, Phat Chau wrote: > Hello, > > I put together the following code and am curious about its correctness. My first question relates to the Monte Carlo simulations ? the goal is to continue to iterate until I get n = 1000 simulations where the model successfully converges. I am wondering if I coded it correctly below with the while loop. Is the idea that the counter increments by one only if ?model? does not return a string? > > I would also like to know how I can create n = 1000 independent data sets. I think to do this, I would have to set a random number seed via set.seed() before the creation of each dataset. Where would I enter set.seed in the syntax below? Would it be in the function (as indicated in red)? > > powercrosssw <- function(nclus, clsize) { > > set.seed() > > cohortsw <- genData(nclus, id = "cluster") > cohortsw <- addColumns(clusterDef, cohortsw) > cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars = "cluster", perName = "period") > cohortstep <- trtStepWedge(cohortswTm, "cluster", nWaves = 4, lenWaves = 1, startPer = 1, grpName = "Ijt") > > pat <- genCluster(cohortswTm, cLevelVar = "timeID", numIndsVar = clsize, level1ID = "id") > > dx <- merge(pat[, .(cluster, period, id)], cohortstep, by = c("cluster", "period")) > dx <- addColumns(patError, dx) > > setkey(dx, id, cluster, period) > > dx <- addColumns(outDef, dx) > > return(dx) > > } > > i=1 > > while (i < 1000) { > > dx <- powercrosssw() > > #Fit multi-level model to simulated dataset > model5 <- tryCatch(lme(y ~ factor(period) + factor(Ijt), data = dx, random = ~1|cluster, method = "REML"), > warning = function(w) { "warning" } > ) > > if (! is.character(model5)) { > > coeff <- coef(summary(model5))["factor(Ijt)1", "Value"] > pvalue <- coef(summary(model5))["factor(Ijt)1", "p-value"] > error <- coef(summary(model5))["factor(Ijt)1", "Std.Error"] > bresult <- c(bresult, coeff) > presult <- c(presult, pvalue) > eresult <- c(eresult, error) > > i <- i + 1 > } > } > > Thank you so much. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Michael http://www.dewey.myzen.co.uk/home.html
Dear Edward Every time you call your function powercrosssw() it resets the seed so you must be calling it multiple times in some way. Michael On 14/06/2020 13:57, Phat Chau wrote:> Thank you Michael. > > I will clarify some more. The function in the first part of the code that I posted generates the simulated dataset for a cluster randomized trial from the simstudy package. > > I am not quite clear what you mean by placing it outside the loop. So the goal here is to create n = 1000 independent datasets with different (randomly drawn values from the specified normal distributions not shown) for all of the parameters. What I have tried to do is place the seed at the very top of all my code in the past, but what that does is it leads to the creation of a single dataset that gets repeated over and over n = 1000 times. Hence, there ends up being no variability in the data (and power estimates from the p-values given the stated and required power). > > Regarding the counter, is it correct in this instance that the loop will continue until n = 1000 iterations have successfully converged? I am not so concerned with counting failures. > > Thank you. > Edward > > ?On 2020-06-14, 6:46 AM, "Michael Dewey" <lists at dewey.myzen.co.uk> wrote: > > I am not 100% clear what your code is doing as it gets a bit wangled as > you posted in HTML but here are a couple of thoughts. > > You need to set the seed outside any loops so it happens once and for all. > > I would test after trycatch and keep a separate count of failures and > successes as the failure to converge must be meaningful about the > scientific question whatever that is. At the moment your count appears > to be in the correct place to count successes. > > Michael > > On 14/06/2020 02:50, Phat Chau wrote: > > Hello, > > > > I put together the following code and am curious about its correctness. My first question relates to the Monte Carlo simulations ? the goal is to continue to iterate until I get n = 1000 simulations where the model successfully converges. I am wondering if I coded it correctly below with the while loop. Is the idea that the counter increments by one only if ?model? does not return a string? > > > > I would also like to know how I can create n = 1000 independent data sets. I think to do this, I would have to set a random number seed via set.seed() before the creation of each dataset. Where would I enter set.seed in the syntax below? Would it be in the function (as indicated in red)? > > > > powercrosssw <- function(nclus, clsize) { > > > > set.seed() > > > > cohortsw <- genData(nclus, id = "cluster") > > cohortsw <- addColumns(clusterDef, cohortsw) > > cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars = "cluster", perName = "period") > > cohortstep <- trtStepWedge(cohortswTm, "cluster", nWaves = 4, lenWaves = 1, startPer = 1, grpName = "Ijt") > > > > pat <- genCluster(cohortswTm, cLevelVar = "timeID", numIndsVar = clsize, level1ID = "id") > > > > dx <- merge(pat[, .(cluster, period, id)], cohortstep, by = c("cluster", "period")) > > dx <- addColumns(patError, dx) > > > > setkey(dx, id, cluster, period) > > > > dx <- addColumns(outDef, dx) > > > > return(dx) > > > > } > > > > i=1 > > > > while (i < 1000) { > > > > dx <- powercrosssw() > > > > #Fit multi-level model to simulated dataset > > model5 <- tryCatch(lme(y ~ factor(period) + factor(Ijt), data = dx, random = ~1|cluster, method = "REML"), > > warning = function(w) { "warning" } > > ) > > > > if (! is.character(model5)) { > > > > coeff <- coef(summary(model5))["factor(Ijt)1", "Value"] > > pvalue <- coef(summary(model5))["factor(Ijt)1", "p-value"] > > error <- coef(summary(model5))["factor(Ijt)1", "Std.Error"] > > bresult <- c(bresult, coeff) > > presult <- c(presult, pvalue) > > eresult <- c(eresult, error) > > > > i <- i + 1 > > } > > } > > > > Thank you so much. > > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Michael > http://www.dewey.myzen.co.uk/home.html > > > >-- Michael http://www.dewey.myzen.co.uk/home.html
Dear Michael, So I shouldn't be setting the seed at all then since it is automatic? Or is the suggestion here that a new seed is chosen each time? I think rather than having you guess at the problem (my apologies) I will post the entire of the code (with omissions where it is not directly impacting the problem at hand). Sometimes I hesitate to post huge blocks because it can be a bit daunting, but I realize in coding that even the smallest glitch can throw everything off. Set.seed(123) <<<Placing the seed here leads to no variation at all in my simulations as noted previously clusterDef <- defDataAdd(varname = "u_3", dist = "normal", formula = 0, variance = 25.77) patDef <- defDataAdd(varname = "u_2", dist = "normal", formula = 0, variance = 120.62) patError <- defDataAdd(varname = "error", dist = "normal", formula = 0, variance = 38.35) ...(Data definition code omitted) setkey(patTm, id, cluster, period) #Define outcome y outDef <- defDataAdd(varname = "y", formula = "17.87 + 5.0*Ijt - 5.42*I(period == 1) - 5.72*I(period == 2) - 7.03*I(period == 3) - 6.13*I(period == 4) - 9.13*I(period == 5) + u_3 + u_2 + error", dist = "normal") patTm <- addColumns(outDef, patTm) powercrosssw <- function(nclus, clsize) { set.seed() < not sure if placing it the function rather than at the top is appropriate to generate a new and independent dataset for each of the 1000 iterations Regarding the convergence issue, it seems that what you are saying is I have it all set up correctly (i.e. it will iterate until 1000 iterations converge). I do get this rather peculiar error though in some cases: Error in lme.formula(y ~ factor(period) + factor(Ijt), data = patTm, random = ~1 | : nlminb problem, convergence error code = 1 message = false convergence (8) I am not quite sure what the problem is there. Edward ?On 2020-06-14, 10:16 AM, "Michael Dewey" <lists at dewey.myzen.co.uk> wrote: Dear Edward Every time you call your function powercrosssw() it resets the seed so you must be calling it multiple times in some way. Michael On 14/06/2020 13:57, Phat Chau wrote: > Thank you Michael. > > I will clarify some more. The function in the first part of the code that I posted generates the simulated dataset for a cluster randomized trial from the simstudy package. > > I am not quite clear what you mean by placing it outside the loop. So the goal here is to create n = 1000 independent datasets with different (randomly drawn values from the specified normal distributions not shown) for all of the parameters. What I have tried to do is place the seed at the very top of all my code in the past, but what that does is it leads to the creation of a single dataset that gets repeated over and over n = 1000 times. Hence, there ends up being no variability in the data (and power estimates from the p-values given the stated and required power). > > Regarding the counter, is it correct in this instance that the loop will continue until n = 1000 iterations have successfully converged? I am not so concerned with counting failures. > > Thank you. > Edward > > On 2020-06-14, 6:46 AM, "Michael Dewey" <lists at dewey.myzen.co.uk> wrote: > > I am not 100% clear what your code is doing as it gets a bit wangled as > you posted in HTML but here are a couple of thoughts. > > You need to set the seed outside any loops so it happens once and for all. > > I would test after trycatch and keep a separate count of failures and > successes as the failure to converge must be meaningful about the > scientific question whatever that is. At the moment your count appears > to be in the correct place to count successes. > > Michael > > On 14/06/2020 02:50, Phat Chau wrote: > > Hello, > > > > I put together the following code and am curious about its correctness. My first question relates to the Monte Carlo simulations ? the goal is to continue to iterate until I get n = 1000 simulations where the model successfully converges. I am wondering if I coded it correctly below with the while loop. Is the idea that the counter increments by one only if ?model? does not return a string? > > > > I would also like to know how I can create n = 1000 independent data sets. I think to do this, I would have to set a random number seed via set.seed() before the creation of each dataset. Where would I enter set.seed in the syntax below? Would it be in the function (as indicated in red)? > > > > powercrosssw <- function(nclus, clsize) { > > > > set.seed() > > > > cohortsw <- genData(nclus, id = "cluster") > > cohortsw <- addColumns(clusterDef, cohortsw) > > cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars = "cluster", perName = "period") > > cohortstep <- trtStepWedge(cohortswTm, "cluster", nWaves = 4, lenWaves = 1, startPer = 1, grpName = "Ijt") > > > > pat <- genCluster(cohortswTm, cLevelVar = "timeID", numIndsVar = clsize, level1ID = "id") > > > > dx <- merge(pat[, .(cluster, period, id)], cohortstep, by = c("cluster", "period")) > > dx <- addColumns(patError, dx) > > > > setkey(dx, id, cluster, period) > > > > dx <- addColumns(outDef, dx) > > > > return(dx) > > > > } > > > > i=1 > > > > while (i < 1000) { > > > > dx <- powercrosssw() > > > > #Fit multi-level model to simulated dataset > > model5 <- tryCatch(lme(y ~ factor(period) + factor(Ijt), data = dx, random = ~1|cluster, method = "REML"), > > warning = function(w) { "warning" } > > ) > > > > if (! is.character(model5)) { > > > > coeff <- coef(summary(model5))["factor(Ijt)1", "Value"] > > pvalue <- coef(summary(model5))["factor(Ijt)1", "p-value"] > > error <- coef(summary(model5))["factor(Ijt)1", "Std.Error"] > > bresult <- c(bresult, coeff) > > presult <- c(presult, pvalue) > > eresult <- c(eresult, error) > > > > i <- i + 1 > > } > > } > > > > Thank you so much. > > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Michael > http://www.dewey.myzen.co.uk/home.html > > > > -- Michael http://www.dewey.myzen.co.uk/home.html