jjh21
2009-Apr-07 13:26 UTC
[R] Simulate binary data for a logistic regression Monte Carlo
Hello, I am trying to simulate binary outcome data for a logistic regression Monte Carlo study. I need to eventually be able to manipulate the structure of the error term to give groups of observations a random effect. Right now I am just doing a very basic set up to make sure I can recover the parameters properly. I am running into trouble with the code below. It works if you take out the object 'epsilon,' but I need that object because it is where I plan to add in the random effects eventually. Any suggestions? Thanks. set.seed(1) alpha <- numeric(100) # Vector to store coefficient estimates X1 <- numeric(100) X2 <- numeric(100) a <- 0 # True parameters B1 <- .5 B2 <- .85 for (i in 1:100){ x1 <- rnorm(1000) x2 <- rnorm(1000) epsilon <- rnorm(1000) LP <- a + B1*x1 + B2*x2 + epsilon # Linear predictor y <- rbinom(length(LP), 1, plogis(LP)) # Binomial link model <- glm(y ~ x1 + x2, family = binomial) alpha[i] <- coef(model)[1] X1[i] <- coef(model)[2] X2[i] <- coef(model)[3] } mean(X1) # Shows a bias downward (should be about .5) mean(X2) # Shows a bias downward (should be about .85) mean(alpha) # Pretty close (should be about 0) -- View this message in context: http://www.nabble.com/Simulate-binary-data-for-a-logistic-regression-Monte-Carlo-tp22928872p22928872.html Sent from the R help mailing list archive at Nabble.com.
Rolf Turner
2009-Apr-07 20:58 UTC
[R] Simulate binary data for a logistic regression Monte Carlo
On 8/04/2009, at 1:26 AM, jjh21 wrote:> > Hello, > > I am trying to simulate binary outcome data for a logistic > regression Monte > Carlo study. I need to eventually be able to manipulate the > structure of the > error term to give groups of observations a random effect. Right > now I am > just doing a very basic set up to make sure I can recover the > parameters > properly. I am running into trouble with the code below. It works > if you > take out the object 'epsilon,' but I need that object because it is > where I > plan to add in the random effects eventually. Any suggestions? Thanks.Well, it's the epsilon term that's doing you in. The model that you are fitting takes no cognizance of this random effect. You can *probably* fit a model which does take cognizance of it using one of the variants of glm() (available in R) which include random effects, but don't ask me how! I don't know if it is possible to work out analytically what the bias should be if the epsilon term is ignored when fitting the model, but it might be. Or an approximation might be achieved. Good luck. cheers, Rolf Turner> set.seed(1) > alpha <- numeric(100) # Vector to store coefficient estimates > X1 <- numeric(100) > X2 <- numeric(100) > > a <- 0 # True parameters > B1 <- .5 > B2 <- .85 > > for (i in 1:100){ > x1 <- rnorm(1000) > x2 <- rnorm(1000) > epsilon <- rnorm(1000) > > LP <- a + B1*x1 + B2*x2 + epsilon # Linear predictor > y <- rbinom(length(LP), 1, plogis(LP)) # Binomial link > > model <- glm(y ~ x1 + x2, family = binomial) > alpha[i] <- coef(model)[1] > X1[i] <- coef(model)[2] > X2[i] <- coef(model)[3] > } > > mean(X1) # Shows a bias downward (should be about .5) > mean(X2) # Shows a bias downward (should be about .85) > mean(alpha) # Pretty close (should be about 0)###################################################################### Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
Possibly Parallel Threads
- R graph into MS Word: which format to use?
- Need help creating spatial correlation for MC simulation
- Appropriate tests for logistic regression with a continuous predictor variable and Bernoulli response variable
- What is the R equivalent of STATA's 'drop' command?
- Difference in Monte Carlo calculation between chisq.test and fisher.test