zhijie zhang
2006-Aug-31 14:12 UTC
[R] what's wrong with my simulation programs on logistic regression
Dear friends, I'm doing a simulation on logistic regression model, but the programs can't work well,please help me to correct it and give some suggestions. My programs: data<-matrix(rnorm(400),ncol=8) #sample size is 50 data<-data.frame(data) names(data)<-c(paste("x",1:8,sep="")) #8 independent variables,x1-x8; #logistic regression model is logit(y)=x1+x2+x3+x4+x5+x6+x7+x8 data$y<-exp(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)/(1+(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)) logist<-glm(y~.,family=binomial(),data=simdata) *Warning messages:* 1: algorithm can't converge in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 2: the probability is 0 or 1 in: glm.fit (x = X, y = Y, weights = weights, start = start, etastart = etastart, -- With Kind Regards, Zhi Jie,Zhang ,PHD Department of Epidemiology School of Public Health Fudan University [[alternative HTML version deleted]]
zhijie zhang
2006-Aug-31 14:21 UTC
[R] what's wrong with my simulation programs on logistic regression
Forgot to add my thinkings: I think it over and think that the problem may be the argument(data$y<-exp(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)/(1+(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)) ),which maybe not correctly set for my model:logit(y)=x1+x2+x3+x4+x5+x6+x7+x8, Thanks very much! On 8/31/06, zhijie zhang <epistat@gmail.com> wrote:> > Dear friends, > I'm doing a simulation on logistic regression model, but the programs > can't work well,please help me to correct it and give some suggestions. > My programs: > data<-matrix(rnorm(400),ncol=8) #sample size is 50 > data<-data.frame(data) > names(data)<-c(paste("x",1:8,sep="")) #8 independent variables,x1-x8; > #logistic regression model is logit(y)=x1+x2+x3+x4+x5+x6+x7+x8 > data$y<-exp(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)/(1+(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)) > > logist<-glm(y~.,family=binomial(),data=simdata) > *Warning messages:* > 1: algorithm can't converge in: glm.fit(x = X, y = Y, weights = weights, > start = start, etastart = etastart, > 2: the probability is 0 or 1 in: glm.fit (x = X, y = Y, weights = weights, > start = start, etastart = etastart, > -- > With Kind Regards, > Zhi Jie,Zhang ,PHD > Department of Epidemiology > School of Public Health > Fudan University >-- With Kind Regards, Zhi Jie,Zhang ,PHD Department of Epidemiology School of Public Health Fudan University [[alternative HTML version deleted]]
Dimitris Rizopoulos
2006-Aug-31 14:25 UTC
[R] what's wrong with my simulation programs on logistic regression
In this way you don't simulate response data; it'd also be a good idea to consider a bigger sample, e.g., dat <- matrix(rnorm(4000), ncol = 8) dat <- data.frame(dat) names(dat) <- paste("x", 1:8, sep = "") dat$y <- rbinom(nrow(dat), 1, plogis(rowSums(dat))) fit <- glm(y ~ ., family = binomial, data = dat) fit I hope it helps. Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm ----- Original Message ----- From: "zhijie zhang" <epistat at gmail.com> To: <R-help at stat.math.ethz.ch> Sent: Thursday, August 31, 2006 4:12 PM Subject: [R] what's wrong with my simulation programs on logistic regression> Dear friends, > I'm doing a simulation on logistic regression model, but the > programs can't > work well,please help me to correct it and give some suggestions. > My programs: > data<-matrix(rnorm(400),ncol=8) #sample size is 50 > data<-data.frame(data) > names(data)<-c(paste("x",1:8,sep="")) #8 independent > variables,x1-x8; > #logistic regression model is logit(y)=x1+x2+x3+x4+x5+x6+x7+x8 > data$y<-exp(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)/(1+(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)) > > logist<-glm(y~.,family=binomial(),data=simdata) > *Warning messages:* > 1: algorithm can't converge in: glm.fit(x = X, y = Y, weights = > weights, > start = start, etastart = etastart, > 2: the probability is 0 or 1 in: glm.fit (x = X, y = Y, weights = > weights, > start = start, etastart = etastart, > -- > With Kind Regards, > Zhi Jie,Zhang ,PHD > Department of Epidemiology > School of Public Health > Fudan University > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Prof Brian Ripley
2006-Aug-31 14:55 UTC
[R] what's wrong with my simulation programs on logistic regression
On Thu, 31 Aug 2006, zhijie zhang wrote:> Dear friends, > I'm doing a simulation on logistic regression model, but the programs can't > work well,please help me to correct it and give some suggestions. > My programs: > data<-matrix(rnorm(400),ncol=8) #sample size is 50 > data<-data.frame(data) > names(data)<-c(paste("x",1:8,sep="")) #8 independent variables,x1-x8; > #logistic regression model is logit(y)=x1+x2+x3+x4+x5+x6+x7+x8Rather it is logit(p) = ..., and y ~ binomial(1, p) There is a different sort of 'logistic regression' with y = exp(eta)/(1+exp(eta)) + epsilon but you fit that by nls, not glm.> data$y<-exp(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)/(1+(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8))You need exp()/(1+exp()), and the second exp is missing. Once you have p, you can use data$y <- rbinom(length(p), 1, p)> logist<-glm(y~.,family=binomial(),data=simdata) > *Warning messages:* > 1: algorithm can't converge in: glm.fit(x = X, y = Y, weights = weights, > start = start, etastart = etastart, > 2: the probability is 0 or 1 in: glm.fit (x = X, y = Y, weights = weights, > start = start, etastart = etastart,You do not have a Bernoulli response: it often helps to look at your simulated data to see if it makes sense (just as you would look at real data, I hope). -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595