zhijie zhang
2006-Aug-31 14:12 UTC
[R] what's wrong with my simulation programs on logistic regression
Dear friends,
I'm doing a simulation on logistic regression model, but the programs
can't
work well,please help me to correct it and give some suggestions.
My programs:
data<-matrix(rnorm(400),ncol=8) #sample size is 50
data<-data.frame(data)
names(data)<-c(paste("x",1:8,sep="")) #8 independent
variables,x1-x8;
#logistic regression model is logit(y)=x1+x2+x3+x4+x5+x6+x7+x8
data$y<-exp(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)/(1+(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8))
logist<-glm(y~.,family=binomial(),data=simdata)
*Warning messages:*
1: algorithm can't converge in: glm.fit(x = X, y = Y, weights = weights,
start = start, etastart = etastart,
2: the probability is 0 or 1 in: glm.fit (x = X, y = Y, weights = weights,
start = start, etastart = etastart,
--
With Kind Regards,
Zhi Jie,Zhang ,PHD
Department of Epidemiology
School of Public Health
Fudan University
[[alternative HTML version deleted]]
zhijie zhang
2006-Aug-31 14:21 UTC
[R] what's wrong with my simulation programs on logistic regression
Forgot to add my thinkings: I think it over and think that the problem may be the argument(data$y<-exp(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)/(1+(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)) ),which maybe not correctly set for my model:logit(y)=x1+x2+x3+x4+x5+x6+x7+x8, Thanks very much! On 8/31/06, zhijie zhang <epistat@gmail.com> wrote:> > Dear friends, > I'm doing a simulation on logistic regression model, but the programs > can't work well,please help me to correct it and give some suggestions. > My programs: > data<-matrix(rnorm(400),ncol=8) #sample size is 50 > data<-data.frame(data) > names(data)<-c(paste("x",1:8,sep="")) #8 independent variables,x1-x8; > #logistic regression model is logit(y)=x1+x2+x3+x4+x5+x6+x7+x8 > data$y<-exp(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)/(1+(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)) > > logist<-glm(y~.,family=binomial(),data=simdata) > *Warning messages:* > 1: algorithm can't converge in: glm.fit(x = X, y = Y, weights = weights, > start = start, etastart = etastart, > 2: the probability is 0 or 1 in: glm.fit (x = X, y = Y, weights = weights, > start = start, etastart = etastart, > -- > With Kind Regards, > Zhi Jie,Zhang ,PHD > Department of Epidemiology > School of Public Health > Fudan University >-- With Kind Regards, Zhi Jie,Zhang ,PHD Department of Epidemiology School of Public Health Fudan University [[alternative HTML version deleted]]
Dimitris Rizopoulos
2006-Aug-31 14:25 UTC
[R] what's wrong with my simulation programs on logistic regression
In this way you don't simulate response data; it'd also be a good idea
to consider a bigger sample, e.g.,
dat <- matrix(rnorm(4000), ncol = 8)
dat <- data.frame(dat)
names(dat) <- paste("x", 1:8, sep = "")
dat$y <- rbinom(nrow(dat), 1, plogis(rowSums(dat)))
fit <- glm(y ~ ., family = binomial, data = dat)
fit
I hope it helps.
Best,
Dimitris
----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven
Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
http://www.student.kuleuven.be/~m0390867/dimitris.htm
----- Original Message -----
From: "zhijie zhang" <epistat at gmail.com>
To: <R-help at stat.math.ethz.ch>
Sent: Thursday, August 31, 2006 4:12 PM
Subject: [R] what's wrong with my simulation programs on logistic
regression
> Dear friends,
> I'm doing a simulation on logistic regression model, but the
> programs can't
> work well,please help me to correct it and give some suggestions.
> My programs:
> data<-matrix(rnorm(400),ncol=8) #sample size is 50
> data<-data.frame(data)
> names(data)<-c(paste("x",1:8,sep="")) #8
independent
> variables,x1-x8;
> #logistic regression model is logit(y)=x1+x2+x3+x4+x5+x6+x7+x8
>
data$y<-exp(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)/(1+(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8))
>
> logist<-glm(y~.,family=binomial(),data=simdata)
> *Warning messages:*
> 1: algorithm can't converge in: glm.fit(x = X, y = Y, weights =
> weights,
> start = start, etastart = etastart,
> 2: the probability is 0 or 1 in: glm.fit (x = X, y = Y, weights =
> weights,
> start = start, etastart = etastart,
> --
> With Kind Regards,
> Zhi Jie,Zhang ,PHD
> Department of Epidemiology
> School of Public Health
> Fudan University
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Prof Brian Ripley
2006-Aug-31 14:55 UTC
[R] what's wrong with my simulation programs on logistic regression
On Thu, 31 Aug 2006, zhijie zhang wrote:> Dear friends, > I'm doing a simulation on logistic regression model, but the programs can't > work well,please help me to correct it and give some suggestions. > My programs: > data<-matrix(rnorm(400),ncol=8) #sample size is 50 > data<-data.frame(data) > names(data)<-c(paste("x",1:8,sep="")) #8 independent variables,x1-x8; > #logistic regression model is logit(y)=x1+x2+x3+x4+x5+x6+x7+x8Rather it is logit(p) = ..., and y ~ binomial(1, p) There is a different sort of 'logistic regression' with y = exp(eta)/(1+exp(eta)) + epsilon but you fit that by nls, not glm.> data$y<-exp(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)/(1+(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8))You need exp()/(1+exp()), and the second exp is missing. Once you have p, you can use data$y <- rbinom(length(p), 1, p)> logist<-glm(y~.,family=binomial(),data=simdata) > *Warning messages:* > 1: algorithm can't converge in: glm.fit(x = X, y = Y, weights = weights, > start = start, etastart = etastart, > 2: the probability is 0 or 1 in: glm.fit (x = X, y = Y, weights = weights, > start = start, etastart = etastart,You do not have a Bernoulli response: it often helps to look at your simulated data to see if it makes sense (just as you would look at real data, I hope). -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595