Hi all, As you can see from below, the result is strange... I would imagined that the bb result should be much higher and close to 1, any way to improve the fit? Any other classification methods? Thank you! data=data.frame(y=rep(c(0, 1), times=100), x=1:200) aa=glm(y~x, data=data, family=binomial(link="logit")) newdata=data.frame(x=6, y=100) bb=predict(aa, newdata=newdata, type="response") bb> bb1 0.4929125 [[alternative HTML version deleted]]
On Wed, Feb 29, 2012 at 10:02 AM, Michael <comtech.usa at gmail.com> wrote:> Hi all, > > As you can see from below, the result is strange...Not really.> I would imagined that the bb result should be much higher and close to 1, > any way to improve the fit? > > Any other classification methods? > > Thank you! > > data=data.frame(y=rep(c(0, 1), times=100), x=1:200) > aa=glm(y~x, data=data, family=binomial(link="logit")) > > newdata=data.frame(x=6, y=100) > bb=predict(aa, newdata=newdata, type="response") > bb > > >> bb > > 1 > > 0.4929125What did you expect? Your model is completely nonsignificant; there's no way to predict y from x, and that's what your predicted value tells you.> summary(aa)Call: glm(formula = y ~ x, family = binomial(link = "logit"), data = data) Deviance Residuals: Min 1Q Median 3Q Max -1.190 -1.177 0.000 1.177 1.190 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.030152 0.283924 -0.106 0.915 x 0.000300 0.002450 0.122 0.903 (Dispersion parameter for binomial family taken to be 1) Null deviance: 277.26 on 199 degrees of freedom Residual deviance: 277.24 on 198 degrees of freedom AIC: 281.24 Number of Fisher Scoring iterations: 3 I can only assume that you didn't construct the data frame that you intended to test. -- Sarah Goslee http://www.functionaldiversity.org
Michael <comtech.usa <at> gmail.com> writes:> > Hi all, > > As you can see from below, the result is strange... > > I would imagined that the bb result should be much higher and close to 1, > any way to improve the fit? > > Any other classification methods? > > Thank you! > > data=data.frame(y=rep(c(0, 1), times=100), x=1:200) > aa=glm(y~x, data=data, family=binomial(link="logit")) > > newdata=data.frame(x=6, y=100) > bb=predict(aa, newdata=newdata, type="response") > bb > > > bb > > 1 > > 0.4929125 >I have a feeling you meant to say data <- data.frame(y=rep(c(0,1), each=100), x=1:200) instead. Try with(data,plot(y~x)) for each data set to see what you actually got as opposed to what you thought you were getting it. You may still have a little bit of a problem fitting such an extreme data set -- this is what is called "complete separation", and leads to an infinite estimate of the slope -- if you want to pursue this, take a look at the brglm package.