Dear R Help, I read through the archives pretty extensively before sending this email, as it seemed there were several threads on using predict with GLM. However, while my issue is similar to previous posts (cannot get it to predict using new data), none of the suggested fixes are working. The important bits of my code: set.seed(644) n0=200 #number of observations W1=rnorm(n0,mean=2,sd=2) #Use rnorm to generate W1 W2=rnorm(n0,mean=3,sd=8) #Use rnorm to generate W1 Aprob=matrix(.2, nrow=n0, ncol=1) #generating the probability of A #generating probability of A dependant on W1 for(i in 1:n0){ if (W1[i]>1.5) {Aprob[i]=0.4} } A=matrix(rbinom(n0, 1, Aprob), nrow=n0, ncol=1) #generating the 0/1 exposure Yprob=1/(1+exp(-(10*A-5*(W1)^2+2*W2))) Y=matrix(rbinom(n0, 1, Yprob), nrow=n0, ncol=1) #generating the 0/1 exposure zero=data.frame(rep(0, n0)) Q=glm(cbind(Y, 1-Y) ~ A + W1 + W2, family='binomial') QA=predict(Q, newdata=as.data.frame(A)) Q0=predict(Q,newdata=(A=zero)) I've tried many variations of the last line (Q0) to get the predicted values when A=0 with no luck. With this code, I get errors that my A=zero is a list even though I made it into a data frame. This is the version of the code (after my reading) that *should* work for predict once I can get it to accept that it is not a list. With other variants of the line that will run but are not syntactically correct, my QA and Q0 are the same. Any guidance would be appreciated! Sherri [[alternative HTML version deleted]]
Your problem is that your newdata data frame only specifies what the value for A is in the predictions. The values for W1 and W2 are unspecified. To predict from fitted models, you need to specify the values you wish to use for *all* the predictor variables, not just the one (I presume) is different from the estimation set. Assuming that is what you want to do, here is a version of your script that you may care to look at fairly closely. I have omitted all the unnecessary bits, (I could find), put the variables in a data frame instead of leaving them strewn around the global environment, and done a couple of predictions just to show it works. _________________________________ set.seed(644) n0 <- 200 # number of observations myData <- data.frame(W1 = rnorm(n0, mean = 2, sd = 2), W2 = rnorm(n0, mean = 3, sd = 8)) myData <- transform(myData, # add A A = rbinom(n0, 1, ifelse(W1 > 1.5, 0.4, 0.2))) myData <- transform(myData, # add Y Y = rbinom(n0, 1, 1/(1+exp(-(10*A-5*(W1)^2+2*W2))))) Q <- glm(Y ~ A + W1 + W2, family = binomial, data = myData) pData <- transform(myData, A = 0) # keep W1 and W2 as they were QA <- predict(Q, newdata = pData) # linear predictors QR <- predict(Q, newdata = pData, type = "response") # probs plot(QA, QR) _________________________________ Bill Venables. Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:Bill.Venables at csiro.au http://www.cmis.csiro.au/bill.venables/ -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Sherri Rose Sent: Tuesday, 29 January 2008 1:27 PM To: r-help at r-project.org Subject: [R] Using Predict and GLM Dear R Help, I read through the archives pretty extensively before sending this email, as it seemed there were several threads on using predict with GLM. However, while my issue is similar to previous posts (cannot get it to predict using new data), none of the suggested fixes are working. The important bits of my code: set.seed(644) n0=200 #number of observations W1=rnorm(n0,mean=2,sd=2) #Use rnorm to generate W1 W2=rnorm(n0,mean=3,sd=8) #Use rnorm to generate W1 Aprob=matrix(.2, nrow=n0, ncol=1) #generating the probability of A #generating probability of A dependant on W1 for(i in 1:n0){ if (W1[i]>1.5) {Aprob[i]=0.4} } A=matrix(rbinom(n0, 1, Aprob), nrow=n0, ncol=1) #generating the 0/1 exposure Yprob=1/(1+exp(-(10*A-5*(W1)^2+2*W2))) Y=matrix(rbinom(n0, 1, Yprob), nrow=n0, ncol=1) #generating the 0/1 exposure zero=data.frame(rep(0, n0)) Q=glm(cbind(Y, 1-Y) ~ A + W1 + W2, family='binomial') QA=predict(Q, newdata=as.data.frame(A)) Q0=predict(Q,newdata=(A=zero)) I've tried many variations of the last line (Q0) to get the predicted values when A=0 with no luck. With this code, I get errors that my A=zero is a list even though I made it into a data frame. This is the version of the code (after my reading) that *should* work for predict once I can get it to accept that it is not a list. With other variants of the line that will run but are not syntactically correct, my QA and Q0 are the same. Any guidance would be appreciated! Sherri [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Mon, 28 Jan 2008, Sherri Rose wrote:> Dear R Help, > > I read through the archives pretty extensively before sending this > email, as it seemed there were several threads on using predict with > GLM. However, while my issue is similar to previous posts (cannot > get it to predict using new data), none of the suggested fixes are > working. > > The important bits of my code: > > set.seed(644) > n0=200 #number of observations > W1=rnorm(n0,mean=2,sd=2) #Use rnorm to generate W1 > W2=rnorm(n0,mean=3,sd=8) #Use rnorm to generate W1 > Aprob=matrix(.2, nrow=n0, ncol=1) #generating the probability of A > #generating probability of A dependant on W1 > for(i in 1:n0){ > if (W1[i]>1.5) {Aprob[i]=0.4} > } > A=matrix(rbinom(n0, 1, Aprob), nrow=n0, ncol=1) #generating the 0/1 > exposure > Yprob=1/(1+exp(-(10*A-5*(W1)^2+2*W2))) > Y=matrix(rbinom(n0, 1, Yprob), nrow=n0, ncol=1) #generating the 0/1 > exposure > zero=data.frame(rep(0, n0)) > > > Q=glm(cbind(Y, 1-Y) ~ A + W1 + W2, family='binomial') > QA=predict(Q, newdata=as.data.frame(A)) > Q0=predict(Q,newdata=(A=zero)) > > I've tried many variations of the last line (Q0) to get the predicted > values when A=0 with no luck. With this code, I get errors that my > A=zero is a list even though I made it into a data frame. This is > the version of the code (after my reading) that *should* work for > predict once I can get it to accept that it is not a list.What do you think (A=zero) is? It is the data frame zero, which is a list but does not have variable names 'A', 'W1' and 'W2', and as a side effect assigned 0 to A. I suspect you meant == but then you would have had a logical matrix. (This is one of the consequences of allowing '=' as an assignment operator.)> With other variants of the line that will run but are not > syntactically correct, my QA and Q0 are the same.Code that is not syntactically correct cannot be parsed and so cannot be run. Your intentions are not clear to me. Why are you using one-column matrices? Is A <- rbinom(n0, 1, Aprob) mydat <- data.frame(A, W1, W2) Q <- glm(cbind(Y, 1-Y) ~ A + W1 + W2, family='binomial', data=mydat) predict(Q, subset(mydat, A==0)) what you intended?> > Any guidance would be appreciated! > > Sherri > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595