Hi all I have run into a case where I don't understand why predict.lrm and predict.glm don't yield the same results. My data look like this: set.seed(1) library(Design); ilogit <- function(x) { 1/(1+exp(-x)) } ORDER <- factor(sample(c("mc-sc", "sc-mc"), 403, TRUE)) CONJ <- factor(sample(c("als", "bevor", "nachdem", "weil"), 403, TRUE)) LENGTH_DIFF <- sample(-32:25, 403, TRUE) I fit two models: model.01.lrm <- lrm(ORDER ~ CONJ*LENGTH_DIFF) model.01.glm <- glm(ORDER ~ CONJ*LENGTH_DIFF, family=binomial) Then I wanted to plot for each level of CONJ the predicted probabilities of ORDER="sc-mc" across a large range of LENGTH_DIFF values. Thus I use this (yes, I know about type="response" etc.): # range of LENGTH_DIFF for CONJ="als" x1 <- factor(rep("als", 201)); x2 <- seq(-100, 100, 1) y.als.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) y.als.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) This works perfectly and the two vectors y.als.lrm and y.als.glm are the same. But then: # range of LENGTH_DIFF for CONJ="bevor" x1 <- factor(rep("bevor", 201)); x2 <- seq(-100, 100, 1) y.bevor.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) y.bevor.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) Here, y.bevor.lrm is the same as y.als.lrm, but not the same as y.bevor.glm # range of LENGTH_DIFF for CONJ="nachdem" x1 <- factor(rep("nachdem", 201)); x2 <- seq(-100, 100, 1) y.nachdem.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) y.nachdem.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) Here, y.nachdem.lrm is the same as y.als.lrm, but not the same as y.nachdem.glm # range of LENGTH_DIFF for CONJ="weil" x1 <- factor(rep("weil", 201)); x2 <- seq(-100, 100, 1) y.weil.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) y.weil.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) Here, y.weil.lrm is the same as y.als.lrm, but not the same as y.weil.glm As a result, in this plot, the left panel is useless, but the right has the four different curves: par(mfrow=c(1,2)) plot(x2, y.als.lrm, pch="a", xlim=c(-100, 100), ylim=c(0, 1), main="with predict.lrm", xlab="Main cl. length - subord. cl. length (in words)", ylab="Predicted probability of 'sc-mc'"); grid() points(x2, y.bevor.lrm, pch="b") points(x2, y.nachdem.lrm, pch="n") points(x2, y.weil.lrm, pch="w") plot(x2, y.als.glm, pch="a", xlim=c(-100, 100), ylim=c(0, 1), main="with predict.glm", xlab="Main cl. length - subord. cl. length (in words)", ylab="Predicted probability of 'sc-mc'"); grid() points(x2, y.bevor.glm, pch="b") points(x2, y.nachdem.glm, pch="n") points(x2, y.weil.glm, pch="w") par(mfrow=c(1,1)) What am I doing wrong with predict.lrm? Thanks in advance for any input you may have, STG -- Stefan Th. Gries ----------------------------------------------- University of California, Santa Barbara http://www.linguistics.ucsb.edu/faculty/stgries
Please follow the posting guide. Please provide the smallest possible self-contained example and note that conveys the problem. Note that the rms package has now replaced the Design package. Details at http://biostat.mc.vanderbilt.edu/Rrms Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/predict-lrm-vs-predict-glm-with-newdata-tp3163671p3164020.html Sent from the R help mailing list archive at Nabble.com.
Hi all Trying again with this question. (If it's really that stupid a question, I'd be happy to just receive a link or one or two words to Google that I haven't thought of before): I have run into a case where I don't understand why predict.lrm and predict.glm don't yield the same results. # My data look like this: set.seed(1) library(Design); ilogit <- function(x) { 1/(1+exp(-x)) } ORDER <- factor(sample(c("mc-sc", "sc-mc"), 403, TRUE)) CONJ <- factor(sample(c("als", "bevor", "nachdem", "weil"), 403, TRUE)) LENGTH_DIFF <- sample(-32:25, 403, TRUE) # I fit two models: model.01.lrm <- lrm(ORDER ~ CONJ*LENGTH_DIFF) model.01.glm <- glm(ORDER ~ CONJ*LENGTH_DIFF, family=binomial) # Then I wanted to plot for each level of CONJ the predicted probabilities of ORDER="sc-mc" across a large range of LENGTH_DIFF values. Thus I use this (yes, I know about type="response" etc.): # range of LENGTH_DIFF for CONJ="als" x1 <- factor(rep("als", 201)); x2 <- seq(-100, 100, 1) y.als.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) y.als.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) # This works perfectly and the two vectors y.als.lrm and y.als.glm are the same. But then: # range of LENGTH_DIFF for CONJ="bevor" x1 <- factor(rep("bevor", 201)); x2 <- seq(-100, 100, 1) y.bevor.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) y.bevor.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) # Here, y.bevor.lrm is the same as y.als.lrm, but *not* the same as y.bevor.glm # range of LENGTH_DIFF for CONJ="nachdem" x1 <- factor(rep("nachdem", 201)); x2 <- seq(-100, 100, 1) y.nachdem.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) y.nachdem.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) # Here, y.nachdem.lrm is the same as y.als.lrm, but *not* the same as y.nachdem.glm # range of LENGTH_DIFF for CONJ="weil" x1 <- factor(rep("weil", 201)); x2 <- seq(-100, 100, 1) y.weil.lrm <- ilogit(predict(model.01.lrm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) y.weil.glm <- ilogit(predict(model.01.glm, newdata=list(CONJ=x1, LENGTH_DIFF=x2))) # Here, y.weil.lrm is the same as y.als.lrm, but *not* the same as y.weil.glm #As a result, in this plot, the left panel is useless, but the right has the four different curves: par(mfrow=c(1,2)) plot(x2, y.als.lrm, pch="a", xlim=c(-100, 100), ylim=c(0, 1), main="with predict.lrm", xlab="Main cl. length - subord. cl. length (in words)", ylab="Predicted probability of 'sc-mc'"); grid() points(x2, y.bevor.lrm, pch="b") points(x2, y.nachdem.lrm, pch="n") points(x2, y.weil.lrm, pch="w") plot(x2, y.als.glm, pch="a", xlim=c(-100, 100), ylim=c(0, 1), main="with predict.glm", xlab="Main cl. length - subord. cl. length (in words)", ylab="Predicted probability of 'sc-mc'"); grid() points(x2, y.bevor.glm, pch="b") points(x2, y.nachdem.glm, pch="n") points(x2, y.weil.glm, pch="w") par(mfrow=c(1,1)) What am I doing wrong with predict.lrm? Thanks in advance for any input you may have, STG -- Stefan Th. Gries ----------------------------------------------- University of California, Santa Barbara http://www.linguistics.ucsb.edu/faculty/stgries