Dear List, I am familier with binary models, however i am now trying to get predictions from a ordinal model and have a question. I have a data set made up of 12 categorical predictors, the response variable is classed as 1,2,3,4,5,6, this relates to threat level of the species ( on the IUCN rating). Previously i have combined levels 1 and 2 to form = non threatened and then combined 3-6 to form threatened, and run a binary model. I have tested the performance of this based on the brier score (0.198) and the AUC or C value (0.75). I also partitioned the data set into training and test data and used the predict function to get a predicted probability for the newdata. When visualising the results with a cutoff value calculated with epi, roughly 75% of the time the prediction was correct. Now i am interested in predicting the threat level of a species not purely as threatened or not but to specific IUCN levels. I have used the predict.lrm function (predict.lrm(model1, type="fitted.ind")) to generate probabilities for each level, and also (predict(model1, traist, type="fitted")) see below. When i call the model the Brier score is 0.201 and C value 0.677. However when i inspect the output and relate it to the corresponding species ( for which i know the true IUCN rating) the model performs very badly, only getting 43% correct. Interestingly i have noticed the probabilities are always highest for levels 1 and 6, on no occasion do levels 2,3,4 or 5 have high probabilities? I am unsure if this is just because the model can not discriminate between the various levels due to insufficient data? Or if i am doing something wrong, if this is the case i would greatly appreciate any advice or suggestion of a different method. Thanks in advance, Chris model1 <- lrm(EXTINCTION~BS*FR+WO+LIF+REG+ALT+BIO+HAB+PD+SEA, x=TRUE,y=TRUE) predict.lrm(model1, type="fitted.ind") EXTINCTION=1 EXTINCTION=2 EXTINCTION=3 EXTINCTION=4 EXTINCTION=5 1 0.19748393 0.05895033 0.12811186 0.086140778 0.068137104 2 0.27790178 0.07247496 0.14384976 0.087487677 0.064584865 3 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 4 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 5 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 6 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 7 0.13928899 0.04558050 0.10636220 0.077770389 0.065500459 8 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 9 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 10 0.33865077 0.07915126 0.14744522 0.083923247 0.059387585 EXTINCTION=6 1 0.46117600 2 0.35370096 3 0.39223038 4 0.39223038 5 0.39223038 6 0.39223038 7 0.56549746 8 0.39223038 9 0.39223038 predict(model1, traist, type="fitted") y>=2 y>=3 y>=4 y>=5 y>=6 1 0.80251607 0.74356575 0.61545388 0.52931311 0.46117600 2 0.72209822 0.64962327 0.50577351 0.41828583 0.35370096 3 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 4 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 5 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 6 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 7 0.86071101 0.81513051 0.70876831 0.63099792 0.56549746 8 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 9 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 10 0.66134923 0.58219797 0.43475276 0.35082951 0.29144192 [[alternative HTML version deleted]]
You sent a private note about this which I just took the time to answer. Please send only one note, and please post my reply to you to r-help. Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/predict-lrm-Design-package-tp2546894p2547011.html Sent from the R help mailing list archive at Nabble.com.
On Sep 20, 2010, at 15:50 , Frank Harrell wrote:> > You sent a private note about this which I just took the time to answer. > Please send only one note, and please post my reply to you to r-help.I have been annoyed by this at times as well. However, I have come to suspect that it is actually the mailer (computer program) rather than the sender (human being) that is doing this. It appears that _sometimes_, depending on the phase of the moon and the water levels in lake Wivenhoe, as well as may be other circumstances, a wide reply gets sent as two separate mails, one to the original sender and another to the Cc: list. Of course, with the familiar effect that you follow-up in a private mail, only to see the question going public an instance later. By principle, I rarely do follow-ups in private, so it is not that much of an issue for me.> Frank > > > ----- > Frank Harrell > Department of Biostatistics, Vanderbilt University > -- > View this message in context: http://r.789695.n4.nabble.com/predict-lrm-Design-package-tp2546894p2547011.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Thanks Frank, I have one small question regarding this, understand you are very busy and if you cant answer i would greatly appreciate any thoughts from the list.> Split-sample validation is not reliable unless you have say 10,000 samples to begin withI am a little confused. When i ran the model with a binary response i had a Brier score of 0.199 and a C-value of 0.73. When i looked at the data, 75% of the predictions were "correct". However when i ran the model with a ordinal response, the Brier score was 0.201 and the c-value 0.677 which to me, albeit with limited knowledge in the area, suggests the model performance worse but not by a great deal. However when i compare the predicted mean values with the "real" data i only get 45% accuracy, it is this discrepancy i don't understand. I appreciate you need a large dataset to partition it, but surely if the c-value and brier score are similar then the number predicted "correct" should also be similar? Thanks Chris On 20 Sep 2010, at 14:46, Frank Harrell wrote: A few comments; sorry I don't have time for any more. - Combining categories is almost always a bad idea - It can be harder to discriminate more categories but that's only because the task is more difficult - Split-sample validation is not reliable unless you have say 10,000 samples to begin with; otherwise 100 fold repeats of 10-fold cross-validation, or (faster) 200 or so bootstraps is better. Be sure to re-do all modeling decisions from scratch for each re-sample. If you did no statistical variable selection this may not be an issue. - You don't need ".lrm" after predict - Not sure why your variable names are all upper case; harder to read this way Good luck Frank Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University On Mon, 20 Sep 2010, Chris Mcowen wrote:> Dear Professor Harell > I am familier with binary models, however i am now trying to get predictions from a ordinal model and have a > question. > I have a data set made up of 12 categorical predictors, the response variable is classed as 1,2,3,4,5,6, this > relates to threat level of the species ( on the IUCN rating). > Previously i have combined levels 1 and 2 to form = non threatened and then combined 3-6 to form threatened, and run > a binary model. I have tested the performance of this based on the brier score (0.198) and the AUC or C value > (0.75). I also partitioned the data set into training and test data and used the predict function to get a predicted > probability for the newdata. When visualising the results with a cutoff value calculated with epi, roughly 75% of > the time the prediction was correct. > Now i am interested in predicting the threat level of a species not purely as threatened or not but to specific IUCN > levels. I have used the predict.lrm function (predict.lrm(model1, type="fitted.ind")) to generate probabilities for > each level, and also (predict(model1, traist, type="fitted")) see below. > When i call the model the Brier score is 0.201 and C value 0.677. However when i inspect the output and relate it > to the corresponding species ( for which i know the true IUCN rating) the model performs very badly, only getting > 43% correct. Interestingly i have noticed the probabilities are always highest for levels 1 and 6, on no occasion do > levels 2,3,4 or 5 have high probabilities? > I am unsure if this is just because the model can not discriminate between the various levels due to insufficient > data? Or if i am doing something wrong, if this is the case i would greatly appreciate any advice or suggestion of a > different method. > Thanks in advance, > Chris > model1 <- lrm(EXTINCTION~BS*FR+WO+LIF+REG+ALT+BIO+HAB+PD+SEA, x=TRUE,y=TRUE) > predict.lrm(model1, type="fitted.ind") > EXTINCTION=1 EXTINCTION=2 EXTINCTION=3 EXTINCTION=4 EXTINCTION=5 > 1 0.19748393 0.05895033 0.12811186 0.086140778 0.068137104 > 2 0.27790178 0.07247496 0.14384976 0.087487677 0.064584865 > 3 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 4 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 5 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 6 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 7 0.13928899 0.04558050 0.10636220 0.077770389 0.065500459 > 8 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 9 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 10 0.33865077 0.07915126 0.14744522 0.083923247 0.059387585 > EXTINCTION=6 > 1 0.46117600 > 2 0.35370096 > 3 0.39223038 > 4 0.39223038 > 5 0.39223038 > 6 0.39223038 > 7 0.56549746 > 8 0.39223038 > 9 0.39223038 > predict(model1, traist, type="fitted") > y>=2 y>=3 y>=4 y>=5 y>=6 > 1 0.80251607 0.74356575 0.61545388 0.52931311 0.46117600 > 2 0.72209822 0.64962327 0.50577351 0.41828583 0.35370096 > 3 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 4 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 5 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 6 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 7 0.86071101 0.81513051 0.70876831 0.63099792 0.56549746 > 8 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 9 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 10 0.66134923 0.58219797 0.43475276 0.35082951 0.29144192 >
Thanks Frank, I have one small question regarding this, understand you are very busy and if you cant answer i would greatly appreciate any thoughts from the list.> Split-sample validation is not reliable unless you have say 10,000 samples to begin withI am a little confused. When i ran the model with a binary response i had a Brier score of 0.199 and a C-value of 0.73. When i looked at the data, 75% of the predictions were "correct". However when i ran the model with a ordinal response, the Brier score was 0.201 and the c-value 0.677 which to me, albeit with limited knowledge in the area, suggests the model performance worse but not by a great deal. However when i compare the predicted mean values with the "real" data i only get 45% accuracy, it is this discrepancy i don't understand. I appreciate you need a large dataset to partition it, but surely if the c-value and brier score are similar then the number predicted "correct" should also be similar? Thanks Chris On 20 Sep 2010, at 14:46, Frank Harrell wrote: A few comments; sorry I don't have time for any more. - Combining categories is almost always a bad idea - It can be harder to discriminate more categories but that's only because the task is more difficult - Split-sample validation is not reliable unless you have say 10,000 samples to begin with; otherwise 100 fold repeats of 10-fold cross-validation, or (faster) 200 or so bootstraps is better. Be sure to re-do all modeling decisions from scratch for each re-sample. If you did no statistical variable selection this may not be an issue. - You don't need ".lrm" after predict - Not sure why your variable names are all upper case; harder to read this way Good luck Frank Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University On Mon, 20 Sep 2010, Chris Mcowen wrote:> Dear Professor Harell > I am familier with binary models, however i am now trying to get predictions from a ordinal model and have a > question. > I have a data set made up of 12 categorical predictors, the response variable is classed as 1,2,3,4,5,6, this > relates to threat level of the species ( on the IUCN rating). > Previously i have combined levels 1 and 2 to form = non threatened and then combined 3-6 to form threatened, and run > a binary model. I have tested the performance of this based on the brier score (0.198) and the AUC or C value > (0.75). I also partitioned the data set into training and test data and used the predict function to get a predicted > probability for the newdata. When visualising the results with a cutoff value calculated with epi, roughly 75% of > the time the prediction was correct. > Now i am interested in predicting the threat level of a species not purely as threatened or not but to specific IUCN > levels. I have used the predict.lrm function (predict.lrm(model1, type="fitted.ind")) to generate probabilities for > each level, and also (predict(model1, traist, type="fitted")) see below. > When i call the model the Brier score is 0.201 and C value 0.677. However when i inspect the output and relate it > to the corresponding species ( for which i know the true IUCN rating) the model performs very badly, only getting > 43% correct. Interestingly i have noticed the probabilities are always highest for levels 1 and 6, on no occasion do > levels 2,3,4 or 5 have high probabilities? > I am unsure if this is just because the model can not discriminate between the various levels due to insufficient > data? Or if i am doing something wrong, if this is the case i would greatly appreciate any advice or suggestion of a > different method. > Thanks in advance, > Chris > model1 <- lrm(EXTINCTION~BS*FR+WO+LIF+REG+ALT+BIO+HAB+PD+SEA, x=TRUE,y=TRUE) > predict.lrm(model1, type="fitted.ind") > EXTINCTION=1 EXTINCTION=2 EXTINCTION=3 EXTINCTION=4 EXTINCTION=5 > 1 0.19748393 0.05895033 0.12811186 0.086140778 0.068137104 > 2 0.27790178 0.07247496 0.14384976 0.087487677 0.064584865 > 3 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 4 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 5 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 6 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 7 0.13928899 0.04558050 0.10636220 0.077770389 0.065500459 > 8 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 9 0.24605628 0.06777939 0.13931242 0.087996215 0.066625303 > 10 0.33865077 0.07915126 0.14744522 0.083923247 0.059387585 > EXTINCTION=6 > 1 0.46117600 > 2 0.35370096 > 3 0.39223038 > 4 0.39223038 > 5 0.39223038 > 6 0.39223038 > 7 0.56549746 > 8 0.39223038 > 9 0.39223038 > predict(model1, traist, type="fitted") > y>=2 y>=3 y>=4 y>=5 y>=6 > 1 0.80251607 0.74356575 0.61545388 0.52931311 0.46117600 > 2 0.72209822 0.64962327 0.50577351 0.41828583 0.35370096 > 3 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 4 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 5 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 6 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 7 0.86071101 0.81513051 0.70876831 0.63099792 0.56549746 > 8 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 9 0.75394372 0.68616432 0.54685190 0.45885569 0.39223038 > 10 0.66134923 0.58219797 0.43475276 0.35082951 0.29144192 >
Thats great thanks I guess it is hard to not use % as a performance measure when that is what is commonly used in everyday life. So when i come to predicting the response of new data ( using the estimated mean Y ) which i am more comfortable with i can say - Species A - 2.12 - Therefore this is category 2 Species B - 2.72 - Therefore this is category 3 (on a side note, i had no species with a rating of 6 - the upper category?) The problem comes in explaining this to my peers who are non-statsistcally minded. What you are saying is if i bootstrapp and report the c-values and Bieber scores etc this is sufficient to give an indication of confidence? On 22 Sep 2010, at 12:36, Frank Harrell wrote: % correct is an improper scoring rule and a discontinuous one to boot. So it will not always agree with more proper scoring rules. When you have a more difficult task, e.g., discriminating more categories, indexes such as the generalized c-index that utilize all the categories will recognize the difficulty of the task and give a lower value. No cause for alarm. Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/predict-lrm-Design-package-tp2546894p2550118.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.