John Haart
2010-Oct-01 10:12 UTC
[R] Interpreting the example given by Frank Harrell in the predict.lrm {Design} help
Dear list, I am relatively new to ordinal models and have been working through the example given by Frank Harrell in the predict.lrm {Design} help All of this makes sense to me, except for the responses, i,e how do i interpret them? i would be extremely grateful if someone could explain the results? First i establish the date and model -> y <- factor(sample(1:3, 400, TRUE), 1:3, c('good','better','best')) > x1 <- runif(400) > x2 <- runif(400) > f <- lrm(y ~ rcs(x1,4)*x2, x=TRUE)Get 0.95 confidence limits for Prob[better or best # How do i interpret this on the y scale i.e good,better,best?> > L <- predict(f, se.fit=TRUE) #omitted kint= so use 1st intercept > plogis(with(L, linear.predictors + 1.96*cbind(-se.fit,se.fit)))> se.fit > 1 0.6430994 0.8305201 > 2 0.5812662 0.7919122 > 3 0.5692593 0.7976906 > 4 0.5600308 0.7278637 > 5 0.6845250 0.8819143 > 6 0.5518848 0.7228657 > 7 0.5876031 0.7717215 > 8 0.6291766 0.8354423 > 9 0.5839353 0.8333790 > 10 0.5631326 0.8314051Get Prob(better) than all others - # Does this mean that for data point 1, y= best as it has the higher probability?> predict(f, type="fitted.ind")[1:10,]y=good y=better y=best 1 0.2517915 0.3469692 0.4012392 2 0.3031733 0.3554471 0.3413796 3 0.3046236 0.3555365 0.3398398 4 0.3514780 0.3546880 0.2938340 5 0.1989827 0.3251784 0.4758390 6 0.3581265 0.3540297 0.2878438 7 0.3130150 0.3559091 0.3310759 8 0.2541324 0.3476007 0.3982669 9 0.2740127 0.3519713 0.3740160 10 0.2839907 0.3535331 0.3624763 Establish data frame to use as newdata> d <- data.frame(x1=c(.1,.5),x2=c(.5,.15))Predict newdata - Prob(Y>=j) for new observation> predict(f, d, type="fitted")# Does this mean that for data point 1, y= better as it has the higher probability? y>=better y>=best 1 0.6800290 0.3239935 2 0.5846743 0.2409657 # Prob(Y=j) # Again - Does this mean that for data point 1, y= better as it has the higher probability? predict(f, d, type="fitted.ind") y=good y=better y=best 1 0.3199710 0.3560355 0.3239935 2 0.4153257 0.3437086 0.2409657 predict mean(y) using codes 1,2,3 # How do i interpret this on the y scale i.e good,better,best?> predict(f, d, type='mean', codes=TRUE)1 2 2.004022 1.825640 Thanks for any advice it is greatly appreciated John [[alternative HTML version deleted]]
Frank Harrell
2010-Oct-01 11:14 UTC
[R] Interpreting the example given by Frank Harrell in the predict.lrm {Design} help
John, Don't conclude that one category is the most probable when its probability of being equaled or exceeded is a maximum. The first category would always be the winner if that were the case. When you say y=best remember that you are dealing with a probability model. Nothing is forcing you to classify an observation, and unless the category's probability is high, this may be dangerous. You might do well to consider a more smooth approach such as using the generalized roc area (C-index) or its related rank correlation measure Dxy. Also there are odds ratios. Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Frank-Harrell-in-the-predict-lrm-Design-help-tp2883311p2891623.html Sent from the R help mailing list archive at Nabble.com.
peterfrancis at me.com
2010-Oct-01 14:23 UTC
[R] Interpreting the example given by Frank Harrell in the predict.lrm {Design} help
The reason I am trying to assign them is because I have a data set where i have arrived at the most likely model that describes the data and now I have another dataset where I know the factors but not the response. Therefore, surely I need to assign the predicted values to a response in order to say something like: Based on the model I believe unknown 1 is good, where as unknown 2 is very good etc? Maybe I am missing something or using the wrong approach but I thought the main purpose of using the predict function on new data was to "predict" the response? Peter On 1 Oct 2010, at 14:51, Frank Harrell <f.harrell at vanderbilt.edu> wrote:> > Why assign them at all? Is this a "forced choice at gunpoint" problem? > Remember what probabilities mean. > > Frank > > ----- > Frank Harrell > Department of Biostatistics, Vanderbilt University > -- > View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Frank-Harrell-in-the-predict-lrm-Design-help-tp2883311p2909713.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
John Haart
2010-Oct-01 14:36 UTC
[R] Fwd: Interpreting the example given by Frank Harrell in the predict.lrm {Design} help
Frank and list, The reason I am trying to assign them is because I have a data set where i have arrived at the most likely model that describes the data and now I have another dataset where I know the factors but not the response. Therefore, surely I need to assign the predicted values to a response in order to say something like: Based on the model I believe unknown 1 is good, where as unknown 2 is very good etc? Maybe I am missing something or using the wrong approach but I thought the main purpose of using the predict function on new data was to "predict" the response? John On 1 Oct 2010, at 14:51, Frank Harrell <f.harrell at vanderbilt.edu> wrote:> > Why assign them at all? Is this a "forced choice at gunpoint" problem? > Remember what probabilities mean. > > Frank > > ----- > Frank Harrell > Department of Biostatistics, Vanderbilt University > -- > View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Frank-Harrell-in-the-predict-lrm-Design-help-tp2883311p2909713.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Dear R users Sorry for my mistake The message is for <freetds at lists.ibiblio.org> list Jorge El Viernes 01/10/10, 16:31:08 Jorge W. Cardoso escribi?: ____________________________________________________> Hello list > > Can FreeTDS connect to a "SQL Windows server 2008 x64"? > > If it does, Do I need a 64 bit linux kernel or may a 32 bit > kernel in a x86 architecture be ok? > > Regards > > Jorge > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, > minimal, self-contained, reproducible code.
John Haart
2010-Oct-04 13:03 UTC
[R] Interpreting the example given by Frank Harrell in the predict.lrm {Design} help
Dear List and Frank, I have calculated the log-odds for my models but maybe i am not getting something but i am not understanding how for a categorical factor this helps? On all the examples i have see it relates to continuous factors where moving from one number to another shows either a increase or decrease, not as in my case a change of catagory. Furthermore, this gives the values for each factor independent of each other, how do i get the log-odds for the entire model? I appreciate i maybe trying to put things in boxes again, i am not i am happy to report the log odds of moving from one response level to the next but would like it for all the factors together not independently. John Low High Diff. Effect S.E. Lower Upper WO Woody:Non_woody 1 2 NA 0.28 0.16 -0.04 0.6 Odds Ratio 1 2 NA 1.32 NA 0.96 1.82 PD Abiotic:Biotic 2 1 NA -1.21 0.13 -1.47 -0.96 Odds Ratio 2 1 NA 0.3 NA 0.23 0.38 ALT All:Low 3 1 NA 0.47 0.19 0.11 0.84 Odds Ratio 3 1 NA 1.6 NA 1.11 2.31 ALT High:Low 3 2 NA -0.07 0.14 -0.35 0.21 Odds Ratio 3 2 NA 0.93 NA 0.7 1.24 ALT Mid:Low 3 4 NA 0.39 0.15 0.1 0.67 Odds Ratio 3 4 NA 1.48 NA 1.11 1.96 REG Two_plus:One 1 2 NA -0.59 0.13 -0.84 -0.34 Odds Ratio 1 2 NA 0.55 NA 0.43 0.72 BIO Arctic:Subtropical/Tropical 4 1 NA -1.02 0.81 -2.61 0.58 Odds Ratio 4 1 NA 0.36 NA 0.07 1.78 BIO Boreal:Subtropical/Tropical 4 2 NA -1.21 0.81 -2.79 0.37 Odds Ratio 4 2 NA 0.3 NA 0.06 1.44 BIO Mediterranean:Subtropical/Tropical 4 3 NA -1.89 0.48 -2.83 -0.95 Odds Ratio 4 3 NA 0.15 NA 0.06 0.39 BIO Temperate:Subtropical/Tropical 4 5 NA -0.09 0.16 -0.41 0.23 Odds Ratio 4 5 NA 0.91 NA 0.66 1.26 On 3 Oct 2010, at 15:29, Frank Harrell wrote: You still seem to be hung up on making arbitrary classifications. Instead, look at tendencies using odds ratios or rank correlation measures. My book Regression Modeling Strategies covers this. Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Frank-Harrell-in-the-predict-lrm-Design-help-tp2883311p2953220.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Frank Harrell
2010-Oct-04 13:10 UTC
[R] Interpreting the example given by Frank Harrell in the predict.lrm {Design} help
I may be missing a point, but the proportional odds model easily gives you odds ratios for Y>=j (independent of j by PO assumption). Other options include examining a rank correlation between the linear predictor and Y, or (if Y is numeric and spacings between categories are meaningful) you can get predicted mean Y (see the Mean.lrm in the R rms package, a replacement for the Design package). Frank ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Frank-Harrell-in-the-predict-lrm-Design-help-tp2883311p2954274.html Sent from the R help mailing list archive at Nabble.com.