Dear Sir or Madam?
I am a doctor of urology,and I am engaged in developing a nomogram of bladder
cancer. May I ask for your help on below issue?
I set up a dataset which include 317 cases. I got the Binary Logistic Regression
model by SPSS.And then I try to reconstruct the model
?lrm(RECU~Complication+T.Num+T.Grade+Year+TS)? by R-Project,and try to internal
validate the model through using the function ?validate( )?,and get the ROC
through the function ?plot.roc( )?.The outcomes like this: At last I want to get
the Logistic model ,and get the prediction accuracy .Now the ?Area under the
curve?(0.6931) is not too bad,but the ?Dxy?(I think it as the prediction
accuracy probability) is too low.And I don?t know which reason lead to the
outcomes.Maybe I have a mistake understanding on the function ?lrm( )?,and apply
it wrong.
Could you please give me some idea on how to resulve this problem? Thanks in
advance for your kind support.
warmly regards,
Ding
---------------------------------------outcomes----------------------------------------------------------------------------
Logistic Regression Model
lrm(formula = RECU ~ Complications + T.Num + T.Grade + Year + TS, x = TRUE, y =
TRUE)
Model Likelihood Discrimination
Rank Discrim.
Ratio Test Indexes
Indexes
Obs 317 LR chi2 37.78 R2 0.154
C 0.693
0 201 d.f. 5 g 0.876
Dxy 0.386
1 116 Pr(> chi2) <0.0001 gr 2.400
gamma 0.408
max |deriv| 2e-09 gp 0.183
tau-a 0.180
Brier
0.207
Coef S.E.
Wald Z Pr(>|Z|)
Intercept -2.3566 0.3819 -6.17
<0.0001
Complications 1.6807 0.6005 2.80
0.0051
T.Num 0.6481 0.2503 2.59
0.0096
T.Grade 0.4276 0.1820 2.35
0.0188
Year 0.5759 0.2849 2.02
0.0432
TS 0.6313 0.2750 2.30
0.0217
> validate(f,B=200)
index.orig training test optimism index.corrected n
Dxy 0.3861 0.4081 0.3699 0.0382 0.3479 200
R2 0.1537 0.1716 0.1378 0.0339 0.1198 200
Intercept 0.0000 0.0000 -0.0585 0.0585 -0.0585 200
Slope 1.0000 1.0000 0.8835 0.1165 0.8835 200
Emax 0.0000 0.0000 0.0375 0.0375 0.0375 200
D 0.1160 0.1315 0.1030 0.0285 0.0875 200
U -0.0063 -0.0063 0.0021 -0.0084 0.0021 200
Q 0.1223 0.1378 0.1010 0.0369 0.0855 200
B 0.2073 0.2035 0.2114 -0.0079 0.2153 200
g 0.8755 0.9415 0.8170 0.1244 0.7511 200
gp 0.1833 0.1920 0.1728 0.0192 0.1641 200
> plot.roc(RECU,l)
Call:
plot.roc.default(x = RECU, predictor = l)
Data: l in 201 controls (response 0) < 116 cases (response 1).
Area under the curve: 0.6931
On Dec 9, 2010, at 8:06 AM, ?? wrote:> Dear Sir or Madam? > > > > I am a doctor of urology,and I am engaged in developing a nomogram > of bladder cancer. May I ask for your help on below issue? > > I set up a dataset which include 317 cases. I got the Binary > Logistic Regression model by SPSS.And then I try to reconstruct the > model > > ?lrm(RECU~Complication+T.Num+T.Grade+Year+TS)? by R-Project,and try > to internal validate the model through using the function > ?validate( )?,and get the ROC through the function > ?plot.roc( )?.The outcomes like this: At last I want to get the > Logistic model ,and get the prediction accuracy .Now the ?Area under > the curve?(0.6931) is not too bad,As a doctor with some experience using logistic regression and looking at the output of ROC analyses, I would ask why you think an AUC is "not too bad"? Just flipping a fair coin would give you an expected AUC of 0.5.> but the ?Dxy?(I think it as the prediction accuracy probability) is > too low.If I'm not mistaken there is a 1-1 relationship between AUC and Dxy, so I suspect you have an incorrect scale for judging the merits of one or the other of those figure of merit numbers. (You may want to purchase Harrell's text "Regression Modeling Strategies" in which I just checked my memory on this point and found that I was correct.) -- David.> And I don?t know which reason lead to the outcomes.Maybe I have a > mistake understanding on the function ?lrm( )?,and apply it wrong. > > Could you please give me some idea on how to resulve this problem? > Thanks in advance for your kind support.> > warmly regards, > Ding > --------------------------------------- > outcomes > ---------------------------------------------------------------------------- > Logistic Regression Model > lrm(formula = RECU ~ Complications + T.Num + T.Grade + Year + TS, x > = TRUE, y = TRUE) > > Model Likelihood > Discrimination Rank Discrim. > Ratio Test > Indexes Indexes > Obs 317 LR chi2 37.78 R2 > 0.154 C 0.693 > 0 201 d.f. 5 g > 0.876 Dxy 0.386 > 1 116 Pr(> chi2) <0.0001 gr > 2.400 gamma 0.408 > max |deriv| 2e-09 gp > 0.183 tau-a 0.180 > Brier > 0.207 > > Coef > S.E. Wald Z Pr(>|Z|) > Intercept -2.3566 0.3819 > -6.17 <0.0001 > Complications 1.6807 0.6005 > 2.80 0.0051 > T.Num 0.6481 0.2503 > 2.59 0.0096 > T.Grade 0.4276 0.1820 > 2.35 0.0188 > Year 0.5759 > 0.2849 2.02 0.0432 > TS 0.6313 > 0.2750 2.30 0.0217 > >> validate(f,B=200) > > index.orig training test optimism index.corrected n > > Dxy 0.3861 0.4081 0.3699 0.0382 0.3479 200 > R2 0.1537 0.1716 0.1378 0.0339 0.1198 200 > Intercept 0.0000 0.0000 -0.0585 0.0585 -0.0585 200 > Slope 1.0000 1.0000 0.8835 0.1165 0.8835 200 > Emax 0.0000 0.0000 0.0375 0.0375 0.0375 200 > D 0.1160 0.1315 0.1030 0.0285 0.0875 200 > U -0.0063 -0.0063 0.0021 -0.0084 0.0021 200 > Q 0.1223 0.1378 0.1010 0.0369 0.0855 200 > B 0.2073 0.2035 0.2114 -0.0079 0.2153 200 > g 0.8755 0.9415 0.8170 0.1244 0.7511 200 > gp 0.1833 0.1920 0.1728 0.0192 0.1641 200 > >> plot.roc(RECU,l) > > Call: > > plot.roc.default(x = RECU, predictor = l) > > Data: l in 201 controls (response 0) < 116 cases (response 1). > > Area under the curve: > 0.6931______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT