Dear R - help, I am working on the Credit scorecard model. I am using the Logistic regression to arrive at the regression coefficients model. I want to use the Hosmer - Lemeshow test . In order to understand the use of R - language, I had referred the following URL http://www.stat.sc.edu/~hitchcock/diseaseoutbreakRexample704.txt The related data 'diseaseoutbreak' is available at the following URL http://www.stat.sc.edu/~hitchcock/diseaseoutbreakdata.txt The R code as mentioned therein is #### # A function to do the Hosmer-Lemeshow test in R. # R Function is due to Peter D. M. Macdonald, McMaster University. # hosmerlem <- function (y, yhat, g = 10) { cutyhat <- cut(yhat, breaks = quantile(yhat, probs = seq(0, 1, 1/g)), include.lowest = T) obs <- xtabs(cbind(1 - y, y) ~ cutyhat) expect <- xtabs(cbind(1 - yhat, yhat) ~ cutyhat) chisq <- sum((obs - expect)^2/expect) P <- 1 - pchisq(chisq, g - 2) c("X^2" = chisq, Df = g - 2, "P(>Chi)" = P) } # ###### # Doing the Hosmer-Lemeshow test # (after copying the above function into R): hosmerlem(disease, fitted(disease.logit)) However when I ran these commands / functions in R, I got following errors Error in model.frame.default(formula = cbind(1 - y, y) ~ cutyhat) : invalid type (list) for variable 'cbind(1 - y, y)' Can anyone please guide me as to how to run Hosmer- Lemeshow test, as also how to find out the other usual logistic regression related "Log - likelihood, AIC, Pseudo R etc"? Thanking you all in advance Saggak Unlimited freedom, unlimited storage. Get it now, on http://help.yahoo.com/l/in/yahoo/mail/yahoomail/tools/tools-08.html/ [[alternative HTML version deleted]]
saggak wrote:> Dear R - help, > > I am working on the Credit scorecard model. I am using the Logistic regression to arrive at the regression coefficients model. > > I want to use the Hosmer - Lemeshow test . > > In order to understand the use of R - language, I had referred the following URL > > ? ? ? ? ? http://www.stat.sc.edu/~hitchcock/diseaseoutbreakRexample704.txt > > The related data 'diseaseoutbreak' is available at the following URL > > ? ? ? ? ? ? http://www.stat.sc.edu/~hitchcock/diseaseoutbreakdata.txt > > The R code as mentioned therein is > > #### > # A function to do the Hosmer-Lemeshow test in R. > # R Function is due to Peter D. M. Macdonald, McMaster University. > # > hosmerlem <- > function (y, yhat, g = 10) > { > cutyhat <- cut(yhat, breaks = quantile(yhat, probs = seq(0, > 1, 1/g)), include.lowest = T) > obs <- xtabs(cbind(1 - y, y) ~ cutyhat) > expect <- xtabs(cbind(1 - yhat, yhat) ~ cutyhat) > chisq <- sum((obs - expect)^2/expect) > P <- 1 - pchisq(chisq, g - 2) > c("X^2" = chisq, Df = g - 2, "P(>Chi)" = P) > } > # > ###### > > # Doing the Hosmer-Lemeshow test > # (after copying the above function into R): > > hosmerlem(disease, fitted(disease.logit)) > However when I ran these commands / functions in R, I got following errors > > Error in model.frame.default(formula = cbind(1 - y, y) ~ cutyhat) : > ? invalid type (list) for variable 'cbind(1 - y, y)' > > Can anyone please guide me as to how to run Hosmer- Lemeshow test, as also how to find out the other usual logistic regression related "Log - likelihood, AIC, Pseudo R etc"? > > Thanking you all in advance > > SaggakThat test is too dependent on cutpoints and does not have adequate power . I recommend replacing it with @ARTICLE{hos97com, author = {Hosmer, D. W. and Hosmer, T. and {le Cessie}, S. and Lemeshow, S.}, year = 1997, title = {A comparison of goodness-of-fit tests for the logistic regression model}, journal = Statistics in Medicine, volume = 16, pages = {965-980}, annote = {goodness-of-fit for binary logistic model;difficulty with Hosmer-Lemeshow statistic being dependent on how groups are defined;sum of squares test;cumulative sum test;invalidity of naive test based on deviance;goodness-of-link function;simulation setup} which is implemented in the residuals.lrm function in the Design package. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
Dear Mr Frank, I thank you for your prompt reply. However, I am not able to understand (may be since for me R is a new venture) the contents of your reply. If its a book you are referring to, I don't have access to it. How do I get @ARTICLE{hos97com and how do I run it in R? Thanking you in adavance With regards Saggak --- On Tue, 16/9/08, Frank E Harrell Jr <f.harrell@vanderbilt.edu> wrote: From: Frank E Harrell Jr <f.harrell@vanderbilt.edu> Subject: Re: [R] Hosmer- Lemeshow test To: saggak1908@yahoo.co.in Cc: "R list" <r-help@stat.math.ethz.ch> Date: Tuesday, 16 September, 2008, 4:38 PM saggak wrote:> Dear R - help, > > I am working on the Credit scorecard model. I am using the Logisticregression to arrive at the regression coefficients model.> > I want to use the Hosmer - Lemeshow test . > > In order to understand the use of R - language, I had referred thefollowing URL> > Â Â Â Â Âhttp://www.stat.sc.edu/~hitchcock/diseaseoutbreakRexample704.txt> > The related data 'diseaseoutbreak' is available at the followingURL> > Â Â Â Â Â Âhttp://www.stat.sc.edu/~hitchcock/diseaseoutbreakdata.txt> > The R code as mentioned therein is > > #### > # A function to do the Hosmer-Lemeshow test in R. > # R Function is due to Peter D. M. Macdonald, McMaster University. > # > hosmerlem <- > function (y, yhat, g = 10) > { > cutyhat <- cut(yhat, breaks = quantile(yhat, probs = seq(0, > 1, 1/g)), include.lowest = T) > obs <- xtabs(cbind(1 - y, y) ~ cutyhat) > expect <- xtabs(cbind(1 - yhat, yhat) ~ cutyhat) > chisq <- sum((obs - expect)^2/expect) > P <- 1 - pchisq(chisq, g - 2) > c("X^2" = chisq, Df = g - 2, "P(>Chi)" = P) > } > # > ###### > > # Doing the Hosmer-Lemeshow test > # (after copying the above function into R): > > hosmerlem(disease, fitted(disease.logit)) > However when I ran these commands / functions in R, I got following errors > > Error in model.frame.default(formula = cbind(1 - y, y) ~ cutyhat) : > Â invalid type (list) for variable 'cbind(1 - y, y)' > > Can anyone please guide me as to how to run Hosmer- Lemeshow test, as alsohow to find out the other usual logistic regression related "Log - likelihood, AIC, Pseudo R etc"?> > Thanking you all in advance > > SaggakThat test is too dependent on cutpoints and does not have adequate power . I recommend replacing it with @ARTICLE{hos97com, author = {Hosmer, D. W. and Hosmer, T. and {le Cessie}, S. and Lemeshow, S.}, year = 1997, title = {A comparison of goodness-of-fit tests for the logistic regression model}, journal = Statistics in Medicine, volume = 16, pages = {965-980}, annote = {goodness-of-fit for binary logistic model;difficulty with Hosmer-Lemeshow statistic being dependent on how groups are defined;sum of squares test;cumulative sum test;invalidity of naive test based on deviance;goodness-of-link function;simulation setup} which is implemented in the residuals.lrm function in the Design package. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University Cricket on your mind? Visit the ultimate cricket website. Enter http://in.sports.yahoo.com/cricket/ [[alternative HTML version deleted]]
Possibly Parallel Threads
- Hosmer-Lemeshow 'goodness of fit'
- two kind of Hosmer and Lemeshow’s test
- Hosmer Lemeshow test
- Different goodness of fit tests leads to contradictory conclusions
- Appropriate tests for logistic regression with a continuous predictor variable and Bernoulli response variable