I have a dataset at a hospital level (as opposed to the patient level) that contains number of patients experiencing events (call this number y), and the number of patients eligible for such events (call this number n). I am trying to model logit(y/n) = XBeta. In SAS this can be done in PROC LOGISTIC or GENMOD with a model statement such as: model y/n = <predictors>;. Can this be done using lrm from the Hmisc library without restructuring the dataset so that for each hospital there is one row with y = 1 and one row with y = 0 and then using the weight option in lrm to weight these two responses by the number of 'successes' and 'failures' for that hospital, respectively? I would like to avoid the restructuring, and I understand that the use of the weight function is not compatible with a lot of the validation functions available in Hmisc (validate, bootcov, etc.). Cody Hamilton, Ph.D Institute for Health Care Research and Improvement Baylor Health Care System (214) 265-3618 This e-mail, facsimile, or letter and any files or attachments transmitted with it contains information that is confidential and privileged. This information is intended only for the use of the individual(s) and entity(ies) to whom it is addressed. If you are the intended recipient, further disclosures are prohibited without proper authorization. If you are not the intended recipient, any disclosure, copying, printing, or use of this information is strictly prohibited and possibly a violation of federal or state law and regulations. If you have received this information in error, please notify Baylor Health Care System immediately at 1-866-402-1661 or via e-mail at privacy@baylorhealth.edu. Baylor Health Care System, its subsidiaries, and affiliates hereby claim all applicable privileges related to this information. [[alternative HTML version deleted]]
Hamilton, Cody wrote:> I have a dataset at a hospital level (as opposed to the patient level) > that contains number of patients experiencing events (call this number > y), and the number of patients eligible for such events (call this > number n). I am trying to model logit(y/n) = XBeta. In SAS this can be > done in PROC LOGISTIC or GENMOD with a model statement such as: model > y/n = <predictors>;. Can this be done using lrm from the Hmisc library > without restructuring the dataset so that for each hospital there is one > row with y = 1 and one row with y = 0 and then using the weight option > in lrm to weight these two responses by the number of 'successes' and > 'failures' for that hospital, respectively? I would like to avoid the > restructuring, and I understand that the use of the weight function is > not compatible with a lot of the validation functions available in Hmisc > (validate, bootcov, etc.).I don't know about lrm, but for glm you can do glm(cbind(y,m)~ ...) where y is number of successes and m is the number of failures. So, you might try that.> Cody Hamilton, Ph.D > > Institute for Health Care Research and Improvement > > Baylor Health Care System > > (214) 265-3618 >-- Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Department of Public Health Sciences Faculty of Medicine, University of Toronto email: kevin.thorpe at utoronto.ca Tel: 416.946.8081 Fax: 416.946.3297
Cody Hamilton, Ph.D, wrote:> I have a dataset at a hospital level (as opposed to the patient > level) that contains number of patients experiencing events (call > this number y), and the number of patients eligible for such events > (call this number n). I am trying to model logit(y/n) = XBeta. In > SAS this can be done in PROC LOGISTIC or GENMOD with a model > statement such as: model y/n = <predictors>;. Can this be done using > lrm from the Hmisc library without restructuring the dataset so that > for each hospital there is one row with y = 1 and one row with y = 0 > and then using the weight option in lrm to weight these two responses > by the number of 'successes' and 'failures' for that hospital, > respectively? I would like to avoid the restructuring, and I > understand that the use of the weight function is not compatible with > a lot of the validation functions available in Hmisc (validate, > bootcov, etc.).Why do you need lrm()? Is there something I'm missing? As far as I can tell you can simply do glm(cbind(y,n-y) ~ <predictors>,family=binomial,data=<data>) where ``<data>'' has columns named ``y'' ``n'' and whatever the predictors are called. cheers, Rolf Turner rolf at math.unb.ca
Thank you for the suggestions! I am interested in using lrm because I am not sure that glm will interact with other functions from Hmisc (e.g. aregImpute, fit.mult.impute, etc). Cody Hamilton, Ph.D Institute for Health Care Research and Improvement Baylor Health Care System (214) 265-3618 This e-mail, facsimile, or letter and any files or attachments transmitted with it contains information that is confidential and privileged. This information is intended only for the use of the individual(s) and entity(ies) to whom it is addressed. If you are the intended recipient, further disclosures are prohibited without proper authorization. If you are not the intended recipient, any disclosure, copying, printing, or use of this information is strictly prohibited and possibly a violation of federal or state law and regulations. If you have received this information in error, please notify Baylor Health Care System immediately at 1-866-402-1661 or via e-mail at privacy@baylorhealth.edu. Baylor Health Care System, its subsidiaries, and affiliates hereby claim all applicable privileges related to this information. [[alternative HTML version deleted]]
Not sure about your data set, but if you have some kind of (weighted/stratified) sample of hospitals you need to pay special attention. Survey data violates the assumptions of the classical linear models (infinite population, identically distributed errors etc) and needs to be analyzed differently. In SAS, it's wrong to throw such data into a PROC LOGISTIC / REG; PROC SURVEYLOGISTIC / SURVEYREG should be used instead. In R, take a look at the survey package. For details check http://www2.sas.com/proceedings/sugi31/193-31.pdf> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Hamilton, Cody > Sent: Friday, June 16, 2006 1:32 PM > To: r-help at stat.math.ethz.ch > Subject: [R] modeling logit(y/n) using lrm > > > I have a dataset at a hospital level (as opposed to the patient level) > that contains number of patients experiencing events (call this number > y), and the number of patients eligible for such events (call this > number n). I am trying to model logit(y/n) = XBeta. In SAS > this can be > done in PROC LOGISTIC or GENMOD with a model statement such as: model > y/n = <predictors>;. Can this be done using lrm from the > Hmisc library > without restructuring the dataset so that for each hospital > there is one > row with y = 1 and one row with y = 0 and then using the weight option > in lrm to weight these two responses by the number of 'successes' and > 'failures' for that hospital, respectively? I would like to avoid the > restructuring, and I understand that the use of the weight function is > not compatible with a lot of the validation functions > available in Hmisc > (validate, bootcov, etc.). > > > > Cody Hamilton, Ph.D > > Institute for Health Care Research and Improvement > > Baylor Health Care System > > (214) 265-3618 > > > > > > This e-mail, facsimile, or letter and any files or > attachments transmitted with it contains information that is > confidential and privileged. This information is intended > only for the use of the individual(s) and entity(ies) to whom > it is addressed. If you are the intended recipient, further > disclosures are prohibited without proper authorization. If > you are not the intended recipient, any disclosure, copying, > printing, or use of this information is strictly prohibited > and possibly a violation of federal or state law and > regulations. If you have received this information in error, > please notify Baylor Health Care System immediately at > 1-866-402-1661 or via e-mail at privacy at baylorhealth.edu. > Baylor Health Care System, its subsidiaries, and affiliates > hereby claim all applicable privileges related to this information. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >
After a little digging, it turns out that fit.mult.impute will allow fitter = glm, so previous suggestions regarding modeling cbind(y,n) as an outcome will work fine. Thanks! Cody Hamilton, Ph.D Institute for Health Care Research and Improvement Baylor Health Care System (214) 265-3618 This e-mail, facsimile, or letter and any files or attachments transmitted with it contains information that is confidential and privileged. This information is intended only for the use of the individual(s) and entity(ies) to whom it is addressed. If you are the intended recipient, further disclosures are prohibited without proper authorization. If you are not the intended recipient, any disclosure, copying, printing, or use of this information is strictly prohibited and possibly a violation of federal or state law and regulations. If you have received this information in error, please notify Baylor Health Care System immediately at 1-866-402-1661 or via e-mail at privacy@baylorhealth.edu. Baylor Health Care System, its subsidiaries, and affiliates hereby claim all applicable privileges related to this information. [[alternative HTML version deleted]]
Hamilton, Cody wrote:> After a little digging, it turns out that fit.mult.impute will allow > fitter = glm, so previous suggestions regarding modeling cbind(y,n) as > an outcome will work fine. Thanks!Also lrm can easily handle your setup using the weights argument. Frank> > > > Cody Hamilton, Ph.D > > Institute for Health Care Research and Improvement > > Baylor Health Care System > > (214) 265-3618 > > > > > > This e-mail, facsimile, or letter and any files or attachments transmitted with it contains information that is confidential and privileged. This information is intended only for the use of the individual(s) and entity(ies) to whom it is addressed. If you are the intended recipient, further disclosures are prohibited without proper authorization. If you are not the intended recipient, any disclosure, copying, printing, or use of this information is strictly prohibited and possibly a violation of federal or state law and regulations. If you have received this information in error, please notify Baylor Health Care System immediately at 1-866-402-1661 or via e-mail at privacy at baylorhealth.edu. Baylor Health Care System, its subsidiaries, and affiliates hereby claim all applicable privileges related to this information. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University