Weijia Wang
2013-Jan-31 15:51 UTC
[R] Locate Patients who have multiple high blood pressure readings
On Thu, Jan 31, 2013 at 10:29 AM, Weijia Wang <wwang.nyu@gmail.com> wrote:> Hi, > > > > I have a new question about subsetting in R. > > > > Say we have this data frame: > > > > PT_ID Blood_Pressure OBS_TYPE > > 92 1900 90.0 DBP > > 94 1900 90.0 DBP > > 174 2900 140.0 SBP > > 176 2900 130.0 SBP > > 180 3900 120.0 SBP > > 268 3900 150.0 SBP > > 268 3900 90.0 DBP > > > > I need to obtain those with 2+ DBP>=90 or 2+ SBP>=140. > > > > PT_ID=1900, he has 2 DBP>=90, so he will be included. > > PT_ID=2900, he has 1 SBP>=140, so he will NOT be included. > > PT_ID=3900, he has 1 SBP>=140 and 1 DBP>=90, so he will still NOT be > included. > > > > So, the condition requires TWO OR MORE values higher than the threshold. > It could be either SBP or DBP or both of them. > > > > I have tried ddply, but I don’t know how to add the condition 2+ inside > ddply. > > > > Any help is appreciated!! > > > > Weijia > > >[[alternative HTML version deleted]]
Bert Gunter
2013-Jan-31 17:52 UTC
[R] Locate Patients who have multiple high blood pressure readings
Well, since no one has responded.... Please use ?dput to provide data in your posts. There are likely zillions of way to go about this. Following is one way based on ?duplicated that I think works, but I make no claims for either elegance or efficiency. Others may do lots better. But maybe it suffices. ## Untested ## I assume the data is provided in a data frame named dd. ## All PT_ID's with >=1 high readings in SBP or in DBP> hiS <- with(dd,PT_ID[OBS_TYPE == "SBP" & Blood_Pressure >= 140]) > hiD <- with(dd,PT_ID[OBS_TYPE == "DBP" & Blood_Pressure > =90])## id's that appear more than once in either> union(unique(hiS[duplicated(hiS)]), unique(hiD[duplicated(hiD)])## you can subset your data frame to match just these, e.g. via %in%, if you like. Cheers, Bert On Thu, Jan 31, 2013 at 7:51 AM, Weijia Wang <wwang.nyu at gmail.com> wrote:> On Thu, Jan 31, 2013 at 10:29 AM, Weijia Wang <wwang.nyu at gmail.com> wrote: > >> Hi, >> >> >> >> I have a new question about subsetting in R. >> >> >> >> Say we have this data frame: >> >> >> >> PT_ID Blood_Pressure OBS_TYPE >> >> 92 1900 90.0 DBP >> >> 94 1900 90.0 DBP >> >> 174 2900 140.0 SBP >> >> 176 2900 130.0 SBP >> >> 180 3900 120.0 SBP >> >> 268 3900 150.0 SBP >> >> 268 3900 90.0 DBP >> >> >> >> I need to obtain those with 2+ DBP>=90 or 2+ SBP>=140. >> >> >> >> PT_ID=1900, he has 2 DBP>=90, so he will be included. >> >> PT_ID=2900, he has 1 SBP>=140, so he will NOT be included. >> >> PT_ID=3900, he has 1 SBP>=140 and 1 DBP>=90, so he will still NOT be >> included. >> >> >> >> So, the condition requires TWO OR MORE values higher than the threshold. >> It could be either SBP or DBP or both of them. >> >> >> >> I have tried ddply, but I don?t know how to add the condition 2+ inside >> ddply. >> >> >> >> Any help is appreciated!! >> >> >> >> Weijia >> >> >> > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
Gabor Grothendieck
2013-Jan-31 19:20 UTC
[R] Locate Patients who have multiple high blood pressure readings
On Thu, Jan 31, 2013 at 10:51 AM, Weijia Wang <wwang.nyu at gmail.com> wrote:> On Thu, Jan 31, 2013 at 10:29 AM, Weijia Wang <wwang.nyu at gmail.com> wrote: > >> Hi, >> >> >> >> I have a new question about subsetting in R. >> >> >> >> Say we have this data frame: >> >> >> >> PT_ID Blood_Pressure OBS_TYPE >> >> 92 1900 90.0 DBP >> >> 94 1900 90.0 DBP >> >> 174 2900 140.0 SBP >> >> 176 2900 130.0 SBP >> >> 180 3900 120.0 SBP >> >> 268 3900 150.0 SBP >> >> 268 3900 90.0 DBP >> >> >> >> I need to obtain those with 2+ DBP>=90 or 2+ SBP>=140. >> >> >> >> PT_ID=1900, he has 2 DBP>=90, so he will be included. >> >> PT_ID=2900, he has 1 SBP>=140, so he will NOT be included. >> >> PT_ID=3900, he has 1 SBP>=140 and 1 DBP>=90, so he will still NOT be >> included. >> >> >> >> So, the condition requires TWO OR MORE values higher than the threshold. >> It could be either SBP or DBP or both of them. >> >> >> >> I have tried ddply, but I don?t know how to add the condition 2+ inside >> ddply. >>This can be specified in a reasonably natural fashion using SQL. Here DF is the input data frame.:> library(sqldf) > sqldf("select+ PT_ID, + sum(Blood_Pressure >= 90 and OBS_TYPE == 'DBP') DBP, + sum(Blood_Pressure >= 140 and OBS_TYPE == 'SBP') SBP + from DF + group by PT_ID + having DBP >= 2 or SBP >= 2") PT_ID DBP SBP 1 1900 2 0
arun
2013-Jan-31 19:38 UTC
[R] Locate Patients who have multiple high blood pressure readings
Hi, May be this helps: #dd res<-data.frame(Include=with(subset(dd,OBS_TYPE == "SBP" & Blood_Pressure >= 140|OBS_TYPE=="DBP" & Blood_Pressure>=90),apply(tapply(Blood_Pressure,list(PT_ID,OBS_TYPE),length)>=2,1,any,na.rm=T))) res ?# ?? Include #1900??? TRUE #2900?? FALSE #3900?? FALSE A.K. ----- Original Message ----- From: Weijia Wang <wwang.nyu at gmail.com> To: r-help at r-project.org Cc: Sent: Thursday, January 31, 2013 10:51 AM Subject: [R] Locate Patients who have multiple high blood pressure readings On Thu, Jan 31, 2013 at 10:29 AM, Weijia Wang <wwang.nyu at gmail.com> wrote:> Hi, > > > > I have a new question about subsetting in R. > > > > Say we have this data frame: > > > >? ? PT_ID Blood_Pressure OBS_TYPE > > 92? 1900? ? ? 90.0? ? ? DBP > > 94? 1900? ? ? 90.0? ? ? DBP > > 174? 2900? ? 140.0? ? ? SBP > > 176? 2900? ? 130.0? ? ? SBP > > 180? 3900? ? 120.0? ? ? SBP > > 268? 3900? ? 150.0? ? ? SBP > > 268? 3900? ? ? 90.0? ? ? DBP > > > > I need to obtain those with 2+ DBP>=90 or 2+ SBP>=140. > > > > PT_ID=1900, he has 2 DBP>=90, so he will be included. > > PT_ID=2900, he has 1 SBP>=140, so he will NOT be included. > > PT_ID=3900, he has 1 SBP>=140 and 1 DBP>=90, so he will still NOT be > included. > > > > So, the condition requires TWO OR MORE values higher than the threshold. > It could be either SBP or DBP or both of them. > > > > I have tried ddply, but I don?t know how to add the condition 2+ inside > ddply. > > > > Any help is appreciated!! > > > > Weijia > > >??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.