Hi all, thank you for your patience. I am dealing with a large dataset detailing patients and medications Medications are hard to code, as they are (usually) meaningless unless matched with doses. I have a dataframe with vectors (Drug1, Drug2..... Drug 16) and individual patients are represented by rows. The vectors are actually factors, with 100s of possible levels (all the drugs the patient could be on). All I want to do is produce a vector of logicals (TTTTFFFFTTT......) that I can cbind into the dataframe, that will tell me whether a patient is or is not on a particular, important drug. This i so I can use particularly important drugs' presence or absence as categorical covariates in a model. I've tried grep, to search along the rows, and I can generate a vector of identifiers, but I cannot seem to generate the vector of logicals. I realise I'm doing something simply wrong. names(drugindex) [1] "book.MRN" "DRUG1" "DRUG2" "DRUG3" "DRUG4" "DRUG5" [7] "DRUG6" "DRUG7" "DRUG8" "DRUG9" "DRUG10" "DRUG11" [13] "DRUG12" "DRUG13" "DRUG14" "DRUG15" "DRUG16"> truvec<-drugindex$book.MRN[as.vector(unlist(apply(drugindex[,2:17], 2, > grep, pattern="Lamotrigine")))] > truvectruvec [1] 0024633 0008291 0008469 0030599 0027667 37 Levels: 0008291 0008469 0010188 0014217 0014439 0015822 ... 0034262> head(drugindex)book.MRN DRUG1 DRUG2 DRUG3 DRUG4 DRUG5 4 0008291 Venlafaxine Procyclidine Flunitrazepam Amisulpiride Clozapine 31 0008469 Venlafaxine Mirtazapine Lithium Olanzapine Metoprolol 3 0010188 Flurazepam Valproate Olanzapine Mirtazapine Esomeprazole 13 0014217 Aspirin Ramipril Zuclopenthixol Lorazepam Haloperidol 15 0014439 Zopiclone Diazepam Haloperidol Paracetamol <NA> 5 0015822 Olanzapine Venlafaxine Lithium Haloperidol Alprazolam DRUG6 DRUG7 DRUG8 DRUG9 DRUG10 DRUG11 DRUG12 4 Lamotrigine Alprazolam Lithium Alprazolam <NA> <NA> <NA> 31 Lamotrigine Ramipril Alprazolam Zolpidem Trifluoperazine <NA> <NA> 3 Paracetamol Alprazolam Citalopram <NA> <NA> <NA> <NA> 13 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 15 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> DRUG13 DRUG14 DRUG15 DRUG16 4 <NA> <NA> <NA> <NA> 31 <NA> <NA> <NA> <NA> 3 <NA> <NA> <NA> <NA> 13 <NA> <NA> <NA> <NA> 15 <NA> <NA> <NA> <NA> 5 <NA> <NA> <NA> <NA> And what I want is a vector of logicals for each drug, saying whether that patient is on it Thank you all for your time. Ross Dunne MRCPsych -- View this message in context: http://r.789695.n4.nabble.com/Just-another-pattern-matching-indexing-question-tp3276617p3276617.html Sent from the R help mailing list archive at Nabble.com.
Denis Kazakiewicz
2011-Feb-08 22:11 UTC
[R] Just another pattern matching / indexing question
For the sake of simplicity I've made your data example idd d1 d2 d3 d4 1 a d b c 2 a d v h 3 c b v NA 4 q v NA f df <- read.table('clipboard', header = TRUE,na.strings="NA") df logVec <- apply(df[2:5],1,function(x)"a"%in%x) #find drug "a" from example df2 <- cbind(df,logVec) df2 Good luck in your work :) ? ???, 08/02/2011 ? 11:38 -0800, dunner ????:> Hi all, thank you for your patience. > > I am dealing with a large dataset detailing patients and medications > > Medications are hard to code, as they are (usually) meaningless unless > matched with doses. > > I have a dataframe with vectors (Drug1, Drug2..... Drug 16) and individual > patients are represented by rows. > The vectors are actually factors, with 100s of possible levels (all the > drugs the patient could be on). > > All I want to do is produce a vector of logicals (TTTTFFFFTTT......) that I > can cbind into the dataframe, that will tell me whether a patient is or is > not on a particular, important drug. > > This i so I can use particularly important drugs' presence or absence as > categorical covariates in a model. > > I've tried grep, to search along the rows, and I can generate a vector of > identifiers, but I cannot seem to generate the vector of logicals. > > I realise I'm doing something simply wrong. > > > names(drugindex) > [1] "book.MRN" "DRUG1" "DRUG2" "DRUG3" "DRUG4" "DRUG5" > [7] "DRUG6" "DRUG7" "DRUG8" "DRUG9" "DRUG10" "DRUG11" > [13] "DRUG12" "DRUG13" "DRUG14" "DRUG15" "DRUG16" > > > truvec<-drugindex$book.MRN[as.vector(unlist(apply(drugindex[,2:17], 2, > > grep, pattern="Lamotrigine")))] > > truvec > truvec > [1] 0024633 0008291 0008469 0030599 0027667 > 37 Levels: 0008291 0008469 0010188 0014217 0014439 0015822 ... 0034262 > > > head(drugindex) > book.MRN DRUG1 DRUG2 DRUG3 DRUG4 > DRUG5 > 4 0008291 Venlafaxine Procyclidine Flunitrazepam Amisulpiride > Clozapine > 31 0008469 Venlafaxine Mirtazapine Lithium Olanzapine > Metoprolol > 3 0010188 Flurazepam Valproate Olanzapine Mirtazapine > Esomeprazole > 13 0014217 Aspirin Ramipril Zuclopenthixol Lorazepam > Haloperidol > 15 0014439 Zopiclone Diazepam Haloperidol Paracetamol > <NA> > 5 0015822 Olanzapine Venlafaxine Lithium Haloperidol > Alprazolam > DRUG6 DRUG7 DRUG8 DRUG9 DRUG10 DRUG11 > DRUG12 > 4 Lamotrigine Alprazolam Lithium Alprazolam <NA> <NA> > <NA> > 31 Lamotrigine Ramipril Alprazolam Zolpidem Trifluoperazine <NA> > <NA> > 3 Paracetamol Alprazolam Citalopram <NA> <NA> <NA> > <NA> > 13 <NA> <NA> <NA> <NA> <NA> <NA> > <NA> > 15 <NA> <NA> <NA> <NA> <NA> <NA> > <NA> > 5 <NA> <NA> <NA> <NA> <NA> <NA> > <NA> > DRUG13 DRUG14 DRUG15 DRUG16 > 4 <NA> <NA> <NA> <NA> > 31 <NA> <NA> <NA> <NA> > 3 <NA> <NA> <NA> <NA> > 13 <NA> <NA> <NA> <NA> > 15 <NA> <NA> <NA> <NA> > 5 <NA> <NA> <NA> <NA> > > And what I want is a vector of logicals for each drug, saying whether that > patient is on it > > Thank you all for your time. > > Ross Dunne MRCPsych > > > > > >