I'm trying to select a subset of a dataframe while dropping some factors. While the dataset gets smaller all Factor levels remain and I need to get rid of them. Strangely enough, I am almost certain that the same code on the same data worked OK earlier today - and it is not the first time that I'm not able to replicate earlier results with this command (I know, I might just be going crazy). What am I doing wrong? I'm working on Windows Server 2003, R 2.1.0 (2005-04-18).> str(spray)`data.frame': 370 obs. of 7 variables: $ PD : Factor w/ 8 levels "Botrytis","Downy",..: 2 2 2 2 4 2 2 5 5 5 ... $ postSpmtsQ: num 1309 1309 384 384 1044 ... $ ante62Q : num 284 284 218 218 366 ... $ ante08Q : num 331 331 228 228 492 ... $ ante29Q : num 297 297 1067 1067 1034 ... $ ante16Q : num 0 0 0.2 0.2 0 0 0 6.7 0 31.5 ... $ Trt : Factor w/ 41 levels "Acrobat MZ WP",..: 27 5 27 5 36 27 5 24 24 24 ...> sprayS <- spray[spray$PD == "Spidermites", , drop TRUE] > str(sprayS)`data.frame': 111 obs. of 7 variables: $ PD : Factor w/ 8 levels "Botrytis","Downy",..: 5 5 5 5 5 5 5 5 5 5 ... $ postSpmtsQ: num 13395 31588 84254 136 619 ... $ ante62Q : num 1357 21187 21819 218 237 ... $ ante08Q : num 973 21740 25112 228 134 ... $ ante29Q : num 2103 106970 66676 1067 119 ... $ ante16Q : num 6.7 0 31.5 0.2 0 0 0 0 14.3 0 ... $ Trt : Factor w/ 41 levels "Acrobat MZ WP",..: 24 24 24 24 24 24 24 24 24 24 ...> table(sprayS$Trt)Acrobat MZ WP Agrifos Apollo 50 SC CALMAG 0 0 13 0 DM-31 Dynamec 1.8 EC Equation Pro DF Evisect S 0 13 0 0 Flint Floramite Impulse Karate 0 15 0 0 Karate zeon Melody Meltatox 40 EC MKP 0 0 0 0 Molasses Nembicidine Nimrod 250 EC Nissorun 10 EC 0 0 0 12 Oberon Orthene 75 WP Oscar 20 SC Pegasus 15 0 9 26 Polar 50 WSG Potfos Proplant Pyrus 0 0 0 0 Ridomil MZ 63 5WP Rovral aqua flo Score 250 EC Secure 36 SC 0 0 0 8 Sequestrone Shavit f Sporekill Stroby 50 WG 0 0 0 0 Switch Tracer Trafos K Vandozeb 0 0 0 0 Vitomex 0 cheers, Mikkel
Hi, Mikkel; The problem here, I think, is that spray$PD is NOT recognized as character in R. Actually, R understands it as factors according to your printout. This happens quite a lot if you use read.table and come across characters within the data file. One remedy I use often use is doing something like spray$PD <- as.character(spray$PD) to make it sure that spray$PD is character. Tae-Hoon Chung -------------------------------------------------- Tae-Hoon Chung Post-Doctoral Researcher Translational Genomics Research Institute (TGen) 445 N. 5th Street (Suite 530) Phoenix, AZ 85004 1-602-343-8724 (Direct) 1-480-323-9820 (Mobile) 1-602-343-8840 (Fax) -------------------------------------------------- On 6/1/05 11:08 AM, "Mikkel Grum" <mi2kelgrum at yahoo.com> wrote:> I'm trying to select a subset of a dataframe while > dropping some factors. While the dataset gets smaller > all Factor levels remain and I need to get rid of > them. Strangely enough, I am almost certain that the > same code on the same data worked OK earlier today - > and it is not the first time that I'm not able to > replicate earlier results with this command (I know, I > might just be going crazy). What am I doing wrong? > > I'm working on Windows Server 2003, R 2.1.0 > (2005-04-18). > >> str(spray) > `data.frame': 370 obs. of 7 variables: > $ PD : Factor w/ 8 levels > "Botrytis","Downy",..: 2 2 2 2 4 2 2 5 5 5 ... > $ postSpmtsQ: num 1309 1309 384 384 1044 ... > $ ante62Q : num 284 284 218 218 366 ... > $ ante08Q : num 331 331 228 228 492 ... > $ ante29Q : num 297 297 1067 1067 1034 ... > $ ante16Q : num 0 0 0.2 0.2 0 0 0 6.7 0 31.5 ... > $ Trt : Factor w/ 41 levels "Acrobat MZ WP",..: > 27 5 27 5 36 27 5 24 24 24 ... >> sprayS <- spray[spray$PD == "Spidermites", , drop > TRUE] >> str(sprayS) > `data.frame': 111 obs. of 7 variables: > $ PD : Factor w/ 8 levels > "Botrytis","Downy",..: 5 5 5 5 5 5 5 5 5 5 ... > $ postSpmtsQ: num 13395 31588 84254 136 619 ... > $ ante62Q : num 1357 21187 21819 218 237 ... > $ ante08Q : num 973 21740 25112 228 134 ... > $ ante29Q : num 2103 106970 66676 1067 119 > ... > $ ante16Q : num 6.7 0 31.5 0.2 0 0 0 0 14.3 0 ... > $ Trt : Factor w/ 41 levels "Acrobat MZ WP",..: > 24 24 24 24 24 24 24 24 24 24 ... >> table(sprayS$Trt) > > Acrobat MZ WP Agrifos Apollo 50 SC > CALMAG > 0 0 13 > 0 > DM-31 Dynamec 1.8 EC Equation Pro DF > Evisect S > 0 13 0 > 0 > Flint Floramite Impulse > Karate > 0 15 0 > 0 > Karate zeon Melody Meltatox 40 EC > MKP > 0 0 0 > 0 > Molasses Nembicidine Nimrod 250 EC > Nissorun 10 EC > 0 0 0 > 12 > Oberon Orthene 75 WP Oscar 20 SC > Pegasus > 15 0 9 > 26 > Polar 50 WSG Potfos Proplant > Pyrus > 0 0 0 > 0 > Ridomil MZ 63 5WP Rovral aqua flo Score 250 EC > Secure 36 SC > 0 0 0 > 8 > Sequestrone Shavit f Sporekill > Stroby 50 WG > 0 0 0 > 0 > Switch Tracer Trafos K > Vandozeb > 0 0 0 > 0 > Vitomex > 0 > > cheers, > Mikkel > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! R-project.org/posting-guide.html
the argument drop =TRUE is not meant to do that (see several responses from Peter Dalgaard about this issue i.e. finzi.psych.upenn.edu/R/Rhelp02a/archive/37333.html) If you want to drop unused factor levels try: sprayS$PD<-factor(sprayS$PD) Cheers Francisco>From: Mikkel Grum <mi2kelgrum at yahoo.com> >To: RHelp <r-help at stat.math.ethz.ch> >Subject: [R] x[x$a=="q",,drop=TRUE] >Date: Wed, 1 Jun 2005 11:08:46 -0700 (PDT) > >I'm trying to select a subset of a dataframe while >dropping some factors. While the dataset gets smaller >all Factor levels remain and I need to get rid of >them. Strangely enough, I am almost certain that the >same code on the same data worked OK earlier today - >and it is not the first time that I'm not able to >replicate earlier results with this command (I know, I >might just be going crazy). What am I doing wrong? > >I'm working on Windows Server 2003, R 2.1.0 >(2005-04-18). > > > str(spray) >`data.frame': 370 obs. of 7 variables: > $ PD : Factor w/ 8 levels >"Botrytis","Downy",..: 2 2 2 2 4 2 2 5 5 5 ... > $ postSpmtsQ: num 1309 1309 384 384 1044 ... > $ ante62Q : num 284 284 218 218 366 ... > $ ante08Q : num 331 331 228 228 492 ... > $ ante29Q : num 297 297 1067 1067 1034 ... > $ ante16Q : num 0 0 0.2 0.2 0 0 0 6.7 0 31.5 ... > $ Trt : Factor w/ 41 levels "Acrobat MZ WP",..: >27 5 27 5 36 27 5 24 24 24 ... > > sprayS <- spray[spray$PD == "Spidermites", , drop >TRUE] > > str(sprayS) >`data.frame': 111 obs. of 7 variables: > $ PD : Factor w/ 8 levels >"Botrytis","Downy",..: 5 5 5 5 5 5 5 5 5 5 ... > $ postSpmtsQ: num 13395 31588 84254 136 619 ... > $ ante62Q : num 1357 21187 21819 218 237 ... > $ ante08Q : num 973 21740 25112 228 134 ... > $ ante29Q : num 2103 106970 66676 1067 119 >... > $ ante16Q : num 6.7 0 31.5 0.2 0 0 0 0 14.3 0 ... > $ Trt : Factor w/ 41 levels "Acrobat MZ WP",..: >24 24 24 24 24 24 24 24 24 24 ... > > table(sprayS$Trt) > > Acrobat MZ WP Agrifos Apollo 50 SC > CALMAG > 0 0 13 > 0 > DM-31 Dynamec 1.8 EC Equation Pro DF > Evisect S > 0 13 0 > 0 > Flint Floramite Impulse > Karate > 0 15 0 > 0 > Karate zeon Melody Meltatox 40 EC > MKP > 0 0 0 > 0 > Molasses Nembicidine Nimrod 250 EC > Nissorun 10 EC > 0 0 0 > 12 > Oberon Orthene 75 WP Oscar 20 SC > Pegasus > 15 0 9 > 26 > Polar 50 WSG Potfos Proplant > Pyrus > 0 0 0 > 0 >Ridomil MZ 63 5WP Rovral aqua flo Score 250 EC > Secure 36 SC > 0 0 0 > 8 > Sequestrone Shavit f Sporekill > Stroby 50 WG > 0 0 0 > 0 > Switch Tracer Trafos K > Vandozeb > 0 0 0 > 0 > Vitomex > 0 > >cheers, >Mikkel > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! >R-project.org/posting-guide.html
On 6/1/05, Mikkel Grum <mi2kelgrum at yahoo.com> wrote:> I'm trying to select a subset of a dataframe while > dropping some factors. While the dataset gets smaller > all Factor levels remain and I need to get rid of > them. Strangely enough, I am almost certain that the > same code on the same data worked OK earlier today - > and it is not the first time that I'm not able to > replicate earlier results with this command (I know, I > might just be going crazy). What am I doing wrong? > > I'm working on Windows Server 2003, R 2.1.0 > (2005-04-18). > > > str(spray) > `data.frame': 370 obs. of 7 variables: > $ PD : Factor w/ 8 levels > "Botrytis","Downy",..: 2 2 2 2 4 2 2 5 5 5 ... > $ postSpmtsQ: num 1309 1309 384 384 1044 ... > $ ante62Q : num 284 284 218 218 366 ... > $ ante08Q : num 331 331 228 228 492 ... > $ ante29Q : num 297 297 1067 1067 1034 ... > $ ante16Q : num 0 0 0.2 0.2 0 0 0 6.7 0 31.5 ... > $ Trt : Factor w/ 41 levels "Acrobat MZ WP",..: > 27 5 27 5 36 27 5 24 24 24 ... > > sprayS <- spray[spray$PD == "Spidermites", , drop > TRUE] > > str(sprayS) > `data.frame': 111 obs. of 7 variables: > $ PD : Factor w/ 8 levels > "Botrytis","Downy",..: 5 5 5 5 5 5 5 5 5 5 ... > $ postSpmtsQ: num 13395 31588 84254 136 619 ... > $ ante62Q : num 1357 21187 21819 218 237 ... > $ ante08Q : num 973 21740 25112 228 134 ... > $ ante29Q : num 2103 106970 66676 1067 119 > ... > $ ante16Q : num 6.7 0 31.5 0.2 0 0 0 0 14.3 0 ... > $ Trt : Factor w/ 41 levels "Acrobat MZ WP",..: > 24 24 24 24 24 24 24 24 24 24 ... > > table(sprayS$Trt) > > Acrobat MZ WP Agrifos Apollo 50 SC > CALMAG > 0 0 13 > 0 > DM-31 Dynamec 1.8 EC Equation Pro DF > Evisect S > 0 13 0 > 0 > Flint Floramite Impulse > Karate > 0 15 0 > 0 > Karate zeon Melody Meltatox 40 EC > MKP > 0 0 0 > 0 > Molasses Nembicidine Nimrod 250 EC > Nissorun 10 EC > 0 0 0 > 12 > Oberon Orthene 75 WP Oscar 20 SC > Pegasus > 15 0 9 > 26 > Polar 50 WSG Potfos Proplant > Pyrus > 0 0 0 > 0 > Ridomil MZ 63 5WP Rovral aqua flo Score 250 EC > Secure 36 SC > 0 0 0 > 8 > Sequestrone Shavit f Sporekill > Stroby 50 WG > 0 0 0 > 0 > Switch Tracer Trafos K > Vandozeb > 0 0 0 > 0 > Vitomex > 0 > > cheers, > Mikkel >Your code says to drop dimensions whereas you want to drop factor levels (I think). For example, using the iris data set from R: ii <- subset(iris, Species == "setosa") # subset out setosa only ii$Species <- ii$Species[drop = TRUE] # drop unused factors levels(ii$Species) # check that unused factors are gone iris1 <- subset(iris0, Species == "setosa") iris1$Species <- iris1$Species[drop = TRUE]