I have a dataset. Initially, it has 25 levels for a certain factor, Description. However, I then subset it, because I am only interested in 2 of the 25 factors. When I subset it, I get the following. The vector lists only the two factors, yet there remain 25 levels:> Quadrats.df$Description[1] Emergent 25x75 Emergent 25x75 Emergent 25x75 Emergent 25x75 Emergent 25x75 Emergent 25x75 Emergent 25x75 Emergent 25x75 Emergent 25x75 [10] Emergent 25x75 Emergent 25x75 Emergent 25x75 Emergent 25x75 Emergent 25x75 Emergent 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 [19] Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 [28] Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 25 Levels: Black Cottonwood Black Cottonwood Enhanced Emergent Emergent 25x75 Floodplain 1 Floodplain 2 Floodplain 3 Hydroseed 25x75 ... Western Red Cedar Enhanced This seems rather innocuous; however, when I run a by statement, it returns a list with 25 entries, 23 of which are of course NA....is there a way to avoid this? -- View this message in context: http://www.nabble.com/vectors-levels-are-carried-through-to-subsets...-tp25667735p25667735.html Sent from the R help mailing list archive at Nabble.com.
Barry Rowlingson
2009-Sep-29 17:52 UTC
[R] vectors levels are carried through to subsets...
On Tue, Sep 29, 2009 at 6:47 PM, chipmaney <chipmaney at hotmail.com> wrote:> > I have a dataset. ?Initially, it has 25 levels for a certain factor, > Description. > > However, I then subset it, because I am only interested in 2 of the 25 > factors. ?When I subset it, I get the following. The vector lists only the > two factors, yet there remain 25 levels: > >> Quadrats.df$Description > ?[1] Emergent 25x75 ?Emergent 25x75 ?Emergent 25x75 ?Emergent 25x75 > Emergent 25x75 ?Emergent 25x75 ?Emergent 25x75 ?Emergent 25x75 ?Emergent > 25x75 > [10] Emergent 25x75 ?Emergent 25x75 ?Emergent 25x75 ?Emergent 25x75 > Emergent 25x75 ?Emergent 25x75 ?Hydroseed 25x75 Hydroseed 25x75 Hydroseed > 25x75 > [19] Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 > Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed > 25x75 > [28] Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 > 25 Levels: Black Cottonwood Black Cottonwood Enhanced Emergent Emergent > 25x75 Floodplain 1 Floodplain 2 Floodplain 3 Hydroseed 25x75 ... Western Red > Cedar Enhanced > > This seems rather innocuous; however, when I run a by statement, it returns > a list with 25 entries, 23 of which are of course NA....is there a way to > avoid this? >Just re-factor() it when you select a subset - and also it's nice if you give us a simple example - all your Emergent this and Hydroseed doesn't look very clear! Like this: # make a factor:> x=factor(sample(letters,10)) > x[1] z x f i n b y e p c Levels: b c e f i n p x y z # a subset:> x[1:3][1] z x f Levels: b c e f i n p x y z # - still has all the levels. So re-"factor()":> factor(x[1:3])[1] z x f Levels: f x z et voila? Barry