thr3ads.net - R help - [R] Subseting by more than one factor... [Jun 2003]

If this information is useful, please help other people find it:
Share via:

Fernando Henrique Ferraz Pereira da Rosa

2003-Jun-19 20:37 UTC

[R] Subseting by more than one factor...

Is it possible in R to subset a dataframe by more than one factor, all at
once?
     For instance, I have the dataframe: 
 >data 
   p1 p2 p3 p4 p5 p6 p7 p8 p9 p10      pred
  1    0  1  0  0  0  0  0  0  0   0 0.5862069
  4    0  0  0  0  0  0  0  0  0   1 0.5862069
  5    0  0  0  0  0  0  1  0  0   0 0.5862069
  6    0  0  0  0  0  0  0  1  0   0 0.5862069
  7    0  0  1  0  0  0  0  0  0   0 0.5862069
  9    0  0  0  0  1  0  0  0  0   0 0.5862069
  20   0  1  1  0  0  0  0  0  0   0 0.5862069
  22   0  1  0  0  1  0  0  0  0   0 0.5862069
  24   0  1  0  0  0  0  1  0  0   0 0.5862069
  25   0  1  0  0  0  0  0  1  0   0 0.5862069
  27   0  1  0  0  0  0  0  0  0   1 0.5862069

  If I want to subset only those points that have p4 = 1, I do:
   > subset(data,p4 == 1)
  And that's fine. Now suppose I want to subset those that not only have p4
= 1, but also p6 = 1.
   I tried subset(data,p4 == 1 && p6 == 1) or subset(data,p4==1 &
p6==1).
But it didn't work.
   Then I found a clumsy way to do it :
    subset(subset(data,p4==1),p6==1)
    Which works. But it soon gets very clumsy as the number of conditions
increase (I end up with a really large number of nested subsets). Is there a
simpler way to do that?


--

Sundar Dorai-Raj

2003-Jun-19 21:05 UTC

head link

[R] Subseting by more than one factor...

Fernando Henrique Ferraz Pereira da Rosa wrote:>   Is it possible in R to subset a dataframe by more than one factor, all at
> once?
>      For instance, I have the dataframe: 
>  >data 
>    p1 p2 p3 p4 p5 p6 p7 p8 p9 p10      pred
>   1    0  1  0  0  0  0  0  0  0   0 0.5862069
>   4    0  0  0  0  0  0  0  0  0   1 0.5862069
>   5    0  0  0  0  0  0  1  0  0   0 0.5862069
>   6    0  0  0  0  0  0  0  1  0   0 0.5862069
>   7    0  0  1  0  0  0  0  0  0   0 0.5862069
>   9    0  0  0  0  1  0  0  0  0   0 0.5862069
>   20   0  1  1  0  0  0  0  0  0   0 0.5862069
>   22   0  1  0  0  1  0  0  0  0   0 0.5862069
>   24   0  1  0  0  0  0  1  0  0   0 0.5862069
>   25   0  1  0  0  0  0  0  1  0   0 0.5862069
>   27   0  1  0  0  0  0  0  0  0   1 0.5862069
> 
>   If I want to subset only those points that have p4 = 1, I do:
>    > subset(data,p4 == 1)
>   And that's fine. Now suppose I want to subset those that not only
have p4
> = 1, but also p6 = 1.
>    I tried subset(data,p4 == 1 && p6 == 1) or subset(data,p4==1
& p6==1).
> But it didn't work.
It didn't? It does for me:

R> subset(z, p4 == 1 & p6 == 1)
  [1] p1   p2   p3   p4   p5   p6   p7   p8   p9   p10  pred
<0 rows> (or 0-length row.names)
R> subset(z, p2 == 1 & p8 == 1)
    p1 p2 p3 p4 p5 p6 p7 p8 p9 p10      pred
10  0  1  0  0  0  0  0  1  0   0 0.5862069
R> subset(z, (p2 == 1 & p3 == 0) | p5 == 1)
    p1 p2 p3 p4 p5 p6 p7 p8 p9 p10      pred
1   0  1  0  0  0  0  0  0  0   0 0.5862069
6   0  0  0  0  1  0  0  0  0   0 0.5862069
8   0  1  0  0  1  0  0  0  0   0 0.5862069
9   0  1  0  0  0  0  1  0  0   0 0.5862069
10  0  1  0  0  0  0  0  1  0   0 0.5862069
11  0  1  0  0  0  0  0  0  0   1 0.5862069
R> version
          _
platform i386-pc-mingw32
arch     i386
os       mingw32
system   i386, mingw32
status
major    1
minor    7.0
year     2003
month    04
day      16
language R
R>


[snip]

Regards,
Sundar

Douglas Bates

2003-Jun-19 22:54 UTC

head link

[R] Subseting by more than one factor...

Fernando Henrique Ferraz Pereira da Rosa <mentus at gmx.de> writes:
>   Is it possible in R to subset a dataframe by more than one factor, all at
> once?
>      For instance, I have the dataframe: 
>  >data 
>       p1 p2 p3 p4 p5 p6 p7 p8 p9 p10      pred
>   1    0  1  0  0  0  0  0  0  0   0 0.5862069
>   4    0  0  0  0  0  0  0  0  0   1 0.5862069
>   5    0  0  0  0  0  0  1  0  0   0 0.5862069
>   6    0  0  0  0  0  0  0  1  0   0 0.5862069
>   7    0  0  1  0  0  0  0  0  0   0 0.5862069
>   9    0  0  0  0  1  0  0  0  0   0 0.5862069
>   20   0  1  1  0  0  0  0  0  0   0 0.5862069
>   22   0  1  0  0  1  0  0  0  0   0 0.5862069
>   24   0  1  0  0  0  0  1  0  0   0 0.5862069
>   25   0  1  0  0  0  0  0  1  0   0 0.5862069
>   27   0  1  0  0  0  0  0  0  0   1 0.5862069
> 
>   If I want to subset only those points that have p4 = 1, I do:
>    > subset(data,p4 == 1)
>   And that's fine. Now suppose I want to subset those that not only
have p4
> = 1, but also p6 = 1.
>    I tried subset(data,p4 == 1 && p6 == 1) or subset(data,p4==1
& p6==1).

As Sundar pointed out it is the second form that you want.  When
intersecting conditions in subset() use &, not &&.

The way that you pasted the output in your message the column names
did not align with the columns.  I changed this in the part that I
quoted above.  This shows that you chose the wrong example, I think,
because that intersection is empty.  Try 

 subset(data, p2 == 1 & p3 == 1)

instead.

Reasonably Related Threads

Search for more maybe matching threads

R help - Jun 2003 - Subseting by more than one factor...

[R] Subseting by more than one factor...

[R] Subseting by more than one factor...

[R] Subseting by more than one factor...

Reasonably Related Threads