thr3ads.net - R help - [R] subsetting matrix according to columns with character index [Aug 2008]

If this information is useful, please help other people find it:
Share via:

Ralph S.

2008-Aug-13 18:00 UTC

[R] subsetting matrix according to columns with character index

Hi,

I have a long matrix of the following form which I would like to subset
according to the third column:

[x y z]:

a1 c1 1
a1 c1 2
a2 c1 1
a1 c2 1
a1 c2 2
. . .


The first two columns a characters ai and cj.

I would like to keep all the rows where there are two entries for z, 1 and 2.

That is, I want:
a1 c1 1
a1 c1 2
a1 c2 1
a1 c2 2
. . .

I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but that only
gives me one line of data per x y combination.

Is there an easy way of coding to keep all rows for a and c combinations where z
has entries both 1 and 2?

Many thanks,

Ralph

_________________________________________________________________


LM_WLYIA_whichathlete_us

Henrique Dallazuanna

2008-Aug-13 18:11 UTC

head link

[R] subsetting matrix according to columns with character index

Try this:

x
  V1 V2 V3
1 a1 c1  1
2 a1 c1  2
3 a2 c1  1
4 a1 c2  1
5 a1 c2  2


lis <- split(x, list(x$V1, x$V2), drop = TRUE)
do.call(rbind, unname(lis[sapply(lis, function(x)all(1:2 %in% x[,3]))]))

On Wed, Aug 13, 2008 at 3:00 PM, Ralph S. <ruffel1 at hotmail.com>
wrote:>
>  Hi,
>
> I have a long matrix of the following form which I would like to subset
according to the third column:
>
> [x y z]:
>
> a1 c1 1
> a1 c1 2
> a2 c1 1
> a1 c2 1
> a1 c2 2
> . . .
>
>
> The first two columns a characters ai and cj.
>
> I would like to keep all the rows where there are two entries for z, 1 and
2.
>
> That is, I want:
> a1 c1 1
> a1 c1 2
> a1 c2 1
> a1 c2 2
> . . .
>
> I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but that
only gives me one line of data per x y combination.
>
> Is there an easy way of coding to keep all rows for a and c combinations
where z has entries both 1 and 2?
>
> Many thanks,
>
> Ralph
>
> _________________________________________________________________
>
>
> LM_WLYIA_whichathlete_us
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Henrique Dallazuanna
Curitiba-Paran?-Brasil
25? 25' 40" S 49? 16' 22" O

Ralph S.

2008-Aug-13 18:45 UTC

head link

[R] subsetting matrix according to columns with character index

I tried this - I get an empty set:

<0 rows> (or 0-length row.names)

I guess this happens because the z variable takes only one value per row??

What works is:
DFsub<-DF[DF$z == 1 | DF$z == 2,]

but then, I do not eliminate the entries where there is only one entry for z
given an a and c combination.

Any idea what to do?

-Ralph
> Date: Wed, 13 Aug 2008 13:05:25 -0500
> From: markleeds@verizon.net
> Subject: RE: [R] subsetting matrix according to columns with character
index
> To: ruffel1@hotmail.com
> 
>   it must be a dataframe so, if it was DF, then, assuming i understand 
> what you want then either of the following should work:
> 
> DFsub<-DF[DF$z == 1 & DF$z == 2,]
> 
> or
> 
> DFsub<-subset(DF, z == 1 & z == 2 )
> 
> 
> On Wed, Aug 13, 2008 at  2:00 PM, Ralph S. wrote:
> 
> > Hi,
> >
> > I have a long matrix of the following form which I would like to 
> > subset according to the third column:
> >
> > [x y z]:
> >
> > a1 c1 1
> > a1 c1 2
> > a2 c1 1
> > a1 c2 1
> > a1 c2 2
> > . . .
> >
> >
> > The first two columns a characters ai and cj.
> >
> > I would like to keep all the rows where there are two entries for z, 1
> > and 2.
> >
> > That is, I want:
> > a1 c1 1
> > a1 c1 2
> > a1 c2 1
> > a1 c2 2
> > . . .
> >
> > I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but 
> > that only gives me one line of data per x y combination.
> >
> > Is there an easy way of coding to keep all rows for a and c 
> > combinations where z has entries both 1 and 2?
> > Many thanks,
> >
> > Ralph
> >
> > _________________________________________________________________
> >
> >
> > LM_WLYIA_whichathlete_us
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
_________________________________________________________________


	[[alternative HTML version deleted]]

markleeds at verizon.net

2008-Aug-13 19:06 UTC

head link

[R] subsetting matrix according to columns with character index

i don't think i understood what you were trying to do, atleast based 
on Henrique's solution which I haven't cut and pasted yet in order
to understand. Did Henrique's solution do what you wanted ?

On Wed, Aug 13, 2008 at  2:45 PM, Ralph S. wrote:

I tried this - I get an empty set:

<0 rows> (or 0-length row.names)

I guess this happens because the z variable takes only one value per 
row??

What works is:
DFsub<-DF[DF$z == 1 | DF$z == 2,]

but then, I do not eliminate the entries where there is only one entry 
for z given an a and c combination.

Any idea what to do?

-Ralph
> Date: Wed, 13 Aug 2008 13:05:25 -0500 From: markleeds@verizon.net 
> Subject: RE: [R] subsetting matrix according to columns with character 
> index To: ruffel1@hotmail.com
>   it must be a dataframe so, if it was DF, then, assuming i understand 
> what you want then either of the following should work:
> DFsub<-DF[DF$z == 1 & DF$z == 2,]
> or
> DFsub<-subset(DF, z == 1 & z == 2 )
>
> On Wed, Aug 13, 2008 at  2:00 PM, Ralph S. wrote:
>> Hi,
>> I have a long matrix of the following form which I would like to 
>> subset according to the third column:
>> [x y z]:
>> a1 c1 1 a1 c1 2 a2 c1 1 a1 c2 1 a1 c2 2 . . .
>>
>> The first two columns a characters ai and cj.
>> I would like to keep all the rows where there are two entries for z, 
>> 1 and 2.
>> That is, I want: a1 c1 1 a1 c1 2 a1 c2 1 a1 c2 2 . . .
>> I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but 
>> that only gives me one line of data per x y combination.
>> Is there an easy way of coding to keep all rows for a and c 
>> combinations where z has entries both 1 and 2? Many thanks,
>> Ralph
>> _________________________________________________________________
>>
>> LM_WLYIA_whichathlete_us 
>> ______________________________________________ R-help@r-project.org 
>> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do 
>> read the posting guide http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
___________________________________

Your PC, mobile phone, and online services work together like never 
before. See how Windows® fits your life 
<http://clk.atdmt.com/MRT/go/108587394/direct/01/>

	[[alternative HTML version deleted]]

markleeds at verizon.net

2008-Aug-13 19:23 UTC

head link

[R] subsetting matrix according to columns with character index

sorry ralph. i meant the OR instead of the AND so that was my bad 
mistake. the subset  function should also work with the OR.

i think i understand better what you want now also.  the approach below 
for doing what you want  assumes that , if there are 2 rows associated 
with the
values in the first 2 columns , then they will be 1 and 2. If they are 
1,1 or 2,2, then it won't work. So, henrique's solution could be better 
and more general.

Assume your dataframe is called DF.

tempres<-split(DF$x,DF$y)

onlytwo<-lapply(tempres, function(.df)
    if (nrow(.df) == 2) {
       return(.df) } else {
       return(NULL) }
)

onlytwo<-onlytwo[!sapply(onlytwo,is.null)

result<-do.call(rbind,onlytwo)

On Wed, Aug 13, 2008 at  2:45 PM, Ralph S. wrote:

I tried this - I get an empty set:

<0 rows> (or 0-length row.names)

I guess this happens because the z variable takes only one value per 
row??

What works is:
DFsub<-DF[DF$z == 1 | DF$z == 2,]

but then, I do not eliminate the entries where there is only one entry 
for z given an a and c combination.

Any idea what to do?

-Ralph
> Date: Wed, 13 Aug 2008 13:05:25 -0500 From: markleeds@verizon.net 
> Subject: RE: [R] subsetting matrix according to columns with character 
> index To: ruffel1@hotmail.com
>   it must be a dataframe so, if it was DF, then, assuming i understand 
> what you want then either of the following should work:
> DFsub<-DF[DF$z == 1 & DF$z == 2,]
> or
> DFsub<-subset(DF, z == 1 & z == 2 )
>
> On Wed, Aug 13, 2008 at  2:00 PM, Ralph S. wrote:
>> Hi,
>> I have a long matrix of the following form which I would like to 
>> subset according to the third column:
>> [x y z]:
>> a1 c1 1 a1 c1 2 a2 c1 1 a1 c2 1 a1 c2 2 . . .
>>
>> The first two columns a characters ai and cj.
>> I would like to keep all the rows where there are two entries for z, 
>> 1 and 2.
>> That is, I want: a1 c1 1 a1 c1 2 a1 c2 1 a1 c2 2 . . .
>> I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but 
>> that only gives me one line of data per x y combination.
>> Is there an easy way of coding to keep all rows for a and c 
>> combinations where z has entries both 1 and 2? Many thanks,
>> Ralph
>> _________________________________________________________________
>>
>> LM_WLYIA_whichathlete_us 
>> ______________________________________________ R-help@r-project.org 
>> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do 
>> read the posting guide http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
___________________________________

Your PC, mobile phone, and online services work together like never 
before. See how Windows® fits your life 
<http://clk.atdmt.com/MRT/go/108587394/direct/01/>

	[[alternative HTML version deleted]]

markleeds at verizon.net

2008-Aug-13 20:14 UTC

head link

[R] subsetting matrix according to columns with character index

Ralph: I looked at Henrique's solution and he does 2 things which make 
it better than mine.

1) He splits based off the first two columns where I just split based on 
the second. So, my split assumes that the "same rows" are next to each
other
which is an unnecessary assumption.

2) He actually checks to make sure that 1 and 2 are actually in the 
third column of  the resulting dataframes that split returns.  I assumed 
that , if a
dataframe was of length 2, then the latter would be true automatically.

So,  even though mine worked for what you needed, in the spirit of 
generality and minimal assumptions, it better to use Henrique's 
solution. Also,
make sure you understand it because you can learn a lot from it. ( this 
is  also true of his solutions in general ).

On Wed, Aug 13, 2008 at  3:37 PM, Ralph S. wrote:

yes this work, very elegant thank you. I didn't get Henriques message in 
my mailbox immediately for some reason -

-Ralph

___________________________________

Date: Wed, 13 Aug 2008 14:23:33 -0500
 From: markleeds@verizon.net
Subject: RE: [R] subsetting matrix according to columns with character 
index
To: ruffel1@hotmail.com
CC: r-help@r-project.org

sorry ralph. i meant the OR instead of the AND so that was my bad 
mistake. the subset  function should also work with the OR.

i think i understand better what you want now also.  the approach below 
for doing what you want  assumes that , if there are 2 rows associated 
with the
values in the first 2 columns , then they will be 1 and 2. If they are 
1,1 or 2,2, then it won't work. So, henrique's solution could be better 
and more general.

Assume your dataframe is called DF.

tempres<-split(DF$x,DF$y)

onlytwo<-lapply(tempres, function(.df)
    if (nrow(.df) == 2) {
       return(.df) } else {
       return(NULL) }
)

onlytwo<-onlytwo[!sapply(onlytwo,is.null)

result<-do.call(rbind,onlytwo)

On Wed, Aug 13, 2008 at  2:45 PM, Ralph S. wrote:

I tried this - I get an empty set:

<0 rows> (or 0-length row.names)

I guess this happens because the z variable takes only one value per 
row??

What works is:
DFsub<-DF[DF$z == 1 | DF$z == 2,]

but then, I do not eliminate the entries where there is only one entry 
for z given an a and c combination.

Any idea what to do?

-Ralph
> Date: Wed, 13 Aug 2008 13:05:25 -0500 From: markleeds@verizon.net 
> Subject: RE: [R] subsetting matrix according to columns with character 
> index To: ruffel1@hotmail.com
>   it must be a dataframe so, if it was DF, then, assuming i understand 
> what you want then either of the following should work:
> DFsub<-DF[DF$z == 1 & DF$z == 2,]
> or
> DFsub<-subset(DF, z == 1 & z == 2 )
>
> On Wed, Aug 13, 2008 at  2:00 PM, Ralph S. wrote:
>> Hi,
>> I have a long matrix of the following form which I would like to 
>> subset according to the third column:
>> [x y z]:
>> a1 c1 1 a1 c1 2 a2 c1 1 a1 c2 1 a1 c2 2 . . .
>>
>> The first two columns a characters ai and cj.
>> I would like to keep all the rows where there are two entries for z, 
>> 1 and 2.
>> That is, I want: a1 c1 1 a1 c1 2 a1 c2 1 a1 c2 2 . . .
>> I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but
>> that only gives me one line of data per x y combination.
>> Is there an easy way of coding to keep all rows for a and c 
>> combinations where z has entries both 1 and 2? Many thanks,
>> Ralph
>> _________________________________________________________________
>>
>> LM_WLYIA_whichathlete_us 
>> ______________________________________________ R-help@r-project.org 
>> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do 
>> read the posting guide http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
___________________________________

Your PC, mobile phone, and online services work together like never 
before. See how Windows® fits your life 
<http://clk.atdmt.com/MRT/go/108587394/direct/01/>

___________________________________

Get more from your digital life. Find out how. 
<http://www.windowslive.com/default.html?ocid=TXT_TAGLM_WL_Home2_082008>

	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Aug 2008 - subsetting matrix according to columns with character index

[R] subsetting matrix according to columns with character index

[R] subsetting matrix according to columns with character index

[R] subsetting matrix according to columns with character index

[R] subsetting matrix according to columns with character index

[R] subsetting matrix according to columns with character index

[R] subsetting matrix according to columns with character index

Seemingly Similar Threads