thr3ads.net - R help - [R] subseting a data frame [Mar 2012]

If this information is useful, please help other people find it:
Share via:

nathalie

2012-Mar-02 15:19 UTC

[R] subseting a data frame

HI,
this is my problem I want to subset this file df, using only  unique 
df$exon printing the line once even if  df$exon appear several times:

unique(df$exon) will show me the unique exons
If I try to print only the unique exon lines
with df[unique(df$exon),] -this doesn't print only the unique ones :(

could you help?
thanks
Nat




                         exon size  chr     start       end
413077 ChrX_133594175_133594368_HPRT1  193 ChrX 133594175 133594368
413270 ChrX_133594183_133594368_HPRT1  185 ChrX 133594183 133594368
413455 ChrX_133594381_133594565_HPRT1  184 ChrX 133594381 133594565
413639 ChrX_133607389_133607495_HPRT1  106 ChrX 133607389 133607495
413745 ChrX_133607389_133607495_HPRT1  106 ChrX 133607389 133607495
413851 ChrX_133607404_133607495_HPRT1   91 ChrX 133607404 133607495
413942 ChrX_133609211_133609394_HPRT1  183 ChrX 133609211 133609394
414125 ChrX_133609211_133609394_HPRT1  183 ChrX 133609211 133609394
414308 ChrX_133620495_133620560_HPRT1   65 ChrX 133620495 133620560
414373 ChrX_133620495_133620560_HPRT1   65 ChrX 133620495 133620560
414438 ChrX_133620692_133620696_HPRT1    4 ChrX 133620692 133620696
414442 ChrX_133624218_133624235_HPRT1   17 ChrX 133624218 133624235



-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

Rui Barradas

2012-Mar-02 16:22 UTC

head link

[R] subseting a data frame

Hello,
> HI,
> this is my problem I want to subset this file df, using only  unique
> df$exon printing the line once even if  df$exon appear several times:
> 
> unique(df$exon) will show me the unique exons
> If I try to print only the unique exon lines
> with df[unique(df$exon),] -this doesn't print only the unique ones :(
> 
Try

inx <- match(unique(df$exon), df$exon)
df[inx, ]


Hope this helps,

Rui Barradas


--
View this message in context:
http://r.789695.n4.nabble.com/subseting-a-data-frame-tp4438745p4438922.html
Sent from the R help mailing list archive at Nabble.com.

R. Michael Weylandt <michael.weylandt@gmail.com>

2012-Mar-02 17:02 UTC

head link

[R] subseting a data frame

I believe you want the duplicated() function.

Michael

On Mar 2, 2012, at 10:19 AM, nathalie <nac at sanger.ac.uk> wrote:
> HI,
> this is my problem I want to subset this file df, using only  unique
df$exon printing the line once even if  df$exon appear several times:
> 
> unique(df$exon) will show me the unique exons
> If I try to print only the unique exon lines
> with df[unique(df$exon),] -this doesn't print only the unique ones :(
> 
> could you help?
> thanks
> Nat
> 
> 
> 
> 
>                        exon size  chr     start       end
> 413077 ChrX_133594175_133594368_HPRT1  193 ChrX 133594175 133594368
> 413270 ChrX_133594183_133594368_HPRT1  185 ChrX 133594183 133594368
> 413455 ChrX_133594381_133594565_HPRT1  184 ChrX 133594381 133594565
> 413639 ChrX_133607389_133607495_HPRT1  106 ChrX 133607389 133607495
> 413745 ChrX_133607389_133607495_HPRT1  106 ChrX 133607389 133607495
> 413851 ChrX_133607404_133607495_HPRT1   91 ChrX 133607404 133607495
> 413942 ChrX_133609211_133609394_HPRT1  183 ChrX 133609211 133609394
> 414125 ChrX_133609211_133609394_HPRT1  183 ChrX 133609211 133609394
> 414308 ChrX_133620495_133620560_HPRT1   65 ChrX 133620495 133620560
> 414373 ChrX_133620495_133620560_HPRT1   65 ChrX 133620495 133620560
> 414438 ChrX_133620692_133620696_HPRT1    4 ChrX 133620692 133620696
> 414442 ChrX_133624218_133624235_HPRT1   17 ChrX 133624218 133624235
> 
> 
> 
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited,
a charity registered in England with number 1021457 and a company registered in
England with number 2742969, whose registered office is 215 Euston Road, London,
NW1 2BE.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

R. Michael Weylandt <michael.weylandt@gmail.com>

2012-Mar-02 17:56 UTC

head link

[R] subseting a data frame

Please always cc the list for archival/threading reasons. 

Sort answer is that unique() gives the unique elements rather than something you
should subset by, like a set of logical indices or row numbers.

Note that in general unique(x) == x[!duplicated(x)] I'd imagine there are
cases where this breaks down but I can't assemble one off the top of my
head.

Michael

On Mar 2, 2012, at 12:13 PM, nathalie <nac at sanger.ac.uk> wrote:
> thanks
> why unique doesn't work here??
>> I believe you want the duplicated() function.
>> 
>> Michael
>> 
>> On Mar 2, 2012, at 10:19 AM, nathalie<nac at sanger.ac.uk> 
wrote:
>> 
>>> HI,
>>> this is my problem I want to subset this file df, using only 
unique df$exon printing the line once even if  df$exon appear several times:
>>> 
>>> unique(df$exon) will show me the unique exons
>>> If I try to print only the unique exon lines
>>> with df[unique(df$exon),] -this doesn't print only the unique
ones :(
>>> 
>>> could you help?
>>> thanks
>>> Nat
>>> 
>>> 
>>> 
>>> 
>>>                        exon size  chr     start       end
>>> 413077 ChrX_133594175_133594368_HPRT1  193 ChrX 133594175 133594368
>>> 413270 ChrX_133594183_133594368_HPRT1  185 ChrX 133594183 133594368
>>> 413455 ChrX_133594381_133594565_HPRT1  184 ChrX 133594381 133594565
>>> 413639 ChrX_133607389_133607495_HPRT1  106 ChrX 133607389 133607495
>>> 413745 ChrX_133607389_133607495_HPRT1  106 ChrX 133607389 133607495
>>> 413851 ChrX_133607404_133607495_HPRT1   91 ChrX 133607404 133607495
>>> 413942 ChrX_133609211_133609394_HPRT1  183 ChrX 133609211 133609394
>>> 414125 ChrX_133609211_133609394_HPRT1  183 ChrX 133609211 133609394
>>> 414308 ChrX_133620495_133620560_HPRT1   65 ChrX 133620495 133620560
>>> 414373 ChrX_133620495_133620560_HPRT1   65 ChrX 133620495 133620560
>>> 414438 ChrX_133620692_133620696_HPRT1    4 ChrX 133620692 133620696
>>> 414442 ChrX_133624218_133624235_HPRT1   17 ChrX 133624218 133624235
>>> 
>>> 
>>> 
>>> -- 
>>> The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a company
registered in England with number 2742969, whose registered office is 215 Euston
Road, London, NW1 2BE.
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited,
a charity registered in England with number 1021457 and a company registered in
England with number 2742969, whose registered office is 215 Euston Road, London,
NW1 2BE.

Possibly Parallel Threads

Search for more reasonably related threads

R help - Mar 2012 - subseting a data frame

[R] subseting a data frame

[R] subseting a data frame

[R] subseting a data frame

[R] subseting a data frame

Possibly Parallel Threads