thr3ads.net - R help - [R] How to remove some rows from a data.frame [Dec 2007]

If this information is useful, please help other people find it:
Share via:

affy snp

2007-Dec-23 21:28 UTC

[R] How to remove some rows from a data.frame

Hello list,

I have a data frame M like:

BAC                 chr    pos          s1   s2
RP11-80G24    1    77465510    -1    0
RP11-198H14    1    78696291    -1    0
RP11-267M21    1    79681704    -1    0
RP11-89A19      1    80950808    -1    0
RP11-6B16        1    82255496    -1    0
RP11-210E16    1    228801510    0    -1
RP11-155C15    1    230957584    0    -1
RP11-210F8      1    237932418    0    -1
RP11-263L17     2    65724492    0    1
RP11-340F16     2    65879898    0    1
RP11-68A1        2    67718674    0    0
RP11-474G23    2    68318411    0    0
RP11-218N6      2    68454651    0    0
CTD-2003M22    2    68567494    0    0
.....

how to remove those rows which have 0 for both of columns s1,s2?
sth like M[!M$21=0&!M$s2=0]?

Moreover, I want to get a list which could find a subset of rows which have
the same pattern of data. For example, the first 8 rows in M can be
clustered
into 2 groups (represented below in 2 rows) and shown as:

chr             Start       End             # of rows     Pattern
1             77465510   82255496       5              (-1 0)
1            228801510  237932418     3              (0 -1)

Can anybody help me out of this? Thank you very much and happy holiday!

Best,
    Allen

	[[alternative HTML version deleted]]

Gabor Grothendieck

2007-Dec-23 22:01 UTC

head link

[R] How to remove some rows from a data.frame

On Dec 23, 2007 4:28 PM, affy snp <affysnp at gmail.com>
wrote:> Hello list,
>
> I have a data frame M like:
>
> BAC                 chr    pos          s1   s2
> RP11-80G24    1    77465510    -1    0
> RP11-198H14    1    78696291    -1    0
> RP11-267M21    1    79681704    -1    0
> RP11-89A19      1    80950808    -1    0
> RP11-6B16        1    82255496    -1    0
> RP11-210E16    1    228801510    0    -1
> RP11-155C15    1    230957584    0    -1
> RP11-210F8      1    237932418    0    -1
> RP11-263L17     2    65724492    0    1
> RP11-340F16     2    65879898    0    1
> RP11-68A1        2    67718674    0    0
> RP11-474G23    2    68318411    0    0
> RP11-218N6      2    68454651    0    0
> CTD-2003M22    2    68567494    0    0
> .....
>
> how to remove those rows which have 0 for both of columns s1,s2?
> sth like M[!M$21=0&!M$s2=0]?
>
> Moreover, I want to get a list which could find a subset of rows which have
> the same pattern of data. For example, the first 8 rows in M can be
> clustered
> into 2 groups (represented below in 2 rows) and shown as:
>
> chr             Start       End             # of rows     Pattern
> 1             77465510   82255496       5              (-1 0)
> 1            228801510  237932418     3              (0 -1)
>
Using:

M <- structure(list(BAC = structure(c(13L, 3L, 8L, 14L, 12L, 4L, 2L,
5L, 7L, 9L, 11L, 10L, 6L, 1L), .Label = c("CTD-2003M22",
"RP11-155C15",
"RP11-198H14", "RP11-210E16", "RP11-210F8",
"RP11-218N6", "RP11-263L17",
"RP11-267M21", "RP11-340F16", "RP11-474G23",
"RP11-68A1", "RP11-6B16",
"RP11-80G24", "RP11-89A19"), class = "factor"),
chr = c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), pos = c(77465510L,
78696291L, 79681704L, 80950808L, 82255496L, 228801510L, 230957584L,
237932418L, 65724492L, 65879898L, 67718674L, 68318411L, 68454651L,
68567494L), s1 = c(-1L, -1L, -1L, -1L, -1L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L), s2 = c(0L, 0L, 0L, 0L, 0L, -1L, -1L, -1L, 1L,
1L, 0L, 0L, 0L, 0L)), .Names = c("BAC", "chr",
"pos", "s1", "s2"
), class = "data.frame", row.names = c(NA, -14L))

# try this

subset(M, s1 | s2)  # as 0 regarded as FALSE and others as TRUE

# and for second question:

f <- function(x) with(x,
  c(start = pos[1], end = tail(pos, 1),
     chr = chr[1], nrow = NROW(x), s1 = s1[1], s2 = s2[1])
)
do.call(rbind, by(M, M[4:5], f))

Don MacQueen

2007-Dec-23 22:14 UTC

head link

[R] How to remove some rows from a data.frame

At 4:28 PM -0500 12/23/07, affy snp wrote:>Hello list,
>
>I have a data frame M like:
>
>BAC                 chr    pos          s1   s2
>RP11-80G24    1    77465510    -1    0
>RP11-198H14    1    78696291    -1    0
>RP11-267M21    1    79681704    -1    0
>RP11-89A19      1    80950808    -1    0
>RP11-6B16        1    82255496    -1    0
>RP11-210E16    1    228801510    0    -1
>RP11-155C15    1    230957584    0    -1
>RP11-210F8      1    237932418    0    -1
>RP11-263L17     2    65724492    0    1
>RP11-340F16     2    65879898    0    1
>RP11-68A1        2    67718674    0    0
>RP11-474G23    2    68318411    0    0
>RP11-218N6      2    68454651    0    0
>CTD-2003M22    2    68567494    0    0
>.....
>
>how to remove those rows which have 0 for both of columns s1,s2?
>sth like M[!M$21=0&!M$s2=0]?
M[ !(M$s1==0 & M$s2==0) , ]
>
>Moreover, I want to get a list which could find a subset of rows which have
>the same pattern of data. For example, the first 8 rows in M can be
>clustered
>into 2 groups (represented below in 2 rows) and shown as:
>
>chr             Start       End             # of rows     Pattern
>1             77465510   82255496       5              (-1 0)
>1            228801510  237932418     3              (0 -1)
>
>Can anybody help me out of this? Thank you very much and happy holiday!
pat <- paste(M$s1,M$s2)

## to find the first subset:
M[ pat == pat[1] ,]

## to find the second subset:
M[ pat == pat[2], ]

## and so on, for however many unique patterns there are.

## also try
table(pat)

Of course, your example does more than just "find" the subsets. It 
also does some summarizing of them. That's a little more complicated. 
I might start with the summarize() function in the Hmisc package, but 
there are potentially many ways to also do the summarizing.

-Don
>Best,
>     Allen
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
---------------------------------
Don MacQueen
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062
macq at llnl.gov

Moshe Olshansky

2007-Dec-24 08:54 UTC

head link

[R] How to remove some rows from a data.frame

To answer your firs question try

M[-which( M$s1 == 0 & M$s2 == 0),]

For the second question, you must start with the more
precise definition of the grouping criterion.

--- affy snp <affysnp at gmail.com> wrote:
> Hello list,
> 
> I have a data frame M like:
> 
> BAC                 chr    pos          s1   s2
> RP11-80G24    1    77465510    -1    0
> RP11-198H14    1    78696291    -1    0
> RP11-267M21    1    79681704    -1    0
> RP11-89A19      1    80950808    -1    0
> RP11-6B16        1    82255496    -1    0
> RP11-210E16    1    228801510    0    -1
> RP11-155C15    1    230957584    0    -1
> RP11-210F8      1    237932418    0    -1
> RP11-263L17     2    65724492    0    1
> RP11-340F16     2    65879898    0    1
> RP11-68A1        2    67718674    0    0
> RP11-474G23    2    68318411    0    0
> RP11-218N6      2    68454651    0    0
> CTD-2003M22    2    68567494    0    0
> .....
> 
> how to remove those rows which have 0 for both of
> columns s1,s2?
> sth like M[!M$21=0&!M$s2=0]?
> 
> Moreover, I want to get a list which could find a
> subset of rows which have
> the same pattern of data. For example, the first 8
> rows in M can be
> clustered
> into 2 groups (represented below in 2 rows) and
> shown as:
> 
> chr             Start       End             # of
> rows     Pattern
> 1             77465510   82255496       5           
>   (-1 0)
> 1            228801510  237932418     3             
> (0 -1)
> 
> Can anybody help me out of this? Thank you very much
> and happy holiday!
> 
> Best,
>     Allen
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>

R help - Dec 2007 - How to remove some rows from a data.frame

[R] How to remove some rows from a data.frame

[R] How to remove some rows from a data.frame

[R] How to remove some rows from a data.frame

[R] How to remove some rows from a data.frame

Seemingly Similar Threads