thr3ads.net - R help - [R] remove duplicated row according to NA condition [May 2014]

If this information is useful, please help other people find it:
Share via:

jeff6868

2014-May-28 12:35 UTC

[R] remove duplicated row according to NA condition

Hi everybody,

I have a little problem in my R-code which seems be easy to solve, but I
wasn't able to find the solution by myself for the moment.

Here's an example of the form of my data:

data <-
data.frame(col1=c("a","a","b","b"),col2=c(1,1,2,2),col3=c(NA,"ST001","ST002",NA))

I would like to remove duplicated data based on the first two columns
(col1,col2), but in both cases here, I would like to remove the duplicated
row which is equal to NA in col3.

Here's the data.frame I would like to obtain:

data2 <-
data.frame(col1=c("a","b"),col2=c(1,2),col3=c("ST001","ST002"))

I've been trying to mix duplicated() with is.na() but it doesn't work
yet.

Can someone tell me the best and easiest way to do this?

Thanks a lot!







--
View this message in context:
http://r.789695.n4.nabble.com/remove-duplicated-row-according-to-NA-condition-tp4691362.html
Sent from the R help mailing list archive at Nabble.com.

K. Elo

2014-May-28 15:43 UTC

head link

[R] remove duplicated row according to NA condition

Hi!

How about trying this:

data[ data$col1!=data$col2 & !is.na(data$col3), ]

  col1 col2  col3
2    a    1 ST001
3    b    2 ST002

HTH, Kimmo


28.05.2014 15:35, jeff6868 wrote:> Hi everybody,
> 
> I have a little problem in my R-code which seems be easy to solve, but I
> wasn't able to find the solution by myself for the moment.
> 
> Here's an example of the form of my data:
> 
> data <-
>
data.frame(col1=c("a","a","b","b"),col2=c(1,1,2,2),col3=c(NA,"ST001","ST002",NA))
> 
> I would like to remove duplicated data based on the first two columns
> (col1,col2), but in both cases here, I would like to remove the duplicated
> row which is equal to NA in col3.
> 
> Here's the data.frame I would like to obtain:
> 
> data2 <-
data.frame(col1=c("a","b"),col2=c(1,2),col3=c("ST001","ST002"))
> 
> I've been trying to mix duplicated() with is.na() but it doesn't
work yet.
> 
> Can someone tell me the best and easiest way to do this?
> 
> Thanks a lot!
> 
> 
> 
> 
> 
> 
> 
> --
> View this message in context:
http://r.789695.n4.nabble.com/remove-duplicated-row-according-to-NA-condition-tp4691362.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

William Dunlap

2014-May-28 15:43 UTC

head link

[R] remove duplicated row according to NA condition

It would help if you said what you want done when none or all or some
of the col1-col2 duplicates have NA's in the col3.  E.g., what do you
want the function to do for the following input?
> data2 <-
data.frame(col1=c("a","a","a","b","b","c","c","d","d","e"),    col2=c(1,1,1,2,2,3,3,4,4,5),
   
col3=c("A1",NA,"A3",NA,"B2","C1","C2",NA,NA,NA))> data2   col1 col2 col3
1     a    1   A1
2     a    1 <NA>
3     a    1   A3
4     b    2 <NA>
5     b    2   B2
6     c    3   C1
7     c    3   C2
8     d    4 <NA>
9     d    4 <NA>
10    e    5 <NA>

(You may want it to return a data.frame or you may want the function
to stop because the data is not considered legal, but you should
decide what it should do.)

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, May 28, 2014 at 5:35 AM, jeff6868
<geoffrey_klein at etu.u-bourgogne.fr> wrote:> Hi everybody,
>
> I have a little problem in my R-code which seems be easy to solve, but I
> wasn't able to find the solution by myself for the moment.
>
> Here's an example of the form of my data:
>
> data <-
>
data.frame(col1=c("a","a","b","b"),col2=c(1,1,2,2),col3=c(NA,"ST001","ST002",NA))
>
> I would like to remove duplicated data based on the first two columns
> (col1,col2), but in both cases here, I would like to remove the duplicated
> row which is equal to NA in col3.
>
> Here's the data.frame I would like to obtain:
>
> data2 <-
data.frame(col1=c("a","b"),col2=c(1,2),col3=c("ST001","ST002"))
>
> I've been trying to mix duplicated() with is.na() but it doesn't
work yet.
>
> Can someone tell me the best and easiest way to do this?
>
> Thanks a lot!
>
>
>
>
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/remove-duplicated-row-according-to-NA-condition-tp4691362.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

arun

2014-May-28 15:51 UTC

head link

[R] remove duplicated row according to NA condition

Hi,
May be this helps:
data1 <- data[with(data, order(col1, col2,1*is.na(col3))),]
?data1[!duplicated(data1[,1:2]),]
A.K.


On Wednesday, May 28, 2014 11:28 AM, jeff6868 <geoffrey_klein at
etu.u-bourgogne.fr> wrote:
Hi everybody,

I have a little problem in my R-code which seems be easy to solve, but I
wasn't able to find the solution by myself for the moment.

Here's an example of the form of my data:

data <-
data.frame(col1=c("a","a","b","b"),col2=c(1,1,2,2),col3=c(NA,"ST001","ST002",NA))

I would like to remove duplicated data based on the first two columns
(col1,col2), but in both cases here, I would like to remove the duplicated
row which is equal to NA in col3.

Here's the data.frame I would like to obtain:

data2 <-
data.frame(col1=c("a","b"),col2=c(1,2),col3=c("ST001","ST002"))

I've been trying to mix duplicated() with is.na() but it doesn't work
yet.

Can someone tell me the best and easiest way to do this?

Thanks a lot!







--
View this message in context:
http://r.789695.n4.nabble.com/remove-duplicated-row-according-to-NA-condition-tp4691362.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

R help - May 2014 - remove duplicated row according to NA condition

[R] remove duplicated row according to NA condition

[R] remove duplicated row according to NA condition

[R] remove duplicated row according to NA condition

[R] remove duplicated row according to NA condition