thr3ads.net - R help - [R] extract rows in dataframe with duplicated column values [Mar 2005]

If this information is useful, please help other people find it:
Share via:

Tiago R Magalhaes

2005-Mar-18 02:11 UTC

[R] extract rows in dataframe with duplicated column values

Hi

I want to extract all the rows in a data frame that have duplicates 
for a given column.
I would expect this question to come up pretty often but I have 
researched the archives and surprisingly couldn't find anything.
The best I can come up with is:

x <- data.frame(a=c(1,2,2,3,3,3), b=10)
xdup1 <- duplicated(x[,1])
xdup2 <- duplicated(x[,1][nrow(x):1])[nrow(x):1]
xAllDups <- x[(xdup1+xdup2)!=0,]

This seems to work, but it's so convoluted that I'm sure there's a 
better method.
Thanks for any help and enlightenment
	[[alternative HTML version deleted]]

Liaw, Andy

2005-Mar-18 03:14 UTC

head link

[R] extract rows in dataframe with duplicated column values

Does this work for you?
> x[table(x[,1]) > 1,]  a  b
2 2 10
3 2 10
5 3 10
6 3 10

Andy
> From: Tiago R Magalhaes
> 
> Hi
> 
> I want to extract all the rows in a data frame that have duplicates 
> for a given column.
> I would expect this question to come up pretty often but I have 
> researched the archives and surprisingly couldn't find anything.
> The best I can come up with is:
> 
> x <- data.frame(a=c(1,2,2,3,3,3), b=10)
> xdup1 <- duplicated(x[,1])
> xdup2 <- duplicated(x[,1][nrow(x):1])[nrow(x):1]
> xAllDups <- x[(xdup1+xdup2)!=0,]
> 
> This seems to work, but it's so convoluted that I'm sure
there's a
> better method.
> Thanks for any help and enlightenment
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 
>

Liaw, Andy

2005-Mar-18 03:25 UTC

head link

[R] extract rows in dataframe with duplicated column values

OK, strike one...

Here's my second try:
> cnt <- table(x[,1])
> v <- as.numeric(names(cnt[cnt > 1]))
> v
[1] 2 3> x[x[,1] %in% v, ]  a  b
2 2 10
3 2 10
4 3 10
5 3 10
6 3 10

Andy
> From: Liaw, Andy
> 
> Does this work for you?
> 
> > x[table(x[,1]) > 1,]
>   a  b
> 2 2 10
> 3 2 10
> 5 3 10
> 6 3 10
> 
> Andy
> 
> > From: Tiago R Magalhaes
> > 
> > Hi
> > 
> > I want to extract all the rows in a data frame that have duplicates 
> > for a given column.
> > I would expect this question to come up pretty often but I have 
> > researched the archives and surprisingly couldn't find anything.
> > The best I can come up with is:
> > 
> > x <- data.frame(a=c(1,2,2,3,3,3), b=10)
> > xdup1 <- duplicated(x[,1])
> > xdup2 <- duplicated(x[,1][nrow(x):1])[nrow(x):1]
> > xAllDups <- x[(xdup1+xdup2)!=0,]
> > 
> > This seems to work, but it's so convoluted that I'm sure
there's a
> > better method.
> > Thanks for any help and enlightenment
> > 	[[alternative HTML version deleted]]
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> > 
> > 
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 
> --------------------------------------------------------------
> ----------------
> Notice:  This e-mail message, together with any attachments, 
> contains information of Merck & Co., Inc. (One Merck Drive, 
> Whitehouse Station, New Jersey, USA 08889), and/or its 
> affiliates (which may be known outside the United States as 
> Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as 
> Banyu) that may be confidential, proprietary copyrighted 
> and/or legally privileged. It is intended solely for the use 
> of the individual or entity named on this message.  If you 
> are not the intended recipient, and have received this 
> message in error, please notify us immediately by reply 
> e-mail and then delete it from your system.
> --------------------------------------------------------------
> ----------------
> 
>

Rob J Goedman

2005-Mar-18 03:35 UTC

head link

[R] extract rows in dataframe with duplicated column values

Tiago,

Assuming the column in x is sorted:

t = which(duplicated(x[, 1]))
x[sort(union(t-1, t)),]

or, if not sorted:

t = which(duplicated(sort(x[, 1])))
x[sort(union(t-1, t)),]

Rob

On Mar 17, 2005, at 6:11 PM, Tiago R Magalhaes wrote:
> Hi
>
> I want to extract all the rows in a data frame that have duplicates
> for a given column.
> I would expect this question to come up pretty often but I have
> researched the archives and surprisingly couldn't find anything.
> The best I can come up with is:
>
> x <- data.frame(a=c(1,2,2,3,3,3), b=10)
> xdup1 <- duplicated(x[,1])
> xdup2 <- duplicated(x[,1][nrow(x):1])[nrow(x):1]
> xAllDups <- x[(xdup1+xdup2)!=0,]
>
> This seems to work, but it's so convoluted that I'm sure
there's a
> better method.
> Thanks for any help and enlightenment
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
	[[alternative text/enriched version deleted]]

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Mar 2005 - extract rows in dataframe with duplicated column values

[R] extract rows in dataframe with duplicated column values

[R] extract rows in dataframe with duplicated column values

[R] extract rows in dataframe with duplicated column values

[R] extract rows in dataframe with duplicated column values

Seemingly Similar Threads