Dear R family, I have a question about how to detect some duplicate numeric observations. Suppose that I have two variables dataset. order value 1 0.52 2 0.23 3 0.43 4 0.21 5 0.32 6 0.32 7 0.32 8 0.32 9 0.32 10 0.12 11 0.46 12 0.09 13 0.32 14 0.25 ; Could you help me indicate where the duplicate observations in a row (e.g., 0.32) are? best, moohwan
Try this: DF[duplicated(DF$value),] On Mon, Jul 5, 2010 at 1:31 PM, Moohwan Kim <kmhlmj2@gmail.com> wrote:> Dear R family, > > I have a question about how to detect some duplicate numeric observations. > Suppose that I have two variables dataset. > > order value > 1 0.52 > 2 0.23 > 3 0.43 > 4 0.21 > 5 0.32 > 6 0.32 > 7 0.32 > 8 0.32 > 9 0.32 > 10 0.12 > 11 0.46 > 12 0.09 > 13 0.32 > 14 0.25 > ; > Could you help me indicate where the duplicate observations in a row > (e.g., 0.32) are? > > best, > moohwan > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
Hello Moohwan, Look at ?duplicated for example:> x[1] 1 1 2 2 3 3> duplicated(x)[1] FALSE TRUE FALSE TRUE FALSE TRUE If your end goal is to get rid of the duplicates, take a look at ?unique> unique(x)[1] 1 2 3 Best Regards, Josh On Mon, Jul 5, 2010 at 9:31 AM, Moohwan Kim <kmhlmj2 at gmail.com> wrote:> Dear R family, > > I have a question about how to detect some duplicate numeric observations. > Suppose that I have two variables dataset. > > order value > 1 ?0.52 > 2 ?0.23 > 3 ?0.43 > 4 ?0.21 > 5 ?0.32 > 6 ?0.32 > 7 ?0.32 > 8 ?0.32 > 9 ?0.32 > 10 0.12 > 11 0.46 > 12 0.09 > 13 0.32 > 14 0.25 > ; > Could you help me indicate where the duplicate observations in a row > (e.g., 0.32) are? > > best, > moohwan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/
try this"> xorder value 1 1 0.52 2 2 0.23 3 3 0.43 4 4 0.21 5 5 0.32 6 6 0.32 7 7 0.32 8 8 0.32 9 9 0.32 10 10 0.12 11 11 0.46 12 12 0.09 13 13 0.32 14 14 0.25> # go both ways to capture all duplicates > which(duplicated(x$value) | duplicated(x$value, fromLast=TRUE))[1] 5 6 7 8 9 13>On Mon, Jul 5, 2010 at 12:31 PM, Moohwan Kim <kmhlmj2 at gmail.com> wrote:> Dear R family, > > I have a question about how to detect some duplicate numeric observations. > Suppose that I have two variables dataset. > > order value > 1 ?0.52 > 2 ?0.23 > 3 ?0.43 > 4 ?0.21 > 5 ?0.32 > 6 ?0.32 > 7 ?0.32 > 8 ?0.32 > 9 ?0.32 > 10 0.12 > 11 0.46 > 12 0.09 > 13 0.32 > 14 0.25 > ; > Could you help me indicate where the duplicate observations in a row > (e.g., 0.32) are? > > best, > moohwan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
On Mon, 5 Jul 2010, Moohwan Kim wrote:> Dear R family, > > I have a question about how to detect some duplicate numeric observations. > Suppose that I have two variables dataset. > > order value > 1 0.52 > 2 0.23 > 3 0.43 > 4 0.21 > 5 0.32 > 6 0.32 > 7 0.32 > 8 0.32 > 9 0.32 > 10 0.12 > 11 0.46 > 12 0.09 > 13 0.32 > 14 0.25 > ; > Could you help me indicate where the duplicate observations in a row > (e.g., 0.32) are?I see you already have replies about duplicate() and unique(), which are very handy for the 'detect' part of your query. But to list the locations of the duplciated elements, you might also benefit from using split() and Filter() like this:> Filter( function(x) length(x)>1, split(order, value) )$`0.32` [1] 5 6 7 8 9 13 HTH, Chuck> > best, > moohwan > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901