Polwart Calum (County Durham and Darlington NHS Foundation Trust)
2010-May-25 18:12 UTC
[R] Non-unique Values
I might be missing something really obvious, but is there an easy way to locate all non-unique values in a data frame? Example mydata <- numeric() mydata$id <- 0:8 mydata$unique <- c(1:5, 1:4) mydata$result <- c(1:3, 1:3, 1:3)> mydata$id [1] 0 1 2 3 4 5 6 7 8 $unique [1] 1 2 3 4 5 1 2 3 4 $result [1] 1 2 3 1 2 3 1 2 3 What I want to to be able to get some form of data output that might look like this:> nonunique(mydata$unique)mydata$unique 1 $id 0, 5 2 $id 1, 6 3 $id 2, 7 4 $id 3, 8 So that I could report to my data entry team any non-unique values of unique and tell them the row numbers so they can check if the 'unique' value is keyed wrongly, or the entry had been made twice. Hoping there is an easy way. if not I suspect we can do it in the SQL tables, just trying not to juggle two languages... C ******************************************************************************************************************** This message may contain confidential information. If yo...{{dropped:21}}
The really obvious thing that you missed ;-) was trying: help(unique) and looking at 'see also' which would have led you to help(duplicated) HTH Jannis --- Polwart Calum (County Durham and Darlington NHS Foundation Trust) <calum.polwart at nhs.net> schrieb am Di, 25.5.2010:> Von: Polwart Calum (County Durham and Darlington NHS Foundation Trust) <calum.polwart at nhs.net> > Betreff: [R] Non-unique Values > An: "r-help at r-project.org" <r-help at r-project.org> > Datum: Dienstag, 25. Mai, 2010 18:12 Uhr > I might be missing something really > obvious, but is there an easy way to locate all non-unique > values in a data frame? > > Example > > mydata <- numeric() > mydata$id <- 0:8 > mydata$unique <- c(1:5, 1:4) > mydata$result <- c(1:3, 1:3, 1:3) > > > mydata > $id > [1] 0 1 2 3 4 5 6 7 8 > $unique > [1] 1 2 3 4 5 1 2 3 4 > $result > [1] 1 2 3 1 2 3 1 2 3 > > What I want to to be able to get some form of data output > that might look like this: > > > nonunique(mydata$unique) > mydata$unique > 1? $id 0, 5 > 2? $id 1, 6 > 3? $id 2, 7 > 4? $id 3, 8 > > So that I could report to my data entry team any non-unique > values of unique and tell them the row numbers so they can > check if the 'unique' value is keyed wrongly, or the entry > had been made twice. > > Hoping there is an easy way.? if not I suspect we can > do it in the SQL tables, just trying not to juggle two > languages... > > C > > ******************************************************************************************************************** > > This message may contain confidential information. If > yo...{{dropped:21}} > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >
Seemingly Similar Threads
- PowerCut Killed R - is my code retrievable?
- Kaplan Meier - not for dates
- odfWeave - merged table cells, and adding information like totals and p-values
- Comparing two different 'survival' events for the same subject using survdiff?
- odfWeave - A loop of the "same" data