Walter Anderson
2015-Mar-31 19:51 UTC
[R] How to obtain a cross tab count of unique values
I have a data frame that shows all of the parks (including duplicates) that are impacted by a projects 'footprint': PROJECT PARKNAME A PRK A A PRK B A PRK A B PRK C B PRK A C PRK B C PRK D ... What I need is a cross tabulation that shows me the number of unique parks for each project. If I using the standard table(df$PROJECT) it reports: A 3 B 2 C 2 ... where I need it to ignore duplicates and report: A 2 B 2 C 2 ... Anyone have any suggestions on how to do this within the R paradigm? Walter Anderson
Sure: tell R you want unique rows.> mydf <- data.frame(PROJECT=c("A","A","A","B","B","C","C"), PARKNAME=c("PRK A", "PRK B", "PRK A", "PRK C", "PRK A", "PRK B", "PRK D"), stringsAsFactors=FALSE) > mydfPROJECT PARKNAME 1 A PRK A 2 A PRK B 3 A PRK A 4 B PRK C 5 B PRK A 6 C PRK B 7 C PRK D> mydf.unique <- unique(mydf) > table(mydf.unique$PROJECT)A B C 2 2 2 Please provide reproducible data yourself in the future. Sarah On Tue, Mar 31, 2015 at 3:51 PM, Walter Anderson <wandrson01 at gmail.com> wrote:> I have a data frame that shows all of the parks (including duplicates) > that are impacted by a projects 'footprint': > > PROJECT PARKNAME > A PRK A > A PRK B > A PRK A > B PRK C > B PRK A > C PRK B > C PRK D > ... > > What I need is a cross tabulation that shows me the number of unique > parks for each project. If I using the standard table(df$PROJECT) it > reports: > > A 3 > B 2 > C 2 > ... > > where I need it to ignore duplicates and report: > > A 2 > B 2 > C 2 > ... > > Anyone have any suggestions on how to do this within the R paradigm? > > Walter Anderson-- Sarah Goslee http://www.functionaldiversity.org
Hello, Try the following. table(unique(df)$PROJECT) And please note that 'df' is the name of an R function, use something else. Hope this helps, Rui Barradas Em 31-03-2015 20:51, Walter Anderson escreveu:> I have a data frame that shows all of the parks (including duplicates) > that are impacted by a projects 'footprint': > > PROJECT PARKNAME > A PRK A > A PRK B > A PRK A > B PRK C > B PRK A > C PRK B > C PRK D > ... > > What I need is a cross tabulation that shows me the number of unique > parks for each project. If I using the standard table(df$PROJECT) it > reports: > > A 3 > B 2 > C 2 > ... > > where I need it to ignore duplicates and report: > > A 2 > B 2 > C 2 > ... > > Anyone have any suggestions on how to do this within the R paradigm? > > Walter Anderson > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
table(unique(df)$PROJECT) On Tue, 2015-03-31 at 14:51 -0500, Walter Anderson wrote:> I have a data frame that shows all of the parks (including duplicates) > that are impacted by a projects 'footprint': > > PROJECT PARKNAME > A PRK A > A PRK B > A PRK A > B PRK C > B PRK A > C PRK B > C PRK D > ... > > What I need is a cross tabulation that shows me the number of unique > parks for each project. If I using the standard table(df$PROJECT) it > reports: > > A 3 > B 2 > C 2 > ... > > where I need it to ignore duplicates and report: > > A 2 > B 2 > C 2 > ... > > Anyone have any suggestions on how to do this within the R paradigm? > > Walter Anderson > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.