R users, I am trying to find some way to find the value of a column that is repeated the most for each StandID of a dataframe. I have research methods online and the help page, but have had no success in finding a solution. I have tried using the table function but it returns items for the whole dataset and not by the StandID. Any help will be appreciated. Thanks in advance. R version 2.11.1 Windows 7 Dataframe is imported from text file StandID PlotNum HerbNum Woody 001 1 1 low 001 2 2 medium 001 3 1 low 001 4 3 low 001 5 1 high 001 6 2 medium 002 1 1 high 002 2 2 high 002 3 2 low 002 4 3 high 002 5 1 high 002 6 2 medium I would like to get the following from the dataframe StandID HerbNum Woody 001 1 low 002 2 high Thanks, Randy [[alternative HTML version deleted]]
Bill.Venables at csiro.au
2010-Aug-25 02:14 UTC
[R] find most repeated item from column in dataframe
Do you expect this to be easy? It may be, but I can't see a particularly graceful way to do it. Here is one possible solution.> datStandID PlotNum HerbNum Woody 1 001 1 1 low 2 001 2 2 medium 3 001 3 1 low 4 001 4 3 low 5 001 5 1 high 6 001 6 2 medium 7 002 1 1 high 8 002 2 2 high 9 002 3 2 low 10 002 4 3 high 11 002 5 1 high 12 002 6 2 medium> getMostCommon <- function(x) {tx <- table(x) m <- which(tx == max(tx))[1] as(names(tx)[m], class(x)) }> val <- unclass(by(dat[,-1], dat$StandID, function(x) lapply(x, getMostCommon))) > (newDat <- cbind(StandID = names(val), as.data.frame(do.call(rbind, val))))StandID PlotNum HerbNum Woody 001 001 1 1 low 002 002 1 2 high This sort of gets you the answer, but it is not quite what it seems. One way to make it more manageable is> for(j in 2:ncol(newDat)) newDat[[j]] <- unlist(newDat[[j]]) > newDatStandID PlotNum HerbNum Woody 001 001 1 1 low 002 002 1 2 high This is now a data frame with columns (more or less) what they appear to be. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Randy Cass Sent: Wednesday, 25 August 2010 11:33 AM To: r-help at r-project.org Subject: [R] find most repeated item from column in dataframe R users, I am trying to find some way to find the value of a column that is repeated the most for each StandID of a dataframe. I have research methods online and the help page, but have had no success in finding a solution. I have tried using the table function but it returns items for the whole dataset and not by the StandID. Any help will be appreciated. Thanks in advance. R version 2.11.1 Windows 7 Dataframe is imported from text file StandID PlotNum HerbNum Woody 001 1 1 low 001 2 2 medium 001 3 1 low 001 4 3 low 001 5 1 high 001 6 2 medium 002 1 1 high 002 2 2 high 002 3 2 low 002 4 3 high 002 5 1 high 002 6 2 medium I would like to get the following from the dataframe StandID HerbNum Woody 001 1 low 002 2 high Thanks, Randy [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Tena koe Randy If your dataframe is called randy, then the following seems to work: aggregate(randy[,-(1:2)], list(randy[,1]), function(x) {tt <- table(x); names(tt)[which.max(tt)]}) HTH .... Peter Alspach> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Randy Cass > Sent: Wednesday, 25 August 2010 1:33 p.m. > To: r-help at r-project.org > Subject: [R] find most repeated item from column in dataframe > > R users, > > I am trying to find some way to find the value of a column that is > repeated > the most for each StandID of a dataframe. I have research methods > online > and the help page, but have had no success in finding a solution. I > have > tried using the table function but it returns items for the whole > dataset > and not by the StandID. Any help will be appreciated. Thanks in > advance. > > R version 2.11.1 > Windows 7 > Dataframe is imported from text file > > StandID PlotNum HerbNum Woody > 001 1 1 low > 001 2 2 medium > 001 3 1 low > 001 4 3 low > 001 5 1 high > 001 6 2 medium > 002 1 1 high > 002 2 2 high > 002 3 2 low > 002 4 3 high > 002 5 1 high > 002 6 2 medium > > I would like to get the following from the dataframe > > StandID HerbNum Woody > 001 1 low > 002 2 high > > Thanks, > > Randy > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.