R users, I am trying to find some way to find the value of a column that is repeated the most for each StandID of a dataframe. I have research methods online and the help page, but have had no success in finding a solution. I have tried using the table function but it returns items for the whole dataset and not by the StandID. Any help will be appreciated. Thanks in advance. R version 2.11.1 Windows 7 Dataframe is imported from text file StandID PlotNum HerbNum Woody 001 1 1 low 001 2 2 medium 001 3 1 low 001 4 3 low 001 5 1 high 001 6 2 medium 002 1 1 high 002 2 2 high 002 3 2 low 002 4 3 high 002 5 1 high 002 6 2 medium I would like to get the following from the dataframe StandID HerbNum Woody 001 1 low 002 2 high Thanks, Randy [[alternative HTML version deleted]]
Bill.Venables at csiro.au
2010-Aug-25 02:14 UTC
[R] find most repeated item from column in dataframe
Do you expect this to be easy? It may be, but I can't see a particularly graceful way to do it. Here is one possible solution.> datStandID PlotNum HerbNum Woody 1 001 1 1 low 2 001 2 2 medium 3 001 3 1 low 4 001 4 3 low 5 001 5 1 high 6 001 6 2 medium 7 002 1 1 high 8 002 2 2 high 9 002 3 2 low 10 002 4 3 high 11 002 5 1 high 12 002 6 2 medium> getMostCommon <- function(x) {tx <- table(x) m <- which(tx == max(tx))[1] as(names(tx)[m], class(x)) }> val <- unclass(by(dat[,-1], dat$StandID, function(x) lapply(x, getMostCommon))) > (newDat <- cbind(StandID = names(val), as.data.frame(do.call(rbind, val))))StandID PlotNum HerbNum Woody 001 001 1 1 low 002 002 1 2 high This sort of gets you the answer, but it is not quite what it seems. One way to make it more manageable is> for(j in 2:ncol(newDat)) newDat[[j]] <- unlist(newDat[[j]]) > newDatStandID PlotNum HerbNum Woody 001 001 1 1 low 002 002 1 2 high This is now a data frame with columns (more or less) what they appear to be. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Randy Cass Sent: Wednesday, 25 August 2010 11:33 AM To: r-help at r-project.org Subject: [R] find most repeated item from column in dataframe R users, I am trying to find some way to find the value of a column that is repeated the most for each StandID of a dataframe. I have research methods online and the help page, but have had no success in finding a solution. I have tried using the table function but it returns items for the whole dataset and not by the StandID. Any help will be appreciated. Thanks in advance. R version 2.11.1 Windows 7 Dataframe is imported from text file StandID PlotNum HerbNum Woody 001 1 1 low 001 2 2 medium 001 3 1 low 001 4 3 low 001 5 1 high 001 6 2 medium 002 1 1 high 002 2 2 high 002 3 2 low 002 4 3 high 002 5 1 high 002 6 2 medium I would like to get the following from the dataframe StandID HerbNum Woody 001 1 low 002 2 high Thanks, Randy [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Tena koe Randy
If your dataframe is called randy, then the following seems to work:
aggregate(randy[,-(1:2)], list(randy[,1]), function(x) {tt <- table(x);
names(tt)[which.max(tt)]})
HTH ....
Peter Alspach
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Randy Cass
> Sent: Wednesday, 25 August 2010 1:33 p.m.
> To: r-help at r-project.org
> Subject: [R] find most repeated item from column in dataframe
>
> R users,
>
> I am trying to find some way to find the value of a column that is
> repeated
> the most for each StandID of a dataframe. I have research methods
> online
> and the help page, but have had no success in finding a solution. I
> have
> tried using the table function but it returns items for the whole
> dataset
> and not by the StandID. Any help will be appreciated. Thanks in
> advance.
>
> R version 2.11.1
> Windows 7
> Dataframe is imported from text file
>
> StandID PlotNum HerbNum Woody
> 001 1 1 low
> 001 2 2 medium
> 001 3 1 low
> 001 4 3 low
> 001 5 1 high
> 001 6 2 medium
> 002 1 1 high
> 002 2 2 high
> 002 3 2 low
> 002 4 3 high
> 002 5 1 high
> 002 6 2 medium
>
> I would like to get the following from the dataframe
>
> StandID HerbNum Woody
> 001 1 low
> 002 2 high
>
> Thanks,
>
> Randy
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.