I am working with census data. My columns of interest are... PercentOld - the percentage of people in each county that are over 65 County - the county in each state State - the state in the US There are about 3100 rows, with each row corresponding to a county within a state. I want to return the top five "PercentOld" by state. But I want the County and the Value. I tried this... topN <- function(column, n=5) { column <- sort(column, decreasing=T) return(column[1:n]) } top5PerState <- tapply(data$percentOld, data$STATE, topN) But this only returns the value for "percentOld" per state, I also want the corresponding County. I think I'm close, but I just can't get it... Thanks cn [[alternative HTML version deleted]]
Hi> I am working with census data. My columns of interest are... > > PercentOld - the percentage of people in each county that are over 65 > County - the county in each state > State - the state in the US > > There are about 3100 rows, with each row corresponding to a countywithin a state.> > I want to return the top five "PercentOld" by state. But I want theCounty> and the Value. > > I tried this... > > topN <- function(column, n=5) > { > column <- sort(column, decreasing=T) > return(column[1:n]) > } > top5PerState <- tapply(data$percentOld, data$STATE, topN)Try aggregate(data$PercentOld, list(data$State, data$County), topN) Regards Petr> > But this only returns the value for "percentOld" per state, I also wantthe> corresponding County. > > I think I'm close, but I just can't get it... > > Thanks > > cn > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
See the examples labelled head in the examples section near the bottom of: http://sqldf.googlecode.com/svn/trunk/man/sqldf.Rd These show show to do it using order as well as using SQL via sqldf. On 8/31/07, Cory Nissen <cnissen at akoyainc.com> wrote:> I am working with census data. My columns of interest are... > > PercentOld - the percentage of people in each county that are over 65 > County - the county in each state > State - the state in the US > > There are about 3100 rows, with each row corresponding to a county within a state. > > I want to return the top five "PercentOld" by state. But I want the County and the Value. > > I tried this... > > topN <- function(column, n=5) > { > column <- sort(column, decreasing=T) > return(column[1:n]) > } > top5PerState <- tapply(data$percentOld, data$STATE, topN) > > But this only returns the value for "percentOld" per state, I also want the corresponding County. > > I think I'm close, but I just can't get it... > > Thanks > > cn > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Perhaps you want this? data <- NULL data$state <- c(rep("Illinois", 10), rep("Wisconsin", 10)) data$county <- c("Adams", "Brown", "Bureau", "Cass", "Champaign", "Christian", "Coles", "De Witt", "Douglas", "Edgar", "Adams", "Ashland", "Barron", "Bayfield", "Buffalo", "Burnett", "Chippewa", "Clark", "Columbia", "Crawford") data$percentOld <- c(17.554849, 16.826594, 18.196593, 17.139242, 8.743823, 17.862746, 13.747967, 16.626302, 15.258940, 18.984435, 19.347022, 17.814436, 16.903067, 17.632781, 16.659305, 20.337817, 14.293354, 17.252820, 15.647179, 16.825596) data<-data.frame(data,stringsAsFactors=FALSE) rankWithinState<-unlist(tapply(-data$percentOld,data$state,rank)) names(rankWithinState)<-NULL data<-data.frame(data,rankWithinState) highCounties<-data[data$rankWithinState<=5,] highCountiesSorted<-highCounties[order(highCounties$state,-highCounties$percentOld),] Cory Nissen wrote:> I am working with census data. My columns of interest are... > > PercentOld - the percentage of people in each county that are over 65 > County - the county in each state > State - the state in the US-- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459