I am working with census data. My columns of interest are...
PercentOld - the percentage of people in each county that are over 65
County - the county in each state
State - the state in the US
There are about 3100 rows, with each row corresponding to a county within a
state.
I want to return the top five "PercentOld" by state. But I want the
County and the Value.
I tried this...
topN <- function(column, n=5)
{
column <- sort(column, decreasing=T)
return(column[1:n])
}
top5PerState <- tapply(data$percentOld, data$STATE, topN)
But this only returns the value for "percentOld" per state, I also
want the corresponding County.
I think I'm close, but I just can't get it...
Thanks
cn
[[alternative HTML version deleted]]
Hi> I am working with census data. My columns of interest are... > > PercentOld - the percentage of people in each county that are over 65 > County - the county in each state > State - the state in the US > > There are about 3100 rows, with each row corresponding to a countywithin a state.> > I want to return the top five "PercentOld" by state. But I want theCounty> and the Value. > > I tried this... > > topN <- function(column, n=5) > { > column <- sort(column, decreasing=T) > return(column[1:n]) > } > top5PerState <- tapply(data$percentOld, data$STATE, topN)Try aggregate(data$PercentOld, list(data$State, data$County), topN) Regards Petr> > But this only returns the value for "percentOld" per state, I also wantthe> corresponding County. > > I think I'm close, but I just can't get it... > > Thanks > > cn > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
See the examples labelled head in the examples section near the bottom of: http://sqldf.googlecode.com/svn/trunk/man/sqldf.Rd These show show to do it using order as well as using SQL via sqldf. On 8/31/07, Cory Nissen <cnissen at akoyainc.com> wrote:> I am working with census data. My columns of interest are... > > PercentOld - the percentage of people in each county that are over 65 > County - the county in each state > State - the state in the US > > There are about 3100 rows, with each row corresponding to a county within a state. > > I want to return the top five "PercentOld" by state. But I want the County and the Value. > > I tried this... > > topN <- function(column, n=5) > { > column <- sort(column, decreasing=T) > return(column[1:n]) > } > top5PerState <- tapply(data$percentOld, data$STATE, topN) > > But this only returns the value for "percentOld" per state, I also want the corresponding County. > > I think I'm close, but I just can't get it... > > Thanks > > cn > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Perhaps you want this?
data <- NULL
data$state <- c(rep("Illinois", 10), rep("Wisconsin",
10))
data$county <- c("Adams", "Brown", "Bureau",
"Cass", "Champaign",
"Christian", "Coles", "De Witt",
"Douglas", "Edgar",
"Adams", "Ashland", "Barron",
"Bayfield", "Buffalo",
"Burnett", "Chippewa", "Clark",
"Columbia", "Crawford")
data$percentOld <- c(17.554849, 16.826594, 18.196593, 17.139242, 8.743823,
17.862746, 13.747967, 16.626302, 15.258940, 18.984435,
19.347022, 17.814436, 16.903067, 17.632781, 16.659305,
20.337817, 14.293354, 17.252820, 15.647179, 16.825596)
data<-data.frame(data,stringsAsFactors=FALSE)
rankWithinState<-unlist(tapply(-data$percentOld,data$state,rank))
names(rankWithinState)<-NULL
data<-data.frame(data,rankWithinState)
highCounties<-data[data$rankWithinState<=5,]
highCountiesSorted<-highCounties[order(highCounties$state,-highCounties$percentOld),]
Cory Nissen wrote:> I am working with census data. My columns of interest are...
>
> PercentOld - the percentage of people in each county that are over 65
> County - the county in each state
> State - the state in the US
--
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459