Hi there,
another beginners question, I'm afraid. Basically i want to selct the
maximum of values, that correspond to different variables. I have a table
of oil production that looks somewhat like this:
oil <- data.frame( YEAR = c(2011, 2012),
                   TX = c(20000, 30000),
                   CA = c(40000, 25000),
                   AL = c(20000,
21000),
                   ND = c(21000,60000))
Now I want to find out, which state produced most oil in a given year. I
tried this:
attach(oil)
last_year = oil[ c(YEAR == 2012), ]
max(last_year)
Which works, but it doesnt't give me the corresponding values (i.e. it just
gives me the maximum output, not what state its from).
So I tried this:
oil[c(oil == max(last_year)),]
and this:
oil[c(last_year == max(last_year)),]
and this:
oil[which.max(last_year),]
and this:
last_year[max(last_year),]
None of them work, but they don't give error messages either, the output is
just "NA". The problem is, in my eyes, that I'm comparing the
values of
different variables with each other. Because if i change the structure of
the dataframe (which I can't do with the real data, at least not with out
doing it by hand with a huge dataset), it looks like this and works
perfectly:
oil2 <- data.frame (
  names = c('YEAR', 'TX', 'CA', 'AL',
'ND'),
  oil_2011 = c(2011, 20000, 40000, 20000, 21000),
  oil_2012 = c(2012, 30000, 25000, 21000, 60000)
  )
attach(oil2)
oil2[c(oil_2012 == max(oil_2012)),]
Any help is much appreciated.
Thanks, Tim Umbach
	[[alternative HTML version deleted]]
Hi,
You may try:
unlist(lapply(seq_len(nrow(oil)),function(i) oil[i,-1][which.max(oil[i,-1])])) 
?#? CA??? ND 
#40000 60000?
#or
library(reshape2)
datM <- melt(oil,id.var="YEAR")
datM[as.logical(with(datM,ave(value,list(YEAR),FUN= function(x) x%in%
max(x)))),]
#? YEAR variable value
#3 2011?????? CA 40000
#8 2012?????? ND 60000
A.K.
On Thursday, October 17, 2013 12:50 PM, Tim Umbach <tim.umbach at hufw.de>
wrote:
Hi there,
another beginners question, I'm afraid. Basically i want to selct the
maximum of values, that correspond to different variables. I have a table
of oil production that looks somewhat like this:
oil <- data.frame( YEAR = c(2011, 2012),
? ? ? ? ? ? ? ? ?? TX = c(20000, 30000),
? ? ? ? ? ? ? ? ?? CA = c(40000, 25000),
? ? ? ? ? ? ? ? ?? AL = c(20000,
21000),
? ? ? ? ? ? ? ? ?? ND = c(21000,60000))
Now I want to find out, which state produced most oil in a given year. I
tried this:
attach(oil)
last_year = oil[ c(YEAR == 2012), ]
max(last_year)
Which works, but it doesnt't give me the corresponding values (i.e. it just
gives me the maximum output, not what state its from).
So I tried this:
oil[c(oil == max(last_year)),]
and this:
oil[c(last_year == max(last_year)),]
and this:
oil[which.max(last_year),]
and this:
last_year[max(last_year),]
None of them work, but they don't give error messages either, the output is
just "NA". The problem is, in my eyes, that I'm comparing the
values of
different variables with each other. Because if i change the structure of
the dataframe (which I can't do with the real data, at least not with out
doing it by hand with a huge dataset), it looks like this and works
perfectly:
oil2 <- data.frame (
? names = c('YEAR', 'TX', 'CA', 'AL',
'ND'),
? oil_2011 = c(2011, 20000, 40000, 20000, 21000),
? oil_2012 = c(2012, 30000, 25000, 21000, 60000)
? )
attach(oil2)
oil2[c(oil_2012 == max(oil_2012)),]
Any help is much appreciated.
Thanks, Tim Umbach
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
On 17-10-2013, at 18:48, Tim Umbach <tim.umbach at hufw.de> wrote:> Hi there, > > another beginners question, I'm afraid. Basically i want to selct the > maximum of values, that correspond to different variables. I have a table > of oil production that looks somewhat like this: > > oil <- data.frame( YEAR = c(2011, 2012), > TX = c(20000, 30000), > CA = c(40000, 25000), > AL = c(20000, > 21000), > > ND = c(21000,60000)) > > Now I want to find out, which state produced most oil in a given year. I > tried this: > > attach(oil) > last_year = oil[ c(YEAR == 2012), ] > max(last_year) >For a single year do year <- which(oil[,"YEAR"]==2011) oil[year,which.max(oil[year,]),drop=FALSE] In the help look at base::[.data.frame Berend> Which works, but it doesnt't give me the corresponding values (i.e. it just > gives me the maximum output, not what state its from). > So I tried this: > > oil[c(oil == max(last_year)),] > and this: > oil[c(last_year == max(last_year)),] > and this: > oil[which.max(last_year),] > and this: > last_year[max(last_year),] > > None of them work, but they don't give error messages either, the output is > just "NA". The problem is, in my eyes, that I'm comparing the values of > different variables with each other. Because if i change the structure of > the dataframe (which I can't do with the real data, at least not with out > doing it by hand with a huge dataset), it looks like this and works > perfectly: > > oil2 <- data.frame ( > names = c('YEAR', 'TX', 'CA', 'AL', 'ND'), > oil_2011 = c(2011, 20000, 40000, 20000, 21000), > oil_2012 = c(2012, 30000, 25000, 21000, 60000) > ) > attach(oil2) > oil2[c(oil_2012 == max(oil_2012)),] > > Any help is much appreciated. > > Thanks, Tim Umbach > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
See ?pmax for getting the max for each year.
do.call('pmax', oil[-1])
Or equivalently:
pmax(oil$TX, oil$CA, oil$AL, oil$ND)
apply and which.max will give you the index:
i <- apply(oil[-1], 1, which.max)
which you can use to extract the state:
names(oil[-1])[i]
Jason
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Tim Umbach
Sent: Thursday, October 17, 2013 9:49 AM
To: r-help at r-project.org
Subject: [R] Selecting maximums between different variables
Hi there,
another beginners question, I'm afraid. Basically i want to selct the
maximum of values, that correspond to different variables. I have a table of oil
production that looks somewhat like this:
oil <- data.frame( YEAR = c(2011, 2012),
                   TX = c(20000, 30000),
                   CA = c(40000, 25000),
                   AL = c(20000,
21000),
                   ND = c(21000,60000))
Now I want to find out, which state produced most oil in a given year. I tried
this:
attach(oil)
last_year = oil[ c(YEAR == 2012), ]
max(last_year)
Which works, but it doesnt't give me the corresponding values (i.e. it just
gives me the maximum output, not what state its from).
So I tried this:
oil[c(oil == max(last_year)),]
and this:
oil[c(last_year == max(last_year)),]
and this:
oil[which.max(last_year),]
and this:
last_year[max(last_year),]
None of them work, but they don't give error messages either, the output is
just "NA". The problem is, in my eyes, that I'm comparing the
values of different variables with each other. Because if i change the structure
of the dataframe (which I can't do with the real data, at least not with out
doing it by hand with a huge dataset), it looks like this and works
perfectly:
oil2 <- data.frame (
  names = c('YEAR', 'TX', 'CA', 'AL',
'ND'),
  oil_2011 = c(2011, 20000, 40000, 20000, 21000),
  oil_2012 = c(2012, 30000, 25000, 21000, 60000)
  )
attach(oil2)
oil2[c(oil_2012 == max(oil_2012)),]
Any help is much appreciated.
Thanks, Tim Umbach
	[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.