thr3ads.net - R help - [R] by output into data frame [Mar 2012]

If this information is useful, please help other people find it:
Share via:

David Perlman

2012-Mar-19 21:44 UTC

[R] by output into data frame

I could do this in various hacky ways, but what's the right way?

I have a nice application of the by function, which does what I want.  The
output looks like this:
> auc_stresslab.samples.stress$subid: 2
  cortisol amylase
1   919.05  6834.8
---------------------------------------------------------------------------------------------------------------------------
lab.samples.stress$subid: 3
   cortisol  amylase
11   728.25 24422.05

etc.

What I want is a data frame roughly like this:

subid  cortisol.auc  amylase.auc
2      919.05        6834.8
3      728.25        24422.05

etc.

What is a nice way to make that happen?



Here is the code and data that I am using, which should run directly if you copy
and paste it:


sanity.check<-read.csv("http://brainimaging.waisman.wisc.edu/~perlman/testdata.csv",
header=TRUE, sep = ",")
lab.samples <- subset(sanity.check,Sample!='before bed' &
Sample!='morning after')
lab.samples$Sample<-factor(lab.samples$Sample)
lab.samples.stress<-subset(lab.samples,challenge=='stress')
lab.samples.control<-subset(lab.samples,challenge=='control')

auc_ground <- function(sub_df) {
	print(sub_df)
	auc<-sub_df[1,]*0
	timedif<-c(60,10,10,10,10,10,10)
	for (i in 1:(nrow(sub_df)-1) ) {
		print(c(i,i+1))
		#print(c(values[i],values[i+1]))
		pair_area<-(sub_df[i,]+sub_df[i+1,])*timedif[i]/2
		auc<-auc+pair_area
	}
	auc
}

auc_stress<-by(lab.samples.stress[c('cortisol','amylase')],
lab.samples.stress$subid, auc_ground, simplify=T)
auc_control<-by(lab.samples.control[c('cortisol','amylase')],
lab.samples.control$subid, auc_ground, simplify=T)


Thanks for your help!

P.S. sorry if this question has been answered before, it is nearly impossible to
get useful google results on search terms like "by"...  too common
word...


-dave----------------------------------------------------------------------
A neuroscientist is at the video arcade, when someone makes him a $1000 bet
on Pac-Man. He smiles, gets out his screwdriver and takes apart the Pac-Man
game. Everyone says "What are you doing?" The neuroscientist says
"Well,
since we all know that Pac-Man is based on electric signals traveling
through these circuits, obviously I can understand it better than the other
guy by going straight to the source!"

Jorge I Velez

2012-Mar-19 22:00 UTC

head link

[R] by output into data frame

Hi David,

Thank you for the reproducible example!

Try
> do.call(rbind, auc_stress)  cortisol  amylase
2   919.05  6834.80
3   728.25 24422.05
4  2106.00 25908.35
6   636.40 12209.75
7  1925.95  4749.25
> do.call(rbind, auc_control)  cortisol  amylase
2   604.90  2458.00
4   587.65 29954.55
6   493.60 13833.80
7  1211.00  4932.35

HTH,
Jorge.-


On Mon, Mar 19, 2012 at 5:44 PM, David Perlman <> wrote:
> I could do this in various hacky ways, but what's the right way?
>
> I have a nice application of the by function, which does what I want.  The
> output looks like this:
>
> > auc_stress
> lab.samples.stress$subid: 2
>  cortisol amylase
> 1   919.05  6834.8
>
>
---------------------------------------------------------------------------------------------------------------------------
> lab.samples.stress$subid: 3
>   cortisol  amylase
> 11   728.25 24422.05
>
> etc.
>
> What I want is a data frame roughly like this:
>
> subid  cortisol.auc  amylase.auc
> 2      919.05        6834.8
> 3      728.25        24422.05
>
> etc.
>
> What is a nice way to make that happen?
>
>
>
> Here is the code and data that I am using, which should run directly if
> you copy and paste it:
>
>
> sanity.check<-read.csv("
> http://brainimaging.waisman.wisc.edu/~perlman/testdata.csv",
header=TRUE,
> sep = ",")
> lab.samples <- subset(sanity.check,Sample!='before bed' &
Sample!='morning
> after')
> lab.samples$Sample<-factor(lab.samples$Sample)
> lab.samples.stress<-subset(lab.samples,challenge=='stress')
> lab.samples.control<-subset(lab.samples,challenge=='control')
>
> auc_ground <- function(sub_df) {
>        print(sub_df)
>        auc<-sub_df[1,]*0
>        timedif<-c(60,10,10,10,10,10,10)
>        for (i in 1:(nrow(sub_df)-1) ) {
>                print(c(i,i+1))
>                #print(c(values[i],values[i+1]))
>                pair_area<-(sub_df[i,]+sub_df[i+1,])*timedif[i]/2
>                auc<-auc+pair_area
>        }
>        auc
> }
>
>
auc_stress<-by(lab.samples.stress[c('cortisol','amylase')],
> lab.samples.stress$subid, auc_ground, simplify=T)
>
auc_control<-by(lab.samples.control[c('cortisol','amylase')],
> lab.samples.control$subid, auc_ground, simplify=T)
>
>
> Thanks for your help!
>
> P.S. sorry if this question has been answered before, it is nearly
> impossible to get useful google results on search terms like
"by"...  too
> common word...
>
>
> -dave----------------------------------------------------------------------
> A neuroscientist is at the video arcade, when someone makes him a $1000 bet
> on Pac-Man. He smiles, gets out his screwdriver and takes apart the Pac-Man
> game. Everyone says "What are you doing?" The neuroscientist says
"Well,
> since we all know that Pac-Man is based on electric signals traveling
> through these circuits, obviously I can understand it better than the other
> guy by going straight to the source!"
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Peter Meilstrup

2012-Mar-19 22:35 UTC

head link

[R] by output into data frame

Thanks for providing a reproducible example.

Using the plyr package you can write your whole computation more compactly:

library(plyr)
library(caTools) #for trapz

auc <- ddply(lab.samples, .(challenge, subid),
             function(df) {
  df$time <- c(0, seq(60,by=10, len=nrow(df)-1))
  summarize(df,
            cortisol = trapz(time, cortisol),
            amylase = trapz(time, amylase))
})

On Mon, Mar 19, 2012 at 2:44 PM, David Perlman <dperlman@wisc.edu> wrote:
> I could do this in various hacky ways, but what's the right way?
>
> I have a nice application of the by function, which does what I want.  The
> output looks like this:
>
> > auc_stress
> lab.samples.stress$subid: 2
>  cortisol amylase
> 1   919.05  6834.8
>
>
---------------------------------------------------------------------------------------------------------------------------
> lab.samples.stress$subid: 3
>   cortisol  amylase
> 11   728.25 24422.05
>
> etc.
>
> What I want is a data frame roughly like this:
>
> subid  cortisol.auc  amylase.auc
> 2      919.05        6834.8
> 3      728.25        24422.05
>
> etc.
>
> What is a nice way to make that happen?
>
>
>
> Here is the code and data that I am using, which should run directly if
> you copy and paste it:
>
>
> sanity.check<-read.csv("
> http://brainimaging.waisman.wisc.edu/~perlman/testdata.csv",
header=TRUE,
> sep = ",")
> lab.samples <- subset(sanity.check,Sample!='before bed' &
Sample!='morning
> after')
> lab.samples$Sample<-factor(lab.samples$Sample)
> lab.samples.stress<-subset(lab.samples,challenge=='stress')
> lab.samples.control<-subset(lab.samples,challenge=='control')
>
> auc_ground <- function(sub_df) {
>        print(sub_df)
>        auc<-sub_df[1,]*0
>        timedif<-c(60,10,10,10,10,10,10)
>        for (i in 1:(nrow(sub_df)-1) ) {
>                print(c(i,i+1))
>                #print(c(values[i],values[i+1]))
>                pair_area<-(sub_df[i,]+sub_df[i+1,])*timedif[i]/2
>                auc<-auc+pair_area
>        }
>        auc
> }
>
>
auc_stress<-by(lab.samples.stress[c('cortisol','amylase')],
> lab.samples.stress$subid, auc_ground, simplify=T)
>
auc_control<-by(lab.samples.control[c('cortisol','amylase')],
> lab.samples.control$subid, auc_ground, simplify=T)
>
>
> Thanks for your help!
>
> P.S. sorry if this question has been answered before, it is nearly
> impossible to get useful google results on search terms like
"by"...  too
> common word...
>
>
> -dave----------------------------------------------------------------------
> A neuroscientist is at the video arcade, when someone makes him a $1000 bet
> on Pac-Man. He smiles, gets out his screwdriver and takes apart the Pac-Man
> game. Everyone says "What are you doing?" The neuroscientist says
"Well,
> since we all know that Pac-Man is based on electric signals traveling
> through these circuits, obviously I can understand it better than the other
> guy by going straight to the source!"
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Mar 2012 - by output into data frame

[R] by output into data frame

[R] by output into data frame

[R] by output into data frame

Seemingly Similar Threads