Eric Vander Wal
2009-Jun-16 20:44 UTC
[R] Running stats on individual data.frames from the split() function list
Hello, and thanks in advance. I have a data.frame from which I want to count observations that occur on each day and determine the mean and std.error of said counts. For instance: x<-split(my.df, my.df$julian.days) Although I'm still in my R learning infancy I am under the impression that x is a list of data.frames subsetting my.df by group (i.e.,julian.day) where day 1:366 are x$'1': x$'366' and my variables are x$'1'$var1, x$1$var2, for each data.frame in the list. The data I seek can be supplied by mean(sapply(split(x$'1'$var1, x$'1'$var2), length)) and std.error(sapply(split(x$'1'$var1, x$'1'$var2), length)), etc. Is there an efficient means for me to process the entire list x so that I needn't call each data.frame (i.e., day) from list x individually? No doubt there is a more sophisticated way to obtain this information from the outset, but I'm not yet familiar with it. Thanks again for all the help, Eric -- Eric Vander Wal Ph.D. Candidate University of Saskatchewan, Department of Biology, 112 Science Place, Saskatoon, SK., S7N 5E2 "Pluralitas non est ponenda sine neccesitate"
Daniel Malter
2009-Jun-16 21:00 UTC
[R] Running stats on individual data.frames from the split()function list
This easy function you are looking for is tapply. Take a look at the following example: day=rep(1:30,each=30) ##There are thirty days ##with thirty obs each y=rnorm(length(day),mean=2*day,sd=day) ##a dep. variable ##with mean=2*day index no. ##and sd=day tapply(y,day,length) ##shows no. of obs for each day tapply(y,day,mean) ##shows mean of y for each day ##should be about equal to two times the day index no. tapply(y,day,sd) ##show sd of y for each day ##should be about equal to the day index no. Best, Daniel ------------------------- cuncta stricte discussurus ------------------------- -----Urspr?ngliche Nachricht----- Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im Auftrag von Eric Vander Wal Gesendet: Tuesday, June 16, 2009 4:44 PM An: r-help at r-project.org Betreff: [R] Running stats on individual data.frames from the split()function list Hello, and thanks in advance. I have a data.frame from which I want to count observations that occur on each day and determine the mean and std.error of said counts. For instance: x<-split(my.df, my.df$julian.days) Although I'm still in my R learning infancy I am under the impression that x is a list of data.frames subsetting my.df by group (i.e.,julian.day) where day 1:366 are x$'1': x$'366' and my variables are x$'1'$var1, x$1$var2, for each data.frame in the list. The data I seek can be supplied by mean(sapply(split(x$'1'$var1, x$'1'$var2), length)) and std.error(sapply(split(x$'1'$var1, x$'1'$var2), length)), etc. Is there an efficient means for me to process the entire list x so that I needn't call each data.frame (i.e., day) from list x individually? No doubt there is a more sophisticated way to obtain this information from the outset, but I'm not yet familiar with it. Thanks again for all the help, Eric -- Eric Vander Wal Ph.D. Candidate University of Saskatchewan, Department of Biology, 112 Science Place, Saskatoon, SK., S7N 5E2 "Pluralitas non est ponenda sine neccesitate" ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.