search for: summaryby

Displaying 20 results from an estimated 101 matches for "summaryby".

Did you mean: summary
2010 Mar 17
2
Using nrow with summaryBy
Hello Everyone- I'm calculating summary statistics on a dataset (~4000 records, observations are not uniformly distributed) using summaryBy and trying to add a column with the number of observations to the output as well. What occurs to me is to use nrow(), but this doesn't appear to be working I'm able to replicate the same results with an example from the summaryBy docs: data(dietox) dietox12<- subset(dietox,Time==12)...
2011 Jan 17
2
Using summaryBy with weighted data
Dear Soren and R users: I am trying to use the summaryBy function with weights. Is this possible? An example that illustrates what I am trying to do follows: library(doBy) ## make up some data response = rnorm(100) group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20)) weights = runif(100, 0, 1) mydata = data.frame(response,group,weights)...
2012 Apr 02
2
summaryBy: transformed variable on RHS of formula?
Hi Folks, I'm trying to cut my data inside the summaryBy function. Perhaps formulas don't work that way? I'd like to avoid adding another column if possible, but if I have to, I have to. Any ideas? Thanks, Allie require(doBy) df = dataframe(a <- rnorm(100), b <-rnorm(100)) summaryBy(a ~ cut(b,c(-100,-1,1,100)), data=df) # p...
2013 Jan 17
3
how to use "..."
Dear users, I'm trying to learn how to use the "...". I have written a function (simplified here) that uses doBy::summaryBy(): # 'dat' is a data.frame from which the aggregation is computed # 'vec_cat' is a integer vector defining which columns of the data.frame should be use on the right side of the formula # 'stat_fun' is the function that will be run to aggregate stat.group <- function(dat...
2006 Dec 05
1
summaryBy(): Is it the best option?
Hi, since I have quite large tables and the processing takes quite a while I am curious if I can improve the performance of this aggregation somehow: At the moment I am using summaryBy from the doBy package under R 2.4.0, Win2K. summaryBy(soc_s6aq5 + soc_s6aq7 + soc_s6aq9 + soc_s6aq11 ~ hh + comgroup,soc6a,postfix=c("","","",""),FUN=sum, na.rm=T) The data.frame has 124100 rows and 13 cols. Thanks for any hints! Werner
2007 Feb 15
1
Problem in summaryBy
The R script below gives values of 1 for all minimum values when I use a custom function in summaryBy. I get the correct values when I use FUN=min directly. Any help is much appreciated. The continuous information provided in this forum is fabulous as are the different R packages available. Rene # Simulated simplified data Subj <- rep(1:4, each=6) Analyte <- rep(c(rep("RBV",3),...
2007 Aug 20
1
Problem mit summaryBy: Group sums gives me "incorrectly" zero for one variable
Hi, first I want to thank all of you for the quick aid which is provided here on the list during all times. Thanks a lot for that! Then, I have a problem using summaryBy which most probably is a problem of wrong use by me or the like: I use this command: summaryBy(total+total.inf~gr, aE, FUN=sum) where aE is a > str(aE) 'data.frame': 127880 obs. of 16 variables: $ gr : num 2.02e+10 1.13e+10 2.15e+10 4.02e+10 1.04e+10 ... $ total...
2012 May 15
0
Indexing in summaryBy
I'm trying to use a self-written function with the summaryBy function (doBy package). I have lots of data from Monte Carlo experiments comparing different estimators across different (combinations of) parameter values, similar to the following form: colnames(mydata) <- c("X", "b0", "b1", # parameter combination, correspondi...
2009 Sep 04
1
Apparent bug in summaryBy (PR#13941)
Full_Name: Marc Paterno Version: 2.9.2 OS: Mac OS X 10.5.8 Submission from: (NULL) (99.53.212.55) summaryBy() produces incorrect results when given some data frames. Below is a transcript of a session showing the result, in a data frame with 2 observations of 2 variables. ------------------- thomas:999 paterno$ R --vanilla R version 2.9.2 (2009-08-24) Copyright (C) 2009 The R Foundation for Statistical...
2006 Jul 10
1
Counting observations split by a factor when there are NAs in the data
...ientist (linguist) who is trying to learn to use R after being very familiar with SPSS. Please be kind! My concern: I cannot figure out a way to get an accurate count of observations of one column of data split by a factor when there are NAs in the data. I know how to use commands like tapply and summaryBy to obtain other summary statistics I am interested in, such as the following: tapply(RLWTEST, list(STATUS), mean, na.rm=T) summaryBy(RLWTEST~STATUS, data=lh.forgotten, FUN=c(mean, sd, min, max), na.rm=T) However, with tapply I know I cannot use length to get a count where there are NAs. summaryBy...
2010 May 07
4
Any way to apply TWO functions with tapply()?
I need to compute the mean and the standard deviation of a data set and would like to have the results in one table/data frame. I call tapply() two times and do then merge the resulting tables to have them all in one table. Is there any way to tell tapply() to use the functions mean and sd within one function call? Something like tapply(data$response, list(data$targets, data$conditions), c(mean,
2006 Jul 10
0
Counting observations split by a factor when there are NA s in the data
...o use R after being very familiar with > SPSS. Please be kind! > > My concern: > I cannot figure out a way to get an accurate count of > observations of one column of data split by a factor when > there are NAs in the data. > > I know how to use commands like tapply and summaryBy to > obtain other summary statistics I am interested in, such as > the following: > tapply(RLWTEST, list(STATUS), mean, na.rm=T) > summaryBy(RLWTEST~STATUS, data=lh.forgotten, FUN=c(mean, sd, > min, max), > na.rm=T) > > However, with tapply I know I cannot use length to...
2010 Jul 16
2
aggregate(...) with multiple functions
hi all - i'm just wondering what sort of code people write to essentially performa an aggregate call, but with different functions being applied to the various columns. for example, if i have a data frame x and would like to marginalize by a factor f for the rows, but apply mean() to col1 and median() to col2. if i wanted to apply mean() to both columns, i would call: aggregate(x, list(f),
2010 Oct 01
2
function which can apply a function by a grouping variable and also hand over an additional variable, e.g. a weight
Hi, I was wondering if there is an easy way to accomplish the following in R: Often I want to apply a function, e.g. weighted.quantile from the Hmisc package to grouped subsets of a data.frame (grouping variable) but then I also need to hand over the weights which seems not possible with summaryBy or aggregate or the like. Is there a function to do this? Currently I do this with loops but it is very slow. I would be very grateful for any hints. Thanks, Werner
2011 Nov 18
1
couting events by subject with "black out" windows
...L, 1L, 1L, 0L, 1L,1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L)), .Names = c("ID", "Date","event"), class = "data.frame", row.names = c(NA, 14L)) ##     remove non events data2 <- data1[data1$event==1,] library(doBy) ##     create a table of first events step1 <- summaryBy(Date~ID, data = data2, FUN=min) step1$Date30 <- step1$Date.min+30                                     step2 <- merge(data2, step1, by.x="ID", by.y="ID") ##     use an ifelse to essentially remove any events that shouldn't be counted step2$event <- ifelse(as.numeric...
2011 Jul 28
3
Data aggregation question
...count of the number of observations for every level of that 3-way interaction. For example, if factors A, B, and C each have 3 levels (all of which were observed someplace in the dataset), I'd like to know how many times A1, B1, and C1 co-occurred in the dataset. Functions like aggregate and summaryBy do a decent job when I sum a vector of ones of the same length as the original dataset, but I'm getting stuck on the fact that neither will return 0-count combinations of the three variables in question. I understand that this is a desirable outcome (if A1, B1, C2 didn't occur, it shouldn&...
2010 Feb 08
1
Follow-up Question: data frames; matching/merging
...res R 2.11.x >> aggregate(V2 ~ V1, DF, min) > ?V1 V2 > 1 ?a ?2 > 2 ?b ?9 > 3 ?c ?4 > >> # 3. SQL using sqldf >> library(sqldf) >> sqldf("select V1, min(V2) V2 from DF group by V1") > ?V1 V2 > 1 ?a ?2 > 2 ?b ?9 > 3 ?c ?4 > >> # 4. summaryBy in the doBy package >> library(doBy) >> summaryBy(V2 ~., DF, FUN = min, keep.names = TRUE) > ?V1 V2 > 1 ?a ?2 > 2 ?b ?9 > 3 ?c ?4 > > On Mon, Feb 8, 2010 at 11:39 AM, Jonathan <jonsleepy at gmail.com> wrote: >> Hi all, >> ? ?I'm feeling a little g...
2007 Oct 30
2
flexible processing
Hello, unfortunately, I don't know a better subject. I would like to be very flexible in how to process my data. Assume the following dataset: par1 <- seq(0,1,length.out = 100) par2 <- seq(1,100) fac1 <- factor(rep(c("group1", "group2"), each = 50)) fac2 <- factor(rep(c("group3", "group4", "group5", "group6"), each =
2007 Aug 31
3
data frame row manipulation
...cannot calculate rowise maxvals evaluation=data.frame(date=c(1,2,3,4,5,6,7,8,9), name=c("Michael","Steve","Bob", "Michael","Steve","Bob","Michael","Steve","Bob"), vol=c(3,5,4,2,4,5,7,6,7)) evaluation # maxval=summaryBy(vol ~ name,data=evaluation,FUN = function(x) { c(ma=max(x)) } ) maxval # over all days per person #function getMaxVal=function(x) { maxval$vol.ma[maxval$name==x] } getMaxVal("Steve") # testing the function for one name is ok #we want to add a column, that shows the daily drinkingvolume...
2010 Jul 16
2
multivariate graphs, averaging on some vars
Hello I have a table of this kind: function x1 x2 x3 2.232 1 1 1.00 2.242 1 1 1.01 2.732 1 1 1.02 2.770 1 2 1.00 1.932 1 2 1.01 2.132 1 2 1.02 3.222 1.2 1 1 ..... ... .. .. The table represents the values of a function(x1, x2, x3) for each combination x1, x2, x3. I'd like to generate a plot where each point has the coordinates x=x1, y=x2,