thr3ads.net - similar to: "data frames; maybe aggregate?"

Displaying 20 results from an estimated 20000 matches similar to: "data frames; maybe aggregate?"

aggregate vs tapply; is there a middle ground?

2006 Feb 11

aggregate vs tapply; is there a middle ground?

Dear all, I'm wanting to do a series of comparisons among 4 categorical variables: a <- aggregate(y, list(var1, var2, var3, var4), sum) This gets me a very nice 2-dimensional data frame with one column per variable, BUT, as help for aggregate says, <<empty subsets are removed>>. I don't see in help(aggregate) how I can change this. In contrast, a <- tapply(y,

Using Aggregate for Date

2011 Feb 07

Using Aggregate for Date

Hi, I am trying to find the min of day for each student in each year. Here is the dataset: date studentid year 1/1/05 6:07 AM 236 20082009 3/27/09 9:45 AM 236 20082009 4/29/09 8:44 AM 236 20082009 3/27/09 11:36 AM 310 20082009 4/1/09 10:43 AM 310 20082009 10/15/09 8:54 AM 310

aggregate text column by a few rows

2010 Oct 07

aggregate text column by a few rows

Hi, R function aggregate can only take summary stats functions, can I aggregate text columns? For example, for the dataframe below, > a <- rbind(data.frame(id=1, name='Tom', hobby='fishing'),data.frame(id=1, name='Tom', hobby='reading'),data.frame(id=2, name='Mary', hobby='reading'),data.frame(id=3, name='John',

trouble with summary tables with several variables using aggregate function

2011 May 19

trouble with summary tables with several variables using aggregate function

Dear all, I am having trouble creating summary tables using aggregate function. given the following table: Var1 Var2 Var3 dummy S1 T1 I 1 S1 T1 I 1 S1 T1 D 1 S1 T1 D 1 S1 T2 I 1 S1 T2 I 1 S1 T2 D 1 S1 T2 D 1 S2

aggregate with cumsum

2010 Oct 12

aggregate with cumsum

Hello everybody, Data is myd <- data.frame(id1=rep(c("a","b","c"),each=3),id2=rep(1:3,3),val=rnorm(9)) I want to get a cumulative sum over each of id1. trying aggregate does not work myd$pcum <- aggregate(myd[,c("val")],list(orig=myd$id1),cumsum) Please suggest a solution. In real the dataframe is huge so looping with for and subsetting is not a

merge data frames with same column names of different lengths and missing values

2009 Mar 07

merge data frames with same column names of different lengths and missing values

Hello, I'm switching over from SAS to R and am having trouble merging data frames. The data frames have several columns with the same name, and each has a different number of rows. Some of the values are missing from cells with the same column names in each data frame. I had hoped that when I merged the dataframes, every column with the same name would be merged, with the value in a complete

Aggregate Help

2010 Aug 19

Aggregate Help

Please let me know if this is or is not the right place to ask these types of questions. Warning: I am new to R by two days. I have a simple dataset. I have loaded the dataset successfully using the following code: Filepath=(C:\temp\\pilot\dataset1.txt") Pilot=read.table(filepath, header=TRUE) Dataset1.txt is delimited and looks like this: Date illness count 2006/01/01 derm 17 2006/01/01

Different way of aggregating

2010 Aug 18

Different way of aggregating

Hi Usually "aggregate" is used to calculate things such as the sum of all data on the first day, the sum next day, and so on. But how can I calculate the mean of the first hour of all days, the mean of the second hour of all days, and so on. ??? That's Most examples: today at 1am + today at 2am + today at 3am +.... -> sum today tomorrow at 1am + tomorrow at

operations between two aggregated data frames?

2010 May 14

operations between two aggregated data frames?

Hi All, I've come up with a solution for this problem that relies on a for loop, and I was wondering if anybody had any insight into a more elegant method: I have two data frames, each has a column for categorical data and a column for date. What I'd like to do, ideally, is calculate the number of days between all pairs of dates in data frame 1 and data frame 2 (*but only for members

subset

2010 Nov 29

subset

?Hi: I always use subset the same way but now is returning 0 rows. What's wrong with the way I am subsetting? library(ggplot2) structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232, 46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056, 34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894, 42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766,

Batch replacement, by factor, of values in a data frame

2009 Aug 26

Batch replacement, by factor, of values in a data frame

Dear List, I'm wondering if there is a better/cleaner/more efficient way of replacing 0 values in a variable with the minimum of the non-missing and non-zero values of that same variable, but doing it within the levels of a factor? Consider the dummy example data presented at the end of my message. Within each 'Site' there are some 0 values and possibly some NA's. I can compute

efficiently picking one row from a data frame per unique key

2010 Apr 13

efficiently picking one row from a data frame per unique key

Hello all, I'm trying to transform data frames by grouping the rows by the values in a particular column, ordered by another column, then picking the first row in each group. I'd like to convert a data frame like this: x y z 1 10 20 1 11 19 2 12 18 4 13 17 into one with three rows, like this, where i've discarded one row: x y z 1 1 11 19 2 2 12 18 4 4 13 17 I've got a

binding data.frames with sequential names

2010 Dec 14

binding data.frames with sequential names

Hello, I have data frames X1 to X19 I want a simple way to bind them as the next run(s) will generate many more sequential data frames. I tried the following with i = 19: > my.list <- as.list(paste("X",1:i,sep="")) > new.data <- do.call("rbind", my.list) > new.data [,1] [1,] "X1" [2,] "X2" [3,] "X3" [4,]

tapply output

2010 Oct 06

tapply output

Hello, I am having trouble getting the output from the tapply function formatted so that it can be made into a nice table. Below is my question written in R code. Does anyone have any suggestions? Thank you. Geoff #Input the data; name <- c('Tom', 'Tom', 'Jane', 'Jane', 'Enzo', 'Enzo', 'Mary', 'Mary'); year <- c(2008, 2009,

help with the unique function

2008 May 07

help with the unique function

Hi, The unique function is easy to understand and use. Beyond that, I want to get also the frequency of repetition of each individual row in a data frame Let me explain with an example : x<-data.frame(a=c(1,2,3,1,2),b=c(2,3,4,2,3),c=c(10,20,30,10,20)) xu<-unique(x) We have, > x ? a b? c 1 1 2 10 2 2 3 20 3 3 4 30 4 1 2 10 5 2 3 20 > xu ? a b? c 1 1 2 10 2 2 3 20 3 3 4 30 I want to get

Counting occurances of a letter by a factor

2010 Sep 10

Counting occurances of a letter by a factor

I'm trying to find a more elegant way of doing this. What I'm trying to accomplish is to count the frequency of letters (major / minor alleles) in a string grouped by the factor levels in another column of my data frame. Ex. > DF<-data.frame(c("CC", "CC", NA, "CG", "GG", "GC"), c("L", "U", "L",

data frame select max group by like function

2010 Mar 09

data frame select max group by like function

Hi, I have a data frame with 3 columns: ID, year and score. How can I select for each unique ID, the year that has the max score? For example, for data frame ID, year, score tom, 1995, 88 rick, 1994, 90 mary, 2000, 97 tom, 1998, 60 mary, 1998,100 I shall have ID, year, score tom, 1995, 88 rick, 1994, 90 mary, 1998,100 Thanks, Richard [[alternative HTML version deleted]]

Componentwise means of a list of matrices?

2008 Dec 30

Componentwise means of a list of matrices?

Dear useRs, I have a list, each entry of which is a matrix of constant dimensions. Is there a good way (i.e., not using a for loop) to apply a mean to each matrix entry *across list entries*? Example: foo <- list(rbind(c(1,2,3),c(4,5,6)),rbind(c(7,8,9),c(10,11,12))) some.sort.of.apply(foo,FUN=mean) I'm looking for a componentwise mean across the two entries of foo, i.e., the

restructuring "by" output for use with write.table

2010 Jun 08

restructuring "by" output for use with write.table

Hello, vegMeans <- by(SoilVegHydro[3:37] , SoilVegHydro$Physiogomy, mean) vegSD <- by(SoilVegHydro[3:37] , SoilVegHydro$Physiogomy, sd) write.table(vegMeans, file="A:\\Work_Area\\Steve\\Hydrology_Data\\data\\vegMeans.txt") Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : cannot coerce class '"by"' into a

Merging data frames one of which is NULL

2010 Nov 09

Merging data frames one of which is NULL

Hello! I am running a loop. The result of each run of the loop is a data frame. I am merging all the data frames. For exampe: The dataframe from run 1: x<-data.frame(a=1,b=2,c=3) The dataframe from run 2: y<-data.frame(a=10,b=20,d=30) What I want to get is: merge(x,y,all.x=T,all.y=T) Then I want to merge it with the output of the 3rd run, etc. Unfortunately, I can't create the

similar to: data frames; maybe aggregate?