similar to: data frames; maybe aggregate?

Displaying 20 results from an estimated 20000 matches similar to: "data frames; maybe aggregate?"

2006 Feb 11
2
aggregate vs tapply; is there a middle ground?
Dear all, I'm wanting to do a series of comparisons among 4 categorical variables: a <- aggregate(y, list(var1, var2, var3, var4), sum) This gets me a very nice 2-dimensional data frame with one column per variable, BUT, as help for aggregate says, <<empty subsets are removed>>. I don't see in help(aggregate) how I can change this. In contrast, a <- tapply(y,
2011 Feb 07
2
Using Aggregate for Date
Hi, I am trying to find the min of day for each student in each year. Here is the dataset: date studentid year 1/1/05 6:07 AM 236 20082009 3/27/09 9:45 AM 236 20082009 4/29/09 8:44 AM 236 20082009 3/27/09 11:36 AM 310 20082009 4/1/09 10:43 AM 310 20082009 10/15/09 8:54 AM 310
2010 Oct 07
3
aggregate text column by a few rows
Hi, R function aggregate can only take summary stats functions, can I aggregate text columns? For example, for the dataframe below, > a <- rbind(data.frame(id=1, name='Tom', hobby='fishing'),data.frame(id=1, name='Tom', hobby='reading'),data.frame(id=2, name='Mary', hobby='reading'),data.frame(id=3, name='John',
2011 May 19
2
trouble with summary tables with several variables using aggregate function
Dear all, I am having trouble creating summary tables using aggregate function. given the following table: Var1 Var2 Var3 dummy S1 T1 I 1 S1 T1 I 1 S1 T1 D 1 S1 T1 D 1 S1 T2 I 1 S1 T2 I 1 S1 T2 D 1 S1 T2 D 1 S2
2010 Oct 12
5
aggregate with cumsum
Hello everybody, Data is myd <- data.frame(id1=rep(c("a","b","c"),each=3),id2=rep(1:3,3),val=rnorm(9)) I want to get a cumulative sum over each of id1. trying aggregate does not work myd$pcum <- aggregate(myd[,c("val")],list(orig=myd$id1),cumsum) Please suggest a solution. In real the dataframe is huge so looping with for and subsetting is not a
2009 Mar 07
4
merge data frames with same column names of different lengths and missing values
Hello, I'm switching over from SAS to R and am having trouble merging data frames. The data frames have several columns with the same name, and each has a different number of rows. Some of the values are missing from cells with the same column names in each data frame. I had hoped that when I merged the dataframes, every column with the same name would be merged, with the value in a complete
2010 Aug 19
4
Aggregate Help
Please let me know if this is or is not the right place to ask these types of questions. Warning: I am new to R by two days. I have a simple dataset. I have loaded the dataset successfully using the following code: Filepath=(C:\temp\\pilot\dataset1.txt") Pilot=read.table(filepath, header=TRUE) Dataset1.txt is delimited and looks like this: Date illness count 2006/01/01 derm 17 2006/01/01
2010 Aug 18
2
Different way of aggregating
Hi Usually "aggregate" is used to calculate things such as the sum of all data on the first day, the sum next day, and so on. But how can I calculate the mean of the first hour of all days, the mean of the second hour of all days, and so on. ??? That's Most examples: today at 1am + today at 2am + today at 3am +.... -> sum today tomorrow at 1am + tomorrow at
2010 May 14
2
operations between two aggregated data frames?
Hi All, I've come up with a solution for this problem that relies on a for loop, and I was wondering if anybody had any insight into a more elegant method: I have two data frames, each has a column for categorical data and a column for date. What I'd like to do, ideally, is calculate the number of days between all pairs of dates in data frame 1 and data frame 2 (*but only for members
2010 Nov 29
4
subset
?Hi: I always use subset the same way but now is returning 0 rows. What's wrong with the way I am subsetting? library(ggplot2) structure(list(first = c(38.2086, 43.1768, 43.146, 41.8044, 42.4232, 46.3646, 38.0813, 40.0745, 40.4889, 38.6246, 40.2826, 41.6056, 34.5353, 40.0768), second = c(43.3295, 42.4326, 38.8994, 37.0894, 42.3218, 46.1726, 39.1206, 41.2072, 42.4874, 40.2657, 38.7766,
2009 Aug 26
1
Batch replacement, by factor, of values in a data frame
Dear List, I'm wondering if there is a better/cleaner/more efficient way of replacing 0 values in a variable with the minimum of the non-missing and non-zero values of that same variable, but doing it within the levels of a factor? Consider the dummy example data presented at the end of my message. Within each 'Site' there are some 0 values and possibly some NA's. I can compute
2010 Apr 13
2
efficiently picking one row from a data frame per unique key
Hello all, I'm trying to transform data frames by grouping the rows by the values in a particular column, ordered by another column, then picking the first row in each group. I'd like to convert a data frame like this: x y z 1 10 20 1 11 19 2 12 18 4 13 17 into one with three rows, like this, where i've discarded one row: x y z 1 1 11 19 2 2 12 18 4 4 13 17 I've got a
2010 Dec 14
1
binding data.frames with sequential names
Hello, I have data frames X1 to X19 I want a simple way to bind them as the next run(s) will generate many more sequential data frames. I tried the following with i = 19: > my.list <- as.list(paste("X",1:i,sep="")) > new.data <- do.call("rbind", my.list) > new.data [,1] [1,] "X1" [2,] "X2" [3,] "X3" [4,]
2010 Oct 06
3
tapply output
Hello, I am having trouble getting the output from the tapply function formatted so that it can be made into a nice table. Below is my question written in R code. Does anyone have any suggestions? Thank you. Geoff #Input the data; name <- c('Tom', 'Tom', 'Jane', 'Jane', 'Enzo', 'Enzo', 'Mary', 'Mary'); year <- c(2008, 2009,
2008 May 07
6
help with the unique function
Hi, The unique function is easy to understand and use. Beyond that, I want to get also the frequency of repetition of each individual row in a data frame Let me explain with an example : x<-data.frame(a=c(1,2,3,1,2),b=c(2,3,4,2,3),c=c(10,20,30,10,20)) xu<-unique(x) We have, > x ? a b? c 1 1 2 10 2 2 3 20 3 3 4 30 4 1 2 10 5 2 3 20 > xu ? a b? c 1 1 2 10 2 2 3 20 3 3 4 30 I want to get
2010 Sep 10
4
Counting occurances of a letter by a factor
I'm trying to find a more elegant way of doing this. What I'm trying to accomplish is to count the frequency of letters (major / minor alleles) in a string grouped by the factor levels in another column of my data frame. Ex. > DF<-data.frame(c("CC", "CC", NA, "CG", "GG", "GC"), c("L", "U", "L",
2010 Mar 09
5
data frame select max group by like function
Hi, I have a data frame with 3 columns: ID, year and score. How can I select for each unique ID, the year that has the max score? For example, for data frame ID, year, score tom, 1995, 88 rick, 1994, 90 mary, 2000, 97 tom, 1998, 60 mary, 1998,100 I shall have ID, year, score tom, 1995, 88 rick, 1994, 90 mary, 1998,100 Thanks, Richard [[alternative HTML version deleted]]
2008 Dec 30
3
Componentwise means of a list of matrices?
Dear useRs, I have a list, each entry of which is a matrix of constant dimensions. Is there a good way (i.e., not using a for loop) to apply a mean to each matrix entry *across list entries*? Example: foo <- list(rbind(c(1,2,3),c(4,5,6)),rbind(c(7,8,9),c(10,11,12))) some.sort.of.apply(foo,FUN=mean) I'm looking for a componentwise mean across the two entries of foo, i.e., the
2010 Jun 08
1
restructuring "by" output for use with write.table
Hello, vegMeans <- by(SoilVegHydro[3:37] , SoilVegHydro$Physiogomy, mean) vegSD <- by(SoilVegHydro[3:37] , SoilVegHydro$Physiogomy, sd) write.table(vegMeans, file="A:\\Work_Area\\Steve\\Hydrology_Data\\data\\vegMeans.txt") Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : cannot coerce class '"by"' into a
2010 Nov 09
2
Merging data frames one of which is NULL
Hello! I am running a loop. The result of each run of the loop is a data frame. I am merging all the data frames. For exampe: The dataframe from run 1: x<-data.frame(a=1,b=2,c=3) The dataframe from run 2: y<-data.frame(a=10,b=20,d=30) What I want to get is: merge(x,y,all.x=T,all.y=T) Then I want to merge it with the output of the 3rd run, etc. Unfortunately, I can't create the