thr3ads.net - similar to: "aggregate data.frame based on column class"

Displaying 20 results from an estimated 30000 matches similar to: "aggregate data.frame based on column class"

best way to aggregate / rearrange data.frame with different data types

2011 Jul 11

best way to aggregate / rearrange data.frame with different data types

Hi, I have a data.frame that looks like this: Subject <- c(rep(1,4), rep(2,4), rep(3,4)) y <- rnorm(12, 3, 2) gender <- c(rep("w",4), rep("m",4), rep("w",4)) comment <- c(rep("comment A",4), rep("comment B",4), rep("comment C",4)) data <- data.frame(Subject,y,gender,comment) data Subject y gender

function in aggregate applied to specific columns only

2010 Jan 04

function in aggregate applied to specific columns only

I want to use aggregate with the mean function on specific columns gender <- factor(c("m", "m", "f", "f", "m")) student <- c(0001, 0002, 0003, 0003, 0001) score <- c(50, 60, 70, 65, 60) basicSub <- data.frame(student, gender, score) basicSubMean <- aggregate(basicSub, by=list(basicSub$student), FUN=mean, na.rm=TRUE) This

mean-aggregate – but use unique for factor variables

2012 Sep 25

mean-aggregate – but use unique for factor variables

Hi, I have a data.frame which I want to aggregate. There are some grouping variables and some continuous variables for which I would like to have the mean. However there are also some factor-variables in the data-frame that are not grouping variables and I actually would like to aggregate these variables with the unique() function. Is that possible with the standard aggregate-function? If I

aggregate / collapse big data frame efficiently

2012 Dec 25

aggregate / collapse big data frame efficiently

Hi, I need to aggregate rows of a data.frame by computing the mean for rows with the same factor-level on one factor-variable; here is the sample code: x <- data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52)) aggregate(x, list(x[,1]), mean) Now my problem is, that the actual data-set is much bigger (120 rows and approximately 100.000 columns) ? and it takes very very long

sum a particular column by group

2010 Feb 05

sum a particular column by group

Dear all, I have a table like this: > eds R.ID Region Gender Agegr Time nvisits 1 1 A F 60--64 1:00 1 2 2 O F 55--59 1:20 1 3 3 O F 55--59 3:45 3 4 4 S M 60--64 1:10 3 5 5 W F 55--59 12:30 1 6

Help with aggregate syntax for a multi-column function please.

2011 Aug 02

Help with aggregate syntax for a multi-column function please.

Dear R-experts: I am using a function called AUC whose arguments are data, time, id, and dv. data is the name of the dataframe, time is the independent variable column name, id is the subject id and dv is the dependent variable. The function computes area under the curve by trapezoidal rule, for each subject id. I would like to embed this in aggregate to further subset by each

aggregate function / custom column names?

2010 Feb 11

aggregate function / custom column names?

This question is about column names returned by the aggregate function. Consider the following example df <- data.frame( id = c(rep('11',30),rep('22',30),rep('33',30)), value = c(rnorm(30,2,0.5), rnorm(30,3,0.5), rnorm(30,6,0.5)) ) aggregate(df[,c("value"),drop=FALSE], by=list(id=df$id), max) output: id value 1 11 2.693528 2 22 3.868400 3 33

(Newbie) Aggregate for NA values

2006 Feb 24

(Newbie) Aggregate for NA values

Folks, Sorry if this question has been answered before or is obvious (or worse, statistically "bad"). I don't understand what was said in one of the search results that seems somewhat related. I use aggregate to get a quick summary of the data. Part of what I am looking for in the summary is, how much influence might the NA's have had, if they were included, and is excluding

Aggregation using list with Hmisc summarize function

2006 Dec 28

Aggregation using list with Hmisc summarize function

Hi All, I'm using the Hmisc summarize function and used list instead of llist to provide the by variables. It generated an error message. Is this a bug, or do I misunderstand how Hmisc works with lists? The program below demonstrates the error message. Thanks, Bob x<-1:8 group <- c(1,1,1,1,2,2,2,2) gender<- c(1,2,1,2,1,2,1,2) mydata<-data.frame(x,group,gender)

question about the aggregate function with respect to order of levels of grouping elements

2007 Dec 16

question about the aggregate function with respect to order of levels of grouping elements

Hi, I am using aggregate() to add up groups of data according to year and month. It seems that the function aggregate() automatically sorts the levels of factors of the grouping elements, even if the order of the levels of factors is supplied. I am wondering if this is a bug, or if I missed something important. Below is an example that shows what I mean. Does anyone know if this is just the way

aggregate function with 'NA'

2006 Oct 01

aggregate function with 'NA'

Dear r-help reader, I have some problems with the aggregate function. My datframe looks like >frame Day Time V1 V2 1 M 0 3 NA 2 M 0 4 NA 3 M 0 5 2 4 M 1 NA 4 5 M 1 10 6 6 T 0 4 45 7 T 1 4 3 8 T 1 3 2 9 T 1 6 1 I used the aggegate function to obtain the mean in V1 and V2 over the grouping variable Time and Day

Lattice xyplots plots with multiple lines per cell

2010 Aug 13

Lattice xyplots plots with multiple lines per cell

Hello, I need to plot the means of some outcome for two groups (control vs intervention) over time (discrete) on the same plot, for various subsets such as gender and grade level. What I have been doing is creating all possible subsets first, using the aggregate function to create the means over time, then plotting the means over time (as a simple line plot with both control & intervention

re sultant column names from reshape::cast, with a fun.aggregate vector

2008 Jun 17

re sultant column names from reshape::cast, with a fun.aggregate vector

try this: scores.melt = data.frame(grade = floor(runif(100, 1,10)), variable = 'score', value = rnorm(100)); cast(scores.melt, grade ~ variable, fun.aggregate = c(mean, length)) it has the nice column names of: grade score_mean score_length 1 1 0.08788535 8 2 2 0.16720313 15 3 3 0.41046299 7 4 4 0.13928356 13 ... but

transforming a .csv file column names as per a particular column rows using R code

2012 Oct 14

transforming a .csv file column names as per a particular column rows using R code

Hello all, I have a .csv file like below. Tool,Step_Number,Data1,Data2... etc up to 100 columns. A,1,0,1 A,2,3,1 A,3,2,1 . . B,1,3,2 B,2,1,2 B,3,3,2 . . ...... so on upto 50 rows where the column "*Tool*" has distinct steps in second column "*Step_Number*",but both have same entries in Step_Number column. I want the output like below.

aggregate(...) with multiple functions

2010 Jul 16

aggregate(...) with multiple functions

hi all - i'm just wondering what sort of code people write to essentially performa an aggregate call, but with different functions being applied to the various columns. for example, if i have a data frame x and would like to marginalize by a factor f for the rows, but apply mean() to col1 and median() to col2. if i wanted to apply mean() to both columns, i would call: aggregate(x, list(f),

ttest in R

2009 Mar 07

ttest in R

Dear list, i am a biologist who needs to do some ttest between disease and non disease, sex, genotype and the serum levels of proteins on a large number of individuals. i have been using excel for a long time but it is very tedious and time consuming. i am posting the data below and ask your help in generating a code to get this analysis done in R. thanks gender disease genotype data M N CC

aggregate(), tapply(): Why is the order of the grouping variables not kept?

2013 Mar 11

aggregate(), tapply(): Why is the order of the grouping variables not kept?

Dear expeRts, The question is rather simple: Why does aggregate (or similarly tapply()) not keep the order of the grouping variable(s)? Here is an example: x <- data.frame(group = rep(LETTERS[1:2], each=10), year = rep(rep(2001:2005, each=2), 2), value = rep(1:10, each=2)) ## => sorted according to group, then year aggregate(value ~ group + year, data=x,

column selection for aggregate()

2010 Jan 18

column selection for aggregate()

Hi everybody! I'm working on R today so I have a lot of questions (you may have noticed that it's the 3rd email today). I'm new on R, so please excuse the "spam"! I have a dataset "ssfa" with many rows and the column names are: > names(ssfa) [1] "SPECSHOR" "BONE" "TO_POS" "MEASUREM" "FACETTE"

aggregate text column by a few rows

2010 Oct 07

aggregate text column by a few rows

Hi, R function aggregate can only take summary stats functions, can I aggregate text columns? For example, for the dataframe below, > a <- rbind(data.frame(id=1, name='Tom', hobby='fishing'),data.frame(id=1, name='Tom', hobby='reading'),data.frame(id=2, name='Mary', hobby='reading'),data.frame(id=3, name='John',

Assigning variable names from one object to another object

2009 May 24

Assigning variable names from one object to another object

Hello I have 2 datasets say Data1 and Data2 both are of different dimesions. Data1: 120 rows and 6 columns (Varname, Vartype, Labels, Description, ....) The column Varname has 120 rows which has variable names such id, age, gender,.....so on Data2: 12528 rows and 120 columns The column names in this case are V1, V2, ......... V120 (which are default names in R when we say head=F in read.csv)

similar to: aggregate data.frame based on column class