Displaying 20 results from an estimated 1000 matches similar to: "multivariate graphs, averaging on some vars"
2010 Mar 17
2
Using nrow with summaryBy
Hello Everyone-
I'm calculating summary statistics on a dataset (~4000 records,
observations are not uniformly distributed) using summaryBy and trying
to add a column with the number of observations to the output as well.
What occurs to me is to use nrow(), but this doesn't appear to be working
I'm able to replicate the same results with an example from the
summaryBy docs:
2011 Jan 17
2
Using summaryBy with weighted data
Dear Soren and R users:
I am trying to use the summaryBy function with weights. Is this possible? An example that illustrates what I am trying to do follows:
library(doBy)
## make up some data
response = rnorm(100)
group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20))
weights = runif(100, 0, 1)
mydata = data.frame(response,group,weights)
## run summaryBy without weights:
2012 Apr 25
4
"Conditional" average
Hello, I have a set of data including age, wage and education level each
called age76, wage76 and grade76 I want to know how i can calculate the
average wage of people age 15 to 65 (each year separetly) , only for those
who have an education level of 10 12 and 16...
--
View this message in context: http://r.789695.n4.nabble.com/Conditional-average-tp4585313p4585313.html
Sent from the R help
2012 Apr 02
2
summaryBy: transformed variable on RHS of formula?
Hi Folks,
I'm trying to cut my data inside the summaryBy function. Perhaps
formulas don't work that way? I'd like to avoid adding another column
if possible, but if I have to, I have to. Any ideas?
Thanks,
Allie
require(doBy)
df = dataframe(a <- rnorm(100), b <-rnorm(100))
summaryBy(a ~ cut(b,c(-100,-1,1,100)), data=df) # preferred
solution, but it throws an
2013 Jan 17
3
how to use "..."
Dear users,
I'm trying to learn how to use the "...".
I have written a function (simplified here) that uses doBy::summaryBy():
# 'dat' is a data.frame from which the aggregation is computed
# 'vec_cat' is a integer vector defining which columns of the data.frame
should be use on the right side of the formula
# 'stat_fun' is the function that will be run to
2010 Feb 22
2
how do I calculate means or cov matrix for multivariate groups
Hello,
Having the matrix d
> d
value value2 class
1 1 1 x
2 2 2 x
3 3 3 x
4 4 2 x
5 5 1 y
6 11 3 y
7 12 4 z
8 13 5 z
9 14 6 z
10 15 7 z
I want to calculate the means and cov matrix for groups x,y,z.
I know how to do it the long way.
I tried to use tapply and
2006 Dec 05
1
summaryBy(): Is it the best option?
Hi,
since I have quite large tables and the processing
takes quite a while I am
curious if I can improve the performance of this
aggregation somehow: At the
moment I am using summaryBy from the doBy package
under R 2.4.0, Win2K.
summaryBy(soc_s6aq5 + soc_s6aq7 + soc_s6aq9 +
soc_s6aq11 ~ hh +
comgroup,soc6a,postfix=c("","","",""),FUN=sum,
na.rm=T)
The
2010 May 07
4
Any way to apply TWO functions with tapply()?
I need to compute the mean and the standard deviation of a data set and would
like to have the results in one table/data frame. I call tapply() two times
and do then merge the resulting tables to have them all in one table. Is
there any way to tell tapply() to use the functions mean and sd within one
function call? Something like tapply(data$response, list(data$targets,
data$conditions), c(mean,
2007 Feb 15
1
Problem in summaryBy
The R script below gives values of 1 for all minimum values when I use a
custom function in summaryBy. I get the correct values when I use FUN=min
directly. Any help is much appreciated.
The continuous information provided in this forum is fabulous as are the
different R packages available.
Rene
# Simulated simplified data
Subj <- rep(1:4, each=6)
Analyte <-
2006 Jul 10
1
Counting observations split by a factor when there are NAs in the data
I am a very novice R user, a social scientist (linguist) who is trying
to learn to use R after being very familiar with SPSS. Please be kind!
My concern:
I cannot figure out a way to get an accurate count of observations of
one column of data split by a factor when there are NAs in the data.
I know how to use commands like tapply and summaryBy to obtain other
summary statistics I am interested
2007 Aug 20
1
Problem mit summaryBy: Group sums gives me "incorrectly" zero for one variable
Hi,
first I want to thank all of you for the quick aid
which is provided here on the list during all times.
Thanks a lot for that!
Then, I have a problem using summaryBy which most
probably is a problem of wrong use by me or the like:
I use this command:
summaryBy(total+total.inf~gr, aE, FUN=sum)
where aE is a
> str(aE)
'data.frame': 127880 obs. of 16 variables:
$ gr
2009 Sep 04
1
Apparent bug in summaryBy (PR#13941)
Full_Name: Marc Paterno
Version: 2.9.2
OS: Mac OS X 10.5.8
Submission from: (NULL) (99.53.212.55)
summaryBy() produces incorrect results when given some data frames. Below is a
transcript of a session showing the result, in a data frame with 2 observations
of 2 variables.
-------------------
thomas:999 paterno$ R --vanilla
R version 2.9.2 (2009-08-24)
Copyright (C) 2009 The R Foundation for
2010 Jul 16
2
aggregate(...) with multiple functions
hi all - i'm just wondering what sort of code people write to
essentially performa an aggregate call, but with different functions
being applied to the various columns.
for example, if i have a data frame x and would like to marginalize by
a factor f for the rows, but apply mean() to col1 and median() to
col2.
if i wanted to apply mean() to both columns, i would call:
aggregate(x, list(f),
2006 Jun 12
3
Group averages
Hello:
I hope none of you will mind helping a newbie. I'm a student research
assistant working with a large data set in which observations are
categorized according to two factors. I'm trying to calculate the group
mean and variance of a variable (called 'hsgpa' in the example data
presented below) to each observation , excluding that observation. For
example, if there are
2007 Oct 30
2
flexible processing
Hello,
unfortunately, I don't know a better subject. I would like to be very flexible
in how to process my data.
Assume the following dataset:
par1 <- seq(0,1,length.out = 100)
par2 <- seq(1,100)
fac1 <- factor(rep(c("group1", "group2"), each = 50))
fac2 <- factor(rep(c("group3", "group4", "group5", "group6"), each =
2011 Jul 28
3
Data aggregation question
Hi all,
I'm working with a sizable dataset that I'd like to summarize, but I
can't find a tool or function that will do quite what I'd like. Basically,
I'd like to summarize the data by fully crossing three variables and getting
a count of the number of observations for every level of that 3-way
interaction. For example, if factors A, B, and C each have 3 levels (all of
2011 Feb 01
2
Problems with sample means and standard deviations
An embedded and charset-unspecified text was scrubbed...
Name: ei saatavilla
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110201/fe2362c4/attachment.pl>
2009 Apr 07
4
group by-like statement for 2-row matrix
Hi,
my problem is as follows:
I have a matrix of two rows like this:
2 2 3 4 4 4 5 5 6
1 1 2 1 3 3 2 1 1
Can I apply something like "group by" in sql? What I want to achieve
is the some of second row for each unique entry of first row:
2 -> 2 (=1+1)
3 -> 2
4 -> 7 (=1+3+3)
5 -> 3 (=2+1)
6 -> 1
Thanks!!
Henning
2006 Feb 17
1
Transforming results of the summary function
Hi all,
I have a question about transforming the data from summary function.
Let's say I have a data frame like this:
> x = data.frame(a = c(rep("lev1", 5), rep("lev2", 5)), b = c(rnorm(5)+2, rnorm(5)))
> x
a b
1 lev1 1.5964765
2 lev1 2.2945609
3 lev1 3.5285787
4 lev1 1.4439838
5 lev1 2.2948826
6 lev2 1.7063506
7 lev2 -0.4042742
8 lev2
2009 Dec 08
6
conditionally merging adjacent rows in a data frame
Hi, I have a data frame and want to merge adjacent rows if some condition is
met. There's an obvious solution using a loop but it is prohibitively slow
because my data frame is large. Is there an efficient canonical solution for
that?
> head(d)
rt dur tid mood roi x
55 5523 200 4 subj 9 5
56 5523 52 4 subj 7 31
57 5523 209 4 subj 4 9
58 5523 188 4 subj 4 7