Displaying 20 results from an estimated 1100 matches similar to: "Using nrow with summaryBy"
2011 Jun 24
0
lag and diff with transformBy
I have a question regarding?the very useful doBy package, and
specifically, the transformBy() function?with the lag() and diff()
functions. ?It is often useful to lag or difference data within a
panel, i.e., within a by-group. ?Is the following code a safe use of
transformBy? Is there an alternative?
First, does the initial "orderBy" statement guarantee that the Time
order will be
2011 Jan 17
2
Using summaryBy with weighted data
Dear Soren and R users:
I am trying to use the summaryBy function with weights. Is this possible? An example that illustrates what I am trying to do follows:
library(doBy)
## make up some data
response = rnorm(100)
group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20))
weights = runif(100, 0, 1)
mydata = data.frame(response,group,weights)
## run summaryBy without weights:
2012 Apr 02
2
summaryBy: transformed variable on RHS of formula?
Hi Folks,
I'm trying to cut my data inside the summaryBy function. Perhaps
formulas don't work that way? I'd like to avoid adding another column
if possible, but if I have to, I have to. Any ideas?
Thanks,
Allie
require(doBy)
df = dataframe(a <- rnorm(100), b <-rnorm(100))
summaryBy(a ~ cut(b,c(-100,-1,1,100)), data=df) # preferred
solution, but it throws an
2006 Dec 05
1
summaryBy(): Is it the best option?
Hi,
since I have quite large tables and the processing
takes quite a while I am
curious if I can improve the performance of this
aggregation somehow: At the
moment I am using summaryBy from the doBy package
under R 2.4.0, Win2K.
summaryBy(soc_s6aq5 + soc_s6aq7 + soc_s6aq9 +
soc_s6aq11 ~ hh +
comgroup,soc6a,postfix=c("","","",""),FUN=sum,
na.rm=T)
The
2007 Feb 15
1
Problem in summaryBy
The R script below gives values of 1 for all minimum values when I use a
custom function in summaryBy. I get the correct values when I use FUN=min
directly. Any help is much appreciated.
The continuous information provided in this forum is fabulous as are the
different R packages available.
Rene
# Simulated simplified data
Subj <- rep(1:4, each=6)
Analyte <-
2009 Sep 04
1
Apparent bug in summaryBy (PR#13941)
Full_Name: Marc Paterno
Version: 2.9.2
OS: Mac OS X 10.5.8
Submission from: (NULL) (99.53.212.55)
summaryBy() produces incorrect results when given some data frames. Below is a
transcript of a session showing the result, in a data frame with 2 observations
of 2 variables.
-------------------
thomas:999 paterno$ R --vanilla
R version 2.9.2 (2009-08-24)
Copyright (C) 2009 The R Foundation for
2007 Aug 20
1
Problem mit summaryBy: Group sums gives me "incorrectly" zero for one variable
Hi,
first I want to thank all of you for the quick aid
which is provided here on the list during all times.
Thanks a lot for that!
Then, I have a problem using summaryBy which most
probably is a problem of wrong use by me or the like:
I use this command:
summaryBy(total+total.inf~gr, aE, FUN=sum)
where aE is a
> str(aE)
'data.frame': 127880 obs. of 16 variables:
$ gr
2013 Jan 17
3
how to use "..."
Dear users,
I'm trying to learn how to use the "...".
I have written a function (simplified here) that uses doBy::summaryBy():
# 'dat' is a data.frame from which the aggregation is computed
# 'vec_cat' is a integer vector defining which columns of the data.frame
should be use on the right side of the formula
# 'stat_fun' is the function that will be run to
2012 May 15
0
Indexing in summaryBy
I'm trying to use a self-written function with the summaryBy function (doBy
package).
I have lots of data from Monte Carlo experiments comparing different
estimators across different (combinations of) parameter values, similar to
the following form:
colnames(mydata) <- c("X", "b0", "b1", # parameter combination,
corresponding (true) parameter values
2005 Jul 12
4
Calculation of group summaries
I know R has a steep learning curve, but from where I stand the slope
looks like a sheer cliff. I'm pawing through the available docs and
have come across examples which come close to what I want but are
proving difficult for me to modify for my use.
Calculating simple group means is fairly straight forward:
data(PlantGrowth)
attach(PlantGrowth)
stack(mean(unstack(PlantGrowth)))
2010 Feb 10
1
using step() with package geepack
I'm using the package geepack to fit GEE models.
Does anyone know of methods for add1 and drop1 for a 'geeglm' model object, or perhaps a method for extractAIC based on the QIC of Pan 2001? I see there has been some mention of this on R-help a few years ago (RSiteSearch("QIC")).
The package does provide an anova method for its model objects, and update() seems to work:
2004 Oct 08
1
Bug in nlme under version 2.0.0
Dear all,
Under version 2.0.0, I get the error below when calling summary() on a lme-object, whereas it works under version 1.9.1 (well, it did last week, before I upgraded). Any help on this?
Thx in advance
S??ren
> library(nlme)
> mf <- formula(Weight~Cu*(Time+I(Time^2)+I(Time^3)))
> lme1 <- lme(mf, data = dietox, random=~1|Pig)
> summary(lme1)
Linear mixed-effects model fit
2010 Feb 08
1
Follow-up Question: data frames; matching/merging
Wow.. thanks for the deluge of responses!
Aggregate seems like the way to go here.
But, suppose that instead of integers in column V2, I actually have
dates (and instead of keeping the minimum integer, I want to keep the
earliest date):
> df =
2011 Nov 18
1
couting events by subject with "black out" windows
I large datset that includes subjects(ID), Dates and events that need to be counted. Not every date includes an event, and I need to only count one event per 30days, per subject. So in essence, I need to create a 30-day "black out" period during which time an event cannot be "counted" for each subject. The reason is that a rule has been set up, whereby a subject can only be
2006 Jan 13
2
Saving data in an R package - how to maintain that t avariable is a 'factor' when it is coded as 1, 2, 3...
I have a .txt file obtained by saving a data frame in which the first four columns are factors (but represented as 1,2,3 etc). The first four lines are
"Pig" "Evit" "Cu" "Litter" "Start" "Weight" "Feed" "Time"
"4601" "1" "1" "1" 26.5 26.5 NA 1
"4601" "1"
2007 Aug 31
3
data frame row manipulation
Hello,
struggling with the very basic needs... :( any help appreciated.
#using the package doBY
#who drinks how much beer per day and therefor cannot calculate rowise
maxvals
evaluation=data.frame(date=c(1,2,3,4,5,6,7,8,9),
name=c("Michael","Steve","Bob",
"Michael","Steve","Bob","Michael","Steve","Bob"),
2007 Dec 26
1
data.frame - how to calculate the number of rows
Hello,
it seems to be a simple problem, but I couldn't find an answer in the
archiv. (I think, it must has something to do with the group-select, like in
php)
I've the following data.frame:
A B C
1 3 6 5
2 4 4 20
3 5 8 2
I want to get the number of the
2006 Jul 10
1
Counting observations split by a factor when there are NAs in the data
I am a very novice R user, a social scientist (linguist) who is trying
to learn to use R after being very familiar with SPSS. Please be kind!
My concern:
I cannot figure out a way to get an accurate count of observations of
one column of data split by a factor when there are NAs in the data.
I know how to use commands like tapply and summaryBy to obtain other
summary statistics I am interested
2010 May 07
4
Any way to apply TWO functions with tapply()?
I need to compute the mean and the standard deviation of a data set and would
like to have the results in one table/data frame. I call tapply() two times
and do then merge the resulting tables to have them all in one table. Is
there any way to tell tapply() to use the functions mean and sd within one
function call? Something like tapply(data$response, list(data$targets,
data$conditions), c(mean,
2007 Aug 27
1
Column naming mystery
Hi,
I hope somebody could help me explain what seems
mysterious to me?
I use this line on a dataframe ae:
summaryBy(total_inflated+total~gr1, data=ae, FUN=sum,
na.rm=T)
and it returns 3 columns as expected and columns "gr1"
and "total_inflated.sum"are correct but the
"total.sum" column consists of only zeros which is not
correct. The same happens when I rename the