thr3ads.net - similar to: "Problem mit summaryBy: Group sums gives me "incorrectly" zero for one variable"

Displaying 20 results from an estimated 700 matches similar to: "Problem mit summaryBy: Group sums gives me "incorrectly" zero for one variable"

Using summaryBy with weighted data

2011 Jan 17

Using summaryBy with weighted data

Dear Soren and R users: I am trying to use the summaryBy function with weights. Is this possible? An example that illustrates what I am trying to do follows: library(doBy) ## make up some data response = rnorm(100) group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20)) weights = runif(100, 0, 1) mydata = data.frame(response,group,weights) ## run summaryBy without weights:

Using nrow with summaryBy

2010 Mar 17

Using nrow with summaryBy

Hello Everyone- I'm calculating summary statistics on a dataset (~4000 records, observations are not uniformly distributed) using summaryBy and trying to add a column with the number of observations to the output as well. What occurs to me is to use nrow(), but this doesn't appear to be working I'm able to replicate the same results with an example from the summaryBy docs:

Indexing in summaryBy

2012 May 15

Indexing in summaryBy

I'm trying to use a self-written function with the summaryBy function (doBy package). I have lots of data from Monte Carlo experiments comparing different estimators across different (combinations of) parameter values, similar to the following form: colnames(mydata) <- c("X", "b0", "b1", # parameter combination, corresponding (true) parameter values

summaryBy: transformed variable on RHS of formula?

2012 Apr 02

summaryBy: transformed variable on RHS of formula?

Hi Folks, I'm trying to cut my data inside the summaryBy function. Perhaps formulas don't work that way? I'd like to avoid adding another column if possible, but if I have to, I have to. Any ideas? Thanks, Allie require(doBy) df = dataframe(a <- rnorm(100), b <-rnorm(100)) summaryBy(a ~ cut(b,c(-100,-1,1,100)), data=df) # preferred solution, but it throws an

Problem in summaryBy

2007 Feb 15

Problem in summaryBy

The R script below gives values of 1 for all minimum values when I use a custom function in summaryBy. I get the correct values when I use FUN=min directly. Any help is much appreciated. The continuous information provided in this forum is fabulous as are the different R packages available. Rene # Simulated simplified data Subj <- rep(1:4, each=6) Analyte <-

summaryBy(): Is it the best option?

2006 Dec 05

summaryBy(): Is it the best option?

Hi, since I have quite large tables and the processing takes quite a while I am curious if I can improve the performance of this aggregation somehow: At the moment I am using summaryBy from the doBy package under R 2.4.0, Win2K. summaryBy(soc_s6aq5 + soc_s6aq7 + soc_s6aq9 + soc_s6aq11 ~ hh + comgroup,soc6a,postfix=c("","","",""),FUN=sum, na.rm=T) The

Apparent bug in summaryBy (PR#13941)

2009 Sep 04

Apparent bug in summaryBy (PR#13941)

Full_Name: Marc Paterno Version: 2.9.2 OS: Mac OS X 10.5.8 Submission from: (NULL) (99.53.212.55) summaryBy() produces incorrect results when given some data frames. Below is a transcript of a session showing the result, in a data frame with 2 observations of 2 variables. ------------------- thomas:999 paterno$ R --vanilla R version 2.9.2 (2009-08-24) Copyright (C) 2009 The R Foundation for

Column naming mystery

2007 Aug 27

Column naming mystery

Hi, I hope somebody could help me explain what seems mysterious to me? I use this line on a dataframe ae: summaryBy(total_inflated+total~gr1, data=ae, FUN=sum, na.rm=T) and it returns 3 columns as expected and columns "gr1" and "total_inflated.sum"are correct but the "total.sum" column consists of only zeros which is not correct. The same happens when I rename the

Counting observations split by a factor when there are NAs in the data

2006 Jul 10

Counting observations split by a factor when there are NAs in the data

I am a very novice R user, a social scientist (linguist) who is trying to learn to use R after being very familiar with SPSS. Please be kind! My concern: I cannot figure out a way to get an accurate count of observations of one column of data split by a factor when there are NAs in the data. I know how to use commands like tapply and summaryBy to obtain other summary statistics I am interested

how to use "..."

2013 Jan 17

how to use "..."

Dear users, I'm trying to learn how to use the "...". I have written a function (simplified here) that uses doBy::summaryBy(): # 'dat' is a data.frame from which the aggregation is computed # 'vec_cat' is a integer vector defining which columns of the data.frame should be use on the right side of the formula # 'stat_fun' is the function that will be run to

Counting observations split by a factor when there are NA s in the data

2006 Jul 10

Counting observations split by a factor when there are NA s in the data

Wouldn't something like table(status) give you want you want? E.g.: R> status <- factor(c("A", "B", "A", NA, "A", "B")) R> table(status) status A B 3 2 Andy From: Jenifer Larson-Hall > > I am a very novice R user, a social scientist (linguist) who > is trying to learn to use R after being very familiar with >

Aggregating a data frame (was: Re: new R-user needs help)

2006 Oct 18

Aggregating a data frame (was: Re: new R-user needs help)

Please use an informative subject for sake of the archives. Here are several solutions: aggregate(DF[4:8], DF[2], mean) library(doBy) summaryBy(x1 + x2 + x3 + x4 + x5 ~ name, DF, FUN = mean) # if Exp, name and id columns are factors then this can be reduced to library(doBy) summaryBy(. ~ name, DF, FUN = mean) library(reshape) cast(melt(DF, id = 1:3), name ~ variable, fun = mean) On

couting events by subject with "black out" windows

2011 Nov 18

couting events by subject with "black out" windows

I large datset that includes subjects(ID), Dates and events that need to be counted. Not every date includes an event, and I need to only count one event per 30days, per subject. So in essence, I need to create a 30-day "black out" period during which time an event cannot be "counted" for each subject. The reason is that a rule has been set up, whereby a subject can only be

Follow-up Question: data frames; matching/merging

2010 Feb 08

Follow-up Question: data frames; matching/merging

Wow.. thanks for the deluge of responses! Aggregate seems like the way to go here. But, suppose that instead of integers in column V2, I actually have dates (and instead of keeping the minimum integer, I want to keep the earliest date): > df =

LTO, ifuncs, and lld

2019 Jan 09

LTO, ifuncs, and lld

It's at this point where I think about filing a full bug report with llvm. Any hints before I do? On Mon, Jan 07, 2019 at 04:00:02PM -0500, Shawn Webb wrote: > It looks like this commit breaks CSU initialization with > statically-compiled applications. > > With a very simple application at [1], compiled with: > cc -g -O0 -flto -static -o pid pid.c > > The application

LTO, ifuncs, and lld

2018 Dec 01

LTO, ifuncs, and lld

Thanks for providing the patch! I got around to testing it this morning and it appears it fixes compilation, but produces a non-working system. I know that's kinda vague and I'll have more details soon, including sample binaries. I at least wanted to give a status update so you didn't think you were being ignored. Thanks, -- Shawn Webb Cofounder and Security Engineer HardenedBSD

function which can apply a function by a grouping variable and also hand over an additional variable, e.g. a weight

2010 Oct 01

function which can apply a function by a grouping variable and also hand over an additional variable, e.g. a weight

Hi, I was wondering if there is an easy way to accomplish the following in R: Often I want to apply a function, e.g. weighted.quantile from the Hmisc package to grouped subsets of a data.frame (grouping variable) but then I also need to hand over the weights which seems not possible with summaryBy or aggregate or the like. Is there a function to do this? Currently I do this with loops but it

data.frame - how to calculate the number of rows

2007 Dec 26

data.frame - how to calculate the number of rows

Hello, it seems to be a simple problem, but I couldn't find an answer in the archiv. (I think, it must has something to do with the group-select, like in php) I've the following data.frame: A B C 1 3 6 5 2 4 4 20 3 5 8 2 I want to get the number of the

LTO, ifuncs, and lld

2018 Nov 29

LTO, ifuncs, and lld

Hey Peter, Here you go! https://hardenedbsd.org/~shawn/2018-11-28_reproduce-01.tar Thanks, -- Shawn Webb Cofounder and Security Engineer HardenedBSD Tor-ified Signal: +1 443-546-8752 Tor+XMPP+OTR: lattera at is.a.hacker.sx GPG Key ID: 0x6A84658F52456EEE GPG Key Fingerprint: 2ABA B6BD EF6A F486 BE89 3D9E 6A84 658F 5245 6EEE On Wed, Nov 28, 2018 at 05:30:57PM -0800, Peter

Data aggregation question

2011 Jul 28

Data aggregation question

Hi all, I'm working with a sizable dataset that I'd like to summarize, but I can't find a tool or function that will do quite what I'd like. Basically, I'd like to summarize the data by fully crossing three variables and getting a count of the number of observations for every level of that 3-way interaction. For example, if factors A, B, and C each have 3 levels (all of

similar to: Problem mit summaryBy: Group sums gives me "incorrectly" zero for one variable