thr3ads.net - similar to: "summaryBy: transformed variable on RHS of formula?"

Displaying 20 results from an estimated 5000 matches similar to: "summaryBy: transformed variable on RHS of formula?"

2010 Mar 17

Using nrow with summaryBy

Hello Everyone- I'm calculating summary statistics on a dataset (~4000 records, observations are not uniformly distributed) using summaryBy and trying to add a column with the number of observations to the output as well. What occurs to me is to use nrow(), but this doesn't appear to be working I'm able to replicate the same results with an example from the summaryBy docs:

Using summaryBy with weighted data

2011 Jan 17

Using summaryBy with weighted data

Dear Soren and R users: I am trying to use the summaryBy function with weights. Is this possible? An example that illustrates what I am trying to do follows: library(doBy) ## make up some data response = rnorm(100) group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20)) weights = runif(100, 0, 1) mydata = data.frame(response,group,weights) ## run summaryBy without weights:

Apparent bug in summaryBy (PR#13941)

2009 Sep 04

Apparent bug in summaryBy (PR#13941)

Full_Name: Marc Paterno Version: 2.9.2 OS: Mac OS X 10.5.8 Submission from: (NULL) (99.53.212.55) summaryBy() produces incorrect results when given some data frames. Below is a transcript of a session showing the result, in a data frame with 2 observations of 2 variables. ------------------- thomas:999 paterno$ R --vanilla R version 2.9.2 (2009-08-24) Copyright (C) 2009 The R Foundation for

summaryBy(): Is it the best option?

2006 Dec 05

summaryBy(): Is it the best option?

Hi, since I have quite large tables and the processing takes quite a while I am curious if I can improve the performance of this aggregation somehow: At the moment I am using summaryBy from the doBy package under R 2.4.0, Win2K. summaryBy(soc_s6aq5 + soc_s6aq7 + soc_s6aq9 + soc_s6aq11 ~ hh + comgroup,soc6a,postfix=c("","","",""),FUN=sum, na.rm=T) The

Problem in summaryBy

2007 Feb 15

Problem in summaryBy

The R script below gives values of 1 for all minimum values when I use a custom function in summaryBy. I get the correct values when I use FUN=min directly. Any help is much appreciated. The continuous information provided in this forum is fabulous as are the different R packages available. Rene # Simulated simplified data Subj <- rep(1:4, each=6) Analyte <-

extract indep vars from formula

2012 Nov 07

extract indep vars from formula

Hello, I'm trying to extract the independent variables from a formula. The closest I've been able to come, aside from rolling my own, is the following: > a = y ~ b * x > attr(terms(formula(a)),"variables") The reason I'm doing this is that I'm building a grid of points that I use to construct a 3-d model prediction surface in rgl. If there are more than two

how to use "..."

2013 Jan 17

how to use "..."

Dear users, I'm trying to learn how to use the "...". I have written a function (simplified here) that uses doBy::summaryBy(): # 'dat' is a data.frame from which the aggregation is computed # 'vec_cat' is a integer vector defining which columns of the data.frame should be use on the right side of the formula # 'stat_fun' is the function that will be run to

Symbolic references - passing variable names into functions

2009 Aug 12

Symbolic references - passing variable names into functions

Hello All, I am trying to write a function which would operate on columns of a dataframe specified in parameters passed to that function. f = function(dataf, col1 = "column1", col2 = "column2") { dataf$col1 = dataf$col2 # just as an example } The above, of course, does not work as intended. In some languages one can force evaluation of a variable, and then

remove all terms with interaction factor in formula

2012 Sep 13

remove all terms with interaction factor in formula

Hi Folks, I'm trying to find a way to remove all terms in a formula that contain a particular interaction. For example, in the formula below, I'd like to remove all terms that contain the b:c interaction. > attributes(terms( ~ a*b*c*d))$term.labels [1] "a" "b" "c" "d" "a:b" "a:c" [7]

summarize dataframe based on multiple cols, not their combinations

2013 Mar 20

summarize dataframe based on multiple cols, not their combinations

Hi folks, I'm trying to figure out how to get summarized data based on multiple columns. However, instead of giving summaries for every combination of categorical columns, I want it for each value of each categorical column regardless of the other columns. I could do this with three different commands, but i'm wondering if there's a more elegant way that I'm missing. Thanks!

reproduction archives

2011 Jun 05

reproduction archives

Hello Folks, As some of my old code broke when an updated package changed its interface, I started thinking about reproduction of analyses. It's not good enough to save our code - we have to save the package versions those analyses used as well as the R-core. I saw a couple references to "reproduction archives" around, but nothing specific. Is there any good way to package up

bigmemory for dataframes?

2012 Oct 18

bigmemory for dataframes?

Hi Folks, I've been bumping my head against the 4GB limit for 32-bit R. I can't go to 64-bit R due to package compatibility issues (ROBDC - possible but painful, xlsReadWrite - not possible, and others). I have a number of big dataframes whose columns all sorts of data types - factor, character, integer, etc. I run and save models that keep copies of the modeled data inside the model

backreferences in gregexpr

2012 Nov 02

backreferences in gregexpr

Hi Folks, I'm trying to extract just the backreferences from a regex. > temp = "abcd1234abcd1234" > regmatches(temp, gregexpr("(?:abcd)(1234)", temp)) [[1]] [1] "abcd1234" "abcd1234" What I would like is: [1] "1234" "1234" Note: I know I can just match 1234 here, but the actual example is complicated enough that I have to

translate vector of numbers to indicies of 0/1 matrix

2010 Nov 17

translate vector of numbers to indicies of 0/1 matrix

Hello All, Searched around, haven't found a decent solution. I'd like to translate a vector of numbers to a matrix (or to a list of vectors) such that the vector values would serve as indicies of the 1's in an otherwise-zero-filled matrix (or list of vectors). For example: > a = c(1,3,3,4) # perform operation [,1] [,2] [,3] [,4] [1,] 1 0 0 0 [2,] 0 0 1

strange date problem - May 3, 1992 is NA

2011 Jun 22

strange date problem - May 3, 1992 is NA

> is.na(strptime("5/2/1992", format="%m/%d/%Y")) [1] FALSE > is.na(strptime("5/3/1992", format="%m/%d/%Y")) [1] TRUE Any idea what's going on with this? Running strptime against all dates from around 1946, only 5/3/1992 was converted as "NA". Even stranger, it still seems to have a value associated with it (even though is.na thinks

How do Sweave users collaborate with Word users?

2012 Apr 07

How do Sweave users collaborate with Word users?

Hello All, I'm getting my workflow switched over to Sweave, which is very cool. However, I collaborate with folks (as many of you must as well) who use Word to Track Changes amongst a group while crafting a paper. In the simplest case, there will just be two people (one Sweave user and one Word user) editing a paper. I'm wondering, how do Sweave users go about this? I could convert a

sweave tables as images?

2012 May 21

sweave tables as images?

Hello folks, I've been on a journey trying to figure out how to manage documents that are amenable to sharing and editing, but that contain dynamic content generated by R. I've come to the following solution: I use Sweave to generate labeled png & pdf figures, and I "Insert & Link" those figures as "Pictures" in a Word 2010 doc. Thus, when data or code

Problem mit summaryBy: Group sums gives me "incorrectly" zero for one variable

2007 Aug 20

Problem mit summaryBy: Group sums gives me "incorrectly" zero for one variable

Hi, first I want to thank all of you for the quick aid which is provided here on the list during all times. Thanks a lot for that! Then, I have a problem using summaryBy which most probably is a problem of wrong use by me or the like: I use this command: summaryBy(total+total.inf~gr, aE, FUN=sum) where aE is a > str(aE) 'data.frame': 127880 obs. of 16 variables: $ gr

dot products

2012 Mar 07

dot products

Hello, I need to take a dot product of each row of a dataframe and a vector. The number of columns will be dynamic. The way I've been doing it so far is contorted. Is there a better way? dotproduct <- function(dataf, v2) { apply(t(t(as.matrix(a)) * v2),1,sum) #contorted! } df = data.frame(a=c(1,2,3),b=c(4,5,6)) vec = c(4,5) dotproduct(df, vec) thanks,

Indexing in summaryBy

2012 May 15

Indexing in summaryBy

I'm trying to use a self-written function with the summaryBy function (doBy package). I have lots of data from Monte Carlo experiments comparing different estimators across different (combinations of) parameter values, similar to the following form: colnames(mydata) <- c("X", "b0", "b1", # parameter combination, corresponding (true) parameter values

similar to: summaryBy: transformed variable on RHS of formula?