similar to: summaryBy: transformed variable on RHS of formula?

Displaying 20 results from an estimated 5000 matches similar to: "summaryBy: transformed variable on RHS of formula?"

2010 Mar 17
2
Using nrow with summaryBy
Hello Everyone- I'm calculating summary statistics on a dataset (~4000 records, observations are not uniformly distributed) using summaryBy and trying to add a column with the number of observations to the output as well. What occurs to me is to use nrow(), but this doesn't appear to be working I'm able to replicate the same results with an example from the summaryBy docs:
2011 Jan 17
2
Using summaryBy with weighted data
Dear Soren and R users: I am trying to use the summaryBy function with weights. Is this possible? An example that illustrates what I am trying to do follows: library(doBy) ## make up some data response = rnorm(100) group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20)) weights = runif(100, 0, 1) mydata = data.frame(response,group,weights) ## run summaryBy without weights:
2009 Sep 04
1
Apparent bug in summaryBy (PR#13941)
Full_Name: Marc Paterno Version: 2.9.2 OS: Mac OS X 10.5.8 Submission from: (NULL) (99.53.212.55) summaryBy() produces incorrect results when given some data frames. Below is a transcript of a session showing the result, in a data frame with 2 observations of 2 variables. ------------------- thomas:999 paterno$ R --vanilla R version 2.9.2 (2009-08-24) Copyright (C) 2009 The R Foundation for
2006 Dec 05
1
summaryBy(): Is it the best option?
Hi, since I have quite large tables and the processing takes quite a while I am curious if I can improve the performance of this aggregation somehow: At the moment I am using summaryBy from the doBy package under R 2.4.0, Win2K. summaryBy(soc_s6aq5 + soc_s6aq7 + soc_s6aq9 + soc_s6aq11 ~ hh + comgroup,soc6a,postfix=c("","","",""),FUN=sum, na.rm=T) The
2007 Feb 15
1
Problem in summaryBy
The R script below gives values of 1 for all minimum values when I use a custom function in summaryBy. I get the correct values when I use FUN=min directly. Any help is much appreciated. The continuous information provided in this forum is fabulous as are the different R packages available. Rene # Simulated simplified data Subj <- rep(1:4, each=6) Analyte <-
2012 Nov 07
3
extract indep vars from formula
Hello, I'm trying to extract the independent variables from a formula. The closest I've been able to come, aside from rolling my own, is the following: > a = y ~ b * x > attr(terms(formula(a)),"variables") The reason I'm doing this is that I'm building a grid of points that I use to construct a 3-d model prediction surface in rgl. If there are more than two
2013 Jan 17
3
how to use "..."
Dear users, I'm trying to learn how to use the "...". I have written a function (simplified here) that uses doBy::summaryBy(): # 'dat' is a data.frame from which the aggregation is computed # 'vec_cat' is a integer vector defining which columns of the data.frame should be use on the right side of the formula # 'stat_fun' is the function that will be run to
2009 Aug 12
2
Symbolic references - passing variable names into functions
Hello All, I am trying to write a function which would operate on columns of a dataframe specified in parameters passed to that function. f = function(dataf, col1 = "column1", col2 = "column2") { dataf$col1 = dataf$col2 # just as an example } The above, of course, does not work as intended. In some languages one can force evaluation of a variable, and then
2012 Sep 13
1
remove all terms with interaction factor in formula
Hi Folks, I'm trying to find a way to remove all terms in a formula that contain a particular interaction. For example, in the formula below, I'd like to remove all terms that contain the b:c interaction. > attributes(terms( ~ a*b*c*d))$term.labels [1] "a" "b" "c" "d" "a:b" "a:c" [7]
2013 Mar 20
3
summarize dataframe based on multiple cols, not their combinations
Hi folks, I'm trying to figure out how to get summarized data based on multiple columns. However, instead of giving summaries for every combination of categorical columns, I want it for each value of each categorical column regardless of the other columns. I could do this with three different commands, but i'm wondering if there's a more elegant way that I'm missing. Thanks!
2011 Jun 05
3
reproduction archives
Hello Folks, As some of my old code broke when an updated package changed its interface, I started thinking about reproduction of analyses. It's not good enough to save our code - we have to save the package versions those analyses used as well as the R-core. I saw a couple references to "reproduction archives" around, but nothing specific. Is there any good way to package up
2012 Oct 18
3
bigmemory for dataframes?
Hi Folks, I've been bumping my head against the 4GB limit for 32-bit R. I can't go to 64-bit R due to package compatibility issues (ROBDC - possible but painful, xlsReadWrite - not possible, and others). I have a number of big dataframes whose columns all sorts of data types - factor, character, integer, etc. I run and save models that keep copies of the modeled data inside the model
2012 Nov 02
2
backreferences in gregexpr
Hi Folks, I'm trying to extract just the backreferences from a regex. > temp = "abcd1234abcd1234" > regmatches(temp, gregexpr("(?:abcd)(1234)", temp)) [[1]] [1] "abcd1234" "abcd1234" What I would like is: [1] "1234" "1234" Note: I know I can just match 1234 here, but the actual example is complicated enough that I have to
2010 Nov 17
3
translate vector of numbers to indicies of 0/1 matrix
Hello All, Searched around, haven't found a decent solution. I'd like to translate a vector of numbers to a matrix (or to a list of vectors) such that the vector values would serve as indicies of the 1's in an otherwise-zero-filled matrix (or list of vectors). For example: > a = c(1,3,3,4) # perform operation [,1] [,2] [,3] [,4] [1,] 1 0 0 0 [2,] 0 0 1
2011 Jun 22
2
strange date problem - May 3, 1992 is NA
> is.na(strptime("5/2/1992", format="%m/%d/%Y")) [1] FALSE > is.na(strptime("5/3/1992", format="%m/%d/%Y")) [1] TRUE Any idea what's going on with this? Running strptime against all dates from around 1946, only 5/3/1992 was converted as "NA". Even stranger, it still seems to have a value associated with it (even though is.na thinks
2012 Apr 07
3
How do Sweave users collaborate with Word users?
Hello All, I'm getting my workflow switched over to Sweave, which is very cool. However, I collaborate with folks (as many of you must as well) who use Word to Track Changes amongst a group while crafting a paper. In the simplest case, there will just be two people (one Sweave user and one Word user) editing a paper. I'm wondering, how do Sweave users go about this? I could convert a
2012 May 21
2
sweave tables as images?
Hello folks, I've been on a journey trying to figure out how to manage documents that are amenable to sharing and editing, but that contain dynamic content generated by R. I've come to the following solution: I use Sweave to generate labeled png & pdf figures, and I "Insert & Link" those figures as "Pictures" in a Word 2010 doc. Thus, when data or code
2007 Aug 20
1
Problem mit summaryBy: Group sums gives me "incorrectly" zero for one variable
Hi, first I want to thank all of you for the quick aid which is provided here on the list during all times. Thanks a lot for that! Then, I have a problem using summaryBy which most probably is a problem of wrong use by me or the like: I use this command: summaryBy(total+total.inf~gr, aE, FUN=sum) where aE is a > str(aE) 'data.frame': 127880 obs. of 16 variables: $ gr
2012 Mar 07
2
dot products
Hello, I need to take a dot product of each row of a dataframe and a vector. The number of columns will be dynamic. The way I've been doing it so far is contorted. Is there a better way? dotproduct <- function(dataf, v2) { apply(t(t(as.matrix(a)) * v2),1,sum) #contorted! } df = data.frame(a=c(1,2,3),b=c(4,5,6)) vec = c(4,5) dotproduct(df, vec) thanks,
2012 May 15
0
Indexing in summaryBy
I'm trying to use a self-written function with the summaryBy function (doBy package). I have lots of data from Monte Carlo experiments comparing different estimators across different (combinations of) parameter values, similar to the following form: colnames(mydata) <- c("X", "b0", "b1", # parameter combination, corresponding (true) parameter values