thr3ads.net - similar to: "ddply to count frequency of combinations"

Displaying 20 results from an estimated 10000 matches similar to: "ddply to count frequency of combinations"

2012 Sep 06

use of ddply() within function

Dear all, I am encountering problems with the application of ddply within the body of a self-defined function. The script is the following: moncostcarmoto <- function(costtype){ costaux_result <- data.frame() for (purp in PURPcount){for (per in PERcount){ costcarin =

unexpected behaviour with ddply and colwise

2010 Apr 07

unexpected behaviour with ddply and colwise

Hi, I am confused by results from: > ddply(aa, names(aa), colwise(sum)) I thought ddply was just calling colwise(sum)() with each column. However ddply() returns a 13 x 5 result !! The general result I expected is similar to that of apply() , or using colwise(sum)() alone. Shouldn't ddply() produce the same ? Thanks in advance for your help, - Stuart Andrews >

ddply from plyr package - any alternatives?

2011 Aug 24

ddply from plyr package - any alternatives?

Hello everyone, I was asked to repost this again, sorry for any inconvenience. I'm looking replacement for ddply function from plyr package. Function allows to apply function by category stored in any column/columns. Regular loops or lapplys slow down greatly because my unique combination count exceeds 9000. Is there any available solution which allow me to apply function by category?

count() function

2012 Apr 05

count() function

I keep expecting R to have something analogous to the =count function in Excel, but I can't find anything. I simply want to count the data for a given category. I've been using the ddply() function in the plyr package to summarize means and st dev of my data, with this code: ddply(NZ_Conifers,.(ElevCat, DataSource, SizeClass), summarise, avgDensity=mean(Density),

summarize dataframe based on multiple cols, not their combinations

2013 Mar 20

summarize dataframe based on multiple cols, not their combinations

Hi folks, I'm trying to figure out how to get summarized data based on multiple columns. However, instead of giving summaries for every combination of categorical columns, I want it for each value of each categorical column regardless of the other columns. I could do this with three different commands, but i'm wondering if there's a more elegant way that I'm missing. Thanks!

[plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function

2010 Dec 06

[plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function

Dear R-Helpers: I am using trying to use *ddply* to extract min and max of a particular column in a data.frame. I am using two different forms of the function: ## var_name_to_split is a string -- something like "var1" which is the name of a column in data.frame ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , max(x[ , 3]))) ## fails with an error - case 1 ddply(

ddply function nesting problems

2009 Nov 19

ddply function nesting problems

While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it.? I've tried several approahces, but neither worked and I need to have the ability to include the "cut", "range", and "fullseq" methods within ddply.? (For a bit of that explanation refer to

plot(m, which = 1), where m is a lm linear model. What is 'which' doing?

2011 Sep 16

plot(m, which = 1), where m is a lm linear model. What is 'which' doing?

Sample code from *R CookBook* (awesome book btw) *11.12: Finding the Best Power Transformation (Box-Cox) Procedure* require(MASS) x <- 10:100 eps <- rnorm(length(x), sd = 5) y <- (x + eps)^(-1 / 1.5) m <- lm(y ~ x) # ***************** What does the *which* in this line do??? ******************************************** plot(m, which = 1) [[alternative HTML version deleted]]

Calculating subsets "on the fly" with ddply

2010 Feb 03

Calculating subsets "on the fly" with ddply

Hi, [I sent this to the plyr mailing list (late) last night, but it seems to be lost in the moderation queue, so here's a shot to the broadeR community] Apologies in advance for being more verbose than necessary, but I'm not even sure how to ask this question in the context of plyr, so ... here goes. As meaningless as this might be to do with the `iris` data, the spirit of it is what

ddply with mean and max...

2011 May 11

ddply with mean and max...

I'm trying to use ddply to compute summary statistics for many variables splitting on the variable site. however, it seems to work fine for mean() but if i use max() or min() things fall apart. whats going on? test.set<-data.frame(site=1:10,x=.Random.seed[1:100],y=rnorm(100)) means<-ddply(test.set,.(site),mean) means site x y 1 1 -97459496 -0.14826303 2

data frame manipulation ddply

2010 Jun 01

data frame manipulation ddply

Dear group, Here is my data frame: futures <- structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10", "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11

a question about "by" and "ddply"

2012 May 29

a question about "by" and "ddply"

Hi all, I have a data set (df, n=10 for the sake of simplicity here) where I have two continuous variables (age and weight) and I also have a grouping variable (group, with two levels). I want to run correlations for each group separately (kind of similar to "split file" in SPSS). I've been experimenting with different functions, and I was able to do this correctly using ddply

Correct use of ddply with own function

2012 May 05

Correct use of ddply with own function

Hi, I am really confused how ddply work, so maybe you can help me. I created a function that sorts a vector etc. fn <- function(x){ x1 <- sort(x) x2 <- seq(length(x)) x3 <- x2/max(x2) df <- data.frame(x1,x2,x3) df } Probably this is not the best form of the function, but at least it produces what I want (data to plot a cumulative count curve). This function works on a

Using ddply within a function by argument transfer

2012 Mar 03

Using ddply within a function by argument transfer

An embedded and charset-unspecified text was scrubbed... Name: inte tillg?nglig URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120303/a62e41f2/attachment.pl>

Function for ddply

2012 Jul 24

Function for ddply

Hello, all. I'm new to R and just beginning to learn to write functions. I know I'm out of my depth posting here, and I'm sure my issue is mundane. But here goes. I'm analyzing the American National Election Study (nes), looking at mean values of a numeric dep_var (environ.therm) across values of a factor (partyid3). I use ddply from plyr and wtd.mean from Hmisc. The nes requires a

Apply pmax to dataframe with different args based on dataframe factor

2012 Feb 09

Apply pmax to dataframe with different args based on dataframe factor

# I have a dataframe in the following form: track <- c(rep('A', 3), rep('B', 4), rep('C', 4)) value <- c(0.15, 0.25, 0.35, 0.05, 0.99, 0.32, 0.13, 0.80, 0.75, 0.60, 0.44) df <- data.frame(track=factor(track), value=value) #> print(df) #track value #1 A 0.15 #2 A 0.25 #3 A 0.35 #4 B 0.05 #5 B 0.99 #6 B 0.32 #7 B 0.13

speeding up regressions using ddply

2010 Sep 22

speeding up regressions using ddply

Hi, I have a data set that I'd like to run logistic regressions on, using ddply to speed up the computation of many models with different combinations of variables. I would like to run regressions on every unique two-variable combination in a portion of my data set, but I can't quite figure out how to do using ddply. The data set looks like this, with "status" as

Vim-R-Plugin issue : Python interface must be enabled to run Vim-R-Plugin

2011 Jun 18

Vim-R-Plugin issue : Python interface must be enabled to run Vim-R-Plugin

I am trying to get the Vim-R-Plugin<http://www.vim.org/scripts/script.php?script_id=2628> to work with gvim and R on Windows 7. When I open a .R file in VIM, it complains and says ""Python interface must be enabled to run Vim-R-Plugin." I have installed pywin32 for python 2.7, and added the following 4 lines to my _vimrc per the instructions

help in ddply

2012 Apr 03

help in ddply

Hi I've records like this df= x panel 4 1 93 2 21 3 83 4 75 1 87 2 87 3 78 4 50 1 76 2 86 3 65 4 84 1 40 2 39 3 26 4 i want to create histogram out of it . i want all the mid and count values for panel wise my code is histoutput = ddply(df,.(df[2]),hist) i'm not able to get the required result. please help me using for loop takes a lot of time if there are more records ----- Thanks

using ddply but preserving some of the outside data

2009 Aug 05

using ddply but preserving some of the outside data

I have a bit of a quandy. I'm working with a data set for which I have sampled sites at a variety of dates. I want to use this data, and get a running average of the sampled values for the current and previous date. I originally thought something like ddply would be ideal for this, however, I cannot break up my data by date, and then apply a function that requires information

similar to: ddply to count frequency of combinations