thr3ads.net - search: "ddply"

Displaying 20 results from an estimated 515 matches for "ddply".

unexpected behaviour with ddply and colwise

2010 Apr 07

unexpected behaviour with ddply and colwise

Hi, I am confused by results from: > ddply(aa, names(aa), colwise(sum)) I thought ddply was just calling colwise(sum)() with each column. However ddply() returns a 13 x 5 result !! The general result I expected is similar to that of apply() , or using colwise(sum)() alone. Shouldn't ddply() produce the same ? Thanks in...

[plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function

2010 Dec 06

[plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function

Dear R-Helpers: I am using trying to use *ddply* to extract min and max of a particular column in a data.frame. I am using two different forms of the function: ## var_name_to_split is a string -- something like "var1" which is the name of a column in data.frame ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , ma...

ddply to count frequency of combinations

2011 Jun 21

ddply to count frequency of combinations

I have a dataframe df with two columns x and y. I want to count the number of times a unique x, y combination occurs. For example x<- c(1,2,3,4,5,1,2,3,4) y<- c(1,2,3,4,5,1,2,4,1) df<-as.data.frame(cbind(x, y)) #what is the correct way to use ddply for this example? ddply(df, c('x','y', summarize, ??) #desired output -- format and order doesn't matter # (x, y) count #-------------------- # (1, 1) 2 # (2, 2) 2 # (3, 3) 1 # (4, 4) 1 # (5, 5) 1 # (2, 3) 1 # (3, 4) 1 # (4, 1) 1 [[alternative HTML version deleted]]

ddply function nesting problems

2009 Nov 19

ddply function nesting problems

While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it.? I've tried several approahces, but neither worked and I need to have the ability to include the "cut", "range", and "fullseq" methods within ddply.? (For a bit of that explanation refer t...

ddply with mean and max...

2011 May 11

ddply with mean and max...

I'm trying to use ddply to compute summary statistics for many variables splitting on the variable site. however, it seems to work fine for mean() but if i use max() or min() things fall apart. whats going on? test.set<-data.frame(site=1:10,x=.Random.seed[1:100],y=rnorm(100)) means<-ddply(test.set,.(site),mean)...

ddply from plyr package - any alternatives?

2011 Aug 24

ddply from plyr package - any alternatives?

Hello everyone, I was asked to repost this again, sorry for any inconvenience. I'm looking replacement for ddply function from plyr package. Function allows to apply function by category stored in any column/columns. Regular loops or lapplys slow down greatly because my unique combination count exceeds 9000. Is there any available solution which allow me to apply function by category? currently my code lo...

a question about "by" and "ddply"

2012 May 29

a question about "by" and "ddply"

...ntinuous variables (age and weight) and I also have a grouping variable (group, with two levels). I want to run correlations for each group separately (kind of similar to "split file" in SPSS). I've been experimenting with different functions, and I was able to do this correctly using ddply function, but output is a little bit difficult to read when I do the cor.test to get all the data with p values, df, and pearson r (see below). I also tried to do it with by function. Although, with by, it shows the data for two groups separately, it seems like it calculates the same r for both gro...

use of ddply() within function

2012 Sep 06

use of ddply() within function

Dear all, I am encountering problems with the application of ddply within the body of a self-defined function. The script is the following: moncostcarmoto <- function(costtype){ costaux_result <- data.frame() for (purp in PURPcount){for (per in PERcount){ costcarin = paste(c("CS_",co...

Calculating subsets "on the fly" with ddply

2010 Feb 03

Calculating subsets "on the fly" with ddply

...w this really should be done. Essentially, I'd like to compute some summary statistics on grouped subsets of data. So, for iris data, let me try to take the mean of the Petal.Width on subsets of data as grouped by: ("some range" of sepal.length, and species). The "normal" ddply invocation would look like so: R> my <- ddply(iris, .(w=Sepal.Length < 5.5, Species), transform, grmean=mean(Petal.Width)) R> head(my) w Sepal.Length Sepal.Width Petal.Length Petal.Width Species grmean 1 FALSE 5.8 4.0 1.2 0.2 setosa 0.26...

data frame manipulation ddply

2010 Jun 01

data frame manipulation ddply

...t;, "14.9200", "14.9200", "14.9200", "14.9200" )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame") Here is the line I pass : >PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise, POSITION= sum(QUANTITY))[,c(1,3,2)] And here the result : PosFut <- structure(list(DESCRIPTION = structure(1:3, .Label = c("CORN Jul/10", "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10"), class = &...

speeding up regressions using ddply

2010 Sep 22

speeding up regressions using ddply

Hi, I have a data set that I'd like to run logistic regressions on, using ddply to speed up the computation of many models with different combinations of variables. I would like to run regressions on every unique two-variable combination in a portion of my data set, but I can't quite figure out how to do using ddply. The data set looks like this, with "stat...

Function for ddply

2012 Jul 24

Function for ddply

...inning to learn to write functions. I know I'm out of my depth posting here, and I'm sure my issue is mundane. But here goes. I'm analyzing the American National Election Study (nes), looking at mean values of a numeric dep_var (environ.therm) across values of a factor (partyid3). I use ddply from plyr and wtd.mean from Hmisc. The nes requires a weight var (wt). I use Rcmdr's plotMeans to obtain a line chart. The following code works: attach(nes) obj1 = ddply(nes, .(partyid3), summarise, var = wtd.mean(environ.therm, wt)) print(obj1) plotMeans(obj1$var, obj1$partyid3, error.bars=...

Correct use of ddply with own function

2012 May 05

Correct use of ddply with own function

Hi, I am really confused how ddply work, so maybe you can help me. I created a function that sorts a vector etc. fn <- function(x){ x1 <- sort(x) x2 <- seq(length(x)) x3 <- x2/max(x2) df <- data.frame(x1,x2,x3) df } Probably this is not the best form of the function, but at least it produces what I want (data...

Subsetting for the ten highest values by group in a dataframe

2012 Jan 27

Subsetting for the ten highest values by group in a dataframe

...l this occurs within some factor levels. ## I've used plyr here but I'm not married to this approach require(plyr) ## I've created a data.frame with two groups and then a id variable (y) df <- data.frame(x=rnorm(400, mean=20), y=1:400, z=c("A","B")) ## So using ddply I can find the highest value of x df.max1 <- ddply(df, c("z"), subset, x==sort(x, TRUE)[1]) ## Or the 2nd highest value df.max2 <- ddply(df, c("z"), subset, x==sort(x, TRUE)[2]) ## And so on.... but when I try to make a series of numbers like so ## to get the top ten val...

New PLYR issue

2012 Jan 17

New PLYR issue

...Using R 2.14.1, plyr 1.7.1, R.Studio 0.94.110, Windows XP The plyr mailing list does not provide any help until now. >require(plyr) >c(sample(c(1:100), 50, replace=TRUE))->V1 >c(rep( 1:5, 10))->f1 #variable to group V1 >data.frame(cbind(V1, f1))->DF >str(DF) >ddply(DF$V1, DF$f1, "sd") >ddply(.(DF$V1), .(DF$f1), "sd") />Error in if (empty(.data)) return(.data) : / /missing value where TRUE/FALSE needed / /Thanks everyone, / //// [[alternative HTML version deleted]]

Problem with ddply in the plyr-package: surprising output of a date-column

2011 Apr 25

Problem with ddply in the plyr-package: surprising output of a date-column

Hi Together, I have a problem with the plyr package - more precisely with the ddply function - and would be very grateful for any help. I hope the example here is precise enough for someone to identify the problem. Basically, in this step I want to identify observations that are identical in terms of certain identifiers (ID1, ID2, ID3) and just want to save those observations (in...

sum specific rows in a data frame

2010 Apr 14

sum specific rows in a data frame

I have a data frame called "pose": DESCRIPTION QUANITY CLOSING.PRICE 1 WHEAT May/10 1 467.75 2 WHEAT May/10 2 467.75 3 WHEAT May/10 1 467.75 4 WHEAT May/10 1 467.75 5 COTTON NO.2 May/10 1 78.13 6 COTTON NO.2 May/10 3 78.13 7 COTTON NO.2 May/10 1 78.13

error for ttest

2011 Apr 13

error for ttest

Hello all, I have arranged my data as per Dennis's suggestion in this post http://www.mail-archive.com/r-help at r-project.org/msg107156.html. the posted code works fine but when I try to apply it to my data, i get "> u2 <- ddply(xxm, .(plateid, cytokine), as.data.frame.function(f)) Error in t.test.formula(conc ~ Self_T1D, data = df, na.rm = T) : grouping factor must have exactly 2 levels". Self_T1D has two levels "N" and "Y" I have used the ddply function to do the mean and sd for the same data...

ddply - how to transform df column "in place"

2011 Aug 23

ddply - how to transform df column "in place"

...success. Given: d<- data.frame(cbind(x=1,y=seq(20100801,20100830,1))) names(d)<-c("first", "daterep") d2<-d # I can convert the daterep column in place the classic way: d$daterep<-as.Date(strptime(d$daterep, format="%Y%m%d")) # How to do it the plyr way? ddply(d2, c("daterep"), function(df){as.Date(df, format="%Y%m%d")}) # returns: Error in as.Date.default(df, format = "%Y%m%d") : # do not know how to convert 'df' to class "Date" Thanks for any hints, ---jean -- View this message in context: http://r....

using ddply but preserving some of the outside data

2009 Aug 05

using ddply but preserving some of the outside data

I have a bit of a quandy. I'm working with a data set for which I have sampled sites at a variety of dates. I want to use this data, and get a running average of the sampled values for the current and previous date. I originally thought something like ddply would be ideal for this, however, I cannot break up my data by date, and then apply a function that requires information about the previous dates. I had thought to use a for loop and merge, but that doesn't quite seem to be working. So, my questions are twofold 1) Is there a way to use...

search for: ddply