search for: ddply

Displaying 20 results from an estimated 511 matches for "ddply".

2010 Apr 07
1
unexpected behaviour with ddply and colwise
Hi, I am confused by results from: > ddply(aa, names(aa), colwise(sum)) I thought ddply was just calling colwise(sum)() with each column. However ddply() returns a 13 x 5 result !! The general result I expected is similar to that of apply() , or using colwise(sum)() alone. Shouldn't ddply() produce the same ? Thanks in...
2010 Dec 06
3
[plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function
Dear R-Helpers: I am using trying to use *ddply* to extract min and max of a particular column in a data.frame. I am using two different forms of the function: ## var_name_to_split is a string -- something like "var1" which is the name of a column in data.frame ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , ma...
2011 Jun 21
4
ddply to count frequency of combinations
I have a dataframe df with two columns x and y. I want to count the number of times a unique x, y combination occurs. For example x<- c(1,2,3,4,5,1,2,3,4) y<- c(1,2,3,4,5,1,2,4,1) df<-as.data.frame(cbind(x, y)) #what is the correct way to use ddply for this example? ddply(df, c('x','y', summarize, ??) #desired output -- format and order doesn't matter # (x, y) count #-------------------- # (1, 1) 2 # (2, 2) 2 # (3, 3) 1 # (4, 4) 1 # (5, 5) 1 # (2, 3) 1 # (3, 4) 1 # (4, 1) 1 [[alternative HTML version deleted]]
2009 Nov 19
1
ddply function nesting problems
While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it.? I've tried several approahces, but neither worked and I need to have the ability to include the "cut", "range", and "fullseq" methods within ddply.? (For a bit of that explanation refer t...
2011 May 11
3
ddply with mean and max...
I'm trying to use ddply to compute summary statistics for many variables splitting on the variable site. however, it seems to work fine for mean() but if i use max() or min() things fall apart. whats going on? test.set<-data.frame(site=1:10,x=.Random.seed[1:100],y=rnorm(100)) means<-ddply(test.set,.(site),mean)...
2011 Aug 24
3
ddply from plyr package - any alternatives?
Hello everyone, I was asked to repost this again, sorry for any inconvenience. I'm looking replacement for ddply function from plyr package. Function allows to apply function by category stored in any column/columns. Regular loops or lapplys slow down greatly because my unique combination count exceeds 9000. Is there any available solution which allow me to apply function by category? currently my code lo...
2012 May 29
2
a question about "by" and "ddply"
...ntinuous variables (age and weight) and I also have a grouping variable (group, with two levels). I want to run correlations for each group separately (kind of similar to "split file" in SPSS). I've been experimenting with different functions, and I was able to do this correctly using ddply function, but output is a little bit difficult to read when I do the cor.test to get all the data with p values, df, and pearson r (see below). I also tried to do it with by function. Although, with by, it shows the data for two groups separately, it seems like it calculates the same r for both gro...
2012 Sep 06
1
use of ddply() within function
Dear all, I am encountering problems with the application of ddply within the body of a self-defined function. The script is the following: moncostcarmoto <- function(costtype){ costaux_result <- data.frame() for (purp in PURPcount){for (per in PERcount){ costcarin = paste(c("CS_",co...
2010 Feb 03
1
Calculating subsets "on the fly" with ddply
...w this really should be done. Essentially, I'd like to compute some summary statistics on grouped subsets of data. So, for iris data, let me try to take the mean of the Petal.Width on subsets of data as grouped by: ("some range" of sepal.length, and species). The "normal" ddply invocation would look like so: R> my <- ddply(iris, .(w=Sepal.Length < 5.5, Species), transform, grmean=mean(Petal.Width)) R> head(my) w Sepal.Length Sepal.Width Petal.Length Petal.Width Species grmean 1 FALSE 5.8 4.0 1.2 0.2 setosa 0.26...
2010 Jun 01
1
data frame manipulation ddply
...t;, "14.9200", "14.9200", "14.9200", "14.9200" )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame") Here is the line I pass : >PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise, POSITION= sum(QUANTITY))[,c(1,3,2)] And here the result : PosFut <- structure(list(DESCRIPTION = structure(1:3, .Label = c("CORN Jul/10", "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10"), class = &...
2010 Sep 22
2
speeding up regressions using ddply
Hi, I have a data set that I'd like to run logistic regressions on, using ddply to speed up the computation of many models with different combinations of variables. I would like to run regressions on every unique two-variable combination in a portion of my data set, but I can't quite figure out how to do using ddply. The data set looks like this, with "stat...
2012 Jul 24
1
Function for ddply
...inning to learn to write functions. I know I'm out of my depth posting here, and I'm sure my issue is mundane. But here goes. I'm analyzing the American National Election Study (nes), looking at mean values of a numeric dep_var (environ.therm) across values of a factor (partyid3). I use ddply from plyr and wtd.mean from Hmisc. The nes requires a weight var (wt). I use Rcmdr's plotMeans to obtain a line chart. The following code works: attach(nes) obj1 = ddply(nes, .(partyid3), summarise, var = wtd.mean(environ.therm, wt)) print(obj1) plotMeans(obj1$var, obj1$partyid3, error.bars=...
2012 May 05
1
Correct use of ddply with own function
Hi, I am really confused how ddply work, so maybe you can help me. I created a function that sorts a vector etc. fn <- function(x){ x1 <- sort(x) x2 <- seq(length(x)) x3 <- x2/max(x2) df <- data.frame(x1,x2,x3) df } Probably this is not the best form of the function, but at least it produces what I want (data...
2012 Jan 27
3
Subsetting for the ten highest values by group in a dataframe
...l this occurs within some factor levels. ## I've used plyr here but I'm not married to this approach require(plyr) ## I've created a data.frame with two groups and then a id variable (y) df <- data.frame(x=rnorm(400, mean=20), y=1:400, z=c("A","B")) ## So using ddply I can find the highest value of x df.max1 <- ddply(df, c("z"), subset, x==sort(x, TRUE)[1]) ## Or the 2nd highest value df.max2 <- ddply(df, c("z"), subset, x==sort(x, TRUE)[2]) ## And so on.... but when I try to make a series of numbers like so ## to get the top ten val...
2012 Jan 17
1
New PLYR issue
...Using R 2.14.1, plyr 1.7.1, R.Studio 0.94.110, Windows XP The plyr mailing list does not provide any help until now. >require(plyr) >c(sample(c(1:100), 50, replace=TRUE))->V1 >c(rep( 1:5, 10))->f1 #variable to group V1 >data.frame(cbind(V1, f1))->DF >str(DF) >ddply(DF$V1, DF$f1, "sd") >ddply(.(DF$V1), .(DF$f1), "sd") />Error in if (empty(.data)) return(.data) : / /missing value where TRUE/FALSE needed / /Thanks everyone, / //// [[alternative HTML version deleted]]
2011 Apr 25
2
Problem with ddply in the plyr-package: surprising output of a date-column
Hi Together, I have a problem with the plyr package - more precisely with the ddply function - and would be very grateful for any help. I hope the example here is precise enough for someone to identify the problem. Basically, in this step I want to identify observations that are identical in terms of certain identifiers (ID1, ID2, ID3) and just want to save those observations (in...
2010 Apr 14
6
sum specific rows in a data frame
I have a data frame called "pose": DESCRIPTION QUANITY CLOSING.PRICE 1 WHEAT May/10 1 467.75 2 WHEAT May/10 2 467.75 3 WHEAT May/10 1 467.75 4 WHEAT May/10 1 467.75 5 COTTON NO.2 May/10 1 78.13 6 COTTON NO.2 May/10 3 78.13 7 COTTON NO.2 May/10 1 78.13
2011 Apr 13
1
error for ttest
Hello all, I have arranged my data as per Dennis's suggestion in this post http://www.mail-archive.com/r-help at r-project.org/msg107156.html. the posted code works fine but when I try to apply it to my data, i get "> u2 <- ddply(xxm, .(plateid, cytokine), as.data.frame.function(f)) Error in t.test.formula(conc ~ Self_T1D, data = df, na.rm = T) : grouping factor must have exactly 2 levels". Self_T1D has two levels "N" and "Y" I have used the ddply function to do the mean and sd for the same data...
2011 Aug 23
3
ddply - how to transform df column "in place"
...success. Given: d<- data.frame(cbind(x=1,y=seq(20100801,20100830,1))) names(d)<-c("first", "daterep") d2<-d # I can convert the daterep column in place the classic way: d$daterep<-as.Date(strptime(d$daterep, format="%Y%m%d")) # How to do it the plyr way? ddply(d2, c("daterep"), function(df){as.Date(df, format="%Y%m%d")}) # returns: Error in as.Date.default(df, format = "%Y%m%d") : # do not know how to convert 'df' to class "Date" Thanks for any hints, ---jean -- View this message in context: http://r....
2009 Aug 05
2
using ddply but preserving some of the outside data
I have a bit of a quandy. I'm working with a data set for which I have sampled sites at a variety of dates. I want to use this data, and get a running average of the sampled values for the current and previous date. I originally thought something like ddply would be ideal for this, however, I cannot break up my data by date, and then apply a function that requires information about the previous dates. I had thought to use a for loop and merge, but that doesn't quite seem to be working. So, my questions are twofold 1) Is there a way to use...