Displaying 20 results from an estimated 515 matches for "ddply".
2010 Apr 07
1
unexpected behaviour with ddply and colwise
Hi,
I am confused by results from:
> ddply(aa, names(aa), colwise(sum))
I thought ddply was just calling colwise(sum)() with each column.
However ddply() returns a 13 x 5 result !!
The general result I expected is similar to that of apply() , or
using colwise(sum)() alone. Shouldn't ddply() produce the same ?
Thanks in...
2010 Dec 06
3
[plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function
Dear R-Helpers:
I am using trying to use *ddply* to extract min and max of a particular
column in a data.frame. I am using two different forms of the function:
## var_name_to_split is a string -- something like "var1" which is the name
of a column in data.frame
ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , ma...
2011 Jun 21
4
ddply to count frequency of combinations
I have a dataframe df with two columns x and y. I want to count the number
of times a unique x, y combination occurs.
For example
x<- c(1,2,3,4,5,1,2,3,4)
y<- c(1,2,3,4,5,1,2,4,1)
df<-as.data.frame(cbind(x, y))
#what is the correct way to use ddply for this example?
ddply(df, c('x','y', summarize, ??)
#desired output -- format and order doesn't matter
# (x, y) count
#--------------------
# (1, 1) 2
# (2, 2) 2
# (3, 3) 1
# (4, 4) 1
# (5, 5) 1
# (2, 3) 1
# (3, 4) 1
# (4, 1) 1
[[alternative HTML version deleted]]
2009 Nov 19
1
ddply function nesting problems
While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it.? I've tried several approahces, but neither worked and I need to have the ability to include the "cut", "range", and "fullseq" methods within ddply.? (For a bit of that explanation refer t...
2011 May 11
3
ddply with mean and max...
I'm trying to use ddply to compute summary statistics for many variables
splitting on the variable site. however, it seems to work fine for mean() but
if i use max() or min() things fall apart. whats going on?
test.set<-data.frame(site=1:10,x=.Random.seed[1:100],y=rnorm(100))
means<-ddply(test.set,.(site),mean)...
2011 Aug 24
3
ddply from plyr package - any alternatives?
Hello everyone,
I was asked to repost this again, sorry for any inconvenience.
I'm looking replacement for ddply function from plyr package.
Function allows to apply function by category stored in any column/columns.
Regular loops or lapplys slow down greatly because my unique combination
count exceeds 9000. Is there any available solution which allow me to apply
function by category?
currently my code lo...
2012 May 29
2
a question about "by" and "ddply"
...ntinuous variables (age and weight) and I also have a grouping variable (group, with two levels). I want to run correlations for each group separately (kind of similar to "split file" in SPSS). I've been experimenting with different functions, and I was able to do this correctly using ddply function, but output is a little bit difficult to read when I do the cor.test to get all the data with p values, df, and pearson r (see below). I also tried to do it with by function. Although, with by, it shows the data for two groups separately, it seems like it calculates the same r for both gro...
2012 Sep 06
1
use of ddply() within function
Dear all,
I am encountering problems with the application of ddply within the body of a self-defined function.
The script is the following:
moncostcarmoto <- function(costtype){
costaux_result <- data.frame()
for (purp in PURPcount){for (per in PERcount){
costcarin = paste(c("CS_",co...
2010 Feb 03
1
Calculating subsets "on the fly" with ddply
...w this really should be done.
Essentially, I'd like to compute some summary statistics on grouped
subsets of data. So, for iris data, let me try to take the mean of the
Petal.Width on subsets of data as grouped by:
("some range" of sepal.length, and species).
The "normal" ddply invocation would look like so:
R> my <- ddply(iris, .(w=Sepal.Length < 5.5, Species), transform,
grmean=mean(Petal.Width))
R> head(my)
w Sepal.Length Sepal.Width Petal.Length Petal.Width Species grmean
1 FALSE 5.8 4.0 1.2 0.2 setosa 0.26...
2010 Jun 01
1
data frame manipulation ddply
...t;, "14.9200", "14.9200", "14.9200", "14.9200"
)), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
"SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
Here is the line I pass :
>PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise, POSITION=
sum(QUANTITY))[,c(1,3,2)]
And here the result :
PosFut <-
structure(list(DESCRIPTION = structure(1:3, .Label = c("CORN Jul/10",
"LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10"), class = &...
2010 Sep 22
2
speeding up regressions using ddply
Hi,
I have a data set that I'd like to run logistic regressions on, using
ddply to speed up the computation of many models with different
combinations of variables. I would like to run regressions on every
unique two-variable combination in a portion of my data set, but I
can't quite figure out how to do using ddply. The data set looks like
this, with "stat...
2012 Jul 24
1
Function for ddply
...inning to learn to write functions. I
know I'm out of my depth posting here, and I'm sure my issue is mundane.
But here goes. I'm analyzing the American National Election Study (nes),
looking at mean values of a numeric dep_var (environ.therm) across values
of a factor (partyid3). I use ddply from plyr and wtd.mean from Hmisc. The
nes requires a weight var (wt). I use Rcmdr's plotMeans to obtain a line
chart. The following code works:
attach(nes)
obj1 = ddply(nes, .(partyid3), summarise,
var = wtd.mean(environ.therm, wt))
print(obj1)
plotMeans(obj1$var, obj1$partyid3, error.bars=...
2012 May 05
1
Correct use of ddply with own function
Hi,
I am really confused how ddply work, so maybe you can help me.
I created a function that sorts a vector etc.
fn <- function(x){
x1 <- sort(x)
x2 <- seq(length(x))
x3 <- x2/max(x2)
df <- data.frame(x1,x2,x3)
df
}
Probably this is not the best form of the function, but at least it produces what I want (data...
2012 Jan 27
3
Subsetting for the ten highest values by group in a dataframe
...l this occurs within some
factor levels.
## I've used plyr here but I'm not married to this approach
require(plyr)
## I've created a data.frame with two groups and then a id variable (y)
df <- data.frame(x=rnorm(400, mean=20), y=1:400, z=c("A","B"))
## So using ddply I can find the highest value of x
df.max1 <- ddply(df, c("z"), subset, x==sort(x, TRUE)[1])
## Or the 2nd highest value
df.max2 <- ddply(df, c("z"), subset, x==sort(x, TRUE)[2])
## And so on.... but when I try to make a series of numbers like so
## to get the top ten val...
2012 Jan 17
1
New PLYR issue
...Using R 2.14.1, plyr 1.7.1, R.Studio 0.94.110, Windows XP
The plyr mailing list does not provide any help until now.
>require(plyr)
>c(sample(c(1:100), 50, replace=TRUE))->V1
>c(rep( 1:5, 10))->f1 #variable to group V1
>data.frame(cbind(V1, f1))->DF
>str(DF)
>ddply(DF$V1, DF$f1, "sd")
>ddply(.(DF$V1), .(DF$f1), "sd")
/>Error in if (empty(.data)) return(.data) : /
/missing value where TRUE/FALSE needed
/
/Thanks everyone,
/
////
[[alternative HTML version deleted]]
2011 Apr 25
2
Problem with ddply in the plyr-package: surprising output of a date-column
Hi Together,
I have a problem with the plyr package - more precisely with the ddply
function - and would be very grateful for any help. I hope the example
here is precise enough for someone to identify the problem. Basically,
in this step I want to identify observations that are identical in
terms of certain identifiers (ID1, ID2, ID3) and just want to save
those observations (in...
2010 Apr 14
6
sum specific rows in a data frame
I have a data frame called "pose":
DESCRIPTION QUANITY CLOSING.PRICE
1 WHEAT May/10 1 467.75
2 WHEAT May/10 2 467.75
3 WHEAT May/10 1 467.75
4 WHEAT May/10 1 467.75
5 COTTON NO.2 May/10 1 78.13
6 COTTON NO.2 May/10 3 78.13
7 COTTON NO.2 May/10 1 78.13
2011 Apr 13
1
error for ttest
Hello all,
I have arranged my data as per Dennis's suggestion in this post
http://www.mail-archive.com/r-help at r-project.org/msg107156.html.
the posted code works fine but when I try to apply it to my data, i get ">
u2 <- ddply(xxm, .(plateid, cytokine), as.data.frame.function(f))
Error in t.test.formula(conc ~ Self_T1D, data = df, na.rm = T) :
grouping factor must have exactly 2 levels".
Self_T1D has two levels "N" and "Y"
I have used the ddply function to do the mean and sd for the same data...
2011 Aug 23
3
ddply - how to transform df column "in place"
...success.
Given:
d<- data.frame(cbind(x=1,y=seq(20100801,20100830,1)))
names(d)<-c("first", "daterep")
d2<-d
# I can convert the daterep column in place the classic way:
d$daterep<-as.Date(strptime(d$daterep, format="%Y%m%d"))
# How to do it the plyr way?
ddply(d2, c("daterep"), function(df){as.Date(df, format="%Y%m%d")})
# returns: Error in as.Date.default(df, format = "%Y%m%d") :
# do not know how to convert 'df' to class "Date"
Thanks for any hints,
---jean
--
View this message in context: http://r....
2009 Aug 05
2
using ddply but preserving some of the outside data
I have a bit of a quandy. I'm working with a data set for which I
have sampled sites at a variety of dates. I want to use this data,
and get a running average of the sampled values for the current and
previous date.
I originally thought something like ddply would be ideal for this,
however, I cannot break up my data by date, and then apply a function
that requires information about the previous dates.
I had thought to use a for loop and merge, but that doesn't quite seem
to be working.
So, my questions are twofold
1) Is there a way to use...