Displaying 20 results from an estimated 2000 matches similar to: "a question about "by" and "ddply""
2010 Apr 07
1
unexpected behaviour with ddply and colwise
Hi,
I am confused by results from:
> ddply(aa, names(aa), colwise(sum))
I thought ddply was just calling colwise(sum)() with each column.
However ddply() returns a 13 x 5 result !!
The general result I expected is similar to that of apply() , or
using colwise(sum)() alone. Shouldn't ddply() produce the same ?
Thanks in advance for your help,
- Stuart Andrews
>
2010 Dec 06
3
[plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function
Dear R-Helpers:
I am using trying to use *ddply* to extract min and max of a particular
column in a data.frame. I am using two different forms of the function:
## var_name_to_split is a string -- something like "var1" which is the name
of a column in data.frame
ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , max(x[
, 3]))) ## fails with an error - case 1
ddply(
2009 Nov 19
1
ddply function nesting problems
While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it.? I've tried several approahces, but neither worked and I need to have the ability to include the "cut", "range", and "fullseq" methods within ddply.? (For a bit of that explanation refer to
2012 Sep 06
1
use of ddply() within function
Dear all,
I am encountering problems with the application of ddply within the body of a self-defined function.
The script is the following:
moncostcarmoto <- function(costtype){
costaux_result <- data.frame()
for (purp in PURPcount){for (per in PERcount){
costcarin =
2010 Feb 03
1
Calculating subsets "on the fly" with ddply
Hi,
[I sent this to the plyr mailing list (late) last night, but it seems
to be lost in the moderation queue, so here's a shot to the broadeR
community]
Apologies in advance for being more verbose than necessary, but I'm
not even sure how to ask this question in the context of plyr, so ...
here goes.
As meaningless as this might be to do with the `iris` data, the spirit
of it is what
2011 May 11
3
ddply with mean and max...
I'm trying to use ddply to compute summary statistics for many variables
splitting on the variable site. however, it seems to work fine for mean() but
if i use max() or min() things fall apart. whats going on?
test.set<-data.frame(site=1:10,x=.Random.seed[1:100],y=rnorm(100))
means<-ddply(test.set,.(site),mean)
means
site x y
1 1 -97459496 -0.14826303
2
2010 Jun 01
1
data frame manipulation ddply
Dear group,
Here is my data frame:
futures <-
structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
"CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
"LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
"SUGAR NO.11 Jul/10", "SUGAR NO.11
2012 May 05
1
Correct use of ddply with own function
Hi,
I am really confused how ddply work, so maybe you can help me.
I created a function that sorts a vector etc.
fn <- function(x){
x1 <- sort(x)
x2 <- seq(length(x))
x3 <- x2/max(x2)
df <- data.frame(x1,x2,x3)
df
}
Probably this is not the best form of the function, but at least it produces what I want (data to plot a cumulative count curve).
This function works on a
2012 Jul 24
1
Function for ddply
Hello, all. I'm new to R and just beginning to learn to write functions. I
know I'm out of my depth posting here, and I'm sure my issue is mundane.
But here goes. I'm analyzing the American National Election Study (nes),
looking at mean values of a numeric dep_var (environ.therm) across values
of a factor (partyid3). I use ddply from plyr and wtd.mean from Hmisc. The
nes requires a
2011 Jun 21
4
ddply to count frequency of combinations
I have a dataframe df with two columns x and y. I want to count the number
of times a unique x, y combination occurs.
For example
x<- c(1,2,3,4,5,1,2,3,4)
y<- c(1,2,3,4,5,1,2,4,1)
df<-as.data.frame(cbind(x, y))
#what is the correct way to use ddply for this example?
ddply(df, c('x','y', summarize, ??)
#desired output -- format and order doesn't matter
# (x, y)
2011 Aug 24
3
ddply from plyr package - any alternatives?
Hello everyone,
I was asked to repost this again, sorry for any inconvenience.
I'm looking replacement for ddply function from plyr package.
Function allows to apply function by category stored in any column/columns.
Regular loops or lapplys slow down greatly because my unique combination
count exceeds 9000. Is there any available solution which allow me to apply
function by category?
2010 Sep 22
2
speeding up regressions using ddply
Hi,
I have a data set that I'd like to run logistic regressions on, using
ddply to speed up the computation of many models with different
combinations of variables. I would like to run regressions on every
unique two-variable combination in a portion of my data set, but I
can't quite figure out how to do using ddply. The data set looks like
this, with "status" as
2012 Apr 03
1
help in ddply
Hi
I've records like this
df=
x panel
4 1
93 2
21 3
83 4
75 1
87 2
87 3
78 4
50 1
76 2
86 3
65 4
84 1
40 2
39 3
26 4
i want to create histogram out of it . i want all the mid and count values
for panel wise
my code is
histoutput = ddply(df,.(df[2]),hist)
i'm not able to get the required result.
please help me
using for loop takes a lot of time if there are more records
-----
Thanks
2009 Aug 05
2
using ddply but preserving some of the outside data
I have a bit of a quandy. I'm working with a data set for which I
have sampled sites at a variety of dates. I want to use this data,
and get a running average of the sampled values for the current and
previous date.
I originally thought something like ddply would be ideal for this,
however, I cannot break up my data by date, and then apply a function
that requires information
2011 Apr 25
2
Problem with ddply in the plyr-package: surprising output of a date-column
Hi Together,
I have a problem with the plyr package - more precisely with the ddply
function - and would be very grateful for any help. I hope the example
here is precise enough for someone to identify the problem. Basically,
in this step I want to identify observations that are identical in
terms of certain identifiers (ID1, ID2, ID3) and just want to save
those observations (in this step,
2009 Nov 19
1
Performance of 'by' and 'ddply' on a large data frame
I've only recently started using R. One of the problems I come up
against is after having extracted a large dataset (>5M rows) out of
database, I realize I need another variable. In this case I have data
frame with dates. I want to find the minimum date for each value of x1
and add that minimum date to my data.frame.
> randomdf <- function(p) {
data.frame(x1=sample(1:10^4, 10^p,
2011 Aug 23
3
ddply - how to transform df column "in place"
Dear R-users,
I am trying to get the plyr syntax right, without much success.
Given:
d<- data.frame(cbind(x=1,y=seq(20100801,20100830,1)))
names(d)<-c("first", "daterep")
d2<-d
# I can convert the daterep column in place the classic way:
d$daterep<-as.Date(strptime(d$daterep, format="%Y%m%d"))
# How to do it the plyr way?
ddply(d2,
2012 Aug 07
2
Error using ddply inside user-defined function
Hi All,
I *think* it's ddply because the function recognizes vr1, etc, in other
parts of the function.
Here's some code:
# create dataset
PROV.PM.FBCTS <- c(0.00 ,0.00, 33205.19, 25994.56, 23351.37, 26959.56
,27632.58, 26076.24, 0.00, 0.00 , 6741.42, 18665.09 ,18129.59 ,21468.39
,21294.60 ,22764.82, 26076.73)
FBCTS.INV.TOT <- c(0 , 0, 958612, 487990, 413344, 573347,
2011 Oct 01
1
error using ddply to generate means
Dear list,
I encounter an error when I try to use ddply to generate means as follows:
fun3<-structure(list(sector = structure(list(gics_sector_name = c("Financials",
"Financials", "Materials", "Materials")), .Names = "gics_sector_name",
row.names = structure(c("UBSN VX Equity",
"LLOY LN Equity", "AI FP Equity",
2012 Feb 15
2
function similar to ddply? + calculations based on previous row
Hi all,
I was wondering if there is a function kind of similar that splits a
dataframe, applies a function to each row and returns in a data frame. I
know ddply but this one isn?t useful in this situation.
I have a dataframe with values for each day (rows) for different objects
(columns). I have values for several years. Now, I want to do calculations
on only the data of that year. With the