thr3ads.net - similar to: "New PLYR issue"

Displaying 20 results from an estimated 4000 matches similar to: "New PLYR issue"

ddply from plyr package - any alternatives?

2011 Aug 24

ddply from plyr package - any alternatives?

Hello everyone, I was asked to repost this again, sorry for any inconvenience. I'm looking replacement for ddply function from plyr package. Function allows to apply function by category stored in any column/columns. Regular loops or lapplys slow down greatly because my unique combination count exceeds 9000. Is there any available solution which allow me to apply function by category?

plyr and table question

2009 Apr 03

plyr and table question

Dear all, I'm puzzled by the following example inspired by a recent question on R-help, cc <- textConnection("user_id website time 20 google 0930 21 yahoo 0935 20 facebook 1000 25 facebook 1015 61 google 0940") d <- read.table(cc, head=T) ; close(cc) table(d$user_id) # count the

parallel computation with plyr 1.2.1

2010 Sep 16

parallel computation with plyr 1.2.1

Hi, I have been trying to use the new .parallel argument with the most recent version of plyr [1] to speed up some tasks. I can run the example in the NEWS file [1], and it seems to be working correctly. However, R will only use a single core when I try to apply this same approach with ddply(). 1. http://cran.r-project.org/web/packages/plyr/NEWS Watching my CPUs I see that in both cases

[plyr] Moving average filter with plyr

2013 Aug 27

[plyr] Moving average filter with plyr

Dear all, I'm stuck with a problem using plyr to process a rather large junk of data. What I'm trying to do is applying a moving average to all the subparts of the dataframe (the example data can be found here https://dl.dropboxusercontent.com/u/2414056/testData.Rdata). require(plyr) load("testData.Rdata") applyfilter<-function(x){ return(filter(x,rep(1/5, times=5))) }

[plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function

2010 Dec 06

[plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function

Dear R-Helpers: I am using trying to use *ddply* to extract min and max of a particular column in a data.frame. I am using two different forms of the function: ## var_name_to_split is a string -- something like "var1" which is the name of a column in data.frame ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , max(x[ , 3]))) ## fails with an error - case 1 ddply(

Can package plyr also calculate the mode?

2013 Apr 03

Can package plyr also calculate the mode?

I am trying to replicate the SAS proc univariate in R. I got most of the stats I needed for a by grouping in a data frame using: all1 <- ddply(all,"ACT_NAME", summarise, mean=mean(COUNTS), sd=sd(COUNTS), q25=quantile(COUNTS,.25),median=quantile(COUNTS,.50), q75=quantile(COUNTS,.75), q90=quantile(COUNTS,.90), q95=quantile(COUNTS,.95), q99=quantile(COUNTS,.99) )

Problem with ddply in the plyr-package: surprising output of a date-column

2011 Apr 25

Problem with ddply in the plyr-package: surprising output of a date-column

Hi Together, I have a problem with the plyr package - more precisely with the ddply function - and would be very grateful for any help. I hope the example here is precise enough for someone to identify the problem. Basically, in this step I want to identify observations that are identical in terms of certain identifiers (ID1, ID2, ID3) and just want to save those observations (in this step,

Using plyr::dply more (memory) efficiently?

2010 Apr 29

Using plyr::dply more (memory) efficiently?

Hi all, In short: I'm running ddply on an admittedly (somehow) large data.frame (not that large). It runs fine until it finishes and gets to the "collating" part where all subsets of my data.frame have been summarized and they are being reassembled into the final summary data.frame (sorry, don't know the correct plyr terminology). During collation, my R workspace RAM usage goes

Plyr and memory allocation issue

2009 Aug 18

Plyr and memory allocation issue

Dear R users I am trying to create some new variables for a 4401 x 30 dataframe using ddply and transform. The "id" variable i am using is a factor with 1330 levles eg bb <- function(df) {transform(df, years = study.year - min(study.year) + 1, periods = length(study.year) )} test <- ddply(x,.(id),bb) I havent copied the data to avoid clogging the

summarize-plyr package

2009 Sep 25

summarize-plyr package

Hi,I am using the amazing package 'plyr". I have one problem. I would appreciate help to fix the following error: Thanks. ______________________________ > library(plyr) > data(baseball) > summarise(baseball, + duration = max(year) - min(year), + nteams = length(unique(team))) Error: could not find function "summarise" > ddply(baseball, "id", summarise, +

Stymied by plyr

2011 Apr 21

Stymied by plyr

Hello, This is my first time trying to use plyr, and I'm getting nowhere. I have teacher ratings data (1:4), on 10 components, by external observers and internal observers, in schools in areas. I want to calculate the percentage of each rating given on each component, by each type of observer, within each school, within each area. The data look like this: unit area ext.obs rating comp 11

Applying function to only numeric variable (plyr package?)

2011 Oct 12

Applying function to only numeric variable (plyr package?)

My data frame consists of character variables, factors, and proportions, something like c1 <- c("A", "B", "C", "C") c2 <- factor(c(1, 1, 2, 2), labels = c("Y","N")) x <- c(0.5234, 0.6919, 0.2307, 0.1160) y <- c(0.9251, 0.7616, 0.3624, 0.4462) df <- data.frame(c1, c2, x, y) pct <- function(x) round(100*x, 1) I want to

MASS fitdistr with plyr or data.table?

2011 Apr 27

MASS fitdistr with plyr or data.table?

I am trying to extract the shape and scale parameters of a wind speed distribution for different sites. I can do this in a clunky way, but I was hoping to find a way using data.table or plyr. However, when I try I am met with the following: set.seed(144) weib.dist<-rweibull(10000,shape=3,scale=8) weib.test<-data.table(cbind(1:10,weib.dist))

Sequential Naming of ggplot .pngs using plyr

2011 Aug 10

Sequential Naming of ggplot .pngs using plyr

If I have data: dat<-data.frame(a=rnorm(20),b=rnorm(20),c=rnorm(20),d=rnorm(20),site=rep(letters[5:8],each=5)) And want to plot like this: ctr<-1 for(i in c('a','b','c','d')){ png(file=paste('/tmp/plot_number_',ctr,'.png',sep=''),height=8.5, width=11,units='in',pointsize=9,res=300) print(ggplot(dat[,names(dat) %in%

problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome

2011 Sep 03

problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome

Dear R experts. I might be missing something obvious. I have been trying to fix this problem for some weeks. Please help. #data ped <- c(rep(1, 4), rep(2, 3), rep(3, 3)) y <- rnorm(10, 8, 2) # variable set 1 M1a <- sample (c(1, 2,3), 10, replace= T) M1b <- sample (c(1, 2,3), 10, replace= T) M1aP1 <- sample (c(1, 2,3), 10, replace= T) M1bP2 <- sample (c(1, 2,3), 10, replace= T)

Why does this work? plyr within-subset normalization

2012 Mar 28

Why does this work? plyr within-subset normalization

Working code that normalize each row's value against the subset's maximum. Does the invocation of max() somehow instruct R to 'step back' and evaluate the subset? Thanks, Zack -- View this message in context: http://r.789695.n4.nabble.com/Why-does-this-work-plyr-within-subset-normalization-tp4512989p4512989.html Sent from the R help mailing list archive at Nabble.com.

do I need plyr, apply or something else?

2012 Jul 11

do I need plyr, apply or something else?

Dear all, This is what I'd like to do (I have an implementation using for loops, which I designed before I realised just how slow R is at executing them - this process currently takes days to run). I have a large dataframe containing corporate bond data, columns are: BondID Date (goes back 5years) Var1 Var2 Term2Maturity What I want to do is this: 1) For each bond, at each given date,

New PLYR issue

2011 Nov 13

New PLYR issue

Issue with PLYR. Now using R 2.14 and this data and plyr command line worked with 2.13 I am also loading the same saved data that worked previously, but now some issue. > library(plyr) > UNESCO <- dget('C:/Carbon-GJ/BZE_ecosys.robj') > df2 <- ddply(df, "UNESCO", summarise, total_ha = sum(Ha)) *Error in if (empty(.data)) return(.data) : missing value where

Subsetting depth profiles based on maximum depth by group with plyr

2011 May 17

Subsetting depth profiles based on maximum depth by group with plyr

Hello, Apologies for a similar earlier post. I didn't include enough details in that one. I am having a little trouble subsetting some data based on a grouping variable. I am using an instrument that does depth profiles of a water column. The instrument records on the way down as well as the way up. So thanks to an off-list reply I can subset the data so that all data collected at the

Reshape or Plyr?

2013 Apr 20

Reshape or Plyr?

H all, I have relative abundance data from >100 sites. This is from acoustic monitoring and usually the data is for 2-3 nights but in some cases my be longer like months or years for each location.. The data output from my management data base is proved by species by night for each location so data frame would look like this below. What I need to do is sum the Survey_time by Spec_Code for

similar to: New PLYR issue