similar to: New PLYR issue

Displaying 20 results from an estimated 4000 matches similar to: "New PLYR issue"

2011 Aug 24
3
ddply from plyr package - any alternatives?
Hello everyone, I was asked to repost this again, sorry for any inconvenience. I'm looking replacement for ddply function from plyr package. Function allows to apply function by category stored in any column/columns. Regular loops or lapplys slow down greatly because my unique combination count exceeds 9000. Is there any available solution which allow me to apply function by category?
2009 Apr 03
3
plyr and table question
Dear all, I'm puzzled by the following example inspired by a recent question on R-help, cc <- textConnection("user_id website time 20 google 0930 21 yahoo 0935 20 facebook 1000 25 facebook 1015 61 google 0940") d <- read.table(cc, head=T) ; close(cc) table(d$user_id) # count the
2010 Sep 16
2
parallel computation with plyr 1.2.1
Hi, I have been trying to use the new .parallel argument with the most recent version of plyr [1] to speed up some tasks. I can run the example in the NEWS file [1], and it seems to be working correctly. However, R will only use a single core when I try to apply this same approach with ddply(). 1. http://cran.r-project.org/web/packages/plyr/NEWS Watching my CPUs I see that in both cases
2013 Aug 27
1
[plyr] Moving average filter with plyr
Dear all, I'm stuck with a problem using plyr to process a rather large junk of data. What I'm trying to do is applying a moving average to all the subparts of the dataframe (the example data can be found here https://dl.dropboxusercontent.com/u/2414056/testData.Rdata). require(plyr) load("testData.Rdata") applyfilter<-function(x){ return(filter(x,rep(1/5, times=5))) }
2010 Dec 06
3
[plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function
Dear R-Helpers: I am using trying to use *ddply* to extract min and max of a particular column in a data.frame. I am using two different forms of the function: ## var_name_to_split is a string -- something like "var1" which is the name of a column in data.frame ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , max(x[ , 3]))) ## fails with an error - case 1 ddply(
2013 Apr 03
5
Can package plyr also calculate the mode?
I am trying to replicate the SAS proc univariate in R. I got most of the stats I needed for a by grouping in a data frame using: all1 <- ddply(all,"ACT_NAME", summarise, mean=mean(COUNTS), sd=sd(COUNTS), q25=quantile(COUNTS,.25),median=quantile(COUNTS,.50), q75=quantile(COUNTS,.75), q90=quantile(COUNTS,.90), q95=quantile(COUNTS,.95), q99=quantile(COUNTS,.99) )
2011 Apr 25
2
Problem with ddply in the plyr-package: surprising output of a date-column
Hi Together, I have a problem with the plyr package - more precisely with the ddply function - and would be very grateful for any help. I hope the example here is precise enough for someone to identify the problem. Basically, in this step I want to identify observations that are identical in terms of certain identifiers (ID1, ID2, ID3) and just want to save those observations (in this step,
2010 Apr 29
1
Using plyr::dply more (memory) efficiently?
Hi all, In short: I'm running ddply on an admittedly (somehow) large data.frame (not that large). It runs fine until it finishes and gets to the "collating" part where all subsets of my data.frame have been summarized and they are being reassembled into the final summary data.frame (sorry, don't know the correct plyr terminology). During collation, my R workspace RAM usage goes
2009 Aug 18
1
Plyr and memory allocation issue
Dear R users I am trying to create some new variables for a 4401 x 30 dataframe using ddply and transform. The "id" variable i am using is a factor with 1330 levles eg bb <- function(df) {transform(df, years = study.year - min(study.year) + 1, periods = length(study.year) )} test <- ddply(x,.(id),bb) I havent copied the data to avoid clogging the
2009 Sep 25
2
summarize-plyr package
Hi,I am using the amazing package 'plyr". I have one problem. I would appreciate help to fix the following error: Thanks. ______________________________ > library(plyr) > data(baseball) > summarise(baseball, + duration = max(year) - min(year), + nteams = length(unique(team))) Error: could not find function "summarise" > ddply(baseball, "id", summarise, +
2011 Apr 21
1
Stymied by plyr
Hello, This is my first time trying to use plyr, and I'm getting nowhere. I have teacher ratings data (1:4), on 10 components, by external observers and internal observers, in schools in areas. I want to calculate the percentage of each rating given on each component, by each type of observer, within each school, within each area. The data look like this: unit area ext.obs rating comp 11
2011 Oct 12
3
Applying function to only numeric variable (plyr package?)
My data frame consists of character variables, factors, and proportions, something like c1 <- c("A", "B", "C", "C") c2 <- factor(c(1, 1, 2, 2), labels = c("Y","N")) x <- c(0.5234, 0.6919, 0.2307, 0.1160) y <- c(0.9251, 0.7616, 0.3624, 0.4462) df <- data.frame(c1, c2, x, y) pct <- function(x) round(100*x, 1) I want to
2011 Apr 27
3
MASS fitdistr with plyr or data.table?
I am trying to extract the shape and scale parameters of a wind speed distribution for different sites. I can do this in a clunky way, but I was hoping to find a way using data.table or plyr. However, when I try I am met with the following: set.seed(144) weib.dist<-rweibull(10000,shape=3,scale=8) weib.test<-data.table(cbind(1:10,weib.dist))
2011 Aug 10
1
Sequential Naming of ggplot .pngs using plyr
If I have data: dat<-data.frame(a=rnorm(20),b=rnorm(20),c=rnorm(20),d=rnorm(20),site=rep(letters[5:8],each=5)) And want to plot like this: ctr<-1 for(i in c('a','b','c','d')){ png(file=paste('/tmp/plot_number_',ctr,'.png',sep=''),height=8.5, width=11,units='in',pointsize=9,res=300) print(ggplot(dat[,names(dat) %in%
2011 Sep 03
2
problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome
Dear R experts. I might be missing something obvious. I have been trying to fix this problem for some weeks. Please help. #data ped <- c(rep(1, 4), rep(2, 3), rep(3, 3)) y <- rnorm(10, 8, 2) # variable set 1 M1a <- sample (c(1, 2,3), 10, replace= T) M1b <- sample (c(1, 2,3), 10, replace= T) M1aP1 <- sample (c(1, 2,3), 10, replace= T) M1bP2 <- sample (c(1, 2,3), 10, replace= T)
2012 Mar 28
1
Why does this work? plyr within-subset normalization
Working code that normalize each row's value against the subset's maximum. Does the invocation of max() somehow instruct R to 'step back' and evaluate the subset? Thanks, Zack -- View this message in context: http://r.789695.n4.nabble.com/Why-does-this-work-plyr-within-subset-normalization-tp4512989p4512989.html Sent from the R help mailing list archive at Nabble.com.
2012 Jul 11
1
do I need plyr, apply or something else?
Dear all, This is what I'd like to do (I have an implementation using for loops, which I designed before I realised just how slow R is at executing them - this process currently takes days to run). I have a large dataframe containing corporate bond data, columns are: BondID Date (goes back 5years) Var1 Var2 Term2Maturity What I want to do is this: 1) For each bond, at each given date,
2011 Nov 13
1
New PLYR issue
Issue with PLYR. Now using R 2.14 and this data and plyr command line worked with 2.13 I am also loading the same saved data that worked previously, but now some issue. > library(plyr) > UNESCO <- dget('C:/Carbon-GJ/BZE_ecosys.robj') > df2 <- ddply(df, "UNESCO", summarise, total_ha = sum(Ha)) *Error in if (empty(.data)) return(.data) : missing value where
2011 May 17
1
Subsetting depth profiles based on maximum depth by group with plyr
Hello, Apologies for a similar earlier post. I didn't include enough details in that one. I am having a little trouble subsetting some data based on a grouping variable. I am using an instrument that does depth profiles of a water column. The instrument records on the way down as well as the way up. So thanks to an off-list reply I can subset the data so that all data collected at the
2013 Apr 20
7
Reshape or Plyr?
H all, I have relative abundance data from >100 sites. This is from acoustic monitoring and usually the data is for 2-3 nights but in some cases my be longer like months or years for each location.. The data output from my management data base is proved by species by night for each location so data frame would look like this below. What I need to do is sum the Survey_time by Spec_Code for