thr3ads.net - search: "daply"

2010 Sep 09

1

Strange output daply with empty strata

Dear list, I get some strange results with daply from the plyr package. In the example below, the average age per municipality for employed en unemployed is calculated. If I do this using tapply (see code below) I get the following result: no yes A NA 36.94931 B 51.22505 34.24887 C 48.05759 51.00198 If I do this using...

plyr: set '.progress' argument to default to "text"

2012 Apr 10

1

plyr: set '.progress' argument to default to "text"

Dear all Is it possible to set globally the option .progress = "text" to all the apply functions in 'plyr'. For example, current default is daply(..., .progress = "none"). I would like to set it to daply(..., .progress = "text"), so as to avoid writing the argument every time I call such a function. I looked into ?daply and ?create_progress_bar without much luck. Regards Liviu -- Do you know how to read? http://www.al...

dataframe to a timeseries object

2011 Mar 11

1

dataframe to a timeseries object

I?m wondering which is the most efficient (time, than memory usage) way to obtain a multivariate time series object from a data frame (the easiest data structure to get data from a database trough RODBC). I have a starting point using timeSeries or xts library (these libraries can handle time zones), below you can find code to test. Merging parallelization (cbind) is something I?m thinking at

frequency, count rows, data for heat map

2010 Aug 25

3

frequency, count rows, data for heat map

Hi all, I have read posts of heat map creation but I am one step prior -- Here is what I am trying to do and wonder if you have any tips? We are trying to map sequence reads from tumors to viral genomes. Example input file : 111 abc 111 sdf 111 xyz 1079 abc 1079 xyz 1079 xyz 5576 abc 5576 sdf 5576 sdf How may xyz's are there for 1079 and 111? How many abc's, etc?

How to speed up grouping time series, help please

2011 Apr 04

3

How to speed up grouping time series, help please

...) else # create xx in env assign("xx", timeSeries(x$VALUE, x$DATE, format = '%Y-%m-%d %H:%M:%S', zone = 'GMT', units = as.character(x$ID[1])), envir = env) return(TRUE) } } # use package plyr, faster than 'by' function tsDaply <- function(...) { library(plyr) e1 <- new.env(parent = baseenv()) #create a new env res <- daply(X, "ID", buildTimeSeriesFromDataFrame, env = e1) return(get("xx", e1)) # return xx from env } ##replicate 100 times #Time03 <- replicate(100, # system.ti...

New package: plyr

2008 Sep 30

0

New package: plyr

...rst letter) and output (second letter): * llply = from a list to a list * alply = from an array (or vector, or matrix) to a list * ldply = from a list to a data.frame * d_ply = from a data.frame, ignore output * and so on for llply, laply, ldply, l_ply, alply, aaply, adply, a_ply, dlply, daply, dply, d_ply plyr also provides: * m*ply which works in a similar way to mapply * r*ply which works in a similar way to replicate You can find out more at http://had.co.nz/plyr/, including a 20 page introductory guide, http://had.co.nz/plyr/plyr-intro.pdf. Regards, Hadley -- http://had....

[solutions] "tapply versus by" in function with more than 1 arguments

2008 Oct 02

0

[solutions] "tapply versus by" in function with more than 1 arguments

...tapply(rownames(dataf), dataf$class, function(r) with(dataf[r, ], cor(V1, V2))) is SIX times slower than sapply #----------------------------------------------------- #Solution(s) 4: install.packages("plyr") library(plyr) ddply(dataf, .(class), function(df) data.frame(cor(df[, 1:2]))) daply(dataf, .(class), function(df) cor(df[, 1:2])) dlply(dataf, .(class), function(df) cor(df[, 1:2])) #Interesting library, but not good to my example - I need a vector with results #[only cor(V1,V2) and not cor(V1,V1)], that is, I don't want rectangular objects Novos endereços, o Yahoo...

New package: plyr

2008 Sep 30

0

New package: plyr

...rst letter) and output (second letter): * llply = from a list to a list * alply = from an array (or vector, or matrix) to a list * ldply = from a list to a data.frame * d_ply = from a data.frame, ignore output * and so on for llply, laply, ldply, l_ply, alply, aaply, adply, a_ply, dlply, daply, dply, d_ply plyr also provides: * m*ply which works in a similar way to mapply * r*ply which works in a similar way to replicate You can find out more at http://had.co.nz/plyr/, including a 20 page introductory guide, http://had.co.nz/plyr/plyr-intro.pdf. Regards, Hadley -- http://had....

parallel computation in plyr 1.7

2012 Jan 12

1

parallel computation in plyr 1.7

Dear all, I have a question regarding the possibility of parallel computation in plyr version 1.7. The help files of the following functions mention the argument '.parallel': ddply, aaply, llply, daply, adply, dlply, alply, ldply, laply However, the help files of the following functions do not mention this argument: ?d_ply, ?aply, ?lply Is it because parallel computation is not supported for these latter functions? Or is it just because the documentation was not updated for these functions afte...

plyr version 0.1.7

2009 Apr 15

0

plyr version 0.1.7

...a - hopefully this will make problems easier to track down Speed-ups * massive speed ups for splitting large arrays * fixed typo that was causing a 50% speed penalty for d*ply * rewritten rbind.fill is considerably (> 4x) faster for many data frames * colwise about twice as fast Bug fixes: * daply: now works when the data frame is split by multiple variables * aaply: now works with vectors * ddply: first variable now varies slowest as you'd expect plyr 0.1.5 (2008-02-23) --------------------------------------------------- * colwise now accepts a quoted list as its second argument. Th...

plyr version 0.1.7

2009 Apr 15

0

plyr version 0.1.7

...a - hopefully this will make problems easier to track down Speed-ups * massive speed ups for splitting large arrays * fixed typo that was causing a 50% speed penalty for d*ply * rewritten rbind.fill is considerably (> 4x) faster for many data frames * colwise about twice as fast Bug fixes: * daply: now works when the data frame is split by multiple variables * aaply: now works with vectors * ddply: first variable now varies slowest as you'd expect plyr 0.1.5 (2008-02-23) --------------------------------------------------- * colwise now accepts a quoted list as its second argument. Th...

plyr: version 1.2

2010 Sep 10

0

plyr: version 1.2

...% less time (or about 20% less time than lapply). Note that as a whole, llply still has much more overhead than lapply. * round_any now lives in plyr instead of reshape BUG FIXES * list_to_array works correct even when there are missing values in the array. This is particularly important for daply. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages

plyr: version 1.2

2010 Sep 10

0

plyr: version 1.2

...% less time (or about 20% less time than lapply). Note that as a whole, llply still has much more overhead than lapply. * round_any now lives in plyr instead of reshape BUG FIXES * list_to_array works correct even when there are missing values in the array. This is particularly important for daply. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages

Counting entries to create a new table

2011 Nov 01

1

Counting entries to create a new table

Hi, I am an R novice and I am trying to do something that it seems should be fairly simple, but I can't quite figure it out and I must not be using the right words when I search for answers. I have a dataset with a number of individuals and observations for each day (7 possible codes plus missing data) So it looks something like this Individual A, B, C, D Day1 1,1,1,1 Day 2 1,3,4,2 Day3

Question on matrix calculation

2013 Jan 24

2

Question on matrix calculation

Hello again, Ley say I have 1 matrix and 1 data frame: > mat <- matrix(1:15, 5) > match_df <- data.frame(Seq = 1:5, criteria = sample(letters[1:5], 5, replace = T)) > mat [,1] [,2] [,3] [1,] 1 6 11 [2,] 2 7 12 [3,] 3 8 13 [4,] 4 9 14 [5,] 5 10 15 > match_df Seq criteria 1 1 c 2 2 e 3 3 c 4 4 c 5

help need on working in subset within a dataframe

2011 Mar 22

1

help need on working in subset within a dataframe

Dear R-experts Execuse me for an easy question, but I need help, sorry for that. >From days I have been working with a large dataset, where operations are needed within a component of dataset. Here is my question: I have big dataset where x1:.....x1000 or so. What I need to do is to work on 4 consequite variables to calculate a statistics and output. So far so good. There are more vector

aggregate, by, *apply

2010 Sep 15

3

aggregate, by, *apply

Dear R gurus, I regularly come across a situation where I would like to apply a function to a subset of data in a dataframe, but I have not found an R function to facilitate exactly what I need. More specifically, I'd like my function to have a context of where the data it's analyzing came from. Here is an example: ### BEGIN ### func<-function(x){ m<-median(x$x) if(m > 2 &

How to remove rows based on frequency of factor and then difference date scores

2010 Aug 24

2

How to remove rows based on frequency of factor and then difference date scores

Hello- A basic question which has nonetheless floored me entirely. I have a dataset which looks like this: Type ID Date Value A 1 16/09/2020 8 A 1 23/09/2010 9 B 3 18/8/2010 7 B 1 13/5/2010 6 There are two Types, which correspond to different individuals in different conditions, and loads of ID labels (1:50)

Memory problem

2016 Apr 06

0

Memory problem

As Jim has indicated, memory usage problems can require very specific diagnostics and code changes, so generic help is tough to give. However, in most cases I have found the dplyr package to be more memory efficient than plyr, so you could consider that. Also, you can be explicit about only saving the minimum results you want to keep rather than making a list of complete results and extracting

"tapply versus by" in function with more than 1 arguments

2008 Oct 01

3

"tapply versus by" in function with more than 1 arguments

Hi. I searched the list and didn't found nothing similar to this. I simplified my example like below: #I need calculate correlation (for example) between 2 columns classified by a third one at a data.frame, like below: #number of rows nr = 10 #the third column is to enforce that I need correlation on two variables only dataf =

search for: daply