plyr is a set of tools for a common set of problems: you need to break down a big data structure into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to: * fit the same model to subsets of a data frame * quickly calculate summary statistics for each group * perform group-wise transformations like scaling or standardising * eliminate for-loops in your code It's already possible to do this with built-in functions (like split and the apply functions), but plyr just makes it all a bit easier with: * absolutely consistent names, arguments and outputs * input from and output to data.frames, matrices and lists * progress bars to keep track of long running operations * built-in error recovery, and informative error messages Some considerable effort has been put into making plyr fast and memory efficient, and in most cases it is faster than the built-in functions. You can find out more at http://had.co.nz/plyr/, including a 20 page introductory guide, http://had.co.nz/plyr/plyr-intro.pdf. You can ask questions about plyr (and data-manipulation in general) on the plyr mailing list. Sign up at http://groups.google.com/group/manipulatr plyr 0.1.7 (2008-04-15) --------------------------------------------------- Ensure that rbind.fill preserves attributes. plyr 0.1.6 (2008-04-15) --------------------------------------------------- Improvements: * all ply functions deal more elegantly when given function names: can supply a vector of function names, and name is used as label in output * failwith and each now work with function names as well as functions (i.e. "nrow" instead of nrow) * each now accepts a list of functions or a vector of function names * l*ply will use list names where present * if .inform is TRUE, error messages will give you information about where errors within your data - hopefully this will make problems easier to track down Speed-ups * massive speed ups for splitting large arrays * fixed typo that was causing a 50% speed penalty for d*ply * rewritten rbind.fill is considerably (> 4x) faster for many data frames * colwise about twice as fast Bug fixes: * daply: now works when the data frame is split by multiple variables * aaply: now works with vectors * ddply: first variable now varies slowest as you'd expect plyr 0.1.5 (2008-02-23) --------------------------------------------------- * colwise now accepts a quoted list as its second argument. This allows you to specify the names of columns to work on: colwise(mean, .(lat, long)) * d_ply and a_ply now correctly pass ... to the function -- http://had.co.nz/ _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages