Hi I have a table called npl containing results of simulations. It contains about 19000 entries and the structure looks like this: NoPlants sim run year DensPlants 1 6 lng_cs99_renosterbos 1 4 0.00192 . . . it has 43 different entries for sim and year goes from 1 to 100, and run from 1 to 5. I would like to calculate the mean of DensPlants for each simulation and each year seperately, i.e. calculating the mean for all combinations of sim and year over run. I can use split(npl, npl$sim) to split npl into different groups each containing the entries for one parameterset - but where to go from there? Rainer -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel: +27 - (0)72 808 2975 (w) Fax: +27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: RKrug at sun.ac.za Rainer at krugs.de
Have a look at the function aggregate.table in the package gtools (part of the gregmisc bundle). On 20/09/06, Rainer M Krug <rkrug at sun.ac.za> wrote:> Hi > > I have a table called npl containing results of simulations. > > It contains about 19000 entries and the structure looks like this: > > NoPlants sim run year DensPlants > 1 6 lng_cs99_renosterbos 1 4 0.00192 > . > . > . > > > it has 43 different entries for sim and year goes from 1 to 100, and run > from 1 to 5. > > I would like to calculate the mean of DensPlants for each simulation and > each year seperately, i.e. calculating the mean for all combinations of > sim and year over run. > > I can use > > split(npl, npl$sim) > > to split npl into different groups each containing the entries for one > parameterset - but where to go from there? > > Rainer > > -- > Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation > Biology (UCT) > > Department of Conservation Ecology and Entomology > University of Stellenbosch > Matieland 7602 > South Africa > > Tel: +27 - (0)72 808 2975 (w) > Fax: +27 - (0)21 808 3304 > Cell: +27 - (0)83 9479 042 > > email: RKrug at sun.ac.za > Rainer at krugs.de > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ================================David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP
Sorry, that should have been package gdata, not gtools...they're both in the same bundle, though. On 20/09/06, Rainer M Krug <rkrug at sun.ac.za> wrote:> Hi > > I have a table called npl containing results of simulations. > > It contains about 19000 entries and the structure looks like this: > > NoPlants sim run year DensPlants > 1 6 lng_cs99_renosterbos 1 4 0.00192 > . > . > . > > > it has 43 different entries for sim and year goes from 1 to 100, and run > from 1 to 5. > > I would like to calculate the mean of DensPlants for each simulation and > each year seperately, i.e. calculating the mean for all combinations of > sim and year over run. > > I can use > > split(npl, npl$sim) > > to split npl into different groups each containing the entries for one > parameterset - but where to go from there? > > Rainer > > -- > Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation > Biology (UCT) > > Department of Conservation Ecology and Entomology > University of Stellenbosch > Matieland 7602 > South Africa > > Tel: +27 - (0)72 808 2975 (w) > Fax: +27 - (0)21 808 3304 > Cell: +27 - (0)83 9479 042 > > email: RKrug at sun.ac.za > Rainer at krugs.de > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ================================David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP
> It contains about 19000 entries and the structure looks like this: > > NoPlants sim run year DensPlants > 1 6 lng_cs99_renosterbos 1 4 0.00192 > . > . > . > > it has 43 different entries for sim and year goes from 1 to 100, and run > from 1 to 5. > > I would like to calculate the mean of DensPlants for each simulation and > each year seperately, i.e. calculating the mean for all combinations of > sim and year over run.You can do this pretty easily with the reshape package: library(reshape) dfm <- rename(df, c(DensPlants = value)) # this is the form that reshape wants # Then try one of these: cast(dfm, year ~ sim) cast(dfm, year + sim ~ . ) cast(dfm, year ~ sim, margins=TRUE) Depending on what format you want the resulting summaries in. Hadley
> # Then try one of these: > > cast(dfm, year ~ sim) > cast(dfm, year + sim ~ . ) > cast(dfm, year ~ sim, margins=TRUE)Oops that should be: dfm <- rename(df, c(DensPlants = "value")) cast(dfm, year ~ sim, mean) cast(dfm, year + sim ~ . , mean) cast(dfm, year ~ sim, mean, margins=TRUE) (Thanks for pointing that out Gabor!) Hadley