Federico Calboli
2003-Jul-28 21:47 UTC
[R] data manipulation: getting mean value every 5 rows
Dear All, I would like to ask you how to accomplish a little tricky data manipulation. I have a large dataset, looking something like: temp line cage number 18 18 1 6678.63 18 18 1 7774.458 18 18 1 7845.902 18 18 1 9483.578 18 18 1 8983.555 18 18 1 9181.052 18 18 1 9458.696 18 18 1 8138.616 18 18 1 7981.994 18 18 1 7556.491 18 18 1 7672.137 18 18 1 6607.776 18 18 1 8383.65 18 18 1 7129.852 18 18 1 8536.667 18 18 2 8287.8 18 18 2 7924.47 18 18 2 7928.474 18 18 2 7363.157 18 18 2 7952.593 ..... I would like to create a dataframe where I get the mean values, 5 rows at a time, of columns "number", while keeping the value in the other columns fixed to the vaules found in the first of the 5 rows (or whatever, it's the same for the 5 rows) so that the above would be "shrunk" to: temp line cage number 18 18 1 8153.2246 18 18 1 8463.3698 18 18 1 7666.0164 18 18 2 7891.2988 Any hints? Regards, Federico Calboli ======================== Federico C.F. Calboli Department of Biology University College London Room 327 Darwin Building Gower Street London WClE 6BT Tel: (+44) 020 7679 4395 Fax (+44) 020 7679 7096 f.calboli at ucl.ac.uk
Spencer Graves
2003-Jul-28 21:59 UTC
[R] data manipulation: getting mean value every 5 rows
Have you considered "aggregate" [documented in help(aggregate) or "www.r-project.org" -> search -> "R site search" or Venables and Ripley, Modern Applied Statistics with S]? hope this helps. spencer graves Federico Calboli wrote:> Dear All, > > I would like to ask you how to accomplish a little tricky data > manipulation. I have a large dataset, looking something like: > > temp line cage number > 18 18 1 6678.63 > 18 18 1 7774.458 > 18 18 1 7845.902 > 18 18 1 9483.578 > 18 18 1 8983.555 > 18 18 1 9181.052 > 18 18 1 9458.696 > 18 18 1 8138.616 > 18 18 1 7981.994 > 18 18 1 7556.491 > 18 18 1 7672.137 > 18 18 1 6607.776 > 18 18 1 8383.65 > 18 18 1 7129.852 > 18 18 1 8536.667 > 18 18 2 8287.8 > 18 18 2 7924.47 > 18 18 2 7928.474 > 18 18 2 7363.157 > 18 18 2 7952.593 > ..... > > I would like to create a dataframe where I get the mean values, 5 rows at a > time, of columns "number", while keeping the value in the other columns > fixed to the vaules found in the first of the 5 rows (or whatever, it's the > same for the 5 rows) so that the above would be "shrunk" to: > > temp line cage number > 18 18 1 8153.2246 > 18 18 1 8463.3698 > 18 18 1 7666.0164 > 18 18 2 7891.2988 > > Any hints? > > Regards, > > Federico Calboli > > ========================> > Federico C.F. Calboli > > Department of Biology > University College London > Room 327 > Darwin Building > Gower Street > London > WClE 6BT > > Tel: (+44) 020 7679 4395 > Fax (+44) 020 7679 7096 > f.calboli at ucl.ac.uk > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> x <- read.table(file("clipboard"), header=T)> # add an extra field to define groups of 5 sequential rows > x[,"code"] <- rep(seq(len=nrow(x)/5), each=5) > x temp line cage number code 1 18 18 1 6678.630 1 2 18 18 1 7774.458 1 3 18 18 1 7845.902 1 4 18 18 1 9483.578 1 5 18 18 1 8983.555 1 6 18 18 1 9181.052 2 7 18 18 1 9458.696 2 8 18 18 1 8138.616 2 9 18 18 1 7981.994 2 10 18 18 1 7556.491 2 11 18 18 1 7672.137 3 12 18 18 1 6607.776 3 13 18 18 1 8383.650 3 14 18 18 1 7129.852 3 15 18 18 1 8536.667 3 16 18 18 2 8287.800 4 17 18 18 2 7924.470 4 18 18 18 2 7928.474 4 19 18 18 2 7363.157 4 20 18 18 2 7952.593 4 > aggregate(x[,"number",drop=F], x[,c("temp", "line", "cage", "code")], mean) temp line cage code number 1 18 18 1 1 8153.225 2 18 18 1 2 8463.370 3 18 18 1 3 7666.016 4 18 18 2 4 7891.299 > # result has an additional column named "code" -- easily eliminated At Monday 10:47 PM 7/28/2003 +0100, you wrote:>Dear All, > >I would like to ask you how to accomplish a little tricky data >manipulation. I have a large dataset, looking something like: > >temp line cage number >18 18 1 6678.63 >18 18 1 7774.458 >18 18 1 7845.902 >18 18 1 9483.578 >18 18 1 8983.555 >18 18 1 9181.052 >18 18 1 9458.696 >18 18 1 8138.616 >18 18 1 7981.994 >18 18 1 7556.491 >18 18 1 7672.137 >18 18 1 6607.776 >18 18 1 8383.65 >18 18 1 7129.852 >18 18 1 8536.667 >18 18 2 8287.8 >18 18 2 7924.47 >18 18 2 7928.474 >18 18 2 7363.157 >18 18 2 7952.593 >..... > >I would like to create a dataframe where I get the mean values, 5 rows at a >time, of columns "number", while keeping the value in the other columns >fixed to the vaules found in the first of the 5 rows (or whatever, it's the >same for the 5 rows) so that the above would be "shrunk" to: > >temp line cage number >18 18 1 8153.2246 >18 18 1 8463.3698 >18 18 1 7666.0164 >18 18 2 7891.2988 > >Any hints? > >Regards, > >Federico Calboli > >========================> >Federico C.F. Calboli > >Department of Biology >University College London >Room 327 >Darwin Building >Gower Street >London >WClE 6BT > >Tel: (+44) 020 7679 4395 >Fax (+44) 020 7679 7096 >f.calboli at ucl.ac.uk > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-helpTony Plate tplate at acm.org
Federico Calboli
2003-Jul-28 23:04 UTC
[R] data manipulation: getting mean value every 5 rows
Dear All, thanks for exceptional and speedy help. In particular, thanks to J. R. Lockwood, Sue Paul, Spencer Graves, Dennis J. Murphy and Tony Plate. regards, Federico Calboli ======================== Federico C.F. Calboli Department of Biology University College London Room 327 Darwin Building Gower Street London WClE 6BT Tel: (+44) 020 7679 4395 Fax (+44) 020 7679 7096 f.calboli at ucl.ac.uk