Hi guys, I hope you can help me with this (probably) simple query: I have a data frame: -------------------------- a=c(1,1,1,1,1,1,2,2,2,2,2,2) b=c(1,1,1,2,3,4,1,1,2,2,3,4) c=c(400,200,300,100,500,300,200,100,500,400,200,100) data=data.frame(a=a,b=b,c=c) -------------------------- And I would like to get the following output: -------------------------- b a 1 2 3 4 1 900 100 500 300 2 300 900 200 100 -------------------------- The values in the output represent the sum of values "c" in data frame "data", for each "a" and "b" combination. For example, where "a" = 1 and "b" = 1, the output is 400+200+300 = 900. Please would anyone be able to provide a script to create my desired output? Many thanks in advance, Ben Gillespie Research Postgraduate School of Geography University of Leeds Leeds LS2 9JT
Hi Ben, let me suggest some background reading - Peter Dalgaard's or Phil Spector's book will set you up with what you need. You can also read one of the many free, contributed sets of notes kept on CRAN. I hope that this helps Andrew On Mon, Feb 4, 2013 at 8:29 PM, Benjamin Gillespie <gybrg@leeds.ac.uk>wrote:> Hi guys, > > I hope you can help me with this (probably) simple query: > > I have a data frame: > > -------------------------- > > a=c(1,1,1,1,1,1,2,2,2,2,2,2) > b=c(1,1,1,2,3,4,1,1,2,2,3,4) > c=c(400,200,300,100,500,300,200,100,500,400,200,100) > > > data=data.frame(a=a,b=b,c=c) > > -------------------------- > > And I would like to get the following output: > > -------------------------- > > b > a 1 2 3 4 > 1 900 100 500 300 > 2 300 900 200 100 > > -------------------------- > > The values in the output represent the sum of values "c" in data frame > "data", for each "a" and "b" combination. > > For example, where "a" = 1 and "b" = 1, the output is 400+200+300 = 900. > > Please would anyone be able to provide a script to create my desired > output? > > Many thanks in advance, > > Ben Gillespie > Research Postgraduate > > School of Geography > University of Leeds > Leeds > LS2 9JT > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Andrew Robinson Director (A/g), ACERA Senior Lecturer in Applied Statistics Tel: +61-3-8344-6410 Department of Mathematics and Statistics Fax: +61-3-8344 4599 University of Melbourne, VIC 3010 Australia Email: a.robinson@ms.unimelb.edu.au Website: http://www.ms.unimelb.edu.au FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/ SPuR: http://www.ms.unimelb.edu.au/spuRs/ [[alternative HTML version deleted]]
try this: a <- c(1,1,1,1,1,1,2,2,2,2,2,2) b <- c(1,1,1,2,3,4,1,1,2,2,3,4) c <- c(400,200,300,100,500,300,200,100,500,400,200,100) DF <- data.frame(a, b, c) with(DF, tapply(c, list(a, b), sum)) I hope it helps. Best, Dimitris On 2/4/2013 10:29 AM, Benjamin Gillespie wrote:> Hi guys, > > I hope you can help me with this (probably) simple query: > > I have a data frame: > > -------------------------- > > a=c(1,1,1,1,1,1,2,2,2,2,2,2) > b=c(1,1,1,2,3,4,1,1,2,2,3,4) > c=c(400,200,300,100,500,300,200,100,500,400,200,100) > > > data=data.frame(a=a,b=b,c=c) > > -------------------------- > > And I would like to get the following output: > > -------------------------- > > b > a 1 2 3 4 > 1 900 100 500 300 > 2 300 900 200 100 > > -------------------------- > > The values in the output represent the sum of values "c" in data frame "data", for each "a" and "b" combination. > > For example, where "a" = 1 and "b" = 1, the output is 400+200+300 = 900. > > Please would anyone be able to provide a script to create my desired output? > > Many thanks in advance, > > Ben Gillespie > Research Postgraduate > > School of Geography > University of Leeds > Leeds > LS2 9JT > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/
Here are some examples of data aggregation functions in R: http://www.slideshare.net/djandrija/data-aggregation-in-r http://www.psychwire.co.uk/2011/04/data-aggregation-in-r-plyr-sqldf-and-data-table/ Andrija On Mon, Feb 4, 2013 at 10:29 AM, Benjamin Gillespie <gybrg@leeds.ac.uk>wrote:> Hi guys, > > I hope you can help me with this (probably) simple query: > > I have a data frame: > > -------------------------- > > a=c(1,1,1,1,1,1,2,2,2,2,2,2) > b=c(1,1,1,2,3,4,1,1,2,2,3,4) > c=c(400,200,300,100,500,300,200,100,500,400,200,100) > > > data=data.frame(a=a,b=b,c=c) > > -------------------------- > > And I would like to get the following output: > > -------------------------- > > b > a 1 2 3 4 > 1 900 100 500 300 > 2 300 900 200 100 > > -------------------------- > > The values in the output represent the sum of values "c" in data frame > "data", for each "a" and "b" combination. > > For example, where "a" = 1 and "b" = 1, the output is 400+200+300 = 900. > > Please would anyone be able to provide a script to create my desired > output? > > Many thanks in advance, > > Ben Gillespie > Research Postgraduate > > School of Geography > University of Leeds > Leeds > LS2 9JT > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hello, First, don't use "data" for a data frame, as it is a R function. Here is a way to do what you are looking for: a=c(1,1,1,1,1,1,2,2,2,2,2,2) b=c(1,1,1,2,3,4,1,1,2,2,3,4) c=c(400,200,300,100,500,300,200,100,500,400,200,100) dat=data.frame(a=a,b=b,c=c) dat.sum <- aggregate(c ~ a+b, dat, sum) dat.sum <- reshape(dat.sum, timevar='b', idvar='a', direction='wide') colnames(dat.sum) <- c('a','b.1','b.2','b.3','b.4') HTH, Pascal Le 04/02/2013 18:29, Benjamin Gillespie a ?crit :> Hi guys, > > I hope you can help me with this (probably) simple query: > > I have a data frame: > > -------------------------- > > a=c(1,1,1,1,1,1,2,2,2,2,2,2) > b=c(1,1,1,2,3,4,1,1,2,2,3,4) > c=c(400,200,300,100,500,300,200,100,500,400,200,100) > > > data=data.frame(a=a,b=b,c=c) > > -------------------------- > > And I would like to get the following output: > > -------------------------- > > b > a 1 2 3 4 > 1 900 100 500 300 > 2 300 900 200 100 > > -------------------------- > > The values in the output represent the sum of values "c" in data frame "data", for each "a" and "b" combination. > > For example, where "a" = 1 and "b" = 1, the output is 400+200+300 = 900. > > Please would anyone be able to provide a script to create my desired output? > > Many thanks in advance, > > Ben Gillespie > Research Postgraduate > > School of Geography > University of Leeds > Leeds > LS2 9JT > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hello, In what follows, I've renamed the data.frame 'dat', 'data' already is an R function. xtabs(c ~ a + b, data = dat) Hope this helps, Rui Barradas Em 04-02-2013 09:29, Benjamin Gillespie escreveu:> Hi guys, > > I hope you can help me with this (probably) simple query: > > I have a data frame: > > -------------------------- > > a=c(1,1,1,1,1,1,2,2,2,2,2,2) > b=c(1,1,1,2,3,4,1,1,2,2,3,4) > c=c(400,200,300,100,500,300,200,100,500,400,200,100) > > > data=data.frame(a=a,b=b,c=c) > > -------------------------- > > And I would like to get the following output: > > -------------------------- > > b > a 1 2 3 4 > 1 900 100 500 300 > 2 300 900 200 100 > > -------------------------- > > The values in the output represent the sum of values "c" in data frame "data", for each "a" and "b" combination. > > For example, where "a" = 1 and "b" = 1, the output is 400+200+300 = 900. > > Please would anyone be able to provide a script to create my desired output? > > Many thanks in advance, > > Ben Gillespie > Research Postgraduate > > School of Geography > University of Leeds > Leeds > LS2 9JT > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi, library(reshape2) dcast(DF,a~b,value.var="c",sum) # ?a ? 1 ? 2 ? 3 ? 4 #1 1 900 100 500 300 #2 2 300 900 200 100 A.K. ----- Original Message ----- From: Benjamin Gillespie <gybrg at leeds.ac.uk> To: "r-help at r-project.org" <r-help at r-project.org> Cc: Sent: Monday, February 4, 2013 4:29 AM Subject: [R] Script for conditional sums of vectors Hi guys, I hope you can help me with this (probably) simple query: I have a data frame: -------------------------- a=c(1,1,1,1,1,1,2,2,2,2,2,2) b=c(1,1,1,2,3,4,1,1,2,2,3,4) c=c(400,200,300,100,500,300,200,100,500,400,200,100) data=data.frame(a=a,b=b,c=c) -------------------------- And I would like to get the following output: -------------------------- ??? b a??? 1??? 2??? 3??? 4 1??? 900??? 100??? 500??? 300 2??? 300??? 900??? 200??? 100 -------------------------- The values in the output represent the sum of values "c" in data frame "data", for each "a" and "b" combination. For example, where "a" = 1 and "b" = 1, the output is 400+200+300 = 900. Please would anyone be able to provide a script to create my desired output? Many thanks in advance, ??? ??? Ben Gillespie Research Postgraduate School of Geography University of Leeds Leeds LS2 9JT ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.