Keith S Weintraub
2013-Apr-17 16:54 UTC
[R] Best way to calculate averages of Blocks in an matrix?
Folks, I recently was given a simulated data set like the following subset: sim_sub<-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, 0.03, 0.03)), .Names = c("V11", "V12", "V13", "V14", "V15", "V16", "V17", "V18", "V19", "V20"), row.names = c(NA, 20L), class = "data.frame")> sim_subV11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 3 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 4 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 5 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 6 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 9 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 Every 5 rows represents one block of simulated data. What would be the best way to average the blocks? My way was to reshape sim_sub, average over the columns and then reshape back like so:> matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0.0050 0.0000 0.0000 0.0025 0.0025 0.005 0.0000 0.0050 0.0050 0.0050 [2,] 0.0000 0.0025 0.0000 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 [3,] 0.0000 0.0000 0.0000 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 [5,] 0.0000 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 How bad is "t(sim_sub)" in the above? Thanks for your time, KW --
do.call(rbind,lapply(split(sim_sub,((seq_len(nrow(sim_sub))-1)%/% 5)+1),colMeans)) ? #? V11?? V12?? V13?? V14?? V15? V16?? V17?? V18?? V19?? V20 #1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 #2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 #3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 #4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 A.K. ----- Original Message ----- From: Keith S Weintraub <kw1958 at gmail.com> To: "r-help at r-project.org" <r-help at r-project.org> Cc: Sent: Wednesday, April 17, 2013 12:54 PM Subject: [R] Best way to calculate averages of Blocks in an matrix? Folks, ? I recently was given a simulated data set like the following subset: sim_sub<-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, 0.03, 0.03)), .Names = c("V11", "V12", "V13", "V14", "V15", "V16", "V17", "V18", "V19", "V20"), row.names = c(NA, 20L), class = "data.frame")> sim_sub? ? V11? V12? V13? V14? V15? V16? V17? V18? V19? V20 1? 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2? 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 3? 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 4? 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 5? 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 6? 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 7? 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8? 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 9? 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 Every 5 rows represents one block of simulated data. What would be the best way to average the blocks? My way was to reshape sim_sub, average over the columns and then reshape back like so:> matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4? ? ? [,1]? [,2]? [,3]? [,4]? [,5]? [,6]? [,7]? [,8]? [,9]? [,10] [1,] 0.0050 0.0000 0.0000 0.0025 0.0025 0.005 0.0000 0.0050 0.0050 0.0050 [2,] 0.0000 0.0025 0.0000 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 [3,] 0.0000 0.0000 0.0000 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 [5,] 0.0000 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 How bad is "t(sim_sub)" in the above? Thanks for your time, KW -- ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Also, do.call(rbind,lapply(split(sim_sub,rep(1:(1+nrow(sim_sub)/5),each=5)[seq_len(nrow(sim_sub))]),colMeans)) #??? V11?? V12?? V13?? V14?? V15? V16?? V17?? V18?? V19?? V20 #1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 #2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 #3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 #4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 A.K. ----- Original Message ----- From: arun <smartpink111 at yahoo.com> To: Keith S Weintraub <kw1958 at gmail.com> Cc: R help <r-help at r-project.org> Sent: Wednesday, April 17, 2013 1:04 PM Subject: Re: [R] Best way to calculate averages of Blocks in an matrix? do.call(rbind,lapply(split(sim_sub,((seq_len(nrow(sim_sub))-1)%/% 5)+1),colMeans)) ? #? V11?? V12?? V13?? V14?? V15? V16?? V17?? V18?? V19?? V20 #1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 #2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 #3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 #4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 A.K. ----- Original Message ----- From: Keith S Weintraub <kw1958 at gmail.com> To: "r-help at r-project.org" <r-help at r-project.org> Cc: Sent: Wednesday, April 17, 2013 12:54 PM Subject: [R] Best way to calculate averages of Blocks in an matrix? Folks, ? I recently was given a simulated data set like the following subset: sim_sub<-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, 0.03, 0.03)), .Names = c("V11", "V12", "V13", "V14", "V15", "V16", "V17", "V18", "V19", "V20"), row.names = c(NA, 20L), class = "data.frame")> sim_sub? ? V11? V12? V13? V14? V15? V16? V17? V18? V19? V20 1? 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2? 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 3? 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 4? 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 5? 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 6? 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 7? 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8? 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 9? 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 Every 5 rows represents one block of simulated data. What would be the best way to average the blocks? My way was to reshape sim_sub, average over the columns and then reshape back like so:> matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4? ? ?? [,1]?? [,2]?? [,3]?? [,4]?? [,5]? [,6]?? [,7]?? [,8]?? [,9]? [,10] [1,] 0.0050 0.0000 0.0000 0.0025 0.0025 0.005 0.0000 0.0050 0.0050 0.0050 [2,] 0.0000 0.0025 0.0000 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 [3,] 0.0000 0.0000 0.0000 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 [5,] 0.0000 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 How bad is "t(sim_sub)" in the above? Thanks for your time, KW -- ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Rui Barradas
2013-Apr-17 17:09 UTC
[R] Best way to calculate averages of Blocks in an matrix?
Hello, Try the following. blocks <- rep(1:(1 + nrow(sim_sub) %/% 5), each = 5)[seq_len(nrow(sim_sub))] aggregate(sim_sub, list(blocks), FUN = mean) Hope this helps, Rui Barradas Em 17-04-2013 18:04, arun escreveu:> do.call(rbind,lapply(split(sim_sub,((seq_len(nrow(sim_sub))-1)%/% 5)+1),colMeans)) > # V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 > #1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 > #2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 > #3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 > #4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 > A.K. > > > > ----- Original Message ----- > From: Keith S Weintraub <kw1958 at gmail.com> > To: "r-help at r-project.org" <r-help at r-project.org> > Cc: > Sent: Wednesday, April 17, 2013 12:54 PM > Subject: [R] Best way to calculate averages of Blocks in an matrix? > > Folks, > I recently was given a simulated data set like the following subset: > > sim_sub<-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, > 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, > 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, > 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, > 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, > 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, > 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, > 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, > 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, > 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, > 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 > ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, > 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, > 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, > 0.03, 0.03)), .Names = c("V11", "V12", "V13", "V14", "V15", "V16", > "V17", "V18", "V19", "V20"), row.names = c(NA, 20L), class = "data.frame") > >> sim_sub > V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 > 1 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 2 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 > 3 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 > 4 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 > 5 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 > 6 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 > 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 8 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 > 9 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 > 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 > 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 > 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 > 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 > 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 > 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 > 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 > 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 > 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 > 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 > 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 > > Every 5 rows represents one block of simulated data. > > What would be the best way to average the blocks? > > My way was to reshape sim_sub, average over the columns and then reshape back like so: > >> matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4 > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > [1,] 0.0050 0.0000 0.0000 0.0025 0.0025 0.005 0.0000 0.0050 0.0050 0.0050 > [2,] 0.0000 0.0025 0.0000 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 > [3,] 0.0000 0.0000 0.0000 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 > [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 > [5,] 0.0000 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 > > > How bad is "t(sim_sub)" in the above? > > Thanks for your time, > KW > > -- > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
David Winsemius
2013-Apr-17 20:05 UTC
[R] Best way to calculate averages of Blocks in an matrix?
On Apr 17, 2013, at 9:54 AM, Keith S Weintraub wrote:> Folks, > I recently was given a simulated data set like the following subset: > > sim_sub<-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, > 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, > 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, > 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, > 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, > 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, > 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, > 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, > 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, > 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, > 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 > ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, > 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, > 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, > 0.03, 0.03)), .Names = c("V11", "V12", "V13", "V14", "V15", "V16", > "V17", "V18", "V19", "V20"), row.names = c(NA, 20L), class = "data.frame") > >> sim_sub > V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 > 1 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 2 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 > 3 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 > 4 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 > 5 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 > 6 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 > 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 8 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 > 9 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 > 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 > 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 > 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 > 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 > 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 > 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 > 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 > 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 > 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 > 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 > 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 > > Every 5 rows represents one block of simulated data. > > What would be the best way to average the blocks?This answers the posed question: > tapply( data.matrix(sim_sub), rep( rep(1:4, each=5), each=10) ,mean) 1 2 3 4 0.0030 0.0070 0.0106 0.0144 Your code following suggests that you do not want the average values within blocks but within blocks AND ALSO within columns (although how you get 5 rows of 5 blocks from a 20 row input object is unclear to me)> data.frame( lapply(sim_sub, function(col) tapply(col, rep(1:4, each=5), mean) ) )V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 From your code I am guessing a typo of 5 for 4?> > My way was to reshape sim_sub, average over the columns and then reshape back like so: > >> matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4 > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > [1,] 0.0050 0.0000 0.0000 0.0025 0.0025 0.005 0.0000 0.0050 0.0050 0.0050 > [2,] 0.0000 0.0025 0.0000 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 > [3,] 0.0000 0.0000 0.0000 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 > [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 > [5,] 0.0000 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 > > > How bad is "t(sim_sub)" in the above?The whole matrix( matrix( t(.), ... )) approach seems kind of tortured, but to your question, t() is a fairly efficient function. -- David Winsemius Alameda, CA, USA
?tapply(t(data.matrix(sim_sub)),rep( rep(1:4, each=5), each=10),mean) ?? #? 1????? 2????? 3????? 4 #0.0086 0.0074 0.0082 0.0108 unlist(lapply(split(sim_sub,((seq_len(nrow(sim_sub))-1)%/%5)+1),function(x) mean(unlist(x)))) #??? 1????? 2????? 3????? 4 #0.0086 0.0074 0.0082 0.0108 A.K. ----- Original Message ----- From: David Winsemius <dwinsemius at comcast.net> To: Keith S Weintraub <kw1958 at gmail.com> Cc: "r-help at r-project.org" <r-help at r-project.org> Sent: Wednesday, April 17, 2013 4:05 PM Subject: Re: [R] Best way to calculate averages of Blocks in an matrix? On Apr 17, 2013, at 9:54 AM, Keith S Weintraub wrote:> Folks, >? I recently was given a simulated data set like the following subset: > > sim_sub<-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, > 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, > 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, > 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, > 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, > 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, > 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, > 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, > 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, > 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, > 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 > ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, > 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, > 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, > 0.03, 0.03)), .Names = c("V11", "V12", "V13", "V14", "V15", "V16", > "V17", "V18", "V19", "V20"), row.names = c(NA, 20L), class = "data.frame") > >> sim_sub >? ? V11? V12? V13? V14? V15? V16? V17? V18? V19? V20 > 1? 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 2? 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 > 3? 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 > 4? 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 > 5? 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 > 6? 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 > 7? 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 8? 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 > 9? 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 > 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 > 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 > 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 > 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 > 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 > 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 > 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 > 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 > 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 > 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 > 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 > > Every 5 rows represents one block of simulated data. > > What would be the best way to average the blocks?This answers the posed question: ? > tapply( data.matrix(sim_sub),? rep( rep(1:4, each=5), each=10) ,mean) ? ? 1? ? ? 2? ? ? 3? ? ? 4 0.0030 0.0070 0.0106 0.0144 Your code following suggests that you do not want the average values within blocks but within blocks AND ALSO within columns (although how you get 5 rows of 5 blocks from a 20 row input object is unclear to me)> data.frame( lapply(sim_sub, function(col) tapply(col, rep(1:4, each=5), mean)? ) )? ? V11? V12? V13? V14? V15? V16? V17? V18? V19? V20 1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 From your code I am guessing a typo of 5 for 4?> > My way was to reshape sim_sub, average over the columns and then reshape back like so: > >> matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4 >? ? ? [,1]? [,2]? [,3]? [,4]? [,5]? [,6]? [,7]? [,8]? [,9]? [,10] > [1,] 0.0050 0.0000 0.0000 0.0025 0.0025 0.005 0.0000 0.0050 0.0050 0.0050 > [2,] 0.0000 0.0025 0.0000 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 > [3,] 0.0000 0.0000 0.0000 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 > [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 > [5,] 0.0000 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 > > > How bad is "t(sim_sub)" in the above?The whole matrix( matrix( t(.), ... )) approach seems kind of tortured, but to your question, t() is a fairly efficient function. -- David Winsemius Alameda, CA, USA ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.