All, I am looking at an example in Aliaga's Interactive Statistics. Bag A has the following vouchers. BagA <- c(-1000,10,10,10,10,10,10, 10,20,20,20,20,20,20,30, 30,40,40,50,60) Bag B has the following vouchers. BagB <- c(10,20,30,30,40,40,50,50, 50,50,50,50,60,60,60,60, 60,60,60,1000) Two values are selected (from BagA or BagB) without replacement. In Table 1.1 on page 54 of the third edition, she lists all "Possible two values selected" in columns one and two, the "Average of the two selected values" in column three and "BAG A Numbers of way of selecting the two values" in column four, and "BAG B Number of ways of selecting the two values" in column five. Here are the first few rows: -1000 -1000 -1000 0 0 -1000 10 -495 7 0 -1000 20 -490 6 0 -1000 30 -485 2 0 -1000 40 -480 2 0 -1000 50 -475 1 0 -1000 60 -470 1 0 -1000 1000 0 0 0 10 10 10 21 0 10 20 15 42 1 ... She then condenses the data in Table 1.2 on page 55, the first column holding "Average of the two selected values', the second column holding "BAG A Number of ways of selecting the two values," and third column holding "BAG B Number of ways of selecting the two values." Here are a few sample rows: -1000 0 0 -495 7 0 -490 6 0 .... Can anyone help show me an efficient way of creating these two tables? Thanks. David. -- View this message in context: http://r.789695.n4.nabble.com/Two-selections-from-Bag-A-tp4641327.html Sent from the R help mailing list archive at Nabble.com.
On Aug 25, 2012, at 5:37 PM, darnold wrote:> All, I am looking at an example in Aliaga's Interactive Statistics. > Bag A has > the following vouchers. > > BagA <- c(-1000,10,10,10,10,10,10, > 10,20,20,20,20,20,20,30, > 30,40,40,50,60) > > Bag B has the following vouchers. > > BagB <- c(10,20,30,30,40,40,50,50, > 50,50,50,50,60,60,60,60, > 60,60,60,1000) > > Two values are selected (from BagA or BagB) without replacement. In > Table > 1.1 on page 54 of the third edition, she lists all "Possible two > values > selected" in columns one and two,?unique ?expand.grid or ?combn Perhaps spliting names from the tabulation below>> the "Average of the two selected values"?mean> in column three and "BAG A Numbers of way of selecting the two > values" in > column four, and "BAG B Number of ways of selecting the two values" in > column five. > > Here are the first few rows: > > -1000 -1000 -1000 0 0Why is that combination even listed?> -1000 10 -495 7 0 > -1000 20 -490 6 0 > -1000 30 -485 2 0 > -1000 40 -480 2 0 > -1000 50 -475 1 0 > -1000 60 -470 1 0 > -1000 1000 0 0 0What are the rules for listing a combination?> 10 10 10 21 0 > 10 20 15 42 1I can get that value if choosing just from BagA, but if the possibilities are for either bag to be selected, then an additional value would arise because 10 and 20 are in BagB. ?table ?apply ?paste> ... > > She then condenses the data in Table 1.2 on page 55, the first column > holding "Average of the two selected values', the second column > holding "BAG > A Number of ways of selecting the two values," and third column > holding "BAG > B Number of ways of selecting the two values." > > Here are a few sample rows: > > -1000 0 0 > -495 7 0 > -490 6 0 > .... > > Can anyone help show me an efficient way of creating these two tables?table( apply( combn(BagA,2), 2, function(x) paste( sort(x), sep=".", collapse=".") ) ) table( apply( combn(BagB,2), 2, function(x) paste( sort(x), sep=".", collapse=".") ) ) You should be able to take it from this illustration of how to get the BagA results: cbind( do.call( rbind , sapply(names(table( apply( combn(BagA,2), 2, function(x) paste( sort(x), sep=".", collapse=".") ) ) ) , strsplit, split= "\ \.") ), # first 2 columns table( apply( combn(BagA,2), 2, function(x) paste( sort(x), sep=".", collapse=".") ) ) ) [,1] [,2] [,3] -1000.10 "-1000" "10" "7" -1000.20 "-1000" "20" "6" -1000.30 "-1000" "30" "2" -1000.40 "-1000" "40" "2" -1000.50 "-1000" "50" "1" -1000.60 "-1000" "60" "1" 10.10 "10" "10" "21" 10.20 "10" "20" "42" 10.30 "10" "30" "14" 10.40 "10" "40" "14" 10.50 "10" "50" "7" 10.60 "10" "60" "7" 20.20 "20" "20" "15" 20.30 "20" "30" "12" 20.40 "20" "40" "12" 20.50 "20" "50" "6" 20.60 "20" "60" "6" 30.30 "30" "30" "1" 30.40 "30" "40" "4" 30.50 "30" "50" "2" 30.60 "30" "60" "2" 40.40 "40" "40" "1" 40.50 "40" "50" "2" 40.60 "40" "60" "2" 50.60 "50" "60" "1" -- David. David Winsemius, MD Alameda, CA, USA
Here are the two tables from Aligaga. The first is table 1.1 and the second is table 1.2. http://r.789695.n4.nabble.com/file/n4641344/table1_1.jpg http://r.789695.n4.nabble.com/file/n4641344/table1_2.jpg David Arnold College of the Redwoods -- View this message in context: http://r.789695.n4.nabble.com/Two-selections-from-Bag-A-tp4641327p4641344.html Sent from the R help mailing list archive at Nabble.com.
On Aug 26, 2012, at 8:28 AM, darnold wrote:> Here are the two tables from Aligaga. The first is table 1.1 and the > second > is table 1.2. > > http://r.789695.n4.nabble.com/file/n4641344/table1_1.jpgMy code from earlier today (that you have not included) showed you how to tabulate and construct the BagA entries. I actually did it by way of makine a dataframe from the names of the table and a counts column with the table. In Table 1.1 the two Bags combinations have been merge()-ed by their value columns. > merge(BagAcombs, BagBcombs, by=1:2, all=TRUE) X1 X2 counts.x counts.y 1 -1000 10 7 NA 2 -1000 20 6 NA 3 -1000 30 2 NA 4 -1000 40 2 NA 5 -1000 50 1 NA 6 -1000 60 1 NA 7 10 10 21 NA 8 10 20 42 1 9 10 30 14 2 10 10 40 14 2 11 10 50 7 6 12 10 60 7 7 13 10 1000 NA 1 .... Rest of output deleted That object was assigned to "Combs". I made the labels numeric. NA values were set to 0.> > Combs$X1 <- as.numeric(as.character(Combs$X1)) > > Combs$X2 <- as.numeric(as.character(Combs$X2))Calculate and average: > Combs$Average <- with( Combs, rowMeans(X1,X2) )> http://r.789695.n4.nabble.com/file/n4641344/table1_2.jpgSo the second table is aggregated (summed and sorted) by the distinct values in the average-column of the first. (The 7 A 10 x 50 values are added to the 12 A 20 x 40 values and the single 1 A 30 x 30 to give 20 in the 30 row for A). You should create a factor and aggregate in the usual manner. > aggregate(Combs[ , 3:5], list(Combs$Average), FUN=sum) Group.1 counts.x counts.y Average 1 -495 7 0 -495 2 -490 6 0 -490 3 -485 2 0 -485 4 -480 2 0 -480 5 -475 1 0 -475 6 -470 1 0 -470 7 10 21 0 10 8 15 42 1 15 9 20 29 2 40 10 25 26 4 50 11 30 20 9 90 ... rest of output deleted. The top and bottom rows of both tables appear to me to have no value. They are not really items in the sample space or the problem and their purpose remains a mystery. (And ... Please do learn to include context.)> > David Arnold > College of the Redwoods-- David Winsemius, MD Alameda, CA, USA