All, I am looking at an example in Aliaga's Interactive Statistics. Bag A has
the following vouchers.
BagA <- c(-1000,10,10,10,10,10,10,
10,20,20,20,20,20,20,30,
30,40,40,50,60)
Bag B has the following vouchers.
BagB <- c(10,20,30,30,40,40,50,50,
50,50,50,50,60,60,60,60,
60,60,60,1000)
Two values are selected (from BagA or BagB) without replacement. In Table
1.1 on page 54 of the third edition, she lists all "Possible two values
selected" in columns one and two, the "Average of the two selected values"
in column three and "BAG A Numbers of way of selecting the two values" in
column four, and "BAG B Number of ways of selecting the two values" in
column five.
Here are the first few rows:
-1000 -1000 -1000 0 0
-1000 10 -495 7 0
-1000 20 -490 6 0
-1000 30 -485 2 0
-1000 40 -480 2 0
-1000 50 -475 1 0
-1000 60 -470 1 0
-1000 1000 0 0 0
10 10 10 21 0
10 20 15 42 1
...
She then condenses the data in Table 1.2 on page 55, the first column
holding "Average of the two selected values', the second column holding "BAG
A Number of ways of selecting the two values," and third column holding "BAG
B Number of ways of selecting the two values."
Here are a few sample rows:
-1000 0 0
-495 7 0
-490 6 0
....
Can anyone help show me an efficient way of creating these two tables?
Thanks.
David.
--
View this message in context: http://r.789695.n4.nabble.com/Two-selections-from-Bag-A-tp4641327.html
Sent from the R help mailing list archive at Nabble.com.
Two selections from Bag A
4 messages · David Winsemius, David Arnold
On Aug 25, 2012, at 5:37 PM, darnold wrote:
All, I am looking at an example in Aliaga's Interactive Statistics.
Bag A has
the following vouchers.
BagA <- c(-1000,10,10,10,10,10,10,
10,20,20,20,20,20,20,30,
30,40,40,50,60)
Bag B has the following vouchers.
BagB <- c(10,20,30,30,40,40,50,50,
50,50,50,50,60,60,60,60,
60,60,60,1000)
Two values are selected (from BagA or BagB) without replacement. In
Table
1.1 on page 54 of the third edition, she lists all "Possible two
values
selected" in columns one and two,
?unique ?expand.grid or ?combn Perhaps spliting names from the tabulation below>
the "Average of the two selected values"
?mean
in column three and "BAG A Numbers of way of selecting the two values" in column four, and "BAG B Number of ways of selecting the two values" in column five. Here are the first few rows: -1000 -1000 -1000 0 0
Why is that combination even listed?
-1000 10 -495 7 0 -1000 20 -490 6 0 -1000 30 -485 2 0 -1000 40 -480 2 0 -1000 50 -475 1 0 -1000 60 -470 1 0 -1000 1000 0 0 0
What are the rules for listing a combination?
10 10 10 21 0
10 20 15 42 1
I can get that value if choosing just from BagA, but if the possibilities are for either bag to be selected, then an additional value would arise because 10 and 20 are in BagB. ?table ?apply ?paste
... She then condenses the data in Table 1.2 on page 55, the first column holding "Average of the two selected values', the second column holding "BAG A Number of ways of selecting the two values," and third column holding "BAG B Number of ways of selecting the two values." Here are a few sample rows: -1000 0 0 -495 7 0 -490 6 0 .... Can anyone help show me an efficient way of creating these two tables?
table( apply( combn(BagA,2), 2, function(x) paste( sort(x), sep=".",
collapse=".") ) )
table( apply( combn(BagB,2), 2, function(x) paste( sort(x), sep=".",
collapse=".") ) )
You should be able to take it from this illustration of how to get the
BagA results:
cbind( do.call( rbind ,
sapply(names(table( apply( combn(BagA,2), 2, function(x)
paste( sort(x), sep=".", collapse=".") ) ) ) , strsplit, split= "\
\.") ), # first 2 columns
table( apply( combn(BagA,2), 2, function(x) paste( sort(x),
sep=".", collapse=".") ) ) )
[,1] [,2] [,3]
-1000.10 "-1000" "10" "7"
-1000.20 "-1000" "20" "6"
-1000.30 "-1000" "30" "2"
-1000.40 "-1000" "40" "2"
-1000.50 "-1000" "50" "1"
-1000.60 "-1000" "60" "1"
10.10 "10" "10" "21"
10.20 "10" "20" "42"
10.30 "10" "30" "14"
10.40 "10" "40" "14"
10.50 "10" "50" "7"
10.60 "10" "60" "7"
20.20 "20" "20" "15"
20.30 "20" "30" "12"
20.40 "20" "40" "12"
20.50 "20" "50" "6"
20.60 "20" "60" "6"
30.30 "30" "30" "1"
30.40 "30" "40" "4"
30.50 "30" "50" "2"
30.60 "30" "60" "2"
40.40 "40" "40" "1"
40.50 "40" "50" "2"
40.60 "40" "60" "2"
50.60 "50" "60" "1"
David. David Winsemius, MD Alameda, CA, USA
Here are the two tables from Aligaga. The first is table 1.1 and the second is table 1.2. http://r.789695.n4.nabble.com/file/n4641344/table1_1.jpg http://r.789695.n4.nabble.com/file/n4641344/table1_2.jpg David Arnold College of the Redwoods -- View this message in context: http://r.789695.n4.nabble.com/Two-selections-from-Bag-A-tp4641327p4641344.html Sent from the R help mailing list archive at Nabble.com.
On Aug 26, 2012, at 8:28 AM, darnold wrote:
Here are the two tables from Aligaga. The first is table 1.1 and the second is table 1.2. http://r.789695.n4.nabble.com/file/n4641344/table1_1.jpg
My code from earlier today (that you have not included) showed you how
to tabulate and construct the BagA entries. I actually did it by way
of makine a dataframe from the names of the table and a counts column
with the table. In Table 1.1 the two Bags combinations have been
merge()-ed by their value columns.
> merge(BagAcombs, BagBcombs, by=1:2, all=TRUE)
X1 X2 counts.x counts.y
1 -1000 10 7 NA
2 -1000 20 6 NA
3 -1000 30 2 NA
4 -1000 40 2 NA
5 -1000 50 1 NA
6 -1000 60 1 NA
7 10 10 21 NA
8 10 20 42 1
9 10 30 14 2
10 10 40 14 2
11 10 50 7 6
12 10 60 7 7
13 10 1000 NA 1
.... Rest of output deleted
That object was assigned to "Combs".
I made the labels numeric. NA values were set to 0.
Combs$X1 <- as.numeric(as.character(Combs$X1)) Combs$X2 <- as.numeric(as.character(Combs$X2))
Calculate and average: > Combs$Average <- with( Combs, rowMeans(X1,X2) )
So the second table is aggregated (summed and sorted) by the distinct
values in the average-column of the first. (The 7 A 10 x 50 values are
added to the 12 A 20 x 40 values and the single 1 A 30 x 30 to give 20
in the 30 row for A). You should create a factor and aggregate in the
usual manner.
> aggregate(Combs[ , 3:5], list(Combs$Average), FUN=sum)
Group.1 counts.x counts.y Average
1 -495 7 0 -495
2 -490 6 0 -490
3 -485 2 0 -485
4 -480 2 0 -480
5 -475 1 0 -475
6 -470 1 0 -470
7 10 21 0 10
8 15 42 1 15
9 20 29 2 40
10 25 26 4 50
11 30 20 9 90
... rest of output deleted.
The top and bottom rows of both tables appear to me to have no value.
They are not really items in the sample space or the problem and their
purpose remains a mystery.
(And ... Please do learn to include context.)
David Arnold College of the Redwoods
David Winsemius, MD Alameda, CA, USA