$`1`
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
1 1 1 0 1 0 0 1 1 0 1 1 1
3 1 0 1 0 0 1 1 0 0 1 0 1
4 1 1 0 0 0 0 1 1 1 1 1 1
7 0 1 0 1 0 0 1 1 0 1 0 1
9 1 1 1 1 0 1 1 0 1 1 1 0
12 1 0 0 0 0 1 1 1 1 1 0 1
13 0 1 1 1 1 0 0 0 1 1 0 1
15 1 0 1 1 0 0 1 0 0 1 0 1
16 1 0 1 0 0 1 1 0 1 0 1 1
19 0 1 0 0 0 0 1 0 0 1 0 1
20 0 1 1 1 0 0 0 1 1 0 0 1
24 1 1 0 1 0 0 1 0 1 1 1 0
26 1 1 1 1 1 1 0 1 0 1 0 1
28 1 0 1 0 1 0 1 1 0 1 1 1
33 1 1 0 1 0 0 0 0 1 1 0 0
38 1 1 1 0 0 0 0 0 1 1 0 0
40 1 0 1 0 0 0 1 0 0 1 1 1
41 1 1 0 0 0 0 0 0 1 1 1 1
43 0 0 1 0 0 0 1 0 1 1 0 1
52 1 1 1 1 0 0 0 1 1 1 0 1
53 1 1 0 0 1 0 0 1 1 1 0 1
56 1 0 1 0 0 1 1 0 1 0 0 0
60 1 1 1 0 1 1 0 1 1 1 0 1
$`2`
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
2 0 1 1 1 1 1 1 0 0 1 1 0
5 0 1 0 1 1 1 0 0 0 1 1 1
6 0 0 0 0 1 0 1 0 0 1 1 1
10 1 1 1 1 1 0 1 1 0 1 0 0
11 0 1 0 1 1 0 1 0 1 1 1 1
14 0 0 1 1 1 1 1 1 0 1 1 1
17 0 1 0 0 1 0 0 0 0 0 1 1
18 1 0 0 1 1 1 1 1 0 0 1 1
29 1 1 0 1 0 1 1 1 0 0 1 1
37 1 0 0 1 1 0 1 1 0 1 0 0
42 1 1 0 1 1 1 1 0 0 0 0 0
46 1 1 0 1 0 1 1 0 0 1 0 1
48 0 1 0 0 1 0 1 0 0 1 1 0
50 0 1 0 1 1 1 1 1 0 0 1 0
51 0 0 0 1 1 1 1 0 0 0 1 1
54 0 0 0 1 1 1 1 0 0 1 1 0
58 0 1 0 1 1 1 1 1 1 1 1 0
61 1 0 1 0 1 1 1 1 0 1 0 0
$`3`
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
8 0 1 1 0 0 1 0 1 1 1 1 0
21 0 1 0 0 1 1 0 1 0 1 1 0
22 1 1 0 0 0 1 1 1 0 0 1 0
25 0 1 0 0 0 1 0 1 0 1 1 0
27 1 1 0 0 1 1 0 1 1 0 0 0
32 1 1 1 0 1 1 0 1 0 0 1 0
36 1 1 0 0 0 1 0 1 0 0 0 0
44 1 1 1 1 1 1 0 1 0 0 0 0
63 0 1 1 0 1 1 0 0 1 1 1 0
$`4`
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
23 0 0 1 1 0 0 0 0 0 1 0 0
34 0 1 1 1 0 0 0 1 0 1 0 0
$`5`
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
30 0 0 0 0 1 1 0 0 1 1 0 1
31 0 1 1 0 1 0 0 0 1 0 1 1
35 0 0 1 0 1 1 0 0 1 1 0 1
47 0 0 1 0 1 0 0 0 1 0 0 1
49 1 0 0 0 1 1 0 0 1 1 1 0
55 1 0 1 0 1 0 0 0 0 1 1 0
59 0 0 1 0 1 0 0 0 1 0 1 1
$`6`
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
39 0 0 0 0 1 0 1 1 0 0 0 0
62 0 0 0 0 1 0 1 1 0 0 0 1
$`7`
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
45 1 1 0 0 0 0 0 0 0 0 1 0
$`8`
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
57 0 0 1 0 0 1 0 1 0 0 1 1
-------
David
-----Original Message-----
From: Bob Green [mailto:bgreen at dyson.brisnet.org.au]
Sent: Sunday, November 18, 2012 3:22 PM
To: dcarlson at tamu.edu; r-help at r-project.org
Subject: RE: [R] Examining how cases are similar by cluster, in cluster
analysis
David,
Many thanks, I'm sure this will be helpful. What would also be
helpful is if I can extract each cluster and examine id by variable,
within the respective cluster. I could index the variables for each
cluster and run such an analysis but thre must be a more efficient
way of doing this (especially as I experiment with different
clustering methods)
Thanks again,
Bob
At 06:44 AM 19/11/2012, David L Carlson wrote:
If you just want a summary of the mean for each variable in each
cluster, this will get you there:
set.seed=42
FS1 <- data.frame(matrix(sample(c(0, 1), 12*63, replace=TRUE),
dmat <- dist(FS1, method="binary")
cl.test <- hclust(dmat, method="average")
plot(cl.test, hang=-1)
hcli8 <- cutree(cl.test, k=8)
tbl <- aggregate(FS1, by=list(Group=hcli8), mean)
print(tbl, digits=4)
Group X1 X2 X3 X4 X5 X6 X7 X8
X9
1 1 0.5122 0.6829 0.6829 0.6341 0.5854 0.5854 0.6829 0.6341
0.5366
2 2 0.0000 0.0000 0.0000 1.0000 0.6667 0.6667 0.0000 0.6667
0.0000
3 3 0.9286 0.1429 0.1429 0.1429 0.2857 0.5714 0.7857 0.3571
0.8571
4 4 1.0000 1.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000
5 5 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
1.0000
6 6 1.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 1.0000
0.0000
7 7 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000
0.0000
8 8 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000
0.0000
X10 X11 X12
1 0.4146 0.4634 0.561
2 0.6667 0.0000 0.000
3 0.8571 0.6429 0.500
4 1.0000 0.0000 0.000
5 0.0000 1.0000 0.000
6 0.0000 0.0000 1.000
7 0.0000 0.0000 0.000
8 0.0000 0.0000 0.000
----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of Bob Green
Sent: Sunday, November 18, 2012 5:00 AM
To: r-help at r-project.org
Subject: [R] Examining how cases are similar by cluster, in
cluster analysis
Hello,
I used the following code to perform a cluster analysis on a
dataframe consisting of 12 variables (coded as 1,0) and 63
cases.
FS1 <- read.csv("D://Arsontest2.csv",header=T,row.names=1)
str(FS1)
dmat <- dist(FS1, method="binary")
cl.test <- hclust (dist(FS1, method ="binary"), "ave")
plot(cl.test, hang = -1)
Each case has an id and the dendogram identifies the respective
cases
which constitute each cluster. What I am seeking advice on is
how to
examine the variables on which the cases are similar, within
each cluster.
sort (hcli8 <- cutree(cl.test, k=8)) identifies that the
following
cluster 2is comprised of the following cases:
1641 2295 2594 2654 2799 3213 3510 3513 2958 3294
2 2 2 2 2 2 2
2
2 2
This code provides means for the variables by cluster. In
relation to
cluster 2 it appears the cases should have no clear motive and
be depressed :
round(sapply(x, function(i) colMeans(FS1[i,])),2)
[,1] [,2] [,3] [ ,4] [,5]
[,6] [,7] [,8]
depressed 0.00 0.33 0.00 0.0 0 0.6 0.00 0.08
unclear 0.33 1.00 1.00 1.0 0 0.0 0.07 0.12
I can manually, examine this variable by variable and look at
how
each of the cases in cluster 2 are similar on the variables. I
am
looking at a more efficient and quicker way to do this.
Bob