Is it possible to obtain an agglomeration schedule with R cluster analyis
You didn't show what the tabular summary should look like. However, look at the height and merge components of an hclust object:
hc3 <- hclust(dist(USArrests[1:8, c(1,2,4)])) data.frame(hc3[2:1])
height merge.1 merge.2
1 9.297849 -1 -8
2 13.609188 -2 -5
3 23.779193 -4 -6
4 33.865321 -3 2
5 48.229659 1 3
6 104.636227 4 5
7 185.135221 -7 6
The two merge.* columns identify what groups merged at
the corresponding height value. Negative values, i, refer to the
-i'th leaf value in the 'labels' component and positive values, i, refer
to cluster created in the i'th row of the data.frame. The following
function transforms those references into name:
f <- function(hc){
data.frame(row.names=paste0("Cluster",seq_along(hc$height)),
height=hc$height,
components=ifelse(hc$merge<0, hc$labels[abs(hc$merge)], paste0("Cluster",hc$merge)),
stringsAsFactors=FALSE)
}
as in
f(hc3)
height components.1 components.2 Cluster1 9.297849 Alabama Delaware Cluster2 13.609188 Alaska California Cluster3 23.779193 Arkansas Colorado Cluster4 33.865321 Arizona Cluster2 Cluster5 48.229659 Cluster1 Cluster3 Cluster6 104.636227 Cluster4 Cluster5 Cluster7 185.135221 Connecticut Cluster6 Compare that to the output of str(as.dendrogram(hc3)):
str(as.dendrogram(hc3))
--[dendrogram w/ 2 branches and 8 members at h = 185]
|--leaf "Connecticut"
`--[dendrogram w/ 2 branches and 7 members at h = 105]
|--[dendrogram w/ 2 branches and 3 members at h = 33.9]
| |--leaf "Arizona"
| `--[dendrogram w/ 2 branches and 2 members at h = 13.6]
| |--leaf "Alaska"
| `--leaf "California"
`--[dendrogram w/ 2 branches and 4 members at h = 48.2]
|--[dendrogram w/ 2 branches and 2 members at h = 9.3]
| |--leaf "Alabama"
| `--leaf "Delaware"
`--[dendrogram w/ 2 branches and 2 members at h = 23.8]
|--leaf "Arkansas"
`--leaf "Colorado"
Does f() produce the information you need for your display?
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
Of Bob Green
Sent: Saturday, February 23, 2013 12:49 PM
To: Uwe Ligges
Cc: r-help at r-project.org
Subject: Re: [R] Is it possible to obtain an agglomeration schedule with R cluster analyis
Hello Uwes,
Thanks. Re-reading the hclust pages I found that using the hclust
'USArrests' data that the command > plot (hc1) will generate the
order in which cases joined. however, I still can't see how to obtain
the respective height at which each case joined each cluster or the
height when clusters merge.
The dendrogram {stats} page provides the following code which
produces the information that I require. However, what I would like
to obtain is a table of the height at which cluster formed.
> hc <- hclust(dist(USArrests), "ave") > (dend1 <- as.dendrogram(hc)) # "print()" method > str(dend1) # "str()" method
I also found as.hclust which plots what I want, but I still can't
find a way to produce the actual height values which are being
plotted, for example as a tabular summary.
plot(hc) ; mtext("hclust", side=1)
Any assistance is appreciated,
Bob
At 04:01 AM 24/02/2013, Uwe Ligges wrote:
On 22.02.2013 11:41, Bob Green wrote:
Hello, In SPSS the cluster analysis output includes an agglomerations schedule, which details the stages when cases are joined. Is it possible to obtain such output when performing cluster analysis in R? If so, I'd appreciate advice regarding how to obtain this information.
If you are talking about hierarchical clustering via hclust(), see ?hclust It tells you that the relevant information is available inside the object and you can even see it via the plot method. Uwe Ligges
Any assistance is appreciated, Regards Bob
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.