Hello, I have been using the "cophenetic" function for objects of class "dendrogram" and I have realised that it gives different results when it is used with objects of class "hclust". For instance, running the first example in the help file of the "cophenetic" function, d1 <- dist(USArrests) hc <- hclust(d1, "ave") d2 <- cophenetic(hc) cor(d1, d2) # 0.7659 the result given is different to the one obtained using an object of class "dendrogram", dendro <- as.dendrogram(hc) d3 <- cophenetic(dendro) cor(d1, d3) # 0.0151 I think that it would be desirable to obtain the same result with all the "cophenetic" methods, irrespectively of the class of the object used. If this is not possible, users could be warned in the help file. Thanks, Alberto Fernandez
"cophenetic" function for objects of class "dendrogram"
4 messages · Alberto Fernández Sabater, William Dunlap, Tal Galili
I think the results differ only in the order of the labels. The following
function
puts the labels in a standard order and then the results are the same:
canonicalize.dist <- function (distObject)
{
o <- order(labels(distObject))
as.matrix(distObject)[o, o, drop = FALSE]
}
identical(canonicalize.dist(d2), canonicalize.dist(d3))
[1] TRUE
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Apr 21, 2016 at 2:37 AM, Alberto Fern?ndez Sabater <
alberto.fernandez at urv.cat> wrote:
Hello,
I have been using the "cophenetic" function for objects of class
"dendrogram" and I have realised that it gives different results when it is
used with objects of class "hclust". For instance, running the first
example in the help file of the "cophenetic" function,
d1 <- dist(USArrests)
hc <- hclust(d1, "ave")
d2 <- cophenetic(hc)
cor(d1, d2) # 0.7659
the result given is different to the one obtained using an object of class
"dendrogram",
dendro <- as.dendrogram(hc)
d3 <- cophenetic(dendro)
cor(d1, d3) # 0.0151
I think that it would be desirable to obtain the same result with all the
"cophenetic" methods, irrespectively of the class of the object used. If
this is not possible, users could be warned in the help file.
Thanks,
Alberto Fernandez
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Note that cophenetic.default (which works on the output of hclust(dist(X))) uses the row names of X as labels. as.dendrogram.hclust does not retain those row names so cophenetic.dendrogram cannot use them (so it orders them based on the topology of the dendrogram). Bill Dunlap TIBCO Software wdunlap tibco.com
On Thu, Apr 21, 2016 at 7:59 AM, William Dunlap <wdunlap at tibco.com> wrote:
I think the results differ only in the order of the labels. The following
function
puts the labels in a standard order and then the results are the same:
canonicalize.dist <- function (distObject)
{
o <- order(labels(distObject))
as.matrix(distObject)[o, o, drop = FALSE]
}
identical(canonicalize.dist(d2), canonicalize.dist(d3))
[1] TRUE
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Apr 21, 2016 at 2:37 AM, Alberto Fern?ndez Sabater <
alberto.fernandez at urv.cat> wrote:
Hello,
I have been using the "cophenetic" function for objects of class
"dendrogram" and I have realised that it gives different results when it is
used with objects of class "hclust". For instance, running the first
example in the help file of the "cophenetic" function,
d1 <- dist(USArrests)
hc <- hclust(d1, "ave")
d2 <- cophenetic(hc)
cor(d1, d2) # 0.7659
the result given is different to the one obtained using an object of
class "dendrogram",
dendro <- as.dendrogram(hc)
d3 <- cophenetic(dendro)
cor(d1, d3) # 0.0151
I think that it would be desirable to obtain the same result with all the
"cophenetic" methods, irrespectively of the class of the object used. If
this is not possible, users could be warned in the help file.
Thanks,
Alberto Fernandez
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
5 days later
Hi Alberto, Everyone in this thread are correct. I'll just mention that if your goal was to calculate the cophenetic correlation between dendrograms, the function cor_cophenetic from the dendextend package can help with that (as well as other functions such as cor_bakers_gamma tanglegram and others. See the bioinformatics paper for more: http://bioinformatics.oxfordjournals.org/content/31/22/3718 ). As you can see from dendextend:::cor_cophenetic.default it deals with properly ordering the distance matrices. With regards, Tal ? ----------------Contact Details:------------------------------------------------------- Contact me: Tal.Galili at gmail.com | Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Thu, Apr 21, 2016 at 7:38 PM, William Dunlap via R-devel <
r-devel at r-project.org> wrote:
Note that cophenetic.default (which works on the output of hclust(dist(X))) uses the row names of X as labels. as.dendrogram.hclust does not retain those row names so cophenetic.dendrogram cannot use them (so it orders them based on the topology of the dendrogram). Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Apr 21, 2016 at 7:59 AM, William Dunlap <wdunlap at tibco.com> wrote:
I think the results differ only in the order of the labels. The
following
function
puts the labels in a standard order and then the results are the same:
canonicalize.dist <- function (distObject)
{
o <- order(labels(distObject))
as.matrix(distObject)[o, o, drop = FALSE]
}
identical(canonicalize.dist(d2), canonicalize.dist(d3))
[1] TRUE
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Apr 21, 2016 at 2:37 AM, Alberto Fern?ndez Sabater <
alberto.fernandez at urv.cat> wrote:
Hello, I have been using the "cophenetic" function for objects of class "dendrogram" and I have realised that it gives different results when
it is
used with objects of class "hclust". For instance, running the first example in the help file of the "cophenetic" function, d1 <- dist(USArrests) hc <- hclust(d1, "ave") d2 <- cophenetic(hc) cor(d1, d2) # 0.7659 the result given is different to the one obtained using an object of class "dendrogram", dendro <- as.dendrogram(hc) d3 <- cophenetic(dendro) cor(d1, d3) # 0.0151 I think that it would be desirable to obtain the same result with all
the
"cophenetic" methods, irrespectively of the class of the object used. If
this is not possible, users could be warned in the help file.
Thanks,
Alberto Fernandez
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel