Skip to content

"cophenetic" function for objects of class "dendrogram"

4 messages · Alberto Fernández Sabater, William Dunlap, Tal Galili

#
Hello,


I have been using the "cophenetic" function for objects of class "dendrogram" and I have realised that it gives different results when it is used with objects of class "hclust". For instance, running the first example in the help file of the "cophenetic" function,


d1 <- dist(USArrests)

hc <- hclust(d1, "ave")

d2 <- cophenetic(hc)

cor(d1, d2)  # 0.7659


the result given is different to the one obtained using an object of class "dendrogram",


dendro <- as.dendrogram(hc)

d3 <- cophenetic(dendro)

cor(d1, d3)  # 0.0151


I think that it would be desirable to obtain the same result with all the "cophenetic" methods, irrespectively of the class of the object used. If this is not possible, users could be warned in the help file.


Thanks,

Alberto Fernandez
#
I think the results differ only in the order of the labels.  The following
function
puts the labels in a standard order and then the results are the same:

  canonicalize.dist <- function (distObject)
  {
      o <- order(labels(distObject))
      as.matrix(distObject)[o, o, drop = FALSE]
  }
  identical(canonicalize.dist(d2), canonicalize.dist(d3))
  [1] TRUE




Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Apr 21, 2016 at 2:37 AM, Alberto Fern?ndez Sabater <
alberto.fernandez at urv.cat> wrote:

            

  
  
#
Note that cophenetic.default (which works on the output of hclust(dist(X)))
uses the
row names of X as labels.  as.dendrogram.hclust does not retain those row
names
so cophenetic.dendrogram cannot use them (so it orders them based on the
topology of the dendrogram).

Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Apr 21, 2016 at 7:59 AM, William Dunlap <wdunlap at tibco.com> wrote:

            

  
  
5 days later
#
Hi Alberto,
Everyone in this thread are correct.
I'll just mention that if your goal was to calculate the cophenetic
correlation between dendrograms, the function cor_cophenetic from the
dendextend package can help with that (as well as other functions such as
cor_bakers_gamma tanglegram and others. See the bioinformatics paper for
more: http://bioinformatics.oxfordjournals.org/content/31/22/3718 ).

As you can see from dendextend:::cor_cophenetic.default it deals with
properly ordering the distance matrices.

With regards,
Tal


?


----------------Contact
Details:-------------------------------------------------------
Contact me: Tal.Galili at gmail.com |
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------


On Thu, Apr 21, 2016 at 7:38 PM, William Dunlap via R-devel <
r-devel at r-project.org> wrote: