Skip to content
Prev 170913 / 398506 Next

alpha shape function

Hi,

If the convex hull for *all* the data points is not ideal enough, is
it feasible to break the data into small subsets using clustering
methods such as kmeans() and compute the convex hull for each cluster?
Finally we are able to know the "borders" of all clusters using
chull(); I don't know how difficult it will be to find an exact
solution to your problem in the future computation, but I think there
can be good enough approximations.

For example, you may choose a proper 'k' for the k-means clustering:

##
set.seed(1234)
devAskNewPage(ask = TRUE)
par(pch = 20)
dat = iris[, 1:2]
n = nrow(dat)
for (k in 2:30) {
    ch = integer()
    cl = kmeans(dat, k, 50)$cluster
    plot(dat, main = paste("k =", k))
    for (i in unique(cl)) {
        idx = chull(tmp <- dat[cl == i, ])
        ch = c(ch, as.integer(rownames(tmp[idx, ])))
        polygon(tmp[idx, ], border = NA, col = rgb(0, 0, 0, 0.2))
    }
    plot(dat, main = paste("Polygon shape when k =", k))
    polygon(dat[ch, ], col = rgb(0, 0, 0, 0.2))  # need to be ordered
}
##

One critical problem I have not solved in the above code, I think, is
the ordering of all the border points, so the last whole polygon looks
weird...

Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086
Mobile: +86-15810805877
Homepage: http://www.yihui.name
School of Statistics, Room 1037, Mingde Main Building,
Renmin University of China, Beijing, 100872, China
On Wed, Feb 18, 2009 at 9:46 PM, roger koenker <rkoenker at uiuc.edu> wrote: