Skip to content
Prev 5716 / 29559 Next

Classification of attribute table

Hi Dan, Dylan, Thierry & the rest of the list

Firstly, thanks for your input so far. Unfortunately I am running out of time as I need to get the analysis complete before IGARSS 09 in Cape Town so I don't think I will be able to implement your suggestions. In the meantime I have just selected segments which are above the 50th percentile and used those for further analysis (segments containing brighter values are considered tree crowns). I am not sure if classification could improve my tree counting accuracy, my intial results return +-70% accuracy when compared to field enumeration which I can improve upon by tweaking the watershed segmentation.

To answer your question Dan, the 1916 cases are fixed with no additional cases, although the same analysis will be applied in at least 10 other discrete plantation forest compartments. Perhaps the probability of cluster membership could be used in the rest of the compartments based on the initial clustering, not sure if that will work but could be interesting to test for the paper associated with this work.

Once I have finished the poster for IGARSS I will revisit the classification as this work is the final chapter of my PhD and would like to get it published.

Many thanks to the list for all your assistance.
Kind regards,
Wesley
 

Wesley Roberts MSc.
Researcher: Earth Observation (Ecosystems)
Natural Resources and the Environment
CSIR
Tel: +27 (21) 888-2490
Fax: +27 (21) 888-2693

"To know the road ahead, ask those coming back."
- Chinese proverb
Hi Wesley,

So you just want to partition the 1916 cases into three clusters. This
is a clustering problem rather than a discriminant analysis oriented
classification problem. As a result, Dylan Beaudette's suggestion of
using the clara() function is pretty reasonable, but your data set isn't
so large that other (more computationally intensive) algorithms can't be
used (assuming you have a machine with a reasonable amount of memory in
it). Moreover, some of your measures are very highly correlated with one
another (var and stdev for instance), so you can probably reduce the
number of variables used in the clustering.

Is the 1916 cases fixed, or will you want to take new cases and then
assign them to one of the three clusters created using the original
1916? If this is the case, using model based clustering might make the
most sense since you have a clean way of assigning new cases to the
existing clusters based on the posterior probability of cluster
membership.

Dan
On Mon, 2009-05-11 at 07:44 -0700, Dylan Beaudette wrote:
-- 
Dan Putler
Sauder School of Business
University of British Columbia