Greetings RListers, I have a data set containing two types of outcomes; success and failure. Associated with each outcome are 12 different measurements. I'm trying to find out, for example, if some of the 12 measures are associated more with success or failure or, if there's any relationship at all between the measures and the outcomes (success or failure). I don't have (as yet) any experience using clustering techniqes (in R or elsewhere) but thought that they might be applied in this situation. In general, are clustering techniques useful in this situation? Is any techniques better than another? And more specifically, can I use R to determine clustering characterics of the 12 measures and then, for each cluster, have R output whether the cluster is more associated with success or failure? Any help appreciated, Bill Vedder -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Help with Clustering Techniques in R
3 messages · Bill Vedder, Christian Hennig, Brian Ripley
Dear Bill, I think that this is essentially a problem of discriminant analysis or logistic regression and not of clustering. That is, you have two groups defined by success and failure, and you want to find out how your measures are able to distiguish between them. R should be able to carry out such analyses. I don't have the time now to figure out exactly how. Background and further references are given e.g. in G.J. McLachlan, Discriminant analysis and statistical pattern recognition, Wiley, 1992. Best, Christian
On Mon, 27 Aug 2001, Bill Vedder wrote:
Greetings RListers, I have a data set containing two types of outcomes; success and failure. Associated with each outcome are 12 different measurements. I'm trying to find out, for example, if some of the 12 measures are associated more with success or failure or, if there's any relationship at all between the measures and the outcomes (success or failure). I don't have (as yet) any experience using clustering techniqes (in R or elsewhere) but thought that they might be applied in this situation. In general, are clustering techniques useful in this situation? Is any techniques better than another? And more specifically, can I use R to determine clustering characterics of the 12 measures and then, for each cluster, have R output whether the cluster is more associated with success or failure? Any help appreciated, Bill Vedder -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
*********************************************************************** Christian Hennig University of Hamburg, Faculty of Mathematics - SPST/ZMS (Schwerpunkt Mathematische Statistik und Stochastische Prozesse, Zentrum fuer Modellierung und Simulation) Bundesstrasse 55, D-20146 Hamburg, Germany Tel: x40/42838 4907, privat x40/631 62 79 hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/ ####################################################################### ich empfehle www.boag.de -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Tue, 28 Aug 2001, Christian Hennig wrote:
Dear Bill, I think that this is essentially a problem of discriminant analysis or logistic regression and not of clustering. That is, you have two groups defined by success and failure, and you want to find out how your measures are able to distiguish between them.
In pattern-recognition terminology, this is a supervised and not an unsupervised problem.
R should be able to carry out such analyses. I don't have the time now to figure out exactly how. Background and further references are given e.g. in G.J. McLachlan, Discriminant analysis and statistical pattern recognition, Wiley, 1992.
That's rather old-fashioned (and was in 1992). Probably the best exploratory methods are logistic discrimination and classifcation trees, methods statisticians used to consistently overlook. R is very well set up for this sort of thing. There are several worked examples in Venables & Ripley (1999) that work with minor changes (see the online R complements) in R.
Best, Christian On Mon, 27 Aug 2001, Bill Vedder wrote:
Greetings RListers, I have a data set containing two types of outcomes; success and failure. Associated with each outcome are 12 different measurements. I'm trying to find out, for example, if some of the 12 measures are associated more with success or failure or, if there's any relationship at all between the measures and the outcomes (success or failure). I don't have (as yet) any experience using clustering techniqes (in R or elsewhere) but thought that they might be applied in this situation. In general, are clustering techniques useful in this situation? Is any techniques better than another? And more specifically, can I use R to determine clustering characterics of the 12 measures and then, for each cluster, have R output whether the cluster is more associated with success or failure? Any help appreciated, Bill Vedder -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
*********************************************************************** Christian Hennig University of Hamburg, Faculty of Mathematics - SPST/ZMS (Schwerpunkt Mathematische Statistik und Stochastische Prozesse, Zentrum fuer Modellierung und Simulation) Bundesstrasse 55, D-20146 Hamburg, Germany Tel: x40/42838 4907, privat x40/631 62 79 hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/ ####################################################################### ich empfehle www.boag.de -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._