From: Dan Bolser
On Thu, 4 Nov 2004, Berton Gunter wrote:
Dan:
1) There is no guarantee that PCA will show separate groups,
that is not its purpose, although it is frequently a side effect.
2) If you were to use a classification method of some sort
analysis, neural nets, SVM's, model=based classification, ...), my
understanding is that yes, indeed, severely unbalanced group
would, indeed, affect results. A guess is that Bayesian or
that could explicitly model the prior membership
better. To make it clear why, suppose that there was a 99.9%
"dog" and .05% each of the others. Than your datasets would
information on how covariates could distinguish the classes
classifier would be to call everything a "dog" no matter
covariates had.
I presume experts will have more and better to say about this.
Sounds interesting. Thanks very much for the input. Just out
of curiosity,
given that I can make my data more uniform (less biased), how
could I best
generate a 2d plot to encapsulate the clusters (and inter cluster
relationships)?
Actually I am thinking of a 2d density.
-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
"The business of the statistician is to catalyze the
process." - George E. P. Box
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Dan Bolser
Sent: Thursday, November 04, 2004 9:41 AM
To: R mailing list
Subject: [R] highly biased PCA data?
Hello, supposing that I have two or three clear categories
for my data,
lets say pet preferece across fish, cat, dog. Lets say most
people rate
their preference as being mostly one of the categories.
I want to do pca on the data to see three 'groups' of people,
one group
for fish, one for cat and one for dog. I would like to see
the odd person
who likes both or all three in the (appropriate) middle of
the other main
groups.
Will my data be affected by the fact that I have
owners, 100 cat owners and 10 fish owners? (assuming that
each scale of
preference has an equal range).
Cheers,
dan.