Skip to content
Prev 22501 / 398502 Next

cluster summary score

On 08/08/02 13:23, Huan Huang wrote:
I'm not sure what you mean by resuction, but when I and many
other psychologists face this kind of problem - reducing a set of
variables - we often use factor analysis.  A good progam is
factanal in the mva library.  Varimax rotation (the default)
usuallly picks out a sensible set of factors, although of course
other rotations may be more informative for a given case.  You
sort the loadings if you want.  (Look at the various options for
loadings() and print().)

There are no fixed rules for this sort of thing.  Sometimes one
variable winds up in the wrong place by chance.  The strategy I
use is to figure out a sensible grouping of variables before I
use them to predict anything, so that I am not biased by knowing
the results.  So I feel free to move or remove variables that
don't make sense.  Some people may prefer a more rigid approach,
which further reduces the temptation to cheat.

Having found the grouping of variables, you can do three
different things:

1. Define "scores" by simply adding up the (standardized?) scores
   of the variables in each group (with high loadings in the same
   factor, perhaps).

2. Use the factor scores themselves as variables.

3. Use a single representative variable from each group.  This
   seems to be what you were suggesting, but I'm having trouble
   thinking of a situation where this would be better than #1 or
   #2.

Whatever you do, you need to figure out how many groups, and
prcomp() or princomp() is often helpful here.  (And take a look
at biplot().  A really nice tool for looking at the first two
principal components.)  The factanal() program also reports a
chi-square fit statistic.  So in principle you could use that to
figure out how many factors there are.  However, that method
usually gives more factors than are meaningful, especially when
you have a large data set.
Message-ID: <20020808093705.A1940@cattell.psych.upenn.edu>
In-Reply-To: <001001c23ed6$5fa864b0$a91401a3@stats.ox.ac.uk>; from huang@stats.ox.ac.uk on Thu, Aug 08, 2002 at 01:23:05PM +0100