Skip to content

Strange Results of summary()

2 messages · Martin Maechler, Hubert Palme

#
Huber Palme writes
.....
What's the problem? 
	'(Other)' gives all the levels having (in your case) 0,1,2 observations,
	which sum to 3 observations.
"summary(.)" should give a summary  (think of a factor with 500 levels....)

table() is more detailed (but doesn't report the NA's),
  which is the only thing to critize here:

  S-plus's  table(..) has an extra argument  "exclude" which
  we should also have in R:

  S> args(table) 
	  function(..., exclude = c(NA, NaN))

  S> table(c(NA,1:5))
   1 2 3 4 5 
   1 1 1 1 1
  > table(c(NA,1:5), exclude=NULL)
   1 2 3 4 5 NA 
   1 1 1 1 1  1

Martin Maechler <maechler@stat.math.ethz.ch>			<><
Seminar fuer Statistik, ETH-Zentrum SOL G1;	Sonneggstr.33
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1086
http://www.stat.math.ethz.ch/~maechler/

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
Martin Maechler: 
 > >>                    berufl  
 > >>  Bureaukraft         :15  
 > >>  Guetererzeugung     : 9  
 > >>  sonstige            : 4  
 > >>  Handel              : 3  
 > >>  wissensch.-technisch: 3  
 > >>  (Other)             : 3  
 > >>  NA's                :43  
 > 
 > .....
 > 
 > >> > table(berufl)
 > >>           wissensch.-technisch Leiter Oeff. Dienst/Wirtschaft 
 > >>                              3                              0 
 > >>                    Bureaukraft                         Handel 
 > >>                             15                              3 
 > >>  Dienstleistungsgewerbe/Soldat                Gaertner/Jaeger 
 > >>                              2                              1 
 > >>                Guetererzeugung                       sonstige 
 > >>                              9                              4 
 > 
 > What's the problem? 
 > 	'(Other)' gives all the levels having (in your case) 0,1,2 observations,
 > 	which sum to 3 observations.

Do I understand you right, that the variables with low frequency are
put togehter in (other)? This should be explained to a newbie!!

- What criteria decides which variables are put into (other)?
- What kind of order do the values have? Frequency?

This is very irritating! Where can I get information about all this?

 > table() is more detailed (but doesn't report the NA's),
 >   which is the only thing to critize here:

I agree.

Thanks!

(Hmm... R is a very interesting and powerfull tool, but it's
philosophy and terminology need much accustomization for one being
familiar with SPSS & Co.)