Dear all,
the number of significant digits in summary default is
digits = max(3, getOption("digits") - 3)
on my platform this results to be 4. The point is that if you have,
say, integer data of magnitude greater than 10^3 the command summary
will produce heavily rounded results.
A simple example follow:
x <- c(123456,234567,345678)
x
[1] 123456 234567 345678
summary(x)
Min. 1st Qu. Median Mean 3rd Qu. Max.
123500 179000 234600 234600 290100 345700
# quite different from
quantile(x)
0% 25% 50% 75% 100%
123456.0 179011.5 234567.0 290122.5 345678.0
Is it possible to adapt the number of significant digits to the
magnitude of the data?
The first thing that comes into my mind is
digits = nchar(trunc(max(x))) #
If it is not possible then I think it would be nice to mention the
issue in the documentation.
Thanks for the attention,
Simone
R.version
_
platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status
major 2
minor 3.1
year 2006
month 06
day 01
svn rev 38247
language R
version.string Version 2.3.1 (2006-06-01)
______________________________________________________
Simone Giannerini
Dipartimento di Scienze Statistiche "Paolo Fortunati"
Universita' di Bologna
Via delle belle arti 41 - 40126 Bologna, ITALY
Tel: +39 051 2098262 Fax: +39 051 232153
"Simone" == Simone Giannerini <sgiannerini at gmail.com>
on Thu, 14 Sep 2006 11:14:51 +0200 writes:
Simone> Dear all, the number of significant digits in
Simone> summary default is
Simone> digits = max(3, getOption("digits") - 3)
Simone> on my platform this results to be 4. The point is
Simone> that if you have, say, integer data of magnitude
Simone> greater than 10^3 the command summary will produce
Simone> heavily rounded results.
Simone> A simple example follow:
>> x <- c(123456,234567,345678)
>> x
Simone> [1] 123456 234567 345678
>> summary(x)
Simone> Min. 1st Qu. Median Mean 3rd Qu. Max. 123500
Simone> 179000 234600 234600 290100 345700
Simone> # quite different from
>> quantile(x)
Simone> 0% 25% 50% 75% 100%
Simone> 123456.0 179011.5 234567.0 290122.5 345678.0
Yes, a very very very old topic, and has been frequently on the
R lists.
The reason for this default has been compatibility with S
and in particular Splus-3.4 (1996) which used to be a partial
role model for R in its infancy.
However, I now see that Insightful also must have decided that
the old S setting was not satisfactory and that one can and
should do better.
Simone> Is it possible to adapt the number of significant
Simone> digits to the magnitude of the data? The first
Simone> thing that comes into my mind is
Simone> digits = nchar(trunc(max(x))) #
that's a first step of one thing to consider, yes,
but does need quite a bit of fixup before it's usable.
Since I've now seen the code of summary.default in S-plus 6.2,
I'm not in a good position to propose a code change here ---
unless Insightful ``donates'' their 3 lines of implementation to
R {which I think would be quite fair given the recent flurry of
things they've recently ported into S-plus 8.x}
Simone> If it is not possible then I think it would be nice
Simone> to mention the issue in the documentation.
The issue is mentioned but maybe in a too terse way.
I agree that I'd also want to change this behavior.
It's definitely too late for R 2.4.0, since although this may
seem like a small thing to do,
it can have quite a large effect in many outputs of R scripts.
Simone> Thanks for the attention,
Simone> Simone
>> R.version
..............
(does not really matter - here for once)
Martin Maechler, ETH Zurich
Since I've now seen the code of summary.default in S-plus 6.2,
I'm not in a good position to propose a code change here ---
unless Insightful ``donates'' their 3 lines of implementation to
R??{which?I?think?would?be?quite?fair?given?the?recent?flurry?of
things?they've?recently?ported?into?S-plus?8.x}
It's also possible to be a bit smarter in specific cases. See for example
the LaTeX table functions for regression summaries in the Dmisc package[1],
which uses the magnitude of the standard errors to dermine the number of
digits shown for estimates (s.t. the number of digits vary for each row/
estimate).
[1] Not on CRAN. See http://www.menne-biomed.de/download/download.html
Karl Ove Hufthammer
E-mail and Jabber: karl at huftis.org