Hi r-devels,
I am a bit puzzled by the behaviour of cat() --- any help is
appreciated...
It appears to me that the elements of sep are just used as separators
_between_ each of the objects comprising '...' handed to cat.
If N objects are handed to cat, cat requires N-1 separator strings.
The default separator string is " " (space character).
Hence for
cat(rep("x",3), sep = ".")
two periods are needed to separate the three input objects
cat(rep("x",3), sep = ".")
x.x.x
as expected.
For cat(rep("x",3),sep = c(".","\n",".")), the first separator
is a period, the second is a newline, and the third is not needed.
cat(rep("x",3),sep = c(".","\n","."))
x.x
x
as expected. The line feed inserted is expected, it is the
second element of the sep vector, so should appear between
the second and third objects, as it does. The third element
of sep is not needed, so is ignored.
Another example:
again, as expected.
I haven't delved into the source to see where the final line feed
is being generated (as I see the next R prompt on a new line) so
I can't comment on whether anything is appended to the end of the
output string generated by cat(). The documentation says no line
feed is appended unless argument 'fill' is TRUE or numeric.
At least AFAICS, cat() for vector-valued '...' argument behaves in
contradiction to what I understand from the note in the help to cat()
which reads
"
Despite its name and earlier documentation, 'sep' is a vector of
terminators rather than separators, being output after every
vector element (including the last). Entries are recycled as
needed.
"
I think you're right that the documentation is incorrect. I'd prefer a
patch to the docs, rather than a change to the behaviour: cat() is so
fundamental that any changes to it would have wide ranging consequences.
If you want to study the code and draft a documentation patch, I'll
review it and possibly commit it.
How about this:
sep a character vector of strings to insert between each object. If
there are too few elements in sep to separate all the objects,
the elements of sep are recycled. Unused elements of sep are ignored.
then in Details:
Details
cat is useful for producing output in user-defined functions. It
converts its arguments to character vectors, concatenates them to a
single character vector, inserts the given sep= string(s) between each
element and then outputs them.
Duncan Murdoch
----------------------------------------------------------------------------
reproducible example code:
----------------------------------------------------------------------------
cat(rep("x",3), sep = ".")
x.x.x
## no "." appended!
Things get even worse if "\n" features in the 'sep' vector:
cat(rep("x",3),sep = c(".","\n","."))
x.x
x
## last separator "." gets swallowed; an non-intended line feed is
inserted
----------------------------------------------------------------------------
code causing this behaviour
----------------------------------------------------------------------------
##### "\n"
I have looked a bit into the source code
(lines 468-630 in builtin.c in src/main)
and found out, as variable pwidth is set to 1 in line 504, i.e.;
if (strstr(CHAR(STRING_ELT(sepr, i)), "\n")) nlsep = 1; /* ASCII */
the code in lines 622-23, i.e.;
if ((pwidth != INT_MAX) || nlsep)
Rprintf("\n");
is responsible for the newline. Is this really intended?
##### separators, not terminators
Another look shows that, contrary to what is said in the help file,
an element of vector 'sep' is /not/ printed out after each element
of the vector passed as argument '...' to cat(), "including the last"
--- confer the for-loop over the elements of '...' in lines 596-617
and the print-out of the separator
cat_printsep(sepr, ntot);
in line 600. Once again: Is this intended?
A patch fixing my problem would be easy, though might crash
other much more important code; would you have any
proposals?
Best,
Peter
-------------------------------------------------------------------
Version:
platform = i386-pc-mingw32
arch = i386
os = mingw32
system = i386, mingw32
status = Under development (unstable)
major = 2
minor = 9.0
year = 2008
month = 10
day = 01
svn rev = 46589
language = R
version.string = R version 2.9.0 Under development (unstable)
(2008-10-01 r46589)
Windows XP (build 2600) Service Pack 3
Locale:
LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252
Search Path:
.GlobalEnv, package:stats, package:graphics, package:grDevices,
package:utils, package:datasets, package:methods, Autoloads, package:base
Steven McKinney
Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre
email: smckinney +at+ bccrc +dot+ ca
tel: 604-675-8000 x7561
BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C.
V5Z 1L3
Canada