Skip to content

Strange paste, string or package problem?

3 messages · Thomas Allen, Berwin A Turlach, Brian Ripley

#
Hi

I came across this strange bug the other day, I'm not sure how to solve it
and I wonder if anyone can even replicate it.

Using OS Ubuntu 7.10

Step 1) Make an R package using the package.skeleton() command with
only these two functions:

error <- function(){
  cmd <- paste(" -a ",1," -a ",1," -a ",1,
               " -a ",1," -a ",1," -a ",1,
               " -a ",1," -a ",1," -a ",1,
               " -a ",1," -a ",1," -a ",1,
               " ?a ",1,sep="")
  cat(cmd,"\n")
}
noerror <- function(){
  cmd <- paste(" -a ",1," -a ",1," -a ",1,
               " -a ",1," -a ",1," -a ",1,
               " -a ",1," -a ",1," -a ",1,
               " -a ",1," -a ",1," -a ",1,sep="")
  cat(cmd,"\n")
}

Step 2) Start R again. Load the package with library() and run the commands:
error()
noerror()

I get the following output:
-a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1 <e2><80><93>a 1
-a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1 -a 1
Now why does that "<e2><80><93>" replace one of the "-" in the first command?

Any ideas?

Cheers

Tom
#
G'day Thomas,

On Tue, 4 Mar 2008 14:40:35 +1300
"Thomas Allen" <hedbag at gmail.com> wrote:

            
With my mailtool, the "-" before the last a looks a bit longer than the
others; it definitely seems to be a different tool.  Also, if I cut and
paste this function into an emacs buffer on my Kubuntu machine, I see:

error <- function(){
  cmd <- paste(" -a ",1," -a ",1," -a ",1,
               " -a ",1," -a ",1," -a ",1,
               " -a ",1," -a ",1," -a ",1,
               " -a ",1," -a ",1," -a ",1,
               " \u2013a ",1,sep="")
  cat(cmd,"\n")
}

So somehow you must have entered not a "-" but some other symbol before
that last a.  And I guess what you see is a result of the locale and
the character encoding that you are working in.  But others would know
more about this than I and can probably explain better what is going on.

Cheers,

	Berwin

=========================== Full address =============================
Berwin A Turlach                            Tel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability        +65 6516 6650 (self)
Faculty of Science                          FAX : +65 6872 3919       
National University of Singapore     
6 Science Drive 2, Blk S16, Level 7          e-mail: statba at nus.edu.sg
Singapore 117546                    http://www.stat.nus.edu.sg/~statba
#
On Tue, 4 Mar 2008, Thomas Allen wrote:

            
Because you put it there!  I believe that at the first step you were 
running R in a UTF-8 locale, and at the second in (probably) an 8-bit 
locale.  I can reproduce this by changing from en_GB.utf8 to en_GB on F8, 
for example.

<e2><80><93> is UTF-8 for the Unicode point U+2013, the en dash.  I don't 
know how you managed to enter that in UTF-8 (I would have not have 
expected it to be accidentally possible from the keyboard), but the 
solution is to use hyphen where you intend hyphen.  (Unicode calls this 
'HYPHEN-MINUS' to indicate its dual role -- it also has U+2212 for minus.)