Skip to content
Back to formatted view

Raw Message

Message-ID: <57309999.4050302@ivt.baug.ethz.ch>
Date: 2016-05-09T14:07:21Z
From: Kirill Müller
Subject: Regression in match() in R 3.3.0 when matching strings with different character encodings

Hi


I think the following behavior is a regression from R 3.2.5:

 > match(iconv(  c("\u00f8", "A"), from = "UTF8", to  = "latin1" ), 
"\u00f8")
[1]  1 NA
 > match(iconv(  c("\u00f8"), from = "UTF8", to  = "latin1" ), "\u00f8")
[1] NA
 > match(iconv(  c("\u00f8"), from = "UTF8", to  = "latin1" ), "\u00f8", 
incomparables = NA)
[1] 1

I'm seeing this in R 3.3.0 on both Windows and Ubuntu 15.10.

The specific behavior makes me think this is related to the following 
NEWS entry:

match(x, table) is faster (sometimes by an order of magnitude) when x is 
of length one and incomparables is unchanged (PR#16491).


Best regards

Kirill