Skip to content

regexec() bug in R 3.4.0

2 messages · Weeks, Nathan, Martin Maechler

#
Hi,

In R 3.4.0, the "Pattern Matching and Replacement" documentation that describes regexec(), gregexpr(), etc. states that the "text" argument to regexec is a character vector, "or an object which can be coerced by as.character to a character vector":

     regexec(pattern, text, ignore.case = FALSE, perl = FALSE,
             fixed = FALSE, useBytes = FALSE)

     x, text: a character vector where matches are sought, or an object
         which can be coerced by as.character to a character vector.
         Long vectors are supported.

However, in R 3.4.0, this coercion doesn't seem to automatically occur for the text argument of regexec(), whereas it does for gregexpr(), regexpr(), etc:

============================================================
$ R --vanilla

R version 3.4.0 (2017-04-21) -- "You Stupid Darkness"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

...
Error in regexec("foo", text) : invalid 'text' argument
attr(,"match.length")
[1] 3
attr(,"useBytes")
[1] TRUE
[1] 1
attr(,"match.length")
[1] 3
attr(,"useBytes")
[1] TRUE
============================================================

Is this a documentation issue, a bug in regexec(), or am I misunderstanding how it's supposed to behave?

Thanks,

--
Nathan Weeks
IT Specialist
USDA-ARS Corn Insects and Crop Genetics Research Unit
Crop Genome Informatics Laboratory
Iowa State University







This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.
#
[...........]

I agree this is an inconsistency of documentation and behaviour,
and hence an (easy to work around) bug.

I propose to fix the code (for consistency) rather than the
documentation and will do so if there's no dissent.

We have become wary and cautious with last minute changes so
this won't be in  R 3.4.1 (due tomorrow Friday) but probably
in 'R 3.4.1 patched" later, and then future versions.

Martin Maechler,
ETH Zurich