Skip to content

do_gsub (PR#10540)

3 messages · A.R.Runnalls at kent.ac.uk, Hin-Tak Leung, Andrew Runnalls

#
In character.c within R 2.6.1, the function do_gsub contains the following 
code:

1199        if (STRING_ELT(pat, 0) == NA_STRING) {
1200            PROTECT(ans = allocVector(STRSXP, n));
1201            for(i = 0; i < n; i++)  SET_STRING_ELT(ans, i, NA_STRING);
1202            UNPROTECT(1);
1203            return ans;
1204        }
1205
1206        if (length(pat) < 1 || length(rep) < 1) error(R_MSG_IA);

Line 1206 checks that pat contains at least one element.  However, line 1199 
has already attempted to access the first element.  The check should surely 
come first.

Andrew Runnalls
8 days later
#
I believe STRING_ELT() works even when pat is R_NilValue, so
the first line "does the right thing" even if the first element does not 
exist. I think this is intentional and not a bug.

Other more knowledgeable people please correct me if I am wrong.
A.R.Runnalls at kent.ac.uk wrote:
13 days later
#
Hin-Tak,

Apologies for the delay in responding.
On 2008/01/11 Hin-Tak Leung wrote:
Generally speaking, in the interests of software reliability and
maintainability, it is a good idea for part of a program to 'do the
right thing' itself - in the present case, to check that an array is
non-empty before assuming that it is - rather than relying on what is in
fact rather unusual behaviour of another part of the program.  The case
I am concerned about is not in fact where 'pat' is R_NilValue, but when
it is a vector of length zero.  When asked to create such a vector,
memory.c will in fact allocate some element space in the vector, which
is why the above problem doesn't manifest itself as a crash (in the
no-segfault.R test, for example, which explores this case); however, it
seems unwise to rely on this.  And what would happen if that
(redundantly) allocated space just happened already to contain NA_STRING?

Best regards,
Andrew