Skip to content
Back to formatted view

Raw Message

Message-ID: <CAPRVBczg_Cz2QXnmn7ZRAe=jEFoi_+c_DOkUxf_TxLjipSZHVQ@mail.gmail.com>
Date: 2025-04-23T17:03:12Z
From: Michael Chirico
Subject: R should add an API routine for safe use of memcpy(), memset() for use with 0-length SEXP
In-Reply-To: <CAPRVBcyUWiYWPbNXk5dU4ww+0fXf-2=RP=sJ1PFtE0Mnhg=M3g@mail.gmail.com>

h/t Tim Taylor for pointing out my blindspot :)

We have Memcpy() in API already [1], which wraps a 0-aware R_chk_memcpy() [2].

We don't quite have Memset() in API, though; instead we have Memzero()
[3] for R_chk_memset(s, 0, n) which is 0-aware memset() [4].

[1] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/include/R_ext/RS.h#L59-L60
[2] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/main/memory.c#L3575-L3580
[3] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/include/R_ext/RS.h#L62-L63
[4] https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/main/memory.c#L3582-L3587

Mike C

On Wed, Apr 23, 2025 at 8:57?AM Michael Chirico
<michaelchirico4 at gmail.com> wrote:
>
> From R 4.5.0 [1], all builds of R discourage use of INTEGER() [and
> friends REAL(), ... and *_RO() equivalents] on length-0 SEXP [2].
> Before R 4.5.0, this was the behavior under --enable-strict-barrier.
>
> That means the following can segfault under strict builds (e.g.
> -fsanitize=alignment and -O0):
>
> SEXP x = PROTECT(Rf_allocVector(INTSXP, 0));
> SEXP y = PROTECT(Rf_allocVector(INTSXP, 0));
> const int *x = INTEGER_RO(x); // invalid!
> int *y = INTEGER(y);
> memcpy(y, x, 0); // alluring, but undefined behavior!
>
> There are a number of CRAN packages that fall victim to this, see e.g.
> this PR and others linked to it [3]. I'm sure there are dozens if not
> hundreds of other equivalent bugs waiting to be discovered that just
> aren't covered by existing tests.
>
> {rlang} took the approach to define r_memcpy() and r_memset() which
> wrap memcpy() and memset(), resp., with an added length-0 check [4]; I
> think R itself should offer these (probably more consistently styled
> as R_Memcpy() and R_Memset()).
>
> (NB there's a possibility I'm still not fully grasping what's going on here :) )
>
> Mike C
>
> [1] related: https://stat.ethz.ch/pipermail/r-devel/2024-June/083456.html
> [2] https://github.com/r-devel/r-svn/blob/2b29e52e1c4e3d26b649cb7ac320b8a3dd13de30/src/main/memory.c#L4146
> [3] https://github.com/r-lib/vctrs/pull/1968
> [4] https://github.com/r-lib/rlang/pull/1797