[Bioc-devel] PROTECT errors in Bioconductor packages
On Fri, 7 Apr 2017, Herv? Pag?s wrote:
On 04/06/2017 03:29 AM, Michael Lawrence wrote:
On Thu, Apr 6, 2017 at 2:59 AM, Martin Morgan <martin.morgan at roswellpark.org> wrote:
On 04/06/2017 05:33 AM, Aaron Lun wrote:
The tool is not perfect, so assess each report carefully.
I get a lot of warnings because the tool seems to consider that
extracting an attribute (with getAttrib(x, ...)) or extracting the
slot of an S4 object (with GET_SLOT(x, ...) or R_do_slot(x, ...))
returns an SEXP that needs protection. I always assumed that it
didn't because my understanding is that the returned SEXP is pointing
to a part of a pre-existing object ('x') and not to a newly created
one. So I decided I could treat it like the SEXP returned by
VECTOR_ELT(), which, AFAIK, doesn't need protection.
So I suspect these warnings are false positives but I'm not 100% sure.
If you are not 100% sure then you should protect :-) There are some cases, in particular related to compact row names on data frames, where getAttrib will allocate. Best, luke
I also get a warning on almost every C++ function I've written, because
I use the following code to handle exceptions:
SEXP output=PROTECT(allocVector(...));
try {
// do something that might raise an exception
} catch (std::exception& e) {
UNPROTECT(1);
throw; // break out of this part of the function
}
UNPROTECT(1);
return output;
Presumably the check doesn't account for transfer of control to the
catch block. I find that R itself is pretty good at complaining about
stack imbalances during execution of tests, examples, etc.
'My' packages
(Rsamtools, DirichletMultinomial) had several false positives (all
associated with use of an attribute of a protected SEXP), one subtle
problem (a symbol from a PROTECT'ed package name space; the symbol could
in theory be an active binding and the value obtained not PROTECTed by
the name space), and a genuine bug
tag = NEW_CHARACTER(n);
for (int j = 0; j < n; ++j)
SET_STRING_ELT(tag, j, NA_STRING);
if ('A' == aux[0]) {
buf_A = R_alloc(2, sizeof(char)); # <<- bug
buf_A[1] = '\0';
}
...
SET_VECTOR_ELT(tags, i, tag); # PROTECT tag, too late!
I assume the bug refers to the un-PROTECT'd NEW_CHARACTER here - the R_alloc call looks okay to me...
yes, tag needs protection. I attributed the bug to R_alloc because I failed to reason that R_alloc (obviously) allocates and hence can trigger a garbage collection. Somehow it reflects my approach to PROTECTion, probably not shared by everyone. I like to PROTECT only when necessary, rather than indiscriminately. Probably this has no practical consequence in terms of performance, making the code a little easier to read at the expense of exposing me to bugs like this.
I guess it's a tradeoff between syntactic complexity and logical complexity. You have to think pretty hard to minimize use of the protect stack.
I prefer to call it logical obscurity ;-) The hard thinking consists in assessing whether or not the code between the line where a new SEXP is allocated (line X) and the line where it's put in a safe place (line Y) can trigger garbage collection. Hard to figure out in general indeed, but not impossible! Problem is that the result of this assessment is valid at a certain point in time but might change in the future, even if your code has not changed. So a dangerous game for virtually zero benefits.
One recommendation might be to UNPROTECT() as soon as the pointer on the top is unneeded, rather than trying to figure out the number to pop just before returning to R.
If you PROTECT() in a loop, you definitely want to do that. Otherwise, does it make a big difference?
One thing that got me is that the order in which C evaluates function
call arguments is undefined. I did a lot of R_setAttrib(x,
install("foo"), allocBar()), thinking that the symbol would be
automatically protected, and allocBar() would not need protection,
since it happened last. Unfortunately, it might be evaluated first.
I got hit by this too long time ago but with defineVar() instead of R_setAttrib(): https://stat.ethz.ch/pipermail/r-devel/2008-January/048040.html H.
Btw, I think my package RGtk2 got the record: 1952 errors. Luckily almost all of them happened inside a few macros and autogenerated code.
Martin
Cheers, Aaron
_______________________________________________ Bioc-devel at r-project.org mailing list https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=su20YzStywMa_pdWblzF0RnK8ATRw-t61lOIOsi0xTU&s=JhBOw1ac5wXfV1BSjFuidxFiBTx43J7iEvZG4G0_0uU&e=
This email message may contain legally privileged and/or...{{dropped:2}}
_______________________________________________ Bioc-devel at r-project.org mailing list https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=su20YzStywMa_pdWblzF0RnK8ATRw-t61lOIOsi0xTU&s=JhBOw1ac5wXfV1BSjFuidxFiBTx43J7iEvZG4G0_0uU&e=
_______________________________________________ Bioc-devel at r-project.org mailing list https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=su20YzStywMa_pdWblzF0RnK8ATRw-t61lOIOsi0xTU&s=JhBOw1ac5wXfV1BSjFuidxFiBTx43J7iEvZG4G0_0uU&e=
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
Actuarial Science
241 Schaeffer Hall email: luke-tierney at uiowa.edu
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu