Skip to content

[Bioc-devel] How to recreate R CMD BUILD environment in interactive session?

12 messages · Martin Morgan, Elizabeth Purdom, Gabriel Becker +1 more

#
Hello,

I am updating an existing package and I am getting an error in running my vignette (and a similar error in an example in help pages) but ONLY when I run R CMD BUILD. I can?t recreate the error in any session where I can debug and figure out what is happening. So my question is how can I recreate the exact environment of R CMD BUILD that runs the vignette but in an interactive session so that I can figure out what is going on?

I have tried reproducing the error in other environments:

* Running R ?vanilla interactively and trying the code manually
* running purl on my vignette to get pure R code and running just the R code with R CMD BATCH ?vanilla
* Running R ?vanilla interactively and compiling the vignette

But all of the above runs the code without any error.

Furthermore, for some reason TravisCI can build the package without problem nor does it hit any error in R CMD CHECK with TravisCI so for a long time I assumed it was some local file or environment on my machine and forgot about it. But now that I have pushed the changes to Bioconductor, they are getting the same error about the vignette not building. 

By the way, I can?t slowly take back my commands to see what broke it, because the change was a dramatic one involving redefining my S4 class, so a large number of things had to be adapted before any code would work and there?s no way to slowly roll that back. 

Most of the posts I have seen about debugging discrepancies between R CMD BUILD (or equivalently R CMD CHECK) seem to be about namespace issues with other packages. Of course I can?t be sure since I have a pretty terse error from R CMD BUILD, but the error that is given is my own error (it?s a check I wrote as part of the validity check when I updated the class) so it seems unclear how it would be a namespace error. 

Thanks for any suggestions,
Elizabeth Purdom
#
On 10/23/2017 09:26 AM, Elizabeth Purdom wrote:
hint on the specific package and / or error message?

My approach would be to install the package, Stangle / purl the 
vignette, and R -f vignette.R, then trim the vignette to a fast 
reproducible case. But it sounds like you're doing that...

Martin
This email message may contain legally privileged and/or...{{dropped:2}}
#
On 10/23/2017 09:47 AM, Martin Morgan wrote:
A little bit more helpfully, at least on Linux R CMD build evaluates the 
script R_HOME/bin/build, which launches R as

echo 'tools:::.build_packages()' | R_DEFAULT_PACKAGES= LC_COLLATE=C 
"${R_HOME}/bin/R" --no-restore --slave --args ${args}

It's actually very tedious to figure out how the R process that builds 
the vignette is launched.

Martin
This email message may contain legally privileged and/or...{{dropped:2}}
#
Yes, that is what I tried but did not get the error from the R code. 

And I apologize, it?s the `clusterExperiment` package. My error was so specific to the class created by my package that I didn?t think it would be useful, but here is the relevant error message:

Quitting from lines 271-272 (clusterExperimentTutorial.Rmd) 
Error: processing vignette 'clusterExperimentTutorial.Rmd' failed with diagnostics:
invalid class "ClusterExperiment" object: merge_nodeMerge must have 4 columns and column names equal to: 'Node','Contrast','isMerged','mergeClusterId'
Execution halted

I would note that my vignette calls an object that is saved as a data object as part of my package to speed up compilation. But I experimented and you can also switch it so that it creates the object from scratch and doesn?t load the object, and it runs into the same error. There is a `LazyData: false` in my DESCRIPTION File, because I was having problems with my R data object, because it is of the class I make with my package, and without the package loaded there was some problem loading it.
#
On 10/23/2017 09:59 AM, Elizabeth Purdom wrote:
I can reproduce the error with

clusterExperiment/vignettes master$ R_DEFAULT_PACKAGES= LC_COLLATE=C R 
-f clusterExperimentTutorial.R

leading to

 > ## 
----recallRSEC------------------------------------------------------------
 > 
rsecFluidigm<-RSEC(rsecFluidigm,isCount=TRUE,combineProportion=0.6,mergeMethod="JC",mergeCutoff=0.05)
Error in validObject(.Object) :
   invalid class "ClusterExperiment" object: merge_nodeMerge must be 
data.frame with 4 columns and column names equal to: 
'Node','Contrast','isMerged','mergeClusterId'
Calls: RSEC ... .local -> new -> initialize -> initialize -> validObject
Execution halted

and then for work inside R


clusterExperiment/vignettes$ R_DEFAULT_PACKAGES= LC_COLLATE=C R
Bioconductor version 3.6 (BiocInstaller 1.27.7), ?biocLite for help
 > source("clusterExperimentTutorial.R", echo=TRUE, max=Inf)

Does that set you down the right path?

Martin
This email message may contain legally privileged and/or...{{dropped:2}}
#
Yes, thank you!!! I can get it in the interactive session now.
Elizabeth
#
Dear Martin,

Just for completeness, I figured out the discrepancy and solved my problem. In my check, I check that the column names contain the expected names and I didn?t want to make the order required in a certain way so I used sort -- but only of one side because I naively assumed the other side would be fixed:

any(sort(colnames(object at merge_nodeMerge)) != c('Contrast','isMerged','mergeClusterId','Node?)

But the different environments are sorting differently!

In my normal interactive R session:
[1] "Contrast"       "isMerged"       "mergeClusterId" "Node"          

In the build version of R however:
Browse[2]> sort(c("Contrast", "isMerged", "mergeClusterId", "Node"))
[1] "Contrast"       "Node"           "isMerged"       "mergeClusterId"

Thank you very much for your help in getting an interactive session in the build environment!

Elizabeth

  
  
#
FYI, that seems to be a locale issue, as the locale defines the sort order.

The first one is sorting alphabetically in a case-insensitive manner, eg
aAbBcC etc, the second is sorting in a case sensitive manner, where capital
letters all come before lower-case letters, e.g. ABC...YZabc...

You can see what locale you're in via Sys.getlocale() or see it in your
sessionInfo under the locale heading.

Best,
~G

On Mon, Oct 23, 2017 at 1:06 PM, Elizabeth Purdom <epurdom at stat.berkeley.edu

  
    
#
Hi Elizabeth,

Thanks for troubleshooting this.

Note that testing with identical()/checkIdentical() is safer than with 
'any(sort(colnames1) != sort(colnames2))'. The latter won't do the
right thing if 'colnames1' and 'colnames2' have different lengths.

Cheers,
H.
On 10/23/2017 01:06 PM, Elizabeth Purdom wrote:

  
    
#
On 10/23/2017 04:40 PM, Herv? Pag?s wrote:
...but in this case it sounds like you're aiming for a set comparison,

   setequal(colnames1, colnames2)

Martin
This email message may contain legally privileged and/or...{{dropped:2}}
#
Hi Herv?,  Thanks for the suggestion. I actually checked first that they were the same length, but identical() would be safer and simplify my code. Elizabeth
#
On 10/23/2017 01:42 PM, Martin Morgan wrote:
Still not 100% safe though. Will give a false positive in the rare
situation where some columns are duplicated ;-)

 > setequal(c("a", "b", "a"), c("b", "a"))
[1] TRUE

H.