Skip to content

[Bioc-devel] vignette problems

21 messages · Shepherd, Lori, campos, Martin Morgan

#
Please use 'reply all' so that the mailing list remains engaged.

Check out the release schedule

http://bioconductor.org/developers/release-schedule/

in particular

Wednesday April 25

- Deadline for packages passing ??R CMD build?? and ??R CMD check?? 
without errors or warnings.

so you still have time to get your package in order.

Using the same techniques as before, I still see valgrind problems, the 
first being

 > hmm_fitted_poilog = fitHMM(trainRegions, hmm_poilog, 
sizeFactors=sizeFactors, maxIters=10)
==24916== Invalid write of size 4
==24916==    at 0x4BA93FD7: 
TransitionMatrix::updateAuxiliaries(double**, double***, double*, int*, 
int, int**, int, double, int) (TransitionMatrix.cpp:292)
==24916==    by 0x4BA77934: HMM::updateSampleAux(double***, int*, int, 
double**, double**, double**, double***, double*, int*, int*, int*, 
int**, double***, SEXPREC*, SEXPREC*, int, double, int, int, int) 
(HMM.cpp:771)
==24916==    by 0x4BA7896D: HMM::BaumWelch[abi:cxx11](double***, int*, 
int, int, int**, int*, int*, int*, int, int, int**, double***, SEXPREC*, 
SEXPREC*, int, double, double, int, int) (HMM.cpp:1076)
==24916==    by 0x4BA8FEFD: RHMMFit (RWrapper.cpp:1494)
==24916==    by 0x4F2992D: R_doDotCall (dotcode.c:692)
==24916==    by 0x4F339D5: do_dotcall (dotcode.c:1252)
==24916==    by 0x4F81BA6: bcEval (eval.c:6771)
==24916==    by 0x4F6E963: Rf_eval (eval.c:624)
==24916==    by 0x4F71188: R_execClosure (eval.c:1764)
==24916==    by 0x4F70E7C: Rf_applyClosure (eval.c:1692)
==24916==    by 0x4F6F18B: Rf_eval (eval.c:747)
==24916==    by 0x4F74B12: do_set (eval.c:2774)
==24916==  Address 0x2e73a294 is 4 bytes inside a block of size 5 alloc'd
==24916==    at 0x4C2DB8F: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24916==    by 0x4BA93FA6: 
TransitionMatrix::updateAuxiliaries(double**, double***, double*, int*, 
int, int**, int, double, int) (TransitionMatrix.cpp:289)
==24916==    by 0x4BA77934: HMM::updateSampleAux(double***, int*, int, 
double**, double**, double**, double***, double*, int*, int*, int*, 
int**, double***, SEXPREC*, SEXPREC*, int, double, int, int, int) 
(HMM.cpp:771)
==24916==    by 0x4BA7896D: HMM::BaumWelch[abi:cxx11](double***, int*, 
int, int, int**, int*, int*, int*, int, int, int**, double***, SEXPREC*, 
SEXPREC*, int, double, double, int, int) (HMM.cpp:1076)
==24916==    by 0x4BA8FEFD: RHMMFit (RWrapper.cpp:1494)
==24916==    by 0x4F2992D: R_doDotCall (dotcode.c:692)
==24916==    by 0x4F339D5: do_dotcall (dotcode.c:1252)
==24916==    by 0x4F81BA6: bcEval (eval.c:6771)
==24916==    by 0x4F6E963: Rf_eval (eval.c:624)
==24916==    by 0x4F71188: R_execClosure (eval.c:1764)
==24916==    by 0x4F70E7C: Rf_applyClosure (eval.c:1692)
==24916==    by 0x4F6F18B: Rf_eval (eval.c:747)
==24916==

This seems to be the exact same code as in the problem that you fixed at 
another location. Actually, I would guess that all of these

grep --color -nH -e ")\*ncores+1" *
HMM.cpp:784:    int *myStateBuckets = (int*)malloc(sizeof(int)*ncores+1);
MultivariateGaussian.cpp:295:    int *myDimBuckets = 
(int*)malloc(sizeof(int)*ncores+1);
MultivariateGaussian.cpp:475:    int *myDimBuckets = 
(int*)malloc(sizeof(int)*ncores+1);
TransitionMatrix.cpp:132:        int *myStateBuckets = 
(int*)malloc(sizeof(int)*ncores+1);
TransitionMatrix.cpp:289:    int *myStateBuckets = 
(int*)malloc(sizeof(int)*ncores+1);

are the same problem. Also, usually code that has been copy/pasted like 
this can instead be refactored to  a single function call, so a bug can 
be fixed in one place.

I still see a number of compiler warnings, the first of which is

STAN master$ R CMD INSTALL .
Bioconductor version 3.7 (BiocInstaller 1.29.5), ?biocLite for help
* installing to library 
'/home/mtmorgan/R/x86_64-pc-linux-gnu-library/3.5-Bioc-3.7'
* installing *source* package 'STAN' ...
** libs
g++  -I"/home/mtmorgan/bin/R-3-5-branch/include" -DNDEBUG 
-I/usr/local/include  -D_RDLL_ -fopenmp  -fpic  -g -O0 -Wall -pedantic 
-c HMM.cpp -o HMM.o
HMM.cpp: In member function ?virtual void 
HMM::calcEmissionProbs(double***, double**, int*, int, int**, int*, 
int*, int**, int, int, int*)?:
HMM.cpp:112:15: warning: unused variable ?j? [-Wunused-variable]
          int i,j,t,k;
                ^
It really pays to clean these up; most are harmless, but they obscure 
the more important warnings.

Martin
On 04/03/2018 09:58 AM, campos wrote:
This email message may contain legally privileged and/or...{{dropped:2}}
1 day later
#
Hey Martin,

I pushed new changes since last friday but in 
https://bioconductor.org/checkResults/3.7/bioc-LATEST/STAN/ says that 
the last change date was friday. Any idea what is the problem?

I have tried to fix the problems with memory and all you told me.

Best,

Rafael
On 03.04.2018 17:06, Martin Morgan wrote:
#
In order for changes to be propagated a version bump in the DESCRIPTION file is needed.  Please bump the version in the DESCRIPTION file to 2.7.2.




Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263
#
Thanks!

another question, is there any way to check if the changes that I have 
done pass the checks for all PC like on the webpage? I do not have mac 
or windows so I feel like I have to wait to the next day to check if my 
changes worked out.

Thank you very much,

Rafael
On 05.04.2018 14:29, Shepherd, Lori wrote:

  
  
3 days later
#
Dear devel team,

I am still puzzled with the problems with mac compiling. I am really 
lost and have no idea how to continue or how to be able to check about 
this problems with my linux machine in order to fix it faster. Could you 
please help me with that??

Best,

Rafael
On 05.04.2018 14:29, Shepherd, Lori wrote:

  
  
#
I'll try to provide you with a pull request addressing issues. Martin
On 04/09/2018 08:42 AM, campos wrote:
This email message may contain legally privileged and/or...{{dropped:2}}
#
Hi Martin,

Thank you very much, I am a bit concerned about the option of:


Last?Changed?Date: 2018-04-05?09:37:37?-0400?(Thu,?05?Apr?2018)

I did a change yesterday and push it, why isn't it visible?

Best,

Rafa
On 09.04.2018 16:44, Martin Morgan wrote:

  
  
#
On 04/10/2018 05:27 AM, campos wrote:
Notice that at the top of the build report it says

   This page was generated on 2018-04-09 09:50:05 -0400 (Mon, 09 Apr 2018).

   Snapshot Date: 2018-04-08 16:45:28 -0400 (Sun, 08 Apr 2018)

If you pushed after the snapshot date then your changes are not yet visible.

Martin
This email message may contain legally privileged and/or...{{dropped:2}}
#
Hi Martin,

it seems like mac is ok now. What has changed??

Thank you very much,

Rafael
On 10.04.2018 11:33, Martin Morgan wrote:
#
Hi --

I'm not sure what you mean; the 'devel' builld report does not include 
mac (the build system experienced problems with disk space last night, 
so the build did not complete)

   https://bioconductor.org/checkResults/devel/bioc-LATEST/STAN/

and STAN failed in Release

   https://bioconductor.org/checkResults/release/bioc-LATEST/STAN/


HOWEVER, I spent some time with your package. First, I compiled it with 
compiler flages -Wall -pedantic, which makes the compiler quite 
sensitive. I did this by creating, on my Linux, a file

     $ cat ~/.R/Makevars
     CFLAGS = -g -O0 -Wall -pedantic
     CXXFLAGS = -g -O0 -Wall -pedantic

and then installing the package from the source directory

     STAN master$ rm -f src/*o && R CMD INSTALL .

There were a number of minor issues (unused variables, incorrect printf 
formatting [and use of printf() rather than Rprintf()]) as well as 
somewhat more substantial problems (virtual destructors for polymorphic 
base classes); a few problems remain, of the form

     RWrapper.cpp:950:5: warning: control reaches end of non-void 
function [-Wreturn-type]
      }

where the problem is obvious -- no return value if parallel != 0 -- but 
also innocuous, since it seems from inspection that in fact this 
function is always invoked with parallel == 0

     HMM* createHMM(int parallel, int K, InitialProbability* initProb, 
TransitionMatrix* transMat, EmissionFunction** myEmissions)
     {
         if(parallel == 0)
         {
             return new HMM(K, initProb, transMat, myEmissions);
         }
     }

You should fix these problems, e.g., by removing the parallel argument 
from the function and body. After cleaning up as best I could, I install 
the package again and ran the vignette through valgrind

     STAN master$ R CMD INSTALL .
     ...
     STAN master$ cd vignettes
     STAN/vignettes master$ R CMD Stangle STAN-knitr.Rnw
     STAN/vignettes master$ R CMD Stangle STAN-knitr.Rnw
     STAN/vignettes master$ R -d valgrind -f STAN-knitr.R

leading to an about mismatched new[] / delete[] -- memory allocated by 
'new x[]' must be deleted with 'delete[]', but the package code had 
simply 'delete'.

With these changes, the vignette builds without segfaulting or valgrind 
errors on my machine, and on the Mac builder; I did not check the full 
build and check of the package.

The changes are summarized in the following commits:

     STAN master$ git log --oneline
     a719f42 version bump
     b8e2123 make destructors for polymorhpic base classes virtual
     18ac6ba mismatch new[] / delete[]
     985768b restore EmissionFactory::createEmissionFunctionMixed
     8b749db provide return value for non-void functions
     26ca36e update poor printf statements
     0bb83ab clear 'unused variable' -Wall -pedantic warnings
     a3b7666 remove vignette product STAN-knitr.R

You should pull these down to your local git repository, e.g., for me I have

     STAN master$ git remote -v
     origin	git at git.bioconductor.org:packages/STAN (fetch)
     origin	git at git.bioconductor.org:packages/STAN (push)

So I would

     STAN master$ git pull origin master

to incorporate the changes.

It would be good to port these changes to the RELEASE_3_6 branch; 
remember to bump the version of the RELEASE_3_6 branch to 2.6.1.

Unfortunately, I pushed the changes after tonight's builds started, so 
the effect of the changes will not be reported until Thursday 
mid-morning, Eastern time, if the build system does not have problems.

Martin
On 04/10/2018 02:14 PM, campos wrote:
This email message may contain legally privileged and/or...{{dropped:2}}
#
Hi Martin,

thank you very much for your time and effort!

I pulled the branch to my computer updated the DESCRIPTION file and 
pushed the changes. That should be enough right?

Best,

Rafa

PS: If you ever stop by Cologne please let me know. I owe you a beer or two!
On 10.04.2018 23:44, Martin Morgan wrote:
#
On 04/11/2018 09:58 AM, campos wrote:
actually I seemed to have messed up the git commit, but I fixed it from 
my end; you should git pull and not worry about pushing any further 
commits for the moment.
tasty!

Martin
This email message may contain legally privileged and/or...{{dropped:2}}
#
Hi Martin,

when I run git status I get this:

On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean

shouldn't I see that my version in my machine is some commits behind?

 ?$git remote -v
origin??? git at git.bioconductor.org:packages/STAN (fetch)
origin??? git at git.bioconductor.org:packages/STAN (push)

Thanks a lot,

Rafa
On 11.04.2018 16:43, Martin Morgan wrote:
#
On 04/11/2018 10:47 AM, campos wrote:
maybe your version is current; I have

~/b/git/STAN master$ git log --oneline -n10
06dbdae Merge remote-tracking branch 'upstream/master'
6e070d4 version 2.7.3
b8e2123 make destructors for polymorhpic base classes virtual
18ac6ba mismatch new[] / delete[]
985768b restore EmissionFactory::createEmissionFunctionMixed
8b749db provide return value for non-void functions
26ca36e update poor printf statements
0bb83ab clear 'unused variable' -Wall -pedantic warnings
a3b7666 remove vignette product STAN-knitr.R
1c4f140 upgrade version 2.7.3

if you don't, then I guess `git pull` or more pedantically `git fetch 
origin; git merge origin/master master`.

Martin
This email message may contain legally privileged and/or...{{dropped:2}}
#
Aha! now I see all the changes!

so, now everything should be fine and it should run on mac as well right?

here: 
https://bioconductor.org/checkResults/3.7/bioc-LATEST/STAN/malbec2-checksrc.html

there is only the linux and windows machine, there is no info about the 
mac.

Thousand thanks again Martin, you have been of great help.

Best,

Rafael
On 11.04.2018 16:52, Martin Morgan wrote:
9 days later
#
Hi Martin,

sorry to bother you again... I have been working on the void problem 
warning this whole week with some other people and we couldn't come 
along a proper solution... I read that some warnings might be accepted, 
we have checked the package and it seems to work properly. Would that be 
a possible warning to be ignored?

Thank you very much again,

Rafael
On 11.04.2018 16:52, Martin Morgan wrote:
2 days later
#
Dear Bio-dev,

I have a quick question, for some reason I can reproduce the warnings 
that are stated in 
https://bioconductor.org/checkResults/3.7/bioc-LATEST/STAN/malbec2-checksrc.html 
on my linux machine. The check runs smoothly, any idea why so?

Thank you again,

Rafael
On 11.04.2018 16:52, Martin Morgan wrote:
#
I don't see vignette warnings on the build report.

You should address the simple namespace issue by adding

   importFrom("stats", "dnbinom", "kmeans", "optim", "ppois")

to your NAMESPACE file.

The compiler warnings are from the -Wall flag, which you can set by 
creating a plain text file

~/.R/Makevars

with the line

   CFLAGS = -Wall
   CXXFLAGS = -Wall

Martin
On 04/24/2018 05:49 AM, campos wrote:
This email message may contain legally privileged and/or...{{dropped:2}}
#
Hi Martin,

thanks for your reply. I do not have vignette warnings, the warnings 
that I see are:

  MultivariateGaussianFactory.h:23:9: warning: control reaches end of non-void function [-Wreturn-type]
   BernoulliFactory.h:27:9: warning: control reaches end of non-void function [-Wreturn-type]
   PoissonFactory.h:24:9: warning: control reaches end of non-void function [-Wreturn-type]
   MultinomialFactory.h:24:9: warning: control reaches end of non-void function [-Wreturn-type]
   NegativeBinomialFactory.h:27:9: warning: control reaches end of non-void function [-Wreturn-type]
   PoissonLogNormalFactory.h:27:9: warning: control reaches end of non-void function [-Wreturn-type]
   RWrapper.cpp:742:5: warning: control reaches end of non-void function [-Wreturn-type]
   RWrapper.cpp:950:5: warning: control reaches end of non-void function [-Wreturn-type]
On 24.04.2018 15:39, Martin Morgan wrote:

  
  
#
On 04/24/2018 09:42 AM, campos wrote:
yes it looks like you tackled those with the commit

STAN/src master$ git log MultivariateGaussianFactory.h
commit c43b951db86e4c167356ce8b380f6b8429ae0754
Author: rfael0cm <rfael0cm at gmail.com>
Date:   Tue Apr 24 13:52:13 2018 +0200

     Update DESCRIPTION FILE 2.7.8


which has not built yet (it will be included in tonight's build, for 
Wednesday's build report.

I think the change you made will silence the warning

--- a/src/MultivariateGaussianFactory.h
+++ b/src/MultivariateGaussianFactory.h
@@ -20,6 +20,7 @@ class MultivariateGaussianFactory : public EmissionFactory
              {
                  return new MultivariateGaussian(emissionParams);
              }
+            return NULL;
          }

but I wonder whether this is correct? Will your code really work if 
these functions return NULL? My sense actually is that the 'parallel' 
argument is never used (it's always 0), and that the correct solution is 
to remove it from the signature and to make the return value unconditional

--- a/src/MultivariateGaussianFactory.h
+++ b/src/MultivariateGaussianFactory.h
@@ -10,17 +10,13 @@
  class MultivariateGaussianFactory : public EmissionFactory
  {
      public:
-        EmissionFunction* 
createEmissionFunction(ParamContainerEmissions *emissionParams, int 
parallel)
+        EmissionFunction* 
createEmissionFunction(ParamContainerEmissions *emissionParams)
          {
              if(DEBUG_MEMORY)
              {
                  Rprintf("factory->create():");
              }
-            if(parallel == 0)
-            {
-                return new MultivariateGaussian(emissionParams);
-            }
-            return NULL;
+            return new MultivariateGaussian(emissionParams);
          }
          MultivariateGaussianFactory() {}
          ~MultivariateGaussianFactory() { }

This requires other changes in the code, to make the calling functions 
consistent. I guess this is a 'feature' that was planned but never 
implemented; it's usually better to keep such features out of the public 
version of your software.

Martin
This email message may contain legally privileged and/or...{{dropped:2}}
#
Hi Martin,

yeah, I have been wondering around the code these days looking for the 
answer of that, I am the maintainer and not who developed the code... I 
still couldn't reach the developer and ask about this, for the moment my 
assumption is that whenever parallel == 0 (Which I also think is most of 
the time this, but I found in the poissonLogNormalFactory.h another 
statement in which it could be 1) it will return the right value if not 
the NULL which is expected.

With this I want to get rid of the warnings in Linux and Windows and 
pass the deadline which is tomorrow and have it updated on Bioconductor. 
Once I can contact the developer maybe I can work in nicer code.

So, you think will pass the checks? so far I have not gotten any 
warnings with your method.

Best,

Rafa
On 24.04.2018 16:03, Martin Morgan wrote: