Skip to content

The case for freezing CRAN

22 messages · R. Michael Weylandt, Terry Therneau, Kevin Coombes +11 more

#
There is a central assertion to this argument that I don't follow:
This is a very strong assertion. What is the evidence for it?

  I write a lot of Sweave/knitr in house as a way of documenting complex analyses, and a 
glm() based logistic regression looks the same yesterday as it will tomorrow.

Terry Therneau
#
On Mar 20, 2014, at 8:19, "Therneau, Terry M., Ph.D." <therneau at mayo.edu> wrote:

            
If I've understood Jeroen correctly, his point might be alternatively phrased as "won't be reproducED" (i.e., end user difficulties, not software availability).

Michael
#
On 03/20/2014 07:48 AM, Michael Weylandt wrote:
That was my point as well.  Of the 30+ Sweave documents that I've produced I can't think 
of one that will change its output with a new version of R.  My 0/30 estimate is at odds 
with the "nearly all" assertion.  Perhaps I only do dull things?

Terry T.
#
On 3/20/2014 9:00 AM, Therneau, Terry M., Ph.D. wrote:
The only concrete example that comes to mind from my own Sweave reports 
was actually caused by BioConductor and not CRAN. I had a set of 
analyses that used DNAcopy, and the results changed substantially with a 
new release of the package in which they changed the default values to 
the main function call.   As a result, I've taken to writing out more of 
the defaults that I previously just accepted.  There have been a few 
minor issues similar to this one (with changes to parts of the Mclust 
package ??). So my estimates are somewhat higher than 0/30 but are still 
a long way from "almost all".

Kevin
#
No attempt to summarize the thread, but a few highlighted points:

 o Karl's suggestion of versioned / dated access to the repo by adding a
   layer to webaccess is (as usual) nice.  It works on the 'supply' side. But
   Jeroen's problem is on the demand side.  Even when we know that an
   analysis was done on 20xx-yy-zz, and we reconstruct CRAN that day, it only
   gives us a 'ceiling' estimate of what was on the machine.  In production
   or lab environments, installations get stale.  Maybe packages were already
   a year old?  To me, this is an issue that needs to be addressed on the
   'demand' side of the user. But just writing out version numbers is not
   good enough.

 o Roger correctly notes that R scripts and packages are just one issue.
   Compilers, libraries and the OS matter.  To me, the natural approach these
   days would be to think of something based on Docker or Vagrant or (if you
   must, VirtualBox).  The newer alternatives make snapshotting very cheap
   (eg by using Linux LXC).  That approach reproduces a full environemnt as
   best as we can while still ignoring the hardware layer (and some readers
   may recall the infamous Pentium bug of two decades ago).

 o Reproduciblity will probably remain the responsibility of study
   authors. If an investigator on a mega-grant wants to (or needs to) freeze,
   they do have the tools now.  Requiring the need of a few to push work on
   those already overloaded (ie CRAN) and changing the workflow of everybody
   is a non-starter.

 o As Terry noted, Jeroen made some strong claims about exactly how flawed
   the existing system is and keeps coming back to the example of 'a JSS
   paper that cannot be re-run'.  I would really like to see empirics on
   this.  Studies of reproducibility appear to be publishable these days, so
   maybe some enterprising grad student wants to run with the idea of
   actually _testing_ this.  We maybe be above Terry's 0/30 and nearer to
   Kevin's 'low'/30.  But let's bring some data to the debate.

 o Overall, I would tend to think that our CRAN standards of releasing with
   tests, examples, and checks on every build and release already do a much
   better job of keeping things tidy and workable than in most if not all
   other related / similar open source projects. I would of course welcome
   contradictory examples.

Dirk
#
On Thu, Mar 20, 2014 at 7:32 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
[snip]
It was a "Flaw" not a "Bug".  At least I remember the Intel people
making a big deal about that distinction.

But I do remember the time well, I was a biostatistics Ph.D. student
at the time and bought one of the flawed pentiums.  My attempts at
getting the chip replaced resulted in a major run around and each
person that I talked to would first try to explain that I really did
not need the fix because the only people likely to be affected were
large corporations and research scientists.  I will admit that I was
not a large corporation, but if a Ph.D. student in biostatistics is
not a research scientist, then I did not know what they defined one
as.  When I pointed this out they would usually then say that it still
would not matter, unless I did a few thousand floating point
operations I was unlikely to encounter one of the problematic
divisions.  I would then point out that some days I did over 10,000
floating point operations before breakfast (I had checked after the
1st person told me this and 10,000 was a low estimate of a lower bound
of one set of simulations) at which point they would admit that I had
a case and then send me to talk to someone else who would start the
process over.



[snip]

  
    
#
On Mar 20, 2014, at 12:23 PM, Greg Snow <538280 at gmail.com> wrote:

            
Further segue:

That (1994) was a watershed moment for Intel as a company. A time during which Intel's future was quite literally at stake. Intel's internal response to that debacle, which fundamentally altered their own perception of just who their customer was (the OEM's like IBM, COMPAQ and Dell versus the end users like us), took time to be realized, as the impact of increasingly negative PR took hold. It was also a good example of the impact of public perception (a flawed product) versus the realities of how infrequently the flaw would be observed in "typical" computing. "Perception is reality", as some would observe.

Intel ultimately spent somewhere in the neighborhood of $500 million (in 1994 U.S. dollars), as I recall, to implement a large scale Pentium chip replacement infrastructure targeted to end users. The "Intel Inside" marketing campaign was also an outgrowth of that time period.

Regards,

Marc Schwartz
#
On Mar 20, 2014, at 1:02 PM, Marc Schwartz <marc_schwartz at me.com> wrote:

            
Quick correction, thanks to Peter, on my assertion that the "Intel Inside" campaign arose from the 1994 Pentium issue. It actually started in 1991.

I had a faulty recollection from my long ago reading of Andy Grove's 1996 book, "Only The Paranoid Survive", that the slogan arose from Intel's reaction to the Pentium fiasco. It actually pre-dated that time frame by a few years.

Thanks Peter!

Regards,

Marc
#
Dirk Eddelbuettel <edd at debian.org> writes:
These two tools look very interesting - but I have, even after reading a
few discussions of their differences, no idea which one is better suited
to be used for what has been discussed here: Making it possible to run
the analysis later to reproduce results using the same versions used in
the initial analysis.

Am I right in saying:

- Vagrant uses VMs to emulate the hardware
- Docker does not

wherefore
- Vagrant is slower and requires more space
- Docker is faster and requires less space

Therefore, could one say that Vagrant is more "robust" in the long run?

How do they compare in relation to different platforms? Vagrant seems to
be platform agnostic, I can develop and run on Linux, Mac and Windows -
how does it work with Docker? 

I just followed [1] and setup Docker on OSX - loos promising - it also
uses an underlying VM. SO both should be equal in regards to
reproducability in the long run?

Please note: I see these questions in the light of this discussion of
reproducability and not in regards to deployment of applications what
the discussions on the web are.

Any comments, thoughts, remarks?

Rainer


Footnotes: 
[1]  http://docs.docker.io/en/latest/installation/mac/
#
..............................................<?}))><........
 ) ) ) ) )
( ( ( ( (    Prof. Philippe Grosjean
 ) ) ) ) )
( ( ( ( (    Numerical Ecology of Aquatic Systems
 ) ) ) ) )   Mons University, Belgium
( ( ( ( (
..............................................................
On 21 Mar 2014, at 10:59, Rainer M Krug <Rainer at krugs.de> wrote:

            
Yes.
It depends. For instance, if you run R in VirtualBox under Windows, it may run faster depending on the code you run and, say, the Lapack library used. On Linux, you typically got R code run in the VM 2-3% slower than natively, but In a Windows host, most of my R code runs faster in the VM? But yes, you need more RAM.

With Vagrant, you do not need to keep you VM once you don't use it any more. Then, disk space is shrunk down to a few kB, corresponding to the Vagrant configuration file. I guess the same is true for Docker?

A big advantage of Vagrant + VirtualBox is that you got a very similar virtual hardware, no matter if your host system is Linux, Windows or Mac OS X. I see this as a good point for better reproducibility.
May be,? but it depends almost entirely how VirtualBox will support old VMs in the future!

PhG
#
G?bor Cs?rdi <csardi.gabor at gmail.com> writes:
I think I am getting lost in these - I looked ad Docker, and it looks
promising, but I actually didn't even manage to sh into the running
container. Is there somewhere an howto on how one can use these in R, to
the purpose discussed in this thread? If not, I really think this would
be needed. It is extremely difficult for me to translate what I want to
do into the deployment / management / development scenarios discussed in
the blogs I have found.

Cheers, 

(a confused)
Rainer

  
    
#
On 21 Mar 2014, at 20:21, G?bor Cs?rdi <csardi.gabor at gmail.com> wrote:

            
Additional info: you access R into the VM from within the host by ssh. You can enable x11 forwarding there and you also got GUI stuff. It works like a charm, but there are still some problems on my side when I try to disconnect and reconnect to the same R process. I can solve this with, say, screen. However, if any X11 window is displayed while I disconnect, R crashes immediately on reconnection.
Best,

PhG
#
On 03/22/2014 02:10 PM, Nathaniel Smith wrote:
I second that. However, by default, xpra and GNU Screen are not aware of 
each other. To connect to xpra from within GNU Screen, you usually need 
to set the DISPLAY environment variable manually. I have described a 
solution that automates this, so that GUI applications "just work" from 
within GNU Screen and also survive a disconnect: 
http://krlmlr.github.io/integrating-xpra-with-screen/ .


-Kirill
#
On 22 March 2014 at 13:10, Nathaniel Smith wrote:
| You might find the program 'xpra' useful. It's like screen, but for x11
| programs.

There are also NXserver/NXclien/FreeNX which keep 'x11 / xdm sessions' and
can resume / reconnect when the client dies.  I find x2go quite useful at work.

Dirk
1 day later
#
Thanks everybody for their input - interesting suggestions and useful
information - thanks.

But I am still struggling to use this information. What I got so far:

1) I have decided to try docker [1]
2) Installed docker and boot2docker on a Mac via homebrew and it works
3) I found some Dockerfiles to create an image with R and ssh
4) The dockerfile runs and creates the image
5) I can interactively connect to the image by using bash and R is
running there
5) As I am using emacs /  ess, I want to use ssh do R stuff (other
suggestions welcome)

Problems:
1) I don't manage to connect to the running docker image following [2] -
I even managed to freeze my computer while trying.
2) Even if I could, I understand that the ssh port would be different each
time - not very nice. Is there a way of setting the port?

Questions:

1) Am I right in saying, that I have to use ssh to access the running
image, or is there a (faster?) alternative? I mean - I am working
locally and I don't need any encryption.

2) Would Vagrant make the process easier?

And finally:

I think it would be great if this information could be collected in a
wiki page, as I did not find anything about the usage scenario of docker
/ vagrant discussed here - I will certainly see that I blog about my
tries.

Cheers,

Rainer

Kirill M?ller <kirill.mueller at ivt.baug.ethz.ch> writes:
Footnotes: 
[1]  https://www.docker.io

[2]  http://docs.docker.io/en/latest/examples/running_ssh_service/
#
Forgot: My Dockerfiloe is on github:

https://github.com/rkrug/R-docker

Rainer M Krug <Rainer at krugs.de> writes:

  
    
2 days later
#
On Thu, 20 Mar 2014, Dirk Eddelbuettel wrote:
At one of my previous jobs we did effectively this (albeit in a lower tech 
fashion). Every project had its own environment, complete with the exact 
snapshot of R & packages used, etc. All scripts/code was kept in that 
environment in a versioned fashion such that at any point one could go to 
any stage of development of that paper/project's analysis and reproduce it 
exactly.

It was hugely inefficient in terms of storage, but it solved the problem 
we're discussing here. As you note, with the tools available today it'd be 
trivial to distribute that environment for people to reproduce results.