The Quality & Accuracy of R

4 messages · Muenchen, Robert A (Bob), Peter Dalgaard, David Smith

Sat, Jan 24, 2009 5:13 AM #

Dear R Developers,

This is my first time subscribing to this list, so let me start out by saying thank you all very much for the incredible contribution you have made to science through your work on R. 

As you all know many users of commercial stat packages, their managers, directors, CIOs etc. are skeptical of R's quality/accuracy. And as the recent NY Times article demonstrated, the commercial vendors rarely miss an opportunity stoke those fears. I have read many r-help posts on this subject so I was aware that R was developed and tested with great care, but until I read the clinical trials doc, I was not aware that they were as many steps and that they were as rigorous (use of version management software, etc.) Even as I read the document, the opening paragraphs made me think it was far too focused to be of general use. Luckily, I kept reading through the CFRs. Modifying that doc would take little effort as I outlined in my original post (below). Putting it in easy reach of every R user is important. By adding that to the docs in R's Help menu, and adding a FAQ entry for it, all R users will have ready access to it. 

My second suggestion is adding an option to the R installation that would let every R user run the test suite, very clearly showing them that it is being done. I realize this is a superfluous step, since you have already run the test suite against R before releasing it. However, it would provide user assurance that they could easily demonstrate to skeptics that very thorough testing is being done. I don't know whether written messages that I suggested below would be best, or simply showing the output scrolling by would have the most impact. Perhaps both, as in a message "Testing accuracy of linear regression..." in a message window while the output scrolled by in the console.

Rather than having this as a part of installation, an alternative would be to end the installation with a message pointing people to a function like validate.R() and an equivalent menu selection as a following step. That would ensure that everyone knows the option exists, plus it enables any R user to run the tests for skeptics at any time. The easier it is to run the test suite the better. 

The complete set of validation programs you use may be huge and impractical for most people to run. In that case, perhaps just a subset could be compiled, with an emphasis on testing the common statistical functions that people are likely to focus their concern upon.

Asking to add a superfluous step to an installation may seem like a waste of time, and technically it is. But psychologically this testing will have a important impact that will silence many critics. Thanks for taking the time to consider it.

Best regards,
Bob


-----Original Message-----
From: Peter Dalgaard [mailto:p.dalgaard at biostat.ku.dk] 
Sent: Saturday, January 24, 2009 4:53 AM
To: Muenchen, Robert A (Bob)
Cc: R-help at r-project.org
Subject: Re: [R] The Quality & Accuracy of R

Bob,

Your point is well taken, but it also raises a number of issues 
(post-install testing to name one) for which the R-devel list would be 
more suitable. Could we move the discussion there?

	-Peter

Muenchen, Robert A (Bob) wrote:

Hi All,

 

We have all had to face skeptical colleagues asking if software made by
volunteers could match the quality and accuracy of commercially written
software. Thanks to the prompting of a recent R-help thread, I read, "R:
Regulatory Compliance and Validation Issues, A Guidance Document for the
Use of R in Regulated Clinical Trial Environments
(http://www.r-project.org/doc/R-FDA.pdf). This is an important document,
of interest to the general R community. The question of R's accuracy is
such a frequent one, it would be beneficial to increase the visibility
of the non-clinical  information it contains. A document aimed at a
general audience, entitled something like, "R: Controlling Quality and
Assuring Accuracy" could be compiled from the these sections:

 

1.      What is R? (section 4)

2.      The R Foundation for Statistical Computing (section  3)

3.      The Scope of this Guidance Document (section 2)

4.      Software Development Life Cycle (section 6)

 

Marc Schwartz, Frank Harrell, Anthony Rossini, Ian Francis and others
did such a great job that very few words would need to change. The only
addition I suggest is to mention how well R did in, Keeling & Parvur's
"A comparative study of the reliability to nine statistical software
packages, May 1, 2007 Computational Statistics & Data Analysis, Vol.51,
pp 3811-3831. 

 

Given the importance of this issue, I would like to see such a document
added to the PDF manuals in R's Help.

 

The document mentions (Sect. 6.3) that a set of validation tests, data
and known results are available. It would be useful to have an option to
run that test suite in every R installation, providing clear progress,
"Validating accuracy of t-tests...Validating accuracy of linear
regression...." Whether or not people chose to run the tests, they would
at least know that such tests are available. Back in my mainframe
installation days, this step was part of many software installations and
it certainly gave the impression that those were the companies that took
accuracy seriously. Of course the other companies probably just ran
their validation suite before shipping, but seeing it happen had a
tremendous impact.  I don't know how much this would add to download,
but if it was too much, perhaps it could be implemented as a separate
download. 

 

I hope these suggestions can help mitigate the concerns so many non-R
users have.

 

Cheers,

Bob

 

=========================================================

Bob Muenchen (pronounced Min'-chen), 

Manager, Research Computing Support 

U of TN Office of Information Technology

Stokely Management Center, Suite 200

916 Volunteer Blvd., Knoxville, TN 37996-0520

Voice: (865) 974-5230

FAX: (865) 974-4810

Email: muenchen at utk.edu

Web: http://oit.utk.edu/research <http://oit.utk.edu/scc> 

Map to Office: http://www.utk.edu/maps    

Newsletter: http://listserv.utk.edu/archives/rcnews.html
<http://listserv.utk.edu/archives/statnews.html> 

=========================================================

 


	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907

1 day later

Peter Dalgaard

Sun, Jan 25, 2009 4:20 PM #

Muenchen, Robert A (Bob) wrote:

saying thank you all very much for the incredible contribution you have
made to science through your work on R.

managers, directors, CIOs etc. are skeptical of R's quality/accuracy.
And as the recent NY Times article demonstrated, the commercial vendors
rarely miss an opportunity stoke those fears. I have read many r-help
posts on this subject so I was aware that R was developed and tested
with great care, but until I read the clinical trials doc, I was not
aware that they were as many steps and that they were as rigorous (use
of version management software, etc.) Even as I read the document, the
opening paragraphs made me think it was far too focused to be of general
use. Luckily, I kept reading through the CFRs. Modifying that doc would
take little effort as I outlined in my original post (below). Putting it
in easy reach of every R user is important. By adding that to the docs
in R's Help menu, and adding a FAQ entry for it, all R users will have
ready access to it.

would let every R user run the test suite, very clearly showing them
that it is being done. I realize this is a superfluous step, since you
have already run the test suite against R before releasing it. However,
it would provide user assurance that they could easily demonstrate to
skeptics that very thorough testing is being done. I don't know whether
written messages that I suggested below would be best, or simply showing
the output scrolling by would have the most impact. Perhaps both, as in
a message "Testing accuracy of linear regression..." in a message window
while the output scrolled by in the console.

would be to end the installation with a message pointing people to a
function like validate.R() and an equivalent menu selection as a
following step. That would ensure that everyone knows the option exists,
plus it enables any R user to run the tests for skeptics at any time.
The easier it is to run the test suite the better.

impractical for most people to run. In that case, perhaps just a subset
could be compiled, with an emphasis on testing the common statistical
functions that people are likely to focus their concern upon.

waste of time, and technically it is. But psychologically this testing
will have a important impact that will silence many critics. Thanks for
taking the time to consider it.

Now that I've asked you in, I probably should at least chip in with a 
couple of brief notes on the issue:

- not everything can be validated, and it's not like the commercial 
companies are validating everything. E.g. nonlinear regression code will 
give different results on different architectures, or even different 
compilers on the same architecture, and may converge on one and not on 
another.

- end-user validation is in principle a good thing, but please notice 
that what we currently do is part of a build from sources, and requires 
that build tools are installed. (E.g., we don't just run things, we also 
compare them to known outputs.) It's not entirely trivial to push these 
techniques to the end user.

- a good reason to want post-install validation is that validity can 
depend on other part of the system outside developer control (e.g. an 
overzealous BLAS optimization, sacrificing accuracy and/or standards 
compliance for speed, can cause trouble). This is also a reason for not 
making too far-reaching statements about validity.

- I'm not too happy about maintaining the same information in multiple 
places. One thing we learned from the FDA document is how easily factual 
errors creep in and how silly we'd look if, say, the location of a key 
server got stated incorrectly, or say that we release one patch version 
when in fact the most recent one had two. This kind of authoritative 
document itself needs a verification process to ensure that it is correct.

Best,

-pd

Best regards, Bob

-----Original Message-----
From: Peter Dalgaard [mailto:p.dalgaard at biostat.ku.dk] 
Sent: Saturday, January 24, 2009 4:53 AM
To: Muenchen, Robert A (Bob)
Cc: R-help at r-project.org
Subject: Re: [R] The Quality & Accuracy of R

Bob,

Your point is well taken, but it also raises a number of issues 
(post-install testing to name one) for which the R-devel list would be 
more suitable. Could we move the discussion there?

	-Peter

Muenchen, Robert A (Bob) wrote:

Hi All,

 

We have all had to face skeptical colleagues asking if software made by
volunteers could match the quality and accuracy of commercially written
software. Thanks to the prompting of a recent R-help thread, I read, "R:
Regulatory Compliance and Validation Issues, A Guidance Document for the
Use of R in Regulated Clinical Trial Environments
(http://www.r-project.org/doc/R-FDA.pdf). This is an important document,
of interest to the general R community. The question of R's accuracy is
such a frequent one, it would be beneficial to increase the visibility
of the non-clinical  information it contains. A document aimed at a
general audience, entitled something like, "R: Controlling Quality and
Assuring Accuracy" could be compiled from the these sections:

 

1.      What is R? (section 4)

2.      The R Foundation for Statistical Computing (section  3)

3.      The Scope of this Guidance Document (section 2)

4.      Software Development Life Cycle (section 6)

 

Marc Schwartz, Frank Harrell, Anthony Rossini, Ian Francis and others
did such a great job that very few words would need to change. The only
addition I suggest is to mention how well R did in, Keeling & Parvur's
"A comparative study of the reliability to nine statistical software
packages, May 1, 2007 Computational Statistics & Data Analysis, Vol.51,
pp 3811-3831. 

 

Given the importance of this issue, I would like to see such a document
added to the PDF manuals in R's Help.

 

The document mentions (Sect. 6.3) that a set of validation tests, data
and known results are available. It would be useful to have an option to
run that test suite in every R installation, providing clear progress,
"Validating accuracy of t-tests...Validating accuracy of linear
regression...." Whether or not people chose to run the tests, they would
at least know that such tests are available. Back in my mainframe
installation days, this step was part of many software installations and
it certainly gave the impression that those were the companies that took
accuracy seriously. Of course the other companies probably just ran
their validation suite before shipping, but seeing it happen had a
tremendous impact.  I don't know how much this would add to download,
but if it was too much, perhaps it could be implemented as a separate
download. 

 

I hope these suggestions can help mitigate the concerns so many non-R
users have.

O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907

1 day later

Muenchen, Robert A (Bob)

Mon, Jan 26, 2009 4:30 PM #

Peter Dalgaard wrote:

Now that I've asked you in, I probably should at least chip in with a
couple of brief notes on the issue:

- not everything can be validated, and it's not like the commercial
companies are validating everything. E.g. nonlinear regression code will
give different results on different architectures, or even different
compilers on the same architecture, and may converge on one and not on
another.

(Muenchen)==> Good point. The test suites that I ran when installing mainframe software were quite simple. Just one of each of various statistical methods, and I doubt any of them iterated back then. You would want to choose carefully the things to test to minimize such problems. The process I ran listed differences between my locally computed version and the one the company ran. If all went well, I would just see the different dates and times scroll by.

- end-user validation is in principle a good thing, but please notice
that what we currently do is part of a build from sources, and requires
that build tools are installed. (E.g., we don't just run things, we also
compare them to known outputs.) It's not entirely trivial to push these
techniques to the end user.

(Muenchen)==> It sounds like this is quite different from what I expected. The test suites I have seen were just standard code and datasets. They ran in the program I was installing so no extra tools were required. Known outputs did ship with the products for the comparison.

- a good reason to want post-install validation is that validity can
depend on other part of the system outside developer control (e.g. an
overzealous BLAS optimization, sacrificing accuracy and/or standards
compliance for speed, can cause trouble). This is also a reason for not
making too far-reaching statements about validity.

(Muenchen)==> Yes, and the combination of argument settings is probably close to infinite. We would want to emphasize that although testing is done, it's impossible for ANY organization to test all conditions.

- I'm not too happy about maintaining the same information in multiple
places. One thing we learned from the FDA document is how easily factual
errors creep in and how silly we'd look if, say, the location of a key
server got stated incorrectly, or say that we release one patch version
when in fact the most recent one had two. This kind of authoritative
document itself needs a verification process to ensure that it is correct.

(Muenchen)==> Having maintained multiple docs that contained common sections, I can certainly agree it is hard to keep them synchronized. However, if there can only be one document, should it be focused on a small (albeit important) sliver of statistical use? If, as I suspect, the great majority of R users face this question, would it not make sense to address the bigger problem?

Would it be possible to address both audiences in the same document, by putting the information of general interest before the clinical-specific info? Would a more generic title lose the clinical audience?

Thanks,
Bob

-----Original Message-----
From: Peter Dalgaard [mailto:p.dalgaard at biostat.ku.dk] 
Sent: Saturday, January 24, 2009 4:53 AM
To: Muenchen, Robert A (Bob)
Cc: R-help at r-project.org
Subject: Re: [R] The Quality & Accuracy of R

Bob,

Your point is well taken, but it also raises a number of issues 
(post-install testing to name one) for which the R-devel list would be 
more suitable. Could we move the discussion there?

	-Peter

Muenchen, Robert A (Bob) wrote:

Hi All,

 

We have all had to face skeptical colleagues asking if software made by
volunteers could match the quality and accuracy of commercially written
software. Thanks to the prompting of a recent R-help thread, I read, "R:
Regulatory Compliance and Validation Issues, A Guidance Document for the
Use of R in Regulated Clinical Trial Environments
(http://www.r-project.org/doc/R-FDA.pdf). This is an important document,
of interest to the general R community. The question of R's accuracy is
such a frequent one, it would be beneficial to increase the visibility
of the non-clinical  information it contains. A document aimed at a
general audience, entitled something like, "R: Controlling Quality and
Assuring Accuracy" could be compiled from the these sections:

 

1.      What is R? (section 4)

2.      The R Foundation for Statistical Computing (section  3)

3.      The Scope of this Guidance Document (section 2)

4.      Software Development Life Cycle (section 6)

 

Marc Schwartz, Frank Harrell, Anthony Rossini, Ian Francis and others
did such a great job that very few words would need to change. The only
addition I suggest is to mention how well R did in, Keeling & Parvur's
"A comparative study of the reliability to nine statistical software
packages, May 1, 2007 Computational Statistics & Data Analysis, Vol.51,
pp 3811-3831. 

 

Given the importance of this issue, I would like to see such a document
added to the PDF manuals in R's Help.

 

The document mentions (Sect. 6.3) that a set of validation tests, data
and known results are available. It would be useful to have an option to
run that test suite in every R installation, providing clear progress,
"Validating accuracy of t-tests...Validating accuracy of linear
regression...." Whether or not people chose to run the tests, they would
at least know that such tests are available. Back in my mainframe
installation days, this step was part of many software installations and
it certainly gave the impression that those were the companies that took
accuracy seriously. Of course the other companies probably just ran
their validation suite before shipping, but seeing it happen had a
tremendous impact.  I don't know how much this would add to download,
but if it was too much, perhaps it could be implemented as a separate
download. 

 

I hope these suggestions can help mitigate the concerns so many non-R
users have.

O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907

David Smith

Mon, Jan 26, 2009 5:01 PM #

On Sun, Jan 25, 2009 at 4:20 PM, Peter Dalgaard

<p.dalgaard at biostat.ku.dk> wrote:

I wanted to echo Peter's point here.  It's the main reason why we
don't claim our distribution of R is validated: *no* software can be
considered validated outside of the environment where it is installed
and used.  (We do however claim Revolution R is ready for a validation
*process*, a small but significant part of which is coming on-site to
run tests and verify the results.) We've come across a number of
environmental issues (locales, random number generators, shared
libraries, path settings, many others) that may affect the validation
process.  My main point here is that R can only be validated in situ,
and the process isn't practical to automate.  With the right build
tools in place, many of the *tests* can be automated, but that leaves
out validation on how the results are stored, used, and accessed in
practice.

Nonetheless, Bob has an excellent point here -- even short of a
complete validation process, *perception* can prevent the validation
ball from getting stuck in the first place.  Giving the user some
degree of easily-digestible feedback that the installed R has run and
passed a battery of tests could help for that, and is something we'll
look at for the Revolution R distribution.

# David Smith

P.S. For those who subscribe to r-devel but not r-help, some further
discussion of validation for R is here:
http://blog.revolution-computing.com/2009/01/analyzing-clinical-trial-data-with-r.html

--
David M Smith <david at revolution-computing.com>
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (Seattle, USA)