Skip to content

[Bioc-devel] plotPCA for BiocGenerics

6 messages · Michael Love, Kasper Daniel Hansen, Wolfgang Huber +2 more

#
I guess I'm thinking now to move toward breaking up the PCA and plot
in DESeq2, given that there is non-trivial computation going on, and
then to leave plotPCA as a wrapper and to avoid breaking any user
code. We had already moved this way a bit, in that in the current
release, plotPCA has an argument to return a data.frame only.

As far as the proposal of using the plot() function for all plots, I
think for the biologists who are struggling already to get R going,
and to figure out what kinds of plots are possible, plotMA (and
knowing that the help is available at ?plotMA) is just so much simpler
than the alternative (isn't it ?"plot,MA-method" for S4?).

For the benefit of future package developers, I wonder if the core
team wants to add some guidance to the package guidelines on this
topic?

http://bioconductor.org/developers/package-guidelines/


On Fri, Oct 31, 2014 at 9:10 PM, Michael Lawrence
<lawrence.michael at gene.com> wrote:
#
On Nov 1, 2014 1:29 PM, "Michael Love" <michaelisaiahlove at gmail.com> wrote:
Scratch that... I forgot that finding help has to be ugly either way.
and
side
While I
?princomp.
I
way to
plot()
squeeze
artificial and
other
lawrence.michael at gene.com>
rendering.
of
tension
levels
need them
code
not only
brains no
on
the
prevalance, and
user.
not sure
be
say), I
generic
plot-types
instance
accumulated
So while
types such
such as
find
even more
visual
classes
enhancements,
said)
it's
that
derive a
here*)
class.
of
classes
of all
of
plotting.
kevin.r.coombes at gmail.com>>>
get
reusable
function,
"plotPCA"
from
case
from
zero
that
practitioners
of the
but
the
better
*same*
packages, I
this
kevin.r.coombes at gmail.com>>>
you (a)
results
is
opinion)
"plot" or
bunch
from
would be
can
titles,
different
could
to do
the
which is
herlp
to
RLE
me. It
from
http://r-forge.r-project.org/
it
is
is
rows,
single
is the
components of
plots
methods for
elsewhere in
.
that
adopt
(to
current
maintainers to
deprecate
break a
"plot"
there
defined
CopyNumber450k,
------------------------------------------------------------
avast!
davide.risso at berkeley.
Antivirus
list

  
  
#
I see the argument for separating plotting and computation.

I don't see the argument for changing plotPCA to plot.  base R has things
that work either way; we all know hist(), boxplot() etc etc.  And for this
specific case there are (good) arguments for the fact that one could
envision several plots on a PCA object.

But while I see the argument, by having a common class which all packages
should use, it becomes pretty hard to have package specific customization
(colors, phenodata etc etc), or it will at least require some thinking.

Best,
Kasper

On Sat, Nov 1, 2014 at 2:21 PM, Michael Love <michaelisaiahlove at gmail.com>
wrote:

  
  
#
Just to bring the discussion back to the fact that there is a need to do /something/. A function plotPCA is defined in packages EDASeq, DESeq2, DESeq, affycoretools, Rcade, facopy, CopyNumber450k, netresponse, MAIT, with a real potential for needless user confusion. And BiocGenerics already defines the generics plotMA and plotDispEsts.

The need for BiocGenerics in the first place is a consequence of the S4 / Dylan / Common LISP object system and the fact that our project releases more than one package. We should not confuse that with the other issues that came up in the thread.

To what extent functions that do related things should have the same name seems a matter of taste. Reducing the number of function names that are around, but increasing the number of classes, seems pretty much a null-sum game to me. <irony> We could have a ?compute? generic, for all functions that compute something? Might make things easier for some users. Until some authors start using its argument ?what? to say what it should compute if it?s not already clear from the class of its argument(s). </irony> 

I second Mike?s suggestion & Kasper?s points.

Best wishes
	Wolfgang
#
whatever, here's a patch.  If BiocGenerics had a GitHub repo I'd just
submit a pull request.  Bounce it or don't, it took 5 minutes of a Sunday
morning for a non-core committer, so it must not be that hard to build a
bikeshed after all.  /s

FWIW I once remarked to Robert Gentleman that it seemed wasteful for there
to be multiple implementations of the same functionality.  He pointed out
that this just lets the most useful/polished/lucky implementation gradually
take over.  So if plot() is preferable to plotPCA(), for example, plotPCA
will go away.  And if not, plotPCA will stay.

Package authors can vote with their feet.


Statistics is the grammar of science.
Karl Pearson <http://en.wikipedia.org/wiki/The_Grammar_of_Science>
On Sun, Nov 2, 2014 at 12:19 AM, Wolfgang Huber <whuber at embl.de> wrote:

            
-------------- next part --------------
Index: DESCRIPTION
===================================================================
--- DESCRIPTION	(revision 96347)
+++ DESCRIPTION	(working copy)
@@ -1,7 +1,7 @@
 Package: BiocGenerics
 Title: S4 generic functions for Bioconductor
 Description: S4 generic functions needed by many Bioconductor packages.
-Version: 0.13.0
+Version: 0.13.1
 Author: The Bioconductor Dev Team
 Maintainer: Bioconductor Package Maintainer <maintainer at bioconductor.org>
 biocViews: Infrastructure
@@ -39,7 +39,7 @@
 	tapply.R
 	unique.R
 	unlist.R
-        unsplit.R
+  unsplit.R
 	relist.R
 	boxplot.R
 	image.R
@@ -53,6 +53,7 @@
 	dge.R
 	normalize.R
 	plotMA.R
+	plotPCA.R
 	normarg-utils.R
 	show-utils.R
 	strand.R
Index: NAMESPACE
===================================================================
--- NAMESPACE	(revision 96347)
+++ NAMESPACE	(working copy)
@@ -193,6 +193,7 @@
     estimateSizeFactors, 
     estimateDispersions,
     plotDispEsts, 
+    plotPCA,
     plotMA
 )
 
Index: R/plotPCA.R
===================================================================
--- R/plotPCA.R	(revision 0)
+++ R/plotPCA.R	(working copy)
@@ -0,0 +1,10 @@
+setGeneric("plotPCA", function(object, ...) {
+  standardGeneric("plotPCA")
+})
+
+setMethod("plotPCA", signature="ANY", 
+  definition = function(object, ...) {
+    msg = sprintf("Error from the generic function 'plotPCA' defined in package 'BiocGenerics': no S4 method definition for argument '%s' of class '%s' was found. Did you perhaps mean calling the function 'plotPCA' from another package, e.g. 'DESeq2'? In that case, please use the syntax 'DESeq2::plotPCA'.",
+                  deparse(substitute(object)), class(object))
+  stop(msg)
+})
Index: man/plotPCA.R
===================================================================
--- man/plotPCA.R	(revision 0)
+++ man/plotPCA.R	(working copy)
@@ -0,0 +1,55 @@
+\name{plotPCA}
+
+\alias{plotPCA}
+\alias{plotPCA,ANY-method}
+
+\title{PCA-plot: plot principal component scores for high-throughput data}
+
+\description{
+  A generic function which produces a principal components plot for an object containing microarray, RNA-Seq
+  or other data.
+}
+
+\usage{
+plotPCA(object, ...)
+}
+
+\arguments{
+  \item{object}{
+    A data object, typically containing count values from an RNA-Seq experiment or microarray intensity values.
+  }
+  \item{...}{
+    Additional arguments, for use in specific methods.
+  }
+}
+
+\value{
+  Undefined. The function exists for its side effect, producing a plot.
+}
+
+\seealso{
+  \itemize{
+    \item \code{\link[methods]{showMethods}} for displaying a summary of the
+          methods defined for a given generic function.
+
+    \item \code{\link[methods]{selectMethod}} for getting the definition of
+          a specific method.
+
+    \item \code{\link[DESeq2]{plotPCA}} in the \pkg{DESeq2} package
+          for a function with the same name that is not dispatched through this generic function.
+
+    \item \code{\link{BiocGenerics}} for a summary of all the generics defined
+          in the \pkg{BiocGenerics} package.
+  }
+}
+
+\examples{
+showMethods("plotPCA")
+
+suppressWarnings(
+  if(require("DESeq2"))
+    example("plotPCA", package="DESeq2", local=TRUE)
+)
+}
+
+\keyword{methods}
#
On Sun, Nov 2, 2014 at 12:19 AM, Wolfgang Huber <whuber at embl.de> wrote:

            
I think there are real benefits to having a general "plot" abstraction. For
example, a reporting framework or GUI could use it to render a graphical
representation of an object. That doesn't preclude specific functions for
particular plot variants. It would just be nice to have a default
visualization of an object, in the same way we can call print to produce a
textual representation at the console. They're complementary.