Skip to content
Prev 3772 / 21312 Next

[Bioc-devel] parallel package generics

On 10/24/12 5:08 PM, "Herv? Pag?s" <hpages at fhcrc.org> wrote:

            
We have the identical problem already when we try to use parallel mcmapply
on a BioC List (i.e. GRangesList).

Witness:

The casual user (ehrm, myself at least) expects that since I can 'lapply'
on a BioC GRangesList (or any other List) that I should be able to
mclapply on it.

Sadly the casual user is wrong, and gets an error.

Why?

Because parallel::mclapply(X... calls as.list on X.

Which yields 'Error in as.list.default : no method for coercing this S4
class to a vector'

But, you say, IRanges defines as.list for Lists, as can be demonstrated by
calling as.list(myGRL) on a GRangesList.

Here I yield the floor to someone who can explain why this is so, for I
have not studied enough how namespaces/packages/symboltables/whatever work
in R.

Anyone?

Regardless, one BAD workaround I found works is to snarf (tm) the source
for mclapply, evaluate it in the global namespace, after prefixing all
parallel internal functions with 'parallel:::'.

AFter doing this, the modified mclapply works as one might expect.

So, there is at least an issue regarding how method dispatch works across
namespaces.  Again I yield the floor, but, expect that it can be fixed.

BUT, FURTHERMORE, MCLAPPLY SHOULD NOT COERCE X TO LIST ANYWAY

Why?  Because calling `as.list` incurs the overhead of (needlessly!?!)
coercing this nice tight GRangesList into a base::list.

There is NO REASON for it to be coercing X to a list at all.  By my
lights, mclapply only needs `length` and `seq_along` defined on X, which
ARE ALREADY available to a GRangesList from Vector.   Indeed, commenting
out the X<-as.list(X) coercion in mclapply and, lo, it still works on a
GRangesList as hoped, and on a 1000 element GRanges list takes ~18x less
user time to mclapply(myGRL,length).   (and even short just to use
elementLengths, but that is not the point).

In this case the solution appears to be to FIX the upstream package so
that method dispatch works correctly (I expect that length and seq_along
are only visible to my snarfed mclapply and would suffer from similar
error without adressing the package issue).

Indeed, similarly, in my proposed changed to parallel::pvec, I found a
simple change that made it work with Vector as well as vector, since
Vector implements `[` and `length`.

I still think the solution to getting an SGE (et. al.) parallel back-end
is to seek to improve the upstream package to make 'pluggable' for
different parallel backends.

I don't think I'm the right person to represent this to R-devel as
obviously I am not schooled (yet!?!?) in the workings of
S3/S4/signatures/methods/etc.

Herve, I have a hunch that your 'In the mean time' solution is a
workaround that has the potential to invite further confusion.

Anyone, as a perhaps related issue, and as an opportunity to educate me,
can you explain why untrace does NOT completely work on `lapply` (with
BiocGenerics loaded).  Viz:

trace(lapply)
untrace(lapply)
IRanges(1,2)
IRanges of length 1
trace: lapply(dots, methods:::.class1)
....


--Malcolm

Thread (32 messages)

Hahne, Florian parallel package generics Oct 17 Martin Morgan parallel package generics Oct 23 Hahne, Florian parallel package generics Oct 23 Steve Lianoglou parallel package generics Oct 23 Vincent Carey parallel package generics Oct 23 Martin Morgan parallel package generics Oct 23 Vincent Carey parallel package generics Oct 23 Michael Lawrence parallel package generics Oct 23 Cook, Malcolm parallel package generics Oct 24 Hervé Pagès parallel package generics Oct 24 Hahne, Florian parallel package generics Oct 25 Cook, Malcolm parallel package generics Oct 25 Vincent Carey parallel package generics Oct 25 Hahne, Florian parallel package generics Oct 25 Tim Triche, Jr. parallel package generics Oct 25 Hahne, Florian parallel package generics Oct 25 Hahne, Florian parallel package generics Oct 25 Hervé Pagès parallel package generics Oct 25 Cook, Malcolm parallel package generics Oct 25 Martin Morgan as.list.List (was Re: parallel package generics) Oct 25 Martin Morgan Why BiocGenerics (was Re: parallel package generics) Oct 25 Vincent Carey parallel package generics Oct 25 Michael Lawrence parallel package generics Oct 25 Michael Lawrence as.list.List (was Re: parallel package generics) Oct 25 Cook, Malcolm as.list.List (was Re: parallel package generics) Oct 25 Cook, Malcolm parallel package generics Oct 25 Cook, Malcolm Why BiocGenerics (was Re: parallel package generics) Oct 25 Hervé Pagès parallel package generics Oct 25 Cook, Malcolm parallel package generics Oct 25 Hervé Pagès parallel package generics Oct 25 Hahne, Florian parallel package generics Oct 26 Nicolas Delhomme parallel package generics Oct 26