Skip to content

[Bioc-devel] Extending biovizBase and ggbio packages

12 messages · Martin Morgan, Jim Hester, Rainer Johannes +1 more

#
Dear all,

I?ve modified the biovizBase and ggbio packages so that they do directly support EnsDb annotations (just like annotations provided by TxDb objects/packages).
Is there a way I could provide these changes? I?ve directly contacted Tengfei last week, but did not get any reply yet?

cheers, jo
#
Sounds very useful. Perhaps you could make a github pull request on
the Bioconductor mirrors of those packages. Then Tengfei or I could
look it over.

Btw, I like the filtering functionality in ensemldb. Would be nice to
have something as rich for TxDb. Would be great if there were
convenience wrappers like subset() and sort(). Like
transcripts(sort(subset(db, seqname == "chr1"), by = geneSymbol)).

Thanks,
Michael

On Mon, Dec 14, 2015 at 4:12 AM, Rainer Johannes
<Johannes.Rainer at eurac.edu> wrote:
#
dear Micheal,

github pull requests would also be my favorite way to contribute, but unfortunately the Bioconductor-mirror github repos are read-only (thus, as far as I got it, no pull requests are possible), and I didn?t find other repos in github.

Regarding the filtering, you mean implementing subset() and sort() in ensembldb? 
My only concern with the approach you describe below is that it rather looks like "fetch from db and then filter?, which might be quite slow. I implemented the filters such that they are considered at query execution time (in fact, they are used to build the SQL query). 
That way also plotting of gene models in ggbio using EnsDbs is really fast (since only that small portion that will be plotted is really fetched from the db).

thanks, jo
#
On Mon, Dec 14, 2015 at 6:03 AM, Rainer Johannes
<Johannes.Rainer at eurac.edu> wrote:
How about forking the repo, and committing the changes, so we can
comment on the commits?
That is what it would look like, but that's not how it would be implemented.
#
The mirror doesn't accept pull requests (the repository owner would have to handle the pull request, and package maintainers are not the owners of the mirror; this will eventually change). Provide a diff against the mirror, and patch in svn, or package maintainer fork the mirror (http://bioconductor.org/developers/how-to/git-mirrors/) and pull against that.
#
Rainer,

Pull requests to the git mirrors will be closed automatically because they
are read only mirrors. However you can still fork the mirror yourself and
commit your changes to your fork to make them easy for Michael and Tengfei
to review.

Jim

On Mon, Dec 14, 2015 at 9:03 AM, Rainer Johannes <Johannes.Rainer at eurac.edu>
wrote:

  
  
#
I?ll do, thanks for all comments!
On 14 Dec 2015, at 15:55, Jim Hester <james.f.hester at gmail.com<mailto:james.f.hester at gmail.com>> wrote:
Rainer,

Pull requests to the git mirrors will be closed automatically because they are read only mirrors. However you can still fork the mirror yourself and commit your changes to your fork to make them easy for Michael and Tengfei to review.

Jim
On Mon, Dec 14, 2015 at 9:03 AM, Rainer Johannes <Johannes.Rainer at eurac.edu<mailto:Johannes.Rainer at eurac.edu>> wrote:
dear Micheal,

github pull requests would also be my favorite way to contribute, but unfortunately the Bioconductor-mirror github repos are read-only (thus, as far as I got it, no pull requests are possible), and I didn?t find other repos in github.

Regarding the filtering, you mean implementing subset() and sort() in ensembldb?
My only concern with the approach you describe below is that it rather looks like "fetch from db and then filter?, which might be quite slow. I implemented the filters such that they are considered at query execution time (in fact, they are used to build the SQL query).
That way also plotting of gene models in ggbio using EnsDbs is really fast (since only that small portion that will be plotted is really fetched from the db).

thanks, jo
_______________________________________________
Bioc-devel at r-project.org<mailto:Bioc-devel at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
#
OK, the modifications are in the repositories:
https://github.com/jotsetung/biovizBase
https://github.com/jotsetung/ggbio

let me know if I can be of any help.

cheers, jo
On 14 Dec 2015, at 16:00, Rainer Johannes <Johannes.Rainer at eurac.edu<mailto:Johannes.Rainer at eurac.edu>> wrote:
I?ll do, thanks for all comments!
On 14 Dec 2015, at 15:55, Jim Hester <james.f.hester at gmail.com<mailto:james.f.hester at gmail.com>> wrote:
Rainer,

Pull requests to the git mirrors will be closed automatically because they are read only mirrors. However you can still fork the mirror yourself and commit your changes to your fork to make them easy for Michael and Tengfei to review.

Jim
On Mon, Dec 14, 2015 at 9:03 AM, Rainer Johannes <Johannes.Rainer at eurac.edu<mailto:Johannes.Rainer at eurac.edu>> wrote:
dear Micheal,

github pull requests would also be my favorite way to contribute, but unfortunately the Bioconductor-mirror github repos are read-only (thus, as far as I got it, no pull requests are possible), and I didn?t find other repos in github.

Regarding the filtering, you mean implementing subset() and sort() in ensembldb?
My only concern with the approach you describe below is that it rather looks like "fetch from db and then filter?, which might be quite slow. I implemented the filters such that they are considered at query execution time (in fact, they are used to build the SQL query).
That way also plotting of gene models in ggbio using EnsDbs is really fast (since only that small portion that will be plotted is really fetched from the db).

thanks, jo
_______________________________________________
Bioc-devel at r-project.org<mailto:Bioc-devel at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel
#
Great, thanks for this valuable contribution. I made some comments on
the commits. The biggest issue is that I think there is a lot of code
duplication between the EnsDb and TxDb methods. We should try hard to
reduce this.

Michael

On Tue, Dec 15, 2015 at 1:37 AM, Rainer Johannes
<Johannes.Rainer at eurac.edu> wrote:
#
Thanks for the comments! I?ll make some changes and push a ?cleaner? version once I?m done.
Indeed, I have to reduce code duplications. I could also use the same or similar code than for TxDb, but I wanted to make as much use of the EnsDb filter system as possible to reduce processing time.

jo
#
OK, I have cleared the code duplications and pushed to my biovizBase and ggbio forks. 
Also, in the crunch method for EnsDb I?m avoiding now the loops. Thanks Michael!

jo
#
Awesome. That's great (and fast) work. I will add you to the author
list of both packages.

Thanks a lot for your contribution,
Michael

On Wed, Dec 16, 2015 at 6:15 AM, Rainer Johannes
<Johannes.Rainer at eurac.edu> wrote: