Thanks, Michael.
httpuv, to which Hector made crucial contributions, makes it easy to send
data directly between R and the browser, using websockets. I resort to
files, however, because when the data, rendered as json, exceeds 500k, the
websocket hangs. I never identified the weak spot. Some Juypter
developers recently had good luck with binary websocket data exchange. I
am cautious, though, about pushing limits and using the latest websocket
extension, and found the fallback to local files quite adequate for now.
I?ll look at ucsc.R.
- Paul
On Mar 9, 2018, at 11:48 AM, Michael Lawrence <lawrence.michael at gene.com>
Couple of things:
1) Check out epivizr and the surrounding infrastructure (maybe Hector
can chime in). It's able to serve up data directly from R; would be nice if
we could do that with IGV, instead of writing out to files. That would
require it to talk to some standard API, like the old DAS.
2) The rtracklayer API is in rtracklayer/R/browser.R. See ucsc.R for how
that is implemented for UCSC.
On Fri, Mar 9, 2018 at 9:59 AM, Paul Shannon <
pshannon at systemsbiology.org> wrote:
Thanks, Levi. Your comments, and Gabe?s are very helpful, getting me to
consider things I have overlooked.
Support for GenomicRanges is essential, as you and Gabe point out.
In all cases IGV will convert a GRanges object to an appropriate track,
then write it out as a temporary file. igv supports bed, gff, gff3, gtf,
wig, bigWig, bedGraph, bam, vcf, and seg formats, and a variety of
sources: files via http, google cloud storage, GA4GH; recent limited
support has been provided for direct javascript data. Maybe someday
AnnotationHub?
GenomicRanges as I understand them are very flexible, not subclassed
into types as are track formats. So I propose that in many cases it will
be he user?s responsibility to specify track type, call the appropriate
constructor, maybe specify column names so that the right scores can be
extracted from the mcols - whose names are, so far as I know, are not
standardized.
If the GRanges object is too big - greater than a densely packed
megabase, for instance, igv works best if the track file is indexed and
served up by an index- and CORS-savvy webserver. Thus the IGV should
politely fail - or at least issue a warning - when encounters big tracks.
This ?too big? threshold may change over time.
Reading through Michael?s rtracklayer vignette I came across this:
The rtracklayer package currently interfaces with the UCSC web-based
Other packages may provide drivers for other genome browsers through
Can anyone (maybe Michael himself?) comment on how I can evaluate an
rtracklayer plugin strategy for igv?
On Mar 9, 2018, at 4:15 AM, Levi Waldron <lwaldron.research at gmail.com>
On Thu, Mar 8, 2018 at 12:29 AM, Paul Shannon <
pshannon at systemsbiology.org> wrote:
Thanks, Gabe.
You make an excellent point: bioc objects get first class support. In
some instance, base R data types deserve that also, and data.frames lead
the list for me, being useful, concise, universally available, expressive.
So perhaps not ?data.frames replaced by? but ?accompanied by?
appropriate bioc data types?
- Paul
Definitely +1 for supporting GenomicRanges, including what's in
genome() and mcols(). There's a demo of an rtracklayer -> GRanges -> UCSC
genome browser workflow in the rtracklayer vignette that I've made use of.
I wouldn't necessarily say *don't* support data.frame, but I would
certainly encourage Bioc users to import data with rtracklayer instead of
generic read* functions, and to take advantage of the vast AnnotationHub
and OrganismDbi-based annotations which provide GenomicRanges objects.
Thanks and looking forward to it!