[Bioc-devel] coverage,GenomicRanges interpretation of 'weight'
On 02/23/2012 11:22 AM, Cook, Malcolm wrote:
Another way the interface to coverage could/should agree between GenomicRanges and RangedData is the expectation of how width is encoded. For GenomicRanges "'width' must be NULL or a numeric vector" For RangedData, " 'width' must be a non-empty list" Of course not crucial. But, the disagreement is easy to trip over.
In the early days, the "coverage" method for GenomicRanges also used to take a non-empty list but I changed this about one year ago for the "NULL or a numeric vector" interface, which I find more natural (putting the widths for each seqlevel in a numeric vector is more natural than putting them in a list where each element is a numeric vector of length 1, and it's consistent with 'seqlengths(x)'). We should definitely make the same change to the other "coverage" methods that still use the old interface though (RangesList, RangedData, there might be more...). I've put this on our list. Thanks, H.
~Malcolm
-----Original Message-----
From: bioc-devel-bounces at r-project.org [mailto:bioc-devel-bounces at r-
project.org] On Behalf Of Cook, Malcolm
Sent: Thursday, February 23, 2012 1:14 PM
To: 'bioc-devel at r-project.org'
Subject: [Bioc-devel] coverage,GenomicRanges interpretation of 'weight'
It would be great I think if coverage,GenomicRanges interpetation of
'weight' would be similar to that for RangedData
For 'RangedData' objects, this can also be a single string naming a
column to be used as the weights.
In the case for coverage,GenomicRanges we would want to weigh by any
attribute (i.e. score) in the values DataTable.
is this reasonable?
I like being able to refer symbolically as provided by RangedData. It allows
me to write, for instance, this utility function:
coverageByStrand<-function(x,...){
## PURPOSE: compute the coverage of x, split by 'strand'.
## RETURNS: a list of SimpleRLEList (by chromosome)
res<-lapply(split(x,strand(x)),coverage,...)
}
which I can then use as
someStrandedCoverage<-
coverageByStrand(someStrandedFeaturesAsGenomicRanges,weight='theDa
taTableAttributeHoldingSomeWeightFactor')
Otherwise I would have to test for the presence of a weight attribute and
split it by strand, and use mapply instead, etc....
Regardless of the merits, having the interface to coverage be similar
between RangedData and GenomicRanges is arguably desirable.
My workaround is to convert to RangedData for the computation. Definitely
not urgent.
Malcolm Cook
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319