[Bioc-devel] How best to remap S4Vectors::Hits indices?
Hi Pariksheet,
On 05/22/2018 04:57 PM, Pariksheet Nanda wrote:
Hi folks, I'm working on a package that does some trivial GRanges position classifications; primarily to standardize nomenclature according to the literature in workflows. The API for S4Vectors::Hits() generally doesn't seem amenable to modify Hits objects, except for the remapHits() feature (which I see underneath the covers really generates a new Hits object).
Exactly. And that is the case for any object in R that is not a reference object (i.e. that is not an environment, external pointer, or reference class instance). Modifying it always generates a new object. For example replacing a column of a data frame with my_df$foo <- value or a slot of an S4 object with my_object at foo <- value generates a new object. So adding setter methods for Hits objects wouldn't change that. The only reason we don't provide from()/queryHits() or to()/subjectHits() setters is because we've not been able to identify use cases that justify having them so far. For those use cases where the 'from' and 'to' slots both need to be modified (in an atomic way), calling the Hits() constructor to generate a new object does the job.
I was hoping someone could take a quick look at a short function I'm using to subset and reindex Hits in the da_tss() function: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_coregenomics_nascentrna_blob_a2d9d10564c3a88759237b56ec49d0d3e73f6d16_R_classify.R-23L70&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=NZbAyFFpxVrRnJ_wgmnGDVpP3zsiyUN-I1CYW18k--I&s=wdseZzTGLbMSi02jPq5IsSaOgUlJVn_Pbqop_swCpjc&e= Yes, to illustrate the problem I'm having, I've directly used the @-style S4 access which is, of course, a terrible thing to do because it defeats the purpose of S4 object validation, which is why I'm e-mailing the list for an alternative. I feel like casting to something like a data.frame, changing the indices, and changing back to Hits would be wasteful and improperly using the Bioconductor framework?
No need to cast the object to a data.frame. That would indeed be
wasteful. Just compute the new 'from' and 'to' vectors then do
'Hits(from, to, nLnode=nLnode(hits), nRnode=nRnode(hits))'
to create the modified object ('hits' being the original object).
Hope this helps,
H.
Here are the corresponding tests that run the da_tss() function: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_coregenomics_nascentrna_blob_a2d9d10564c3a88759237b56ec49d0d3e73f6d16_tests_testthat_test-2Dclassifiers.R&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=NZbAyFFpxVrRnJ_wgmnGDVpP3zsiyUN-I1CYW18k--I&s=tljRQI1QSZvtWBVQ6nZxvkvHDEhsHLEuileQTDJZAu0&e= What it comes down to is this: I want to compare a subset of GRanges for hits, but revert to the original GRanges indices when returning the results. Thanks for any advice! Pariksheet [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=NZbAyFFpxVrRnJ_wgmnGDVpP3zsiyUN-I1CYW18k--I&s=axfJINFZYMTUgAtiTpF1FfjKAHOgjHrsbge0ANjtCrE&e=
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319