Good day, readGAlignmentPairs has strandMode but readGAlignments doesn't, which means that single-end strand-specific RNA-seq that generates sequences on the opposite strand to the gene needs a subsequent ifelse statement. The API could be more consistent by providing a strandMode option for readGAlignments and other similar functions in GenomicAlignments. -------------------------------------- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia
[Bioc-devel] readGAlignments Lacks strandMode
4 messages · Dario Strbenac, Hervé Pagès
7 days later
Hi,
No ifelse() statement needed. Just use invertStrand() on your
GAlignments object to invert its strand.
strandMode is specific to paired-end reads and the supported modes
reflect what other software do with paired-end reads (e.g. TopHat,
Rsubread, etc..). These modes don't make sense for single-end reads.
Furthermore note that the mode only affects the way the strand() getter
for GAlignmentPairs objects combines the strands from the first and last
end to return a single value for each pair. It does NOT modify the
strand of each end:
library(GenomicAlignments)
bamfile <- system.file("extdata", "ex1.bam", package="Rsamtools")
## strandMode 1
galp <- readGAlignmentPairs(bamfile, strandMode=1)[c(1:3, 1572)]
as.character(strand(first(galp)))
# [1] "+" "+" "+" "-"
as.character(strand(last(galp)))
# [1] "-" "-" "-" "+"
as.character(strand(galp))
# [1] "+" "+" "+" "-"
## strandMode 2
galp <- readGAlignmentPairs(bamfile, strandMode=2)[c(1:3, 1572)]
as.character(strand(first(galp))) # same as with strandMode 1
# [1] "+" "+" "+" "-"
as.character(strand(last(galp))) # same as with strandMode 2
# [1] "-" "-" "-" "+"
as.character(strand(galp))
# [1] "-" "-" "-" "+"
As you can see, in both case the strand of the first and last end is
what's in the BAM file.
I'm a big fan of consistency but generally speaking I don't think
consistency means that every import function should provide the exact
same interface, whatever the data to import is. That's why we have
different functions for different data in the 1st place.
H.
On 01/05/2017 11:00 PM, Dario Strbenac wrote:
Good day, readGAlignmentPairs has strandMode but readGAlignments doesn't, which means that single-end strand-specific RNA-seq that generates sequences on the opposite strand to the gene needs a subsequent ifelse statement. The API could be more consistent by providing a strandMode option for readGAlignments and other similar functions in GenomicAlignments. -------------------------------------- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319
On 01/13/2017 12:36 PM, Herv? Pag?s wrote:
Hi,
No ifelse() statement needed. Just use invertStrand() on your
GAlignments object to invert its strand.
strandMode is specific to paired-end reads and the supported modes
reflect what other software do with paired-end reads (e.g. TopHat,
Rsubread, etc..). These modes don't make sense for single-end reads.
Furthermore note that the mode only affects the way the strand() getter
for GAlignmentPairs objects combines the strands from the first and last
end to return a single value for each pair. It does NOT modify the
strand of each end:
library(GenomicAlignments)
bamfile <- system.file("extdata", "ex1.bam", package="Rsamtools")
## strandMode 1
galp <- readGAlignmentPairs(bamfile, strandMode=1)[c(1:3, 1572)]
as.character(strand(first(galp)))
# [1] "+" "+" "+" "-"
as.character(strand(last(galp)))
# [1] "-" "-" "-" "+"
as.character(strand(galp))
# [1] "+" "+" "+" "-"
## strandMode 2
galp <- readGAlignmentPairs(bamfile, strandMode=2)[c(1:3, 1572)]
as.character(strand(first(galp))) # same as with strandMode 1
# [1] "+" "+" "+" "-"
as.character(strand(last(galp))) # same as with strandMode 2
I meant . . . . . . . . . . . . . . # same as with strandMode 1 H.
# [1] "-" "-" "-" "+" as.character(strand(galp)) # [1] "-" "-" "-" "+" As you can see, in both case the strand of the first and last end is what's in the BAM file. I'm a big fan of consistency but generally speaking I don't think consistency means that every import function should provide the exact same interface, whatever the data to import is. That's why we have different functions for different data in the 1st place. H. On 01/05/2017 11:00 PM, Dario Strbenac wrote:
Good day, readGAlignmentPairs has strandMode but readGAlignments doesn't, which means that single-end strand-specific RNA-seq that generates sequences on the opposite strand to the gene needs a subsequent ifelse statement. The API could be more consistent by providing a strandMode option for readGAlignments and other similar functions in GenomicAlignments. -------------------------------------- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319
Good day, Now I know about invertStrand, I agree that it's best to keep the strandMode only for paired-end data. Indeed, it's an example at the end of the lengthy documentation of GAlignments. -------------------------------------- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia