[Bioc-devel] error using the DEXSeqDataSet function
On 07/30/2015 09:02 AM, Alejandro Reyes wrote:
Dear Leonard, Thanks a lot for reporting this. It should be fixed in the version that I just committed to the svn (DEXSeq 1.5.10). While debugging the DEXSeq code, I noticed that summarizedOverlaps is giving me an error, which I think its a bug while creating the summarizedExperiments object that is returned. Here a reproducible example: Konsole output
library(GenomicRanges) > library(GenomicFeatures) > library(GenomicAlignments) > > hse <- makeTxDbFromBiomart( biomart="ensembl",
+ dataset="hsapiens_gene_ensembl" ) Download and preprocess the 'transcripts' data frame ... OK Download and preprocess the 'chrominfo' data frame ... OK Download and preprocess the 'splicings' data frame ... OK Download and preprocess the 'genes' data frame ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... OK
> > bamDir <- system.file(
+ "extdata", package="parathyroidSE", mustWork=TRUE )
> fls <- list.files( bamDir, pattern="bam$", full=TRUE ) > > bamlst <- BamFileList(
+ fls, index=character(), + yieldSize=100000, obeyQname=TRUE )
> > exonicParts <- disjointExons( hse, aggregateGenes=FALSE ) > > SE <- summarizeOverlaps( exonicParts, bamlst,
+ mode="Union", singleEnd=FALSE, + ignore.strand=TRUE, inter.feature=FALSE, fragments=TRUE ) Error in SummarizedExperiment(assays = SimpleList(counts = counts), rowRanges = features, : error in evaluating the argument 'assays' in selecting a method for function 'SummarizedExperiment': Error in validObject(.Object) : invalid class ?SimpleList? object: invalid object for slot "listData" in class "SimpleList": got class "matrix", should be or extend class "lis t"
with options(error=recover) we end up at
Enter a frame number, or 0 to exit
1: summarizeOverlaps(exonicParts, bamlst, mode = "Union", singleEnd = FALSE, i
2: summarizeOverlaps(exonicParts, bamlst, mode = "Union", singleEnd = FALSE, i
3: .local(features, reads, mode, algorithm, ignore.strand, ...)
4: .dispatchBamFiles(features, reads, mode, match.arg(algorithm), ignore.stran
5: SummarizedExperiment(assays = SimpleList(counts = counts), rowRanges = feat
6: SimpleList(counts = counts)
7: new("SimpleList", listData = args)
8: initialize(value, ...)
9: initialize(value, ...)
10: validObject(.Object)
Selection: 4
Called from: stop(msg, ": ", errors, domain = NA)
Browse[2]> SimpleList(counts=counts)
Error during wrapup: invalid class "SimpleList" object: invalid object for slot
"listData" in class "SimpleList": got class "matrix", should be or extend class
"list"
Browse[2]> class(counts)
[1] "matrix"
Browse[2]> dim(counts) ## suspicious!
[1] 2 3
Browse[2]> cat(counts[[1]])
sequences 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, X, Y have incompatible seqlengths:
- in 'x': 248956422, 242193529, 198295559, 190214555, 181538259, 170805979,
159345973, 145138636, 138394717, 133797422, 135086622, 133275309, 114364328,
107043718, 101991189, 90338345, 83257441, 80373285, 58617616, 64444167,
46709983, 50818468, 156040895, 57227415
- in 'y': 249250621, 243199373, 198022430, 191154276, 180915260, 171115067,
159138663, 146364022, 141213431, 135534747, 135006516, 133851895, 115169878,
107349540, 102531392, 90354753, 81195210, 78077248, 59128983, 63025520,
48129895, 51304566, 155270560, 59373566
So (a) the genome build of the alignment is different from the genome build of
the annotations (b) BiocParallel (I think) is capturing the messages rather than
the return values (c) SimpleList(matrix(list())) fails.
Martin
And the output of sessionInfo(), Konsole output
sessionInfo()
R Under development (unstable) (2015-07-25 r68744) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 15.04 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] GenomicAlignments_1.5.12 Rsamtools_1.21.14 [3] Biostrings_2.37.2 XVector_0.9.1 [5] GenomicFeatures_1.21.13 AnnotationDbi_1.31.17 [7] DEXSeq_1.15.10 DESeq2_1.9.26 [9] RcppArmadillo_0.5.200.1.0 Rcpp_0.12.0 [11] SummarizedExperiment_0.3.2 GenomicRanges_1.21.17 [13] GenomeInfoDb_1.5.9 IRanges_2.3.15 [15] S4Vectors_0.7.10 Biobase_2.29.1 [17] BiocGenerics_0.15.3 BiocParallel_1.3.42 loaded via a namespace (and not attached): [1] genefilter_1.51.0 statmod_1.4.21 locfit_1.5-9.1 [4] reshape2_1.4.1 splines_3.3.0 lattice_0.20-33 [7] colorspace_1.2-6 rtracklayer_1.29.12 survival_2.38-3 [10] XML_3.98-1.3 foreign_0.8-65 DBI_0.3.1 [13] RColorBrewer_1.1-2 lambda.r_1.1.7 plyr_1.8.3 [16] stringr_1.0.0 zlibbioc_1.15.0 munsell_0.4.2 [19] gtable_0.1.2 futile.logger_1.4.1 hwriter_1.3.2 [22] latticeExtra_0.6-26 geneplotter_1.47.0 biomaRt_2.25.1 [25] proto_0.3-10 acepack_1.3-3.3 xtable_1.7-4 [28] scales_0.2.5 Hmisc_3.16-0 annotate_1.47.4 [31] gridExtra_2.0.0 ggplot2_1.0.1 digest_0.6.8 [34] stringi_0.5-5 grid_3.3.0 tools_3.3.0 [37] bitops_1.0-6 magrittr_1.5 RCurl_1.95-4.7 [40] RSQLite_1.0.0 Formula_1.2-1 cluster_2.0.3 [43] futile.options_1.0.0 MASS_7.3-43 rpart_4.1-10 [46] nnet_7.3-10 Best regards, Alejandro Reyes On 29.07.2015 20:26, Leonard Goldstein wrote:
Hi all, I'm having trouble creating a DEXSeqDataSet object (in the devel version of DEXSeq) Running the example included in the manual page results in the same error I get with my own data (see below) Many thanks for your help. Leonard
library(DEXSeq) countData <- matrix( rpois(10000, 100), nrow=1000 ) sampleData <- data.frame(
+ condition=rep( c("untreated", "treated"), each=5 ) )
design <- formula( ~ sample + exon + condition:exon ) groupID <- rep(
+ paste0("gene", 1:10),
+ each= 100 )
featureID <- rep(
+ paste0("exon", 1:10),
+ times= 100 )
DEXSeqDataSet( countData, sampleData, design,
+ featureID, groupID )
converting counts to integer mode
Error in `$<-.data.frame`(`*tmp*`, "dispersion", value = NA) :
replacement has 1 row, data has 0
In addition: Warning message:
In DESeqDataSet(se, design, ignoreRank = TRUE) :
900 duplicate rownames were renamed by adding numbers
sessionInfo()
R version 3.2.1 (2015-06-18) Platform: x86_64-unknown-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux Server release 6.6 (Santiago) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] DEXSeq_1.15.9 DESeq2_1.9.26 [3] RcppArmadillo_0.5.200.1.0 Rcpp_0.12.0 [5] SummarizedExperiment_0.3.2 GenomicRanges_1.21.17 [7] GenomeInfoDb_1.5.9 IRanges_2.3.15 [9] S4Vectors_0.7.10 Biobase_2.29.1 [11] BiocGenerics_0.15.3 BiocParallel_1.3.41 loaded via a namespace (and not attached): [1] genefilter_1.51.0 statmod_1.4.21 locfit_1.5-9.1 [4] reshape2_1.4.1 splines_3.2.1 lattice_0.20-33 [7] colorspace_1.2-6 survival_2.38-3 XML_3.98-1.3 [10] foreign_0.8-65 DBI_0.3.1 RColorBrewer_1.1-2 [13] lambda.r_1.1.7 plyr_1.8.3 stringr_1.0.0 [16] zlibbioc_1.15.0 Biostrings_2.37.2 munsell_0.4.2 [19] gtable_0.1.2 futile.logger_1.4.1 hwriter_1.3.2 [22] latticeExtra_0.6-26 geneplotter_1.47.0 biomaRt_2.25.1 [25] AnnotationDbi_1.31.17 proto_0.3-10 acepack_1.3-3.3 [28] xtable_1.7-4 scales_0.2.5 Hmisc_3.16-0 [31] annotate_1.47.4 XVector_0.9.1 Rsamtools_1.21.14 [34] gridExtra_2.0.0 ggplot2_1.0.1 digest_0.6.8 [37] stringi_0.5-5 grid_3.2.1 tools_3.2.1 [40] bitops_1.0-6 magrittr_1.5 RCurl_1.95-4.7 [43] RSQLite_1.0.0 Formula_1.2-1 cluster_2.0.3 [46] futile.options_1.0.0 MASS_7.3-43 rpart_4.1-10 [49] compiler_3.2.1 nnet_7.3-10
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793