We are starting to work on an infrastructure for annotation of 16S metagenomic sequencing datasets and would like your comments and/or contributions. Below are links to two github repositories: metagenomeFeatures and greengenes13.5MgDb. The metagenomeFeatures package contains two classes; mgDb, for 16S sequence databases, and metagenomeAnnotation, for annotating a sequence dataset with taxonomic information from a mgDb object. The greengenes13.5MgDb package, loads a mgDb object with the greengenes 13.5 database. greengenes 13.5 was used as an example database, we plan on adding additional packages for other commonly used databases, e.g RDP and Silva. The metagenomeFeatures includes two vignettes to demonstrating the mgDb and metagenomeAnnotation class methods using the greengenes13.5MgDb as an example database. We are planning on adding additional methods for the mgDb and metagenomeAnnotation classes. For the mgDb class, assigning query sequences to database sequences using rRDP classifier, and/or sequence alignment methods that are part of the Biostrings package. For the metagenomeAnnotation class we plan to include the ability to create a phylogenetic tree from a metagenomeAnnotation object. We would appreciate comments on the package and suggestions for additional features. Links to package github repositories https://github.com/HCBravoLab/metagenomeFeatures https://github.com/HCBravoLab/greengenes13.5MgDb Thanks Nate Olson and Hector Corrada Bravo
[Bioc-devel] Request for comment metagenomeFeatures package
6 messages · Nathan Olson, Hector Corrada Bravo, Martin Morgan +1 more
very interesting development, we have several folks who will take a look. FYI %vjcair> R CMD INSTALL greeng*b Bioconductor version 3.2 (BiocInstaller 1.19.9), ?biocLite for help Loading required package: digest Loading required package: tools Loading required package: utils Loading required package: codetools * installing to library ?/Library/Frameworks/R.framework/Versions/3.2/Resources/library? * installing *source* package ?greengenes13.5MgDb? ... ** R ** preparing package for lazy loading Warning in .recacheSubclasses(def at className, def, doSubclasses, env) : undefined subclass "externalRefMethod" of class "functionORNULL"; definition not updated ** help No man pages found in package ?greengenes13.5MgDb? *** installing help indices ** building package indices ** testing if installed package can be loaded Bioconductor version 3.2 (BiocInstaller 1.19.9), ?biocLite for help Loading required package: digest Loading required package: tools Loading required package: utils Loading required package: codetools Warning in .recacheSubclasses(def at className, def, doSubclasses, env) : undefined subclass "externalRefMethod" of class "functionORNULL"; definition not updated /gg_13_5.fasta.gz: Permission denied On Tue, Aug 4, 2015 at 9:43 AM, Nathan Olson <nathandavidolson at gmail.com> wrote:
We are starting to work on an infrastructure for annotation of 16S metagenomic sequencing datasets and would like your comments and/or contributions. Below are links to two github repositories: metagenomeFeatures and greengenes13.5MgDb. The metagenomeFeatures package contains two classes; mgDb, for 16S sequence databases, and metagenomeAnnotation, for annotating a sequence dataset with taxonomic information from a mgDb object. The greengenes13.5MgDb package, loads a mgDb object with the greengenes 13.5 database. greengenes 13.5 was used as an example database, we plan on adding additional packages for other commonly used databases, e.g RDP and Silva. The metagenomeFeatures includes two vignettes to demonstrating the mgDb and metagenomeAnnotation class methods using the greengenes13.5MgDb as an example database. We are planning on adding additional methods for the mgDb and metagenomeAnnotation classes. For the mgDb class, assigning query sequences to database sequences using rRDP classifier, and/or sequence alignment methods that are part of the Biostrings package. For the metagenomeAnnotation class we plan to include the ability to create a phylogenetic tree from a metagenomeAnnotation object. We would appreciate comments on the package and suggestions for additional features. Links to package github repositories https://github.com/HCBravoLab/metagenomeFeatures https://github.com/HCBravoLab/greengenes13.5MgDb Thanks Nate Olson and Hector Corrada Bravo [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Thanks Vince, I think we just fixed that: https://github.com/HCBravoLab/greengenes13.5MgDb/issues/1#issuecomment-127649449 Cheers, Hector On Tue, Aug 4, 2015 at 10:45 AM, Vincent Carey <stvjc at channing.harvard.edu> wrote:
very interesting development, we have several folks who will take a look. FYI %vjcair> R CMD INSTALL greeng*b Bioconductor version 3.2 (BiocInstaller 1.19.9), ?biocLite for help Loading required package: digest Loading required package: tools Loading required package: utils Loading required package: codetools * installing to library ?/Library/Frameworks/R.framework/Versions/3.2/Resources/library? * installing *source* package ?greengenes13.5MgDb? ... ** R ** preparing package for lazy loading Warning in .recacheSubclasses(def at className, def, doSubclasses, env) : undefined subclass "externalRefMethod" of class "functionORNULL"; definition not updated ** help No man pages found in package ?greengenes13.5MgDb? *** installing help indices ** building package indices ** testing if installed package can be loaded Bioconductor version 3.2 (BiocInstaller 1.19.9), ?biocLite for help Loading required package: digest Loading required package: tools Loading required package: utils Loading required package: codetools Warning in .recacheSubclasses(def at className, def, doSubclasses, env) : undefined subclass "externalRefMethod" of class "functionORNULL"; definition not updated /gg_13_5.fasta.gz: Permission denied On Tue, Aug 4, 2015 at 9:43 AM, Nathan Olson <nathandavidolson at gmail.com> wrote:
We are starting to work on an infrastructure for annotation of 16S metagenomic sequencing datasets and would like your comments and/or contributions. Below are links to two github repositories: metagenomeFeatures and greengenes13.5MgDb. The metagenomeFeatures
package
contains two classes; mgDb, for 16S sequence databases, and metagenomeAnnotation, for annotating a sequence dataset with taxonomic information from a mgDb object. The greengenes13.5MgDb package, loads a mgDb object with the greengenes 13.5 database. greengenes 13.5 was used
as
an example database, we plan on adding additional packages for other commonly used databases, e.g RDP and Silva. The metagenomeFeatures includes two vignettes to demonstrating the mgDb
and
metagenomeAnnotation class methods using the greengenes13.5MgDb as an example database. We are planning on adding additional methods for the mgDb and metagenomeAnnotation classes. For the mgDb class, assigning query sequences to database sequences using rRDP classifier, and/or sequence alignment methods that are part of the Biostrings package. For the metagenomeAnnotation class we plan to include the ability to create a phylogenetic tree from a metagenomeAnnotation object. We would appreciate comments on the package and suggestions for
additional
features. Links to package github repositories https://github.com/HCBravoLab/metagenomeFeatures https://github.com/HCBravoLab/greengenes13.5MgDb Thanks Nate Olson and Hector Corrada Bravo [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
On 08/04/2015 06:43 AM, Nathan Olson wrote:
We are starting to work on an infrastructure for annotation of 16S metagenomic sequencing datasets and would like your comments and/or contributions. Below are links to two github repositories: metagenomeFeatures and greengenes13.5MgDb. The metagenomeFeatures package contains two classes; mgDb, for 16S sequence databases, and metagenomeAnnotation, for annotating a sequence dataset with taxonomic information from a mgDb object. The greengenes13.5MgDb package, loads a mgDb object with the greengenes 13.5 database. greengenes 13.5 was used as an
does it make sense to use AnnotationHub to manage these resources? Instead of downloading and managing the fasta and taxonomy files in .onLoad and getGreenGenes13.5Db, .onLoad would be hub = AnnotationHub() db_seq = hub[["AH12345"]] db_taxa_file = hub[["AH12346"]] with a 'recipe' describing how the corresponding annotation hub resources are to be created. This would move download and management to AnnotationHub, and potentially allow use of the annotation hub records by people with other interests. If that sounds interesting we can work up a pull request. Martin
example database, we plan on adding additional packages for other commonly used databases, e.g RDP and Silva. The metagenomeFeatures includes two vignettes to demonstrating the mgDb and metagenomeAnnotation class methods using the greengenes13.5MgDb as an example database. We are planning on adding additional methods for the mgDb and metagenomeAnnotation classes. For the mgDb class, assigning query sequences to database sequences using rRDP classifier, and/or sequence alignment methods that are part of the Biostrings package. For the metagenomeAnnotation class we plan to include the ability to create a phylogenetic tree from a metagenomeAnnotation object. We would appreciate comments on the package and suggestions for additional features. Links to package github repositories https://github.com/HCBravoLab/metagenomeFeatures https://github.com/HCBravoLab/greengenes13.5MgDb Thanks Nate Olson and Hector Corrada Bravo
Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
On Tue, Aug 4, 2015 at 3:00 PM, Martin Morgan <mtmorgan at fredhutch.org> wrote:
On 08/04/2015 06:43 AM, Nathan Olson wrote:
We are starting to work on an infrastructure for annotation of 16S metagenomic sequencing datasets and would like your comments and/or contributions. Below are links to two github repositories: metagenomeFeatures and greengenes13.5MgDb. The metagenomeFeatures package contains two classes; mgDb, for 16S sequence databases, and metagenomeAnnotation, for annotating a sequence dataset with taxonomic information from a mgDb object. The greengenes13.5MgDb package, loads a mgDb object with the greengenes 13.5 database. greengenes 13.5 was used as an
does it make sense to use AnnotationHub to manage these resources?
I would think so. At this time, trying to install greengenes13.5MgDb package, the process "testing whether the package can be loaded" takes a very long time -- I suspect it is doing some silent downloading. IMHO such activities should be explicitly undertaken by the user.
Instead of downloading and managing the fasta and taxonomy files in .onLoad and getGreenGenes13.5Db, .onLoad would be hub = AnnotationHub() db_seq = hub[["AH12345"]] db_taxa_file = hub[["AH12346"]]
With this setup the first installation of the package could involve a long download, silent by default. It's feasible but quite unusual.
with a 'recipe' describing how the corresponding annotation hub resources are to be created. This would move download and management to AnnotationHub, and potentially allow use of the annotation hub records by people with other interests. If that sounds interesting we can work up a pull request. Martin example database, we plan on adding additional packages for other commonly
used databases, e.g RDP and Silva. The metagenomeFeatures includes two vignettes to demonstrating the mgDb and metagenomeAnnotation class methods using the greengenes13.5MgDb as an example database. We are planning on adding additional methods for the mgDb and metagenomeAnnotation classes. For the mgDb class, assigning query sequences to database sequences using rRDP classifier, and/or sequence alignment methods that are part of the Biostrings package. For the metagenomeAnnotation class we plan to include the ability to create a phylogenetic tree from a metagenomeAnnotation object. We would appreciate comments on the package and suggestions for additional features. Links to package github repositories https://github.com/HCBravoLab/metagenomeFeatures https://github.com/HCBravoLab/greengenes13.5MgDb Thanks Nate Olson and Hector Corrada Bravo
-- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Thanks, Martin. I agree using AnnotationHub to manage the db resources is a better option than how it is currently setup. A pull request would be much appreciated. On Tue, Aug 4, 2015 at 3:14 PM Vincent Carey <stvjc at channing.harvard.edu> wrote:
On Tue, Aug 4, 2015 at 3:00 PM, Martin Morgan <mtmorgan at fredhutch.org> wrote:
On 08/04/2015 06:43 AM, Nathan Olson wrote:
We are starting to work on an infrastructure for annotation of 16S metagenomic sequencing datasets and would like your comments and/or contributions. Below are links to two github repositories: metagenomeFeatures and greengenes13.5MgDb. The metagenomeFeatures package contains two classes; mgDb, for 16S sequence databases, and metagenomeAnnotation, for annotating a sequence dataset with taxonomic information from a mgDb object. The greengenes13.5MgDb package, loads a mgDb object with the greengenes 13.5 database. greengenes 13.5 was used as an
does it make sense to use AnnotationHub to manage these resources?
I would think so. At this time, trying to install greengenes13.5MgDb package, the process "testing whether the package can be loaded" takes a very long time -- I suspect it is doing some silent downloading. IMHO such activities should be explicitly undertaken by the user.
Instead of downloading and managing the fasta and taxonomy files in .onLoad and getGreenGenes13.5Db, .onLoad would be hub = AnnotationHub() db_seq = hub[["AH12345"]] db_taxa_file = hub[["AH12346"]]
With this setup the first installation of the package could involve a long download, silent by default. It's feasible but quite unusual.
with a 'recipe' describing how the corresponding annotation hub resources are to be created. This would move download and management to AnnotationHub, and potentially allow use of the annotation hub records by people with other interests. If that sounds interesting we can work up a pull request. Martin example database, we plan on adding additional packages for other
commonly used databases, e.g RDP and Silva. The metagenomeFeatures includes two vignettes to demonstrating the mgDb and metagenomeAnnotation class methods using the greengenes13.5MgDb as an example database. We are planning on adding additional methods for the mgDb and metagenomeAnnotation classes. For the mgDb class, assigning query sequences to database sequences using rRDP classifier, and/or sequence alignment methods that are part of the Biostrings package. For the metagenomeAnnotation class we plan to include the ability to create a phylogenetic tree from a metagenomeAnnotation object. We would appreciate comments on the package and suggestions for additional features. Links to package github repositories https://github.com/HCBravoLab/metagenomeFeatures https://github.com/HCBravoLab/greengenes13.5MgDb Thanks Nate Olson and Hector Corrada Bravo
-- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel