[Bioc-devel] GO.db package data source
Further to this point, when comparing to the latest OBO from geneontology, it looks like the current GO.db has just over 1000 GO IDs that are not in GO any longer, and almost 500 GO IDs are in the GO OBO file that are not in GO.db
On Wed, Apr 1, 2020 at 12:11 PM James W. MacDonald <jmacdon at uw.edu> wrote:
Are we still using the scripts in BioconductorAnnotationPipeline/go/scripts to download GO data and create the GO.db package? If so, that is likely a problem that will only get worse with time. Apparently geneontology.org is no longer generating the SQL dumps that the go scripts rely on, so whatever we download is outdated. There have been some complaints to the helpdesk about the data ( https://github.com/geneontology/helpdesk/issues/4), where they discuss a new pipeline (RDF) that may not have ended up being the new pipeline? Apparently they are now using OBO or OWL ( http://geneontology.org/docs/download-ontology/) for the downloadable data, so we should consider switching. I bring this up because apparently the current release GO.db is missing terms that were added as far back as 2018. Best, Jim -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]