Message-ID: <CAOQ5Nye0=D++oQRkCCYSzy0=OcftXerp8Vrx27Bohefr5BCLrQ@mail.gmail.com>
Date: 2016-01-08T13:40:19Z
From: Michael Lawrence
Subject: [Bioc-devel] Problem with seqnames of TwoBitFile from AnnotationHub
In-Reply-To: <7443D1B9-409E-4D2E-AAB4-1FB1F2FA4CFA@eurac.edu>
This is perhaps something that could be handled when population the
hub, but I'm not sure how rtracklayer could automatically derive the
chromosome names.
On Fri, Jan 8, 2016 at 2:37 AM, Rainer Johannes
<Johannes.Rainer at eurac.edu> wrote:
> dear all,
>
> I just run into a problem with a TwoBitFile I fetched from AnnotationHub. I was fetching a TwoBitFile with the genomic DNA sequence, as provided by Ensembl:
>
>> library(AnnotationHub)
>> ah <- AnnotationHub()
>> tbf <- ah[["AH50068?]]
>
>> head(seqnames(seqinfo(tbf)))
> [1] "1 dna:chromosome chromosome:GRCh38:1:1:248956422:1 REF"
> [2] "10 dna:chromosome chromosome:GRCh38:10:1:133797422:1 REF"
> [3] "11 dna:chromosome chromosome:GRCh38:11:1:135086622:1 REF"
> [4] "12 dna:chromosome chromosome:GRCh38:12:1:133275309:1 REF"
> [5] "13 dna:chromosome chromosome:GRCh38:13:1:114364328:1 REF"
> [6] "14 dna:chromosome chromosome:GRCh38:14:1:107043718:1 REF"
>
> Would be nice, if the seqnames would be really just the chromsome names and not the whole string from the FA file header. Is there a way I could fix the file myself or is this something that should be fixed in the rtracklayer or AnnotationHub package when the TwoBitFile is created?
>
> thanks, jo
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel