On Mon, Aug 27, 2012 at 10:24 AM, Ryan C. Thompson <rct at thompsonclan.org> wrote:
Ok, I put the patch in a Github gist, since the list seems to not like patch as an attachment: https://gist.github.com/3490557
I'm trying to ensure that the list supports this type of attachment. It should accept it this time. Dan
On 08/27/2012 09:32 AM, Ryan C. Thompson wrote:
It looks like the attachment was scrubbed from my initial message. Here is another attempt to send it. On Mon 27 Aug 2012 08:50:03 AM PDT, Ryan C. Thompson wrote:
Hi all, I recently found that rtracklayer's GFF3 file read was unable to read GFF3 files produced by Cufflinks. I tracked the problem down to the occurrence of equals signs in tag values. For example, the following line was problematic: C123300344 Cufflinks transcript 1 132 . - . ID=TCONS_00000337;geneID=XLOC_000337;oId=ENSMMUP00000032229;nearest_ref=ENSMMUP00000032229;class_code==;tss_id=TSS337;p_id=P1 due to the "class_code==" part (the value of the class code is actually an equals sign). Obviously the bug occurs because "strsplit" doesn't stop after the first split, but keeps splitting at subsequent occurrences of the separator. I have modified the reader to be able to handle this case, which as far as I know is perfectly valid. Instead of strsplit, I use regexpr to find only the *first* occurrence of an equals sign, and then I use substr to extract the part of the tag before and after the equals sign. The attached file is a patch against "R/gff.R" in the rtracklayer dist. I developed the patch against version 1.16.1. Regards, -Ryan Thompson
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel