Skip to content

[Bioc-devel] Biostrings: List with DNAString does not unlist

2 messages · Julian Gehring, Hervé Pagès

#
Hi,

Unlisting does not work on a list with a 'DNAString' as a element,  the 
resulting object is still a list.  Is this behavior intentional?  Here 
an example that reproduces the issue in both the latest 
R-devel/bioc-devel and R-2.15.3/bioc-stable.

   library(Biostrings)
   d = DNAString("TTGAAAA-CTC-N")
   class(d) ## -> DNAString
   l = list(d)
   class(l) ## -> list
   u = unlist(l)
   class(u) ## -> still a list, should be the same as 'd'

Best wishes
Julian
#
Hi Julian,
On 04/04/2013 03:50 AM, Julian Gehring wrote:
You're putting an S4 object inside an ordinary list so when you
call unlist() on that list, you are actually calling base::unlist(),
which doesn't know how to handle list elements that are S4 objects.
Maybe it should. Conceptually, non-recursive unlisting is equivalent
to 'do.call(c, l)' so it would be expected to work as long as trying
to combine the individual list elements with c() works.
Maybe base::unlist() could be improved to handle this situation but
that means someone has enough motivation to bring this on the
R-devel mailing list.

In the meantime you can unlist you're ordinary list with:

   > do.call(c, l)
     13-letter "DNAString" instance
   seq: TTGAAAA-CTC-N

Or, even better, use a DNAStringSet object to store a list of DNAString
objects. That's exactly what a DNAStringSet object is:

   > dna <- DNAStringSet(c("TTGAAAA-CTC-N", "", "ATTG"))

   > dna
     A DNAStringSet instance of length 3
       width seq
   [1]    13 TTGAAAA-CTC-N
   [2]     0
   [3]     4 ATTG

   > is(dna, "List")
   [1] TRUE

   > elementType(dna)
   [1] "DNAString"

   > dna[[1]]
     13-letter "DNAString" instance
   seq: TTGAAAA-CTC-N

   > unlist(dna)
     17-letter "DNAString" instance
   seq: TTGAAAA-CTC-NATTG

Using a DNAStringSet should generally be much more efficient than
using an ordinary list, not only for unlisting, but for many other
operations.

HTH,

H.