Skip to content
Prev 3802 / 21318 Next

[Bioc-devel] serializing pairwise alignment objects

Indeed. I did not look the far into the implementation, it just seemed odd
to me that the objects got that inflated. scoreOnly is not really that
helpful if you want to deal with the actual alignments. The only
reasonable application I see for it is if you want to rank a bunch of
sequences by pairwise similarity. This gigantic memory footprint is really
breaking things once you start doing a lot of these pairwise alignment
operations in parallel. mclapply complains about not being able to turn
such large objects into a raw vector, and serializing to disk quickly
fills your hard drive. You also loose a lot of the time gained by parallel
processing just by writing and loading gigabytes of data...
I don't know enough about the internals of the PairwiseAlignments classes,
but it seems that there must be a way to avoid having this huge array as
part of the object. As a quick and dirty fix for now I just replaced the
substitutionArray slot with an empty matrix and all the downstream
operations that I wanted to do still work. Would be great if you could
take a look into this, Herve.
Thanks,
Florian