Skip to content

Memoize and vectorize a custom function

3 messages · Kamil Slowikowski, Martin Morgan, Henrik Bengtsson

#
On 04/26/2012 03:21 PM, Kamil Slowikowski wrote:
Hi Kamil --

Not really an answer to your question, but looking at

   http://bioconductor.org/packages/2.10/bioc/html/Biostrings.html

will tell you to install Biostrings with

   source("http://bioconductor.org/biocLite.R")
   biocLite("Biostrings")

and then

   library(Biostrings)
   dna = DNAStringSet(c("","G","C","CCC","T","","TTCCT","","C","CTC"))
   alf = alphabetFrequency(dna, as.prob=TRUE, baseOnly=TRUE)
   rowSums(alf[,c("G", "C")])

will give you GC content of each string.

 > rowSums(alf[,c("G", "C")])
  [1]       NaN 1.0000000 1.0000000 1.0000000 0.0000000       NaN 0.4000000
  [8]       NaN 1.0000000 0.6666667

this will be fast and scalable; Biostrings and other Bioconductor 
(http://bioconductor.org) packages have many useful functions for 
working with DNA.

See the Bioconductor mailing list for more help if this is a promising 
direction.

   http://bioconductor.org/help/mailing-list/

Martin

  
    
#
On Thu, Apr 26, 2012 at 3:21 PM, Kamil Slowikowski
<kslowikowski at gmail.com> wrote:
About R.cache: All memoization by R.cache is currently done toward the
file system.  In other words, it is designed for larger objects (so
you cannot hold all of the cache in memory) and more computationally
expensive tasks.

/Henrik