Skip to content

when vectorising does not work: silent function fail?

5 messages · Federico Calboli, jim holtman, Thomas Lumley +1 more

#
Dear All,

I'm using apply to do some genetic association analysis along a chromosome, with 
many thousands markers. For each marker the analysis is the same, so I was 
planning to use apply(chrom, 2, somefunction)

In the specific case I do:

my.results = apply(chr, 2, function(x){anova(lrm( cpstc.f ~ x + time.cpstc + age 
+ sex + mri))[1,3]})

This is all good and well in theory, but in practice the lrm() model will fail 
for some markers and wreck the whole process. Failure for some markers is no 
problem for me, but *predicting* which markers will fail can be hugely problematic.

I then though of creating some fucntion to catch the error messages that would 
otherwise scr*w things over:

my.lrm = function(x){
pol = NULL
pol = lrm( cpstc.f ~ x + time.cpstc + age + sex + mri)
if(length(pol) > 0)
rez = anova(pol)[1,3]
if(length(pol)  == 0)
rez = 1
rez}

my.results = apply(chr, 2, my.lrm)

Still no joy, even adding try() in the evaluation and 
options(show.error.messages = F)

I am at loss on how to get the darn function to bail out *silently* if needs be 
so I can just smack a replacement value in --which would also have the benefit 
of keeping the order of the markers.

Any idea will be gratefully acknowledged.

Best,

Federico
#
Have you tried something like this:

my.results = apply(chr, 2, function(x){
    result <- try(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3])
    if (inherits(result, "try-error")) return(NULL)
    result
})

This should catch the error and have NULL in that list element.

On Tue, Nov 10, 2009 at 10:04 AM, Federico Calboli
<f.calboli at imperial.ac.uk> wrote:

  
    
#
On Tue, 10 Nov 2009, jim holtman wrote:

            
It's probably more useful to return a numeric vector of the right size, with tryCatch()

    tryCatch(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3],
             error=function(e) NA)

Also, you probably get less data copying by using a for() or while() loop than by using apply() in this context.

Finally, the overhead of formula parsing and model matrix construction repeated thousands of times probably dominates this computation; if it isn't just a one-off it would probably be worth a lower-level implementation.

      -thomas
Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle
#
On 10 Nov 2009, at 17:16, tlumley at u.washington.edu wrote:

            
Both suggestions work beautifully, thanks to both!
I'm happy to take this in, I am not sure I can see how it would work  
(the less data copying) though.
The markers are ~ 1.5M, but I don't do this kind of analysis often  
enough to justify any more work than a measly script.

Cheers,

Fede
--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com
4 days later
#
Why may there be less data copying with "for" and "while" compared to apply?
Does "lower-level implementation" mean code this outside of R.

Thanks!

Juliet