Skip to content

speeding up 1000s of coxph regression?

3 messages · Xiao-Jun Ma, Thomas Lumley, A.J. Rossini

#
I have a gene expression matrix n (genes) X p (cases), where n = 8000 and p
= 100. I want to fit each gene as univariate in a coxph model, i.e., fitting
8000 models. I do something like this:

res <- apply(data, 1, coxph.func)

which takes about 4 min, not bad. But I need to do large numbers of
permutations of the data (permuting the columns), for example, 2000, which
would take 5 days. I would like to know if there is way to speed this up?

Any help appreciated.

Xiao-Jun
#
On Tue, 10 Jun 2003, Xiao-Jun Ma wrote:

            
Calling coxph.fit directly would likely be faster.

Also, you probably don't need to do 2000 permutations on all 8000 genes: a
few hundred permutations is probably enough to decide that most of the
genes aren't interesting.

If you are going to be doing a lot of this sort of thing it might be worth
looking at the parallel processing facilities in the `snow' package.
There's a description of their use in another gene expression problem in
the new R Newsletter.


	-thomas
#
Thomas Lumley <tlumley at u.washington.edu> writes:
Actually, they used the RPVM package directly; however, Thomas is
still correct, it probably would be simple to recast using SNOW.

Some hints and details can be found in a tech report by Luke Tierney,
Michael Li, and myself in the UW Biostat tech report series (can't
recall which #, but it's on http://www.bepress.com/uwbiostat/).

best,
-tony