Skip to content
Prev 12758 / 29559 Next

joining envelope objects for parallel computing. Possible?

Thanks for the receiven ideas,

@Rolf:

the sim_pat_list I already managed to parallelize,

and indeed calculating summary functions is generally faster,
but in my case still takes up about 20% (or 2 hours) of the total time (about 10 hours) so far. 

However, I must say that most of the time is spend with the 

alltypes() function (for bivariate point patterns) and not so much with the envelope() function.


I'm hoping the received feedback is also applicable for alltypes(),

I will try,

Jan   




________________________________________
Van: Rolf Turner [r.turner at auckland.ac.nz]
Verzonden: maandag 12 september 2011 12:39
Aan: Quets Jan
CC: r-sig-geo at r-project.org
Onderwerp: Re: [R-sig-Geo] joining envelope objects for parallel computing. Possible?
On 12/09/11 21:03, Quets Jan wrote:
See fortune("This is R")

     You could do it, I think, at the expense of writing a bit of additional
     code.  Something along the following lines:

     * For each of your calls to envelope, use the argument savefuns=TRUE.
     * From each of the ENV_OBJ_j, extract the "simfuns" attribute.
     * Change the class of each extracted "simfuns" attribute to
"data.frame"
        (from c("fv","data.frame")), discard the "r" columns, and then cbind
        them all together; make the result into a *matrix* rather than a
        data frame.  Call it, say, "M".
     * apply the appropriate function across the rows of M, to obtain "lo"
        and "hi"; e.g.

             LH <- t(apply(M,1,function(x,m){x <- sort(x);
c(x[m],x[length(x)-m+1])},m=5))

        The value "m=5" corresponds to setting "nrank=5" in a ``direct''
call to
        envelope().  The first column of LH is "lo"; the second is "hi".

     * These vectors, "lo" and "hi" are then the lower and upper bounds
        respectively of the required envelope.

     The foregoing can be wrapped up in a convenient function.
     Exercise for the reader! :-)

     HTH

         cheers,

             Rolf

P. S.  Not clear to me that this is worth doing; the time consuming part
is the
simulation of the patterns, which you've already got in the
"sim_pat_list".  Once
you have the patterns, calculating the summary functions is usually very
fast,
and hence not worth parallelizing. Unless you have an extraordinarily
complicated
setting.

             R.