Parallel linear model

| 
| In rereading your posting now, Dirk, I suddenly realized that there is
| one aspect of this that I'd forgotten about:  An ordinary call to
| system.time() does not display all the information returned by that
| function!
| 
| That odd statement is of course due to the fact that the print
| method for objects of class proc_time displays only 3 of the 5 numbers.
| If one actually looks at the 5 numbers individually, you can separate
| the time of the parent process from the sum of the child times.  That
| separation is apparently what rbenchmark gives you, right?
| 
| As I said earlier, the quick-and-dirty way to handle this is to use the
| Elapsed time, typically good enough (say on a dedicated machine).  After
| all, if we are trying to develop a fast parallel algorithm, what the
| potential users of the algorithm care about is essentially the Elapsed
| time.

That seems fair in most cases.

| But at the other extreme, a very fine timing goal might be to try to
| compute what is called the makespan, which in this case would be the
| maximum of all the child times, rather than the sum of the child times.
| I say "try," because I don't see any systems way to accomplish this,
| short of inserting calls to something like clock_gettime() inside each
| thread.

Maybe you could look at what microbenchmark does [ as it covers all the
OS-level dirty work ] and see if it generalizes to multiple machines?

Dirk

| Norm
|

Parallel linear model

Thread (17 messages)