Skip to content
Prev 26572 / 63424 Next

Cell or PS3 Port

The main core of the Cell (the PPE) uses IBM's version of hyperthreading 
to expose two logical, main CPU's to the OS, so code that is "simply" 
multi-threaded should still see an advantage.  In addition, IBM provides 
an SDK which includes workflow management as well as libraries to 
support common linear algebra and other math functions on the 
sub-processors (called SPE's).  They also provide an interface to a 
hardware RNG as well as 3 software types (2 psuedo, 1 quasi) that are 
coded for the SPE.

Each SPE has its own small, local memory store and communicates with 
main memory using a DMA queue.  It seems to be a question of breaking up 
each task into units that are small enough to offload to an SPE.  My 
initial direction will be to set up a rudimentary workflow manager.  As 
an optimized function is encountered, a sufficient number of SPE threads 
will be spawned and execution of the main thread will wait for all 
results.  As for the optimized functions, I intend to start with the 
ones who already have an analogous implementation in the IBM math libraries.

MPI has been employed by some Cell developers to allow multiple SPE's 
working on sections of the same task to communicate with each other.  I 
like the idea of this approach, since it lays the groundwork to allow 
multiple Cell (or really any) processors to be clustered.
Luke Tierney wrote: