

Two major challenges face system designers wishing to achieve progress toward efficient and programmable massively parallel computer systems: the ability to spread a work load over huge numbers of processing cores and a memory model that facilitates seamless data access over the entire memory hierarchy.
The Fresh Breeze program execution model (PXM) addresses these challenges using fixed-size chunks of memory to build trees of chunks that represent arbitrary data structures. Chunks are write-once, so computation proceeds by building new data structures instead of modifying those provided as input. This paper extends previous work reporting simulations of the dot product algorithm to matrix multiplication and the fast Fourier transform, algorithms that stress the PXM in different ways. The paper includes new material explaining how mappings of problem data structures to trees of chunks are chosen to expose opportunities for efficient fine-grain concurrency and exploitation of data locality. This work provides further demonstration of the ability of the Fresh Breeze PXM to support distribution of even relatively small computations over large numbers of processors and achieve high processor utilization.