In the paper we present a framework for partitioning data parallel computations across a heterogeneous metasystem at runtime. The framework is guided by program and resource information which is made available to the system. Three difficult problems are handled by the framework: processor selection, task placement and heterogeneous data domain decomposition. Solving each of these problems contributes to reduced elapsed time. In particular, processor selection determines the best grain size at which to run the computation, task placement reduces communication cost, and data domain decomposition achieves processor load balance. We present results which indicate that excellent performance is achievable using the framework. The paper extends our earlier work on partitioning data parallel computations across a single‐level network of heterogeneous workstations.
|Original language||English (US)|
|Number of pages||24|
|Journal||Concurrency: Practice and Experience|
|State||Published - Aug 1995|