RISC JKU

Austrian Grid 2: Distributed Supercomputing in the Grid

In the frame of the second phase of the Austrian Grid initiative, we have designed and implemented an API for grid computing that can be used for developing grid-distributed parallel programs without leaving the level of the language in which the core application is written. Our software framework is called "Topology-Aware API for the Grid" (TAAG) and it is able to utilize the information about heterogeneous grid environments in order to adapt the algorithmic structure of parallel programs to the particular situation. Since our solution hides low level grid-related execution details from the application by providing an abstract execution model, it is able to eliminate some algorithmic challenges of nowadays grid programming.

Motivation

No application can execute efficiently on the grid that is not aware of the fact that it runs in an heterogeneous network environment with heterogeneous nodes. Our solution is an advanced topology-aware programming tool which takes into account not only the topology of the available grid resources but also the point-to-point communication structure of parallel programs.

In our approach, a pre-defined schema is assigned to each given parallel program that specifies preferred communication patterns of the program in heterogeneous network environments. The execution engine first adapts and maps this schema to the currently available grid resources and then starts according to this mapping the processes on the grid. Our API contains function calls which are able to query all the details of the mapping information which contains both the adapted communication structure of the program and the topological information of the allocated grid resources.

Regard an example where a user intends to execute a tree-like multilevel parallel algorithmic solution on the grid. She specifies in advance that the given application should consist of 20 processes organized into a 3-levels tree structure. On the lowest level leaves belonging to the same parent process should form groups such that each group contains at least 5 processes scheduled to the same local network environment. For this specification, our software framework is able to determinate a suitable partition of processes on the currently available grid resources and to start the processes according to this scheduling. The partition is based on some heuristics, e.g.: our framework prefers such tree structures where the sizes of the groups formed by the leaf processes belonging to the same parents are maximal; consequently the processes of each such group can be scheduled to a cluster. Furthermore, our API maps at runtime the predefined roles of processes in the specified logical hierarchy (global manager, local manager and workers) to the allocated pool of grid nodes such that the execution time is minimized.

Downloading the prototype version of the TAAG software framework

For more details see the draft of the Specification Document (Last Modified: March 15, 2010). Here you can also find some examples for the usage of our API (these programs have already been tested successully on the grid architecture of the Austrian Grid):

Example for Multilevel Parallelism
Example for Broadcasts among Groups in a Ring

Researchers

Wolfgang Schreiner (project director)
Karoly Bosa (key researcher)
Friedrich Priewasser (software developer)

Deliverables

Publications


Wolfgang Schreiner
Last modified: Mon Aug 15 18:20:47 CET 2011