# Exercise 1: Linear Systems

Write parallel implementations of the

Gaussian elimination algorithm with partial pivoting

(see the materials handed out in class) for solving a linear equation system Ax=b of dimension n over floating point numbers (Fortran REAL*8, C double):

Use the parallelization features of the Convex Fortran or C compiler located in /usr/convex/bin (set your path appropriately, otherwise the HP compilers in /bin are called!):

1. First write a sequential base version and compile it with -O2. This version should initialize the system with random coefficients and include a final correctness check by printing the result vector x and the differences between Ax and b. The corresponding parallel programs must deliver the same result (up to approximation errors)!
2. Let in one parallel version the compiler generate parallelism automatically; try to restructure the program to improve performance but do not use any pragmas or optimization flags other than -O3.
3. Try in another parallel version to optimize the program as much as possible using compilation flags, pragmas, or `mpa`.
4. Run the programs with input sizes n = 50, 100, 200, ... (until the sequential program runs longer than 2 minutes) and the parallel programs with 1, 2, 4, 6, 8, ...processors (as many as possible).
5. Use for time measurements wall clock times (`time`, see the man pages). The timings must include the initialization of the linear system but not any output.
6. Use the Convex debugger cxdb and the Convex performance analyzer cxpa for debugging and improving your program.
7. The result of this exercise is a paper containing the
1. Documented source code,
2. Output for one test run of each program (same input sizes),
3. Tables with absolute runtimes,
4. Runtime diagrams,
5. Speedup diagrams,
6. Efficiency diagrams,
8. 2 extra points for the program that solves in least time with not more than 8 processors a system with largest n = 50, 100, 200, ... such that the program runs not more than 2 minutes.