Hand in your solution on RPILMS, unless instructions say otherwise. Each team should submit its solution under only 1 student's name. The other student's submission should just name the lead student. (This makes it easier for us to avoid grading it twice.)
If you have problems, then ask for help. The goal is to learn the material.
Homework 2, due Fri 2017-02-03, 9am.
Write a program to multiply two 100x100 matrices. Do it the conventional way, not using anything fancy like Schonhage-Strassen. Now, see how much improvement you can get with OpenMP. Measure only the elapsed time for the multiplication, not for the matrix initialization.
Report these execution times, on geoxeon and parallel.
- W/o openmp enabled (Don't use -fopenmp. Comment out the pragmas.)
- With openmp, using only 1 thread.
- Using 2, 4, 8, 16, 32, 64 threads.
Write programs to test the effect of the reduction pragma:
- Create an array of 1,000,000,000 floats and fill it with pseudorandom numbers from 0 to 1.
- Do the following tests with 1, 2, 4, 8, 16, and 32 threads.
Programs to write and test:
- Sum it with a simple for loop. This will give a wrong answer with more than 1 thread, but is fast.
- Sum it with the subtotal variable protected with a atomic pragma.
- Sum it with the subtotal variable protected with a critical pragma.
- Sum it with a reduction loop.
Devise a test program to estimate the time to execute a task pragma. You might start with use taskfib.cc.
Sometime parallelizing a program can increase its elapsed time. Try to create such an example, with 2 threads being slower than 1.