Exercises for 14.2 Hello Threading!
Implement the entirety of the scalar_multiply program. Time your code using the
gettimeofday()function and 100 millions elements. How does the time of the program vary as you increase the number of threads? What if you increase the number of elements to 1 billion? 2 billion?
Improve the original
scalar multiplythreaded function by placing all the arguments into a
structand passing it through main. Time the performance of this version of the code. Is there any difference? (solution)
scalar_multiplythreaded function by implementing a better load balancing procedure. In other words, implement the load balancing procedure in the note above.
Using what you have learned, try implementing a program that performs matrix vector multiplication. In matrix vector multiplication, each row in the matrix is multiplied by some vector of elements.
Exercises for 14.3 Synchronization
Implement a parallel version of the Step 2 of the CountSort algorithm. Time your performance.
Try combining Step 1 and Step 2 of the CountSort program into a single program. To do this, you will need to add another cycle of
pthread_join()to your program.
Time the total performance of the new CountSort program.
Exercises for 14.7 OpenMP
writeElems()function makes the assumption that the user only inputs a number of threads less that
MAX. Is there a way to rewrite this code so that it will work, regardless of the number of threads?