COMPENG 701, Fall 2007, course syllabus
Hint for the question 4 in assignment 2: if you start seeing complex eigenvalues, you are on the wrong track. Recall that the
definition of a condition number in class is given for symmetric matrices only, so your first step must be re-writing an operation, i.e., rotation,
on the index matrix in a symmetric form. (Consider if the error magnitude, caused by the perturbation E, is any different if instead of a solution
vector [x;y] of a system A [x;y]=b+E, in MATLAB notation, you are interested in the “reflected” solution vector [x;-y].)
The week of October 22nd, instead of a regular meeting on Thursday, the class will meet Monday for the Introductory SHARCNET seminar,
see the events at www.sharcnet.ca.
The attendance is mandatory. The meeting will take place in AG room,
Please come 10 minutes before the seminar so that the access may be set up for you. See you there.
Clarification: You are supposed to:
1. generate two random square matrices of size $n$ (parameter defined in
your code just in the same way we did it in class for the vectors); the
matrices should be generated as two-dimensional n x n arrays or as
$n^2$-long vectors of doubles,
2. compute the product of two matrices in three different ways (although
the result of multiplication should be the same)
a) using your own version of matrix multiply,
b) using Level 1 BLAS vector-vector dot product (this would require $n^2$ calls to the dot function),
c) using Level 2 BLAS matrix-vector product (this would require calls to the function),
d) using Level 3 BLAS matrix-matrix multiply (this would require one call to the gemm function),
3. contrast the performance of the three approaches based on the time it
took to do a), b), c) and d) for a collection of n, say $n=100,1000,10000$.
To compile and link ACML BLAS Level 2 and 3 functions, you would need to add libg2c.a to the DevC++ linker options, after libacml.a
Free pass bonus assignment: Improve the performance of BLASS dgemm for square 1024 x 1024 and/or 2048 x 2048 and/or 4096 x 4096 matrices using
1-3 levels of Strassen recursive formula. Your code should produce a statistically significant speed improvement, say, >1%, on the native BLAS platform:
if you are running Intel machine, you should get the improvement over MKL, if you are running AMD, the improvement should be over ACML.
The timing for Strassen should include partitioning of the matrices and assembling the result back together. Make sure the improvement is not due to disc swapping.
You will have to demonstrate and explain your implementation, either on your own machine, machine in the lab or my desktop (AMD).
First two people to complete it properly get the free pass in the course and an A+.
The free passes with bonus assignment are already claimed for!
Eighth assignment: speed-up the MATLAB code presented in the first class.
Ninth assignment: write a C-wrapper/MEX function for the MATLAB code improved in the assignment 9.
The function should take two arguments: the filename for the data file (e.g., scan2.mat) and nhd_size, and produce
three output arguments: the actual shift-rotation pair ([theta_p, shift_p]), the “best fit” shift-rotation pair
([best_shift, best_theta]), and the minimum distance (norm) found. To speed up the code, you might want
to switch from the SVD-based norm for two matrices to a simple 2-norm for two column vectors
obtained from those matrices. Due
Final exam, due Thursday, December 6 by