COMPENG 701, Fall 2007,
**course syllabus**

First class:

First lecture and assignment; MATLAB source file, data file, data file for MATLAB6 (you will need to change your code to “load scan2_V6”)

Second lecture and assignment; C source file; more on computer architecture may be found here: http://www.karbosguide.com/

**Hint
for the question 4 in assignment 2: **if you start seeing complex eigenvalues, you are on the wrong track. Recall that the

definition of a condition number in class is given for symmetric matrices only, so your first step must be re-writing an operation, i.e., rotation,

on the index matrix in a symmetric form. (Consider if the error magnitude, caused by the perturbation E, is any different if instead of a solution

vector [x;y] of a system A [x;y]=b+E, in MATLAB notation, you are interested in the “reflected” solution vector [x;-y].)

Third lecture material link and assignment

Fourth lecture material link and assignment

Fifth lecture material link and assignment

The week
of October 22^{nd}, instead of a regular meeting on Thursday, the class
will meet Monday for the Introductory SHARCNET seminar,

see the events at www.sharcnet.ca.
The attendance is mandatory. The meeting will take place in AG room,

Please come 10 minutes before the seminar so that the access may be set up for you. See you there.

Sixth lecture (BLAS) material link and assignment

**Clarification:** You are supposed to:

1. generate two random square matrices of size $n$ (parameter defined in

your code just in the same way we did it in class for the vectors); the

matrices should be generated as two-dimensional n x n arrays or as

$n^2$-long vectors of doubles,

2. compute the product of two matrices in three different ways (although

the result of multiplication should be the same)

a) using your own version of matrix multiply,

b) using Level 1 BLAS vector-vector dot product (this would require $n^2$ calls to the dot function),

c) using Level 2 BLAS matrix-vector product (this would require calls to the function),

d) using Level 3 BLAS matrix-matrix multiply (this would require one call to the gemm function),

3. contrast the performance of the three approaches based on the time it

took to do a), b), c) and d) for a collection of n, say $n=100,1000,10000$.

**To
compile and link** ACML BLAS Level 2 and 3 functions, you would need to add libg2c.a to the DevC++
linker options, after libacml.a

Seventh lecture (BLAS) material link

**Free pass bonus assignment: **Improve
the performance of BLASS dgemm for square 1024 x 1024
and/or 2048 x 2048 and/or 4096 x 4096 matrices using

1-3 levels of Strassen recursive formula. Your code should produce a statistically significant speed improvement, say, >1%, on the native BLAS platform:

if you are running Intel machine, you should get the improvement over MKL, if you are running AMD, the improvement should be over ACML.

The timing for Strassen should include partitioning of the matrices and assembling the result back together. Make sure the improvement is not due to disc swapping.

You will have to demonstrate and explain your implementation, either on your own machine, machine in the lab or my desktop (AMD).

First two people to complete it properly get the free pass in the course and an A+.

**The free passes with bonus assignment are already claimed
for!**

Eighth lecture material – how to speed-up MATLAB article by Pascal Getreuer

**Eighth
assignment:** speed-up the MATLAB code presented in the first class.

Ninth lecture material (MEX-files)

**Ninth
assignment:** write a C-wrapper/MEX function for the MATLAB code improved in
the assignment 9.

The function should take two arguments: the filename for the data file (e.g., scan2.mat) and nhd_size, and produce

three output arguments: the actual shift-rotation pair ([theta_p, shift_p]), the “best fit” shift-rotation pair

([best_shift, best_theta]), and the minimum distance (norm) found. To speed up the code, you might want

to switch from the SVD-based norm for two matrices to a simple 2-norm for two column vectors

obtained from those matrices. Due

Final exam, due Thursday, December 6
by