COMPENG 701, Fall 2007, course syllabus

First class: Thursday, September 13, 2007.

First lecture and assignment; MATLAB source file, data file, data file for MATLAB6 (you will need to change your code to “load scan2_V6”)

Second lecture and assignment; C source file; more on computer architecture may be found here:

Hint for the question 4 in assignment 2: if you start seeing complex eigenvalues, you are on the wrong track. Recall that the

definition of a condition number in class is given for symmetric matrices only, so your first step must be re-writing an operation, i.e., rotation,

on the index matrix in a symmetric form. (Consider if the error magnitude, caused by the perturbation E, is any different if instead of a solution

vector [x;y] of a system A [x;y]=b+E, in MATLAB notation, you are interested in the “reflected” solution vector [x;-y].)

Third lecture material link and assignment

Fourth lecture material link and assignment

Fifth lecture material link and assignment

The week of October 22nd, instead of a regular meeting on Thursday, the class will meet Monday for the Introductory SHARCNET seminar,

see the events at The attendance is mandatory. The meeting will take place in AG room, A.N. Bourns Science Building (ABB) 131-J, 1:30-2:30p.m.

Please come 10 minutes before the seminar so that the access may be set up for you. See you there.

Sixth lecture (BLAS) material link and assignment

Clarification: You are supposed to:

1.         generate two random square matrices of size $n$ (parameter defined in

your code just in the same way we did it in class for the vectors); the

matrices should be generated as two-dimensional n x n arrays or as

$n^2$-long vectors of doubles,

2.         compute the product of two matrices in three different ways (although

the result of multiplication should be the same)

a) using your own version of matrix multiply,

b) using Level 1 BLAS vector-vector dot product (this would require $n^2$ calls to the dot function),

c) using Level 2 BLAS matrix-vector product (this would require calls to the function),

d) using Level 3 BLAS matrix-matrix multiply (this would require one call to the gemm function),

3.         contrast the performance of the three approaches based on the time it

took to do a), b), c) and d) for a collection of n, say $n=100,1000,10000$.

To compile and link ACML BLAS Level 2 and 3 functions, you would need to add libg2c.a to the DevC++ linker options, after libacml.a

Seventh lecture (BLAS) material link

Free pass bonus assignment: Improve the performance of BLASS dgemm for square 1024 x 1024 and/or 2048 x 2048 and/or 4096 x 4096 matrices using

1-3 levels of Strassen recursive formula. Your code should produce a statistically significant speed improvement, say, >1%, on the native BLAS platform:

if you are running Intel machine, you should get the improvement over MKL, if you are running AMD, the improvement should be over ACML.

The timing for Strassen should include partitioning of the matrices and assembling the result back together. Make sure the improvement is not due to disc swapping.

You will have to demonstrate and explain your implementation, either on your own machine, machine in the lab or my desktop (AMD).

First two people to complete it properly get the free pass in the course and an A+.

The free passes with bonus assignment are already claimed for!

Eighth lecture material – how to speed-up MATLAB article by Pascal Getreuer

Eighth assignment: speed-up the MATLAB code presented in the first class.

Ninth lecture material (MEX-files)

Ninth assignment: write a C-wrapper/MEX function for the MATLAB code improved in the assignment 9.

The function should take two arguments: the filename for the data file (e.g., scan2.mat) and nhd_size, and produce

three output arguments: the actual shift-rotation pair ([theta_p, shift_p]), the “best fit” shift-rotation pair

([best_shift, best_theta]), and the minimum distance (norm) found. To speed up the code, you might want

to switch from the SVD-based norm for two matrices to a simple 2-norm for two column vectors

obtained from those matrices. Due December 2, 2006.


Final exam, due Thursday, December 6 by 15:00

Additional files: acml.h, libacml.a, libg2c.a