COMPENG 701, Fall 2006, course syllabus

First class: Friday, September 15, 2006.

First lecture and assignment; MATLAB source file, data file, data file for MATLAB6 (you will need to change your code to “load scan2_V6”)

Second lecture and assignment; C source file; more on computer architecture may be found here: http://www.karbosguide.com/

Hint for the question 4 in assignment 2: if you start seeing complex eigenvalues, you are on the wrong track. Recall that the

definition of a condition number in class is given for symmetric matrices only, so your first step must be re-writing an operation (i.e., rotation)

on the index matrix in a symmetric form. First think will the error be any different if instead of a solution vector to a system A [x;y]=b+Db,

in MATLAB notation, you consider a “flipped” vector [x;-y].

Third lecture material link and assignment

SHARCNET seminar slides

Fourth lecture material link and assignment

Fifth lecture material link and assignment

The deadline for the assignment 5 is extended until Tuesday, October 31. Additional reference on C/lcc, lcc-advanced.pdf,

has been posted on optlab server in BLAS-class directory. In particular, this reference contains a discussion on lists.

Sixth lecture material link and assignment

The second BLAS class material (lecture 7) may be found in /BLASS-class2 folder on optlab.cas.mcmaster.ca

Eighth lecture (third lecture on BLAS)

Ninth lecture material – how to speed-up MATLAB article by Pascal Getreuer

The “free” passes with bonus assignment for the course are already claimed for.

Ninth assignment: speed-up the MATLAB code presented in the first class, due November 28, 2006.

Tenth assignment: write a C-wrapper/MEX function for the MATLAB code improved in the assignment 9.

The function should take two arguments: the filename for the data file (e.g., scan2.mat) and nhd_size, and produce

three output arguments: the actual shift-rotation pair ([theta_p, shift_p]), the “best fit” shift-rotation pair

([best_shift, best_theta]), and the minimum distance (norm) found. To speed up the code, you might want

to switch from the SVD-based norm for two matrices to a simple 2-norm for two column vectors

obtained from those matrices. Due December 2, 2006.

Tenth lecture material (MEX-files)

The link to OpenMP tutorial and the same tutorial zipped.

Here is a sample C code illustrating OpenMP, you might find it useful

(it should also demystify the loop results we have seen in class).

 

A sample solution to the last two assignments is posted below: this is a series of .m files that gave a consecutive

improvements on MATLAB7/Athlon64 2.4GHz to a final speed-up ratio of approximately 4. The speed-up on

your machine might be different (e.g., the original .m file and .m with C-wrapper on Intel PIII with MATLAB7 give

roughly two-times faster code, with no improvements due to “padding” in one of the .m files). The files are

sorted slowest-to-fastest (again, on Athlon64 with MATLAB7):

compeng701_demo_clean.m – the original script,

compeng701_demo_clean_mod.m, sol1.m, sol2.m – modified .m, index rotations are put in a matrix form as two separate functions,

compeng701_demo_clean_mod1.m – further modified, “if” statement is removed by “padding” the index matrix, sol1/sol2 are inlined,

compeng701_demo_clean_mex.m, engine.c – last modification with C-wrapper, note that padding does not pay off that much anymore.

Finally, if you note that within the local search loop one needs to re-compute the indices only if the search angle changes, while shifts

may be accommodated by simple reordering of the array, one can get a fairly impressive speed-up factor of over 20; e.g., on my current

machine, Dual Intel P4 3.2GHz, I have a factor of 25. See the last set of files

compeng701_demo_clean_mex1.m, engine1.c

 

Final exam, due Sunday, December 10 by 17:00, additional ATLAS files are here.