About the Author

Douglas EadlineDouglas Eadline PhD, is both a practitioner and a chronicler of the Linux Cluster HPC revolution. He has worked with parallel computers since 1988 and is a co-author of the original Beowulf How To document.  Prior to starting and editing the popular http://clustermonkey.net web site in 2005, he served as Editor-in-chief for ClusterWorld Magazine. He is currently Senior HPC Editor for Linux Magazine and a consultant to the HPC industry. Doug holds a Ph.D. in Analytical Chemistry from Lehigh University and has been building, deploying, and using Linux HPC clusters since 1995.

User Rating: / 0
A blog about making HPC things (kind of) work

Recently, I was building the LAPACK (Linear Algebra PACKage) math libraries with different compilers. Of course, these are somewhat obsolete due to the various optimized versions that are available from both as academic projects and vendors packages.

I actually build the libraries as RPM's and crafted my spec files to support different compilers using the RPM --with option. At a minimum, I make sure all the libraries build with the GNU compilers, which means gfortran and gcc. I usually set the optimization level at -O2 for most of these builds. It should also be noted that LAPACK preforms tests as part of the install to make sure everything is working correctly.

The LAPACK build takes about fifteen minutes to build on my Core2 machine. I decided to save some time and build it on a clients new Xeon server. I started the build and went to work on something else. Upon retuning to the build window, I found the build process running a test program for quite some time. One core was running flat out for about ten minutes, but the program was not finishing. I decided to try it again and found the same results. A lower optimization of -O1 produced the same stuck test program spinning in place. Next, I turned of all optimizations and all the tests finished!

There are few lessons in this exercise. First, compilers are programs and programs can have bugs. Compiling LAPACK is about as basic as you can get and I am surprised this is not part of regression test for this compiler. In any case, it is important to remember, optimization is a tricky business. Compilers are very complicated programs. They must output binary code that runs correctly on many different processors, even within the same model that are different steppings (revisions of a processor model) for which they must account. This requirement is no small task and becomes more difficult when optimization is requested by the user -- and who does not want their code optimized!

In addition, the further you push the optimization the less safe your code may become. The resulting binary may be faster, but fail with an error or silently get wrong answers, which is why running tests cases is critically important. Enabling as much optimization as possible on the first pass is a novice mistake. In some cases a new compiler may give differing results which may indicate an programming error (One way to thoroughly test your code is to compile it with many different compilers and compare results and warning messages.) In other cases, errors may be due to an optimization. Any seasoned HPC programmer knows, that when using a new compiler, build your applications with no optimization, compare results and if all is well, start adding optimizations, lather rinse repeat.

I am not making excuses for the compiler vendor, this simple problem should have been caught, but it is important to remember, compilers are not super programs, they are mortal, just like your programs.