About the Author

Douglas EadlineDouglas Eadline PhD, is both a practitioner and a chronicler of the Linux Cluster HPC revolution. He has worked with parallel computers since 1988 and is a co-author of the original Beowulf How To document.  Prior to starting and editing the popular http://clustermonkey.net web site in 2005, he served as Editor-in-chief for ClusterWorld Magazine. He is currently Senior HPC Editor for Linux Magazine and a consultant to the HPC industry. Doug holds a Ph.D. in Analytical Chemistry from Lehigh University and has been building, deploying, and using Linux HPC clusters since 1995.

User Rating: / 0
PoorBest 
A blog about making HPC things (kind of) work

Recently, I was running the HPL Benchmark on a rather small cluster. For those that don't know, HPC stands for High Performance Linpack Benchmark. Specifically from the website:

HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark.

Most people have no idea what Linpack actually does, nor do they care. The only thing that matters is the number of floating point operations (FLOPS) that can be generated by a particular machine. Twice a year these results help rank computers on the Top500 list -- based solely on their HPL results.

There are a few misunderstandings about the Top500 that get lost in the marketing hype. I have decided to provide a few facts that may help one appreciate the Top500 list for what it is.

  1. The Top500 competition is a great way to record and track progress in high performance computing.
  2. Users must run the HPL benchmark and submit the results to the Top500 site.
  3. Getting a good HPL number can take several days (or more) of "tweaking" the input file and program options.
  4. There are more optimized parallel solvers that often work faster than HPL
  5. There is no measure of file system performance (I/O rates) in the HPL benchmark.
  6. The HPL benchmark is not a good predictor or how other benchmarks/applications will actually perform.
  7. The fastest machines use tens of thousands of cores and tens of TBytes of memory
  8. Most HPC users use less than 64 cores and do not do any HPL type computation.

Given the above facts, one may wonder why is so much emphasis placed on the Top500 results. From a typical end user perspective, it has no practical value whatsoever. The small number that run similar codes at that scale may certainly find the results interesting, but to others it has little value.

There is great value in the benchmark results provided any claims are given proper context. For instance, The Top500 fastest computers running the HPL benchmark or Achieved 200 TFLOPS running the HPL benchmark provide proper reference to the benchmark. Much of the value comes from the historical record and the fact that the benchmark is openly available.

Because HPL is a valid yardstick, it can also be used to report price-to-performance and performance-per-watt. Again, such claims should be qualified with ... when running the HPL benchmark.

Next time you read about FLOPS, or dollars/FLOPS, or FLOPS/Watts always look for the details. The first things to ask are; What was the benchmark?, What was the level of precision?, and Is it reproducible by everyone? Without this information, headline grabbing claims are of little value to anyone. By the way, I achieved 200 GFLOPS (double precision) when running the HPL benchmark on a Nexlink Limulus personal cluster.