About the Author

Douglas EadlineDouglas Eadline PhD, is both a practitioner and a chronicler of the Linux Cluster HPC revolution. He has worked with parallel computers since 1988 and is a co-author of the original Beowulf How To document.  Prior to starting and editing the popular http://clustermonkey.net web site in 2005, he served as Editor-in-chief for ClusterWorld Magazine. He is currently Senior HPC Editor for Linux Magazine and a consultant to the HPC industry. Doug holds a Ph.D. in Analytical Chemistry from Lehigh University and has been building, deploying, and using Linux HPC clusters since 1995.

High Performance Computing

User Rating: / 4
PoorBest 
A blog about making HPC things (kind of) work

I plan on benchmarking several low cost quad-core processors in the coming weeks. I'm trying decide what to use in my Limulus Project upgrade. Currently, it uses Core 2 processors (three dual core and one quad-core) and while it works quite well, I want to see just how much compute power I can put in a desk-side case. I'll be testing the following:

  • Intel Core2 Quad-core Q6600 running at 2.4GHz (current system)
  • AMD Phenom II X4 quad-core 910e running at 2.6GHz
  • Intel Core i5 Quad-core i5-2400S running at 2.5 GHz

Note that these are all 65W processors, which are necessary for the Limulus design.

When I test processors, I tend to use HPC tests. I do not use HPL (High Performance Linpack) because it can require a lot of tuning to get a good number. I prefer to use standard GNU compilers and the NAS Parallel Benchmark Suite. I also like to use Gromacs. This approach may not test the best performance for a given platform, but it allows me to compare "apples to apples" as close as possible. I can also run these tests on a single core, a multi-core CPU, or a cluster.

When multi-core processors first appeared, I wanted to know the answer to a simple question. If a program runs on a single core in X number of seconds, then Y copies should run in the same amount of time, provided Y is less than or equal to the number of cores and there is perfect memory sharing (i.e. there is no memory contention). If it takes the collection of copies longer to run (than a single copy), then the number of "effective cores" is reduced. Surprisingly, I have not found anyone else that runs these types of tests.

The test is simple enough to run and can be easily scripted. An example script and a link to a complete set of scripts can be found below. To make the test useful, I use the NAS suite compiled for a single core. The NAS suite is a set of eight kernels that represent different aerodynamic application types. Each kernel is self checking and offers a different memory access pattern.

I call the script an "Effective Core" test. As I mentioned, it shows how many cores the application actually "sees." A few months back I was given a dual socket server with two Intel six-core Xeons to test (two Xeon X5670's running at 2.93GHz). There were a total of 12 cores available on a single node. The first thing I did was run my benchmark scripts to see how many effective cores I could see. I ran the tests using 2,4,8,12 copies of the NAS kernels (test size A). The results are below.

test 2 copies 4 copies 8 copies 12 copies
cg 2.0 3.7 5.7 6.6
bt 2.0 3.4 4.6 4.9
ep 2.0 4.0 8.4 11.7
ft 2.0 3.8 7.0 9.0
lu 2.0 3.9 6.4 6.1
is 2.0 3.9 7.8 11.2
sp 2.0 3.5 5.0 5.4
mg 2.0 3.8 6.3 6.6

Effective Cores for NAS Parallel Kernels

Things seem to go well until eight copies are run at that same time. At this point we find two interesting things. First, not all tests "see" all the cores, one test, bt, sees only 4.6 cores. Second, another test, ep, sees more than eight cores! The reduction in performance can be attributed to memory contention and the increase is probably due to cache effects (test ep is CPU bound). Looking at the results for 12 copies, we see some of the stark reality for multi-core.

Five tests see seven or less "effective cores," that is an efficiency of less than 60%. The worst, bt, only achieves 4.9 effective cores, a 40% efficiency. The others that seem to scale well did so throughout all the tests. Again, the winner, ep, is CPU bound so memory bandwidth will have little effect.

In all fairness, this is probably the worst case scenario for this multi-core system (i.e. 12 copies of the same program running in the same way). However, these are real program kernels that are not contrived tests. If I were to run 12 copies on 12 separate servers, then they would all scale to 12 effective cores. Keep this in mind when placing parallel codes on multi-core clusters.

You can download the tests scripts that will work on 2,4,8,12, and 16 cores. (Note: If I spent more time, I suppose I could make a single script that would use a command line argument, but I'm both lazy and short of time, plus I don't run these scripts all that often.) The following is an example of the script for the four-way test.

#!/bin/bash
PROGS="cg.A.1 bt.A.1 ep.A.1 ft.A.1 lu.A.1 is.A.1 sp.A.1 mg.A.1"
NPBPATH="../npb/"
echo "4 Way SMP Memory Test" |tee "smp-mem-test-4.out"
echo "`date`" |tee -a "smp-mem-test-4.out"
# if needed, generate single cpu codes change -c for different compiler
# just check for last program
if [ ! -e "$NPBPATH/bin/mg.A.1" ];
then
pushd $NPBPATH
./run_suite -n 1 -t A -m dummy -c gnu4 -o
popd
fi
for TEST in $PROGS
do
$NPBPATH/bin/$TEST>& temp.mem0
$NPBPATH/bin/$TEST>& temp.mem1 &
$NPBPATH/bin/$TEST>& temp.mem2 &
$NPBPATH/bin/$TEST>& temp.mem3 &
$NPBPATH/bin/$TEST>& temp.mem4
wait
S=`grep Time temp.mem0 |gawk '{print $5}'`
C1=`grep Time temp.mem1 |gawk '{print $5}'`
C2=`grep Time temp.mem2 |gawk '{print $5}'`
C3=`grep Time temp.mem3 |gawk '{print $5}'`
C4=`grep Time temp.mem4 |gawk '{print $5}'`
SPEEDUP=`echo "3 k $S $C1 / $S $C2 / $S $C3 / $S $C4 /  + + + p" | dc`
echo "4 Way SMP Program Speed-up for $TEST is $SPEEDUP" |tee -a "smp-mem-test-4.out"
done
/bin/rm temp.mem*
echo "`date`" |tee -a "smp-mem-test-4.out"

The script can be easily modified for other programs. If you want to use the NAS suite, you may find it helpful to download the Beowulf Performance Suite which has the run_suite script that automates running the NAS suite in the script above.

User Rating: / 2
PoorBest 
A blog about making HPC things (kind of) work

This week I continue my experiments with my new SSD (Solid State Drive). I don't want to give away the whole story, but I may find it hard to go back to spinning disks. Before, I get to all the good news, let's talk about the firmware upgrade issue.

If you recall, I was using a 64GB ADATA S599 SSD. ADATA recently released a firmware upgrade and I read it may help performance (it is supposed to address some other issues as well). The upgrade can only be done with a Windows host. I have a laptop with Windows XP, but no desktop systems. Not a problem, I thought, I'll head over to my friends house and convince him to open his PC and allow me connect my drive and flash the firmware. I assumed a 20 minute exercise. Three hours later, the firmware was still not upgraded. The short story is you have to bend over backwards and hope the moon is in the right phase to get the firmware to upgrade. It seemed to almost work, but then it failed. The update software tool, from SanForce I am told, is really poor. It gives very little information about what it is doing and about any problems it encounters. I was not a happy camper, but decided to just use the drive as is and not muck about changing the BIOS and registry of my friends computer to upgrade the Firmware on my SSD. Clearly an unacceptable situation and one that I will investigate before I buy or recommend another drive.

Once I got the drive back in its home system, I decided to partition and format using ext4 so I could use the TRIM feature (TRIM allows the drive to get hints from the OS about deleted blocks of data). I was running Scientific Linux 5.4 (SL5.4) and assumed I had to update my kernel and tools (hdparm and fdisk) to use TRIM. It started getting involved and I decided to use the just released SL6.0. Before, installed the OS, I consulted Ted Ts'o Blog for some advice on setting up the drive. I also found my friend Jeff Layton's video useful.

I created the partitions (there are only two) and formated them as ext4 when I installed SL6.0. The install went fast and rebooting was very quick. I did a quick hdparm -t --direct /dev/sda and got 249 MB/sec which was better than my initial naive installation test (207 MB/sec).

Next it was time for some real testing. I used Bonnie++ v1.01c with a 4GByte files size (system memory size is 2GByte, so there is no cache effects). The following table show the results for a Soft RAID1 set, a single SATA drive, and the SSD. Processor loads are shown as (X%) next to each result.

DriveWrite (KB/sec)Re-Write (KB/sec)Read (KB/sec) Random Seeks (sec)
Two Seagate: ST3500630AS
500G, SATA3, 7200 RPM
Soft RAID1, XFS
49710 (6%) 21401 (3%) 57859 (5%) 370.4 (1%)
Seagate: ST3250310AS
250G,SATA3, 7200 RPM, ext3
74507 (22%) 33087 (7%) 80636 (6%) 176.8 (0%)
ADATA S599
64G, SATA3, SSD
SanForce 1222, ext4
251014 (46%) 109060 (%19) 253908 (23%) 1343 (2%)

Impressive. In all cases the speed-up was about 3.2X and in the random seeks it was 7.6 times faster, which makes sense because all seeks take the same time on an SSD. I was puzzled by the high processor utilization, however (maybe ext4). Of course, the key is whether the SSD can maintain this performance as it is used (degrades). TRIM will help here.

Finally, I think the best usage case for my SSD is to place all the system files on the SSD and use the RAID1 partition for the /home directories. SSD's are still a bit to pricey for use with large file systems, but the attraction is there. I still have some lingering questions. How do I know if TRIM is working? Why is the processor utilization so high? And what about the firmware update? I'll have more in the future.

A blog about making HPC things (kind of) work

Last fall, I decided to purchase an SSD (Solid State Disk). The prices have come down and it makes sense to at least try one of these devices. SSD technology is interesting because every time you write to the drive, it degrades. Of course the same can be said about spinning platters, but that is to be expected, after all they are mechanical devices. Like the name says, the SSD is solid state and like other solid state devices, the general rule of thumb is, "if it works for the first two weeks, it will probably work for 10+ years" I have found this to be the case with most electronic equipment (provided it is not overheated). This degradation is planned and understood by the manufacturers. SSD drives are designed to work well well into the future with average use (50+ years).

An interesting aside about SDDs. When they stop working they essentially become read-only devices. That is, they cannot be scrubbed, they must be physically destroyed to remove the data. Keep this in mind in the future when you toss that old SSD with all your personal information on it.

Getting back to my new device. Even though I bought it several months ago, I have just now started using my new 64GB ADATA S599. Before I bought it, I researched various SSD drives and found that the biggest factor in performance is the controller. The S599 uses a SandForce controller and the reviews showed the ADATA provided great performance for the price. After I bought it, the adapter tray that came with it did not have the right holes to work with my removable drive tray (XClio SS034). I was using my Limulus personal cluster and had no other place to install the drive.

I set the drive aside until after the holidays, and then found a bracket that could hold an SSD and slim DVD drive in a single drive bay. I replaced my standard DVD with this unit and now I had my SSD securely mounted in the chassis.

I then went about installing Scientific Linux on the drive. It went very smoothly (and quickly). When I did a few quick tests with hdparm, I found it more than doubled the direct write speed over the existing spinning rust drives (207 MB/sec vs 98 MB/sec, hdparm -t --direct /dev/sdX). So far so good.

I was making progress, but then things got more complicated. ADATA issued a firmware update and then I read about "erase block size," an updated fdisk, and TRIM support for Linux. The drive was working, but I wanted to make sure it was optimized and working as best as it could under  Linux. I'll get to those issues next week.

User Rating: / 2
PoorBest 
A blog about making HPC things (kind of) work

I'm in the middle of solving another cluster problem. I won't mention any names or vendors, although at times I think some healthy shamming may be in order. In any case, the problem started out simple enough. I was asked to make a small change with some queues on a torque/maui cluster. The change had to do with twenty new nehalem nodes they had purchased last year.

The nodes were working fine up until this point, or more precisely they were working for what was asked of them. Splitting the nodes into two queues caused the nodes to be used in a different fashion and thus exacerbated a problem that had been lurking in the system.

I'll save the details other than to say it was traced to a faulty GigE switch. Faulty, may not be the best word, because I am not sure if the switch is broken or has a firmware issue. It turns out the switch in question consists of two 48 port switches that are "stacked" to look like one switch. The slave switch was recently replaced due to spontaneous reboots. The current problem with the switch is the inability of some nodes to contact other nodes. This issue is repeatable and isolated to a few ports on the switch. The first logical (and easy) thing to do is to update the somewhat dated firmware and see if it helps.

Here is the catch-22. The cluster vendor will only support the cluster with the old firmware because the new firmware has not been "qualified" yet. But, when the vendor is told about the problem, they gladly provide a new switch with new firmware that must be downgraded to the old firmware so that the support contract can remain in place. And, the person that is sent to install the new switch "does not do firmware." The customer who purchased a support contract now has to downgrade the switch so it will "stack" with the existing switch.

The current problem might be solved with a firmware update. The vendor does not seem interested in fixing the "whole" problem other than sending incompatible parts. As I see it, shabby support requires that old firmware be used, which requires the end user to do their own support and trouble-shooting which means they essentially get no support. Make sense? Not to me.

I'm not interested in telling war stories or discussing how to fix cluster issues. There is a higher lesson here. What many vendors forget is clusters are a "system" not a pile of servers, switches, and cables. Many vendors treat them as individual parts and have no clue how to support the "whole system." They proudly boast of selling clusters, but what they are selling are connected islands of hardware each with individual support. It is rare that a vendor takes responsibility for the whole system.

To be fair, clusters can be complicated and custom systems (much like storage networks). There are vendors, usually not the large vendors, who understand this reality. They design, build, and support clusters as complete systems. These integrator-vendors usually take some responsibly for both hardware and software.

What surprises me the most about the above situation is that it is generally the rule and not the exception. I have been involved with clusters since the mid-90's and today's situation reminds me of the early years where everyone was still trying to figure out how to build these things. At least there was an excuse back then. The only piece of advice I can give is when buying a "cluster" ask the vendor for the name and phone number of the person (or group) who is going to help solve software and hardware issues that involve multiple components from multiple vendors. Most vendors hold up mirror at this point. Keep talking to the ones who don't.

User Rating: / 2
PoorBest 
A blog about making HPC things (kind of) work
As mentioned previously, I have been planning to benchmark some new low power AMD and Intel Quad cores. My motivation is the Limulus Project. Even though these processors are low-power desktop devices they work similarly to their big brother HPC versions. When Intel or AMD make a processor line "the guts" are basically the same. The marketing department slices and dices some of the features configuring products to reflect the market in which they will be sold. Servers need features that consumer desktops do not (and vis-versa). Some of these options make sense from a technical standpoint, while others are just marketing gymnastics.

The underlying architecture (the guts) are the same, however. Generational advances (like memory architecture) travel through the whole product line. The high end server processor may work 20% faster than the lower end desktop processors, but the cost is usually much more (almost always more than 20%). Thus, by using lower cost desktop parts, a large percentage of the "high end" performance can be had at a low end price. The Limulus project is designed to take advantage of this marketing trend. And, to be clear, the Limulus project is not a replacement for a large server based cluster.

Back to the issue at hand, however. I just finished testing the following desktop processors:

  • Intel Core2 Quad-core Q6600 running at 2.4GHz (Kentsfield)
  • AMD Phenom II X4 quad-core 910e running at 2.6GHz (Deneb)
  • Intel Core i5 Quad-core i5-2400S running at 2.5 GHz (Sandybridge)
Note that the 910e and i5-2400S are 65W processors, which are necessary for the Limulus design. Each is in a single socket desktop motherboard. I described my testing script and procedure in a previous post. I ran my effective processor script on all three systems and recorded the results in the following table.

Test AMD 905e Intel i5-2400S Intel Q6600
cg 3.4 3.12.0
bt 2.1 2.01.6
ep 4.0 4.04.0
ft 3.7 3.62.7
lu 2.3 2.52.1
is 3.9 4.03.3
sp 2.4 2.11.5
mg 2.3 2.41.8

Effective cores for NAS Parallel Kernels

The results show a clear improvement over the Q6600 by both newer processors (due to better memory architecture). Interestingly, the 905e and the i5-2400S show about the same number of effective processors suggesting that the memory designs (each has two memory channels) are similar. Of course, the server version of these processors have more memory channels, better performance, and a higher cost. In the future, I'll be running more tests with real applications.

User Rating: / 2
PoorBest 
A blog about making HPC things (kind of) work

One of the issues facing the HPC market/community is the lack of good system administrators for clusters. Many believe this issue holds back the market and I have to agree. I don't think, however, that being a Linux cluster administrator is all that much different than any other Linux systems administrator. There are some new and different things to learn, but most of what you need is common knowledge (i.e. there is no secret cluster admin cabal).

In the past I have taught classes on cluster administration and found that those with Linux/Unix experience often have little trouble handling the concepts and ideas. In addition, I am often asked what skills are needed to be a good cluster administrator. What follows is not a complete list for sure, but it is a start. The list begins with general topics and as you proceed toward the end the topics become more cluster specific.

  • Basic Skills - There are some basic skills that any Linux/Unix administrator need to have mastered. These include, the use of a command line text editors, sorry no mice and menus allowed in the trenches. Bash scripting is essential. Perl is helpful, but bash is what drives a lot of things on a cluster. In addition, understanding basic Linux concepts such as booting, mounting file systems, kernel modules and tools, performance monitoring, and SMP concepts. A good understating of x86 server hardware helps as well.
  • RPM/YUM or Deb packaging - Depending on your distribution, understanding Linux package management is essential. Not only do you need to know how to install packages, but querying package contents, installing, updating, and other package mojo is very important. Being able to build RPMS is nice, but not necessary. I'll have more on this at another time.
  • Compilers - Most cluster experts (and administrators) have a good understanding of compilers and building code. Understanding that the long stream of error messages can be due to missing libraries (and easily fixed) prevents the sense of overwhelm that comes with trying to build that new software package. And, it makes you look like a genius to your users.
  • Networking - Networking is perhaps the toughest area to find good information. In many other market sectors, non-optimal network performance works quite well for just browsing the web or transferring files. Clusters need the fastest networks possible. High end networks have been even more obscure with the use of "user space" or "zero-copy" protocols. The market is focused on either 10 GigE or InfiniBand solutions and most drivers are part of the kernel. Also GigE is still a real alternative in some cases.
  • Cluster Provisioning - When a cluster node boots, it needs to come up in a predictable and manageable fashion. There are many packages out there that provide help with this task. Most cluster tools and provisioning packages use standard Linux/Unix concepts to achieve manageable systems.
  • Schedulers - Resource scheduling has been around ever since people started sharing computers. The basic concept is to allow multiple users the ability to share the cluster. While the issue of resource scheduling can get quite involved, the basic concepts are not too hard to grasp. One should also know that no matter how hard you work to optimize your scheduling system, there will still be complaints.
  • Message Passing Interface (MPI) Libraries and OpenMP - MPI has been around before clusters hit the big time, there are numerous books and classes on the topic. MPI is basically a software library that allows processes to exchange data (on the same or different machine). It is supported on all popular (and even unpopular) networks. OpenMP is implemented by the compiler and uses source code directives to create threaded programs for a single SMP server.

There you have it. It you work with HPC clusters, you bump into these issues most of the time. There is ample and freely available documentation (and software) on all of these topics. There are even cluster courses if you can find them. Of course, there are some exclusive cluster issues which deal with parallel computing, but a good grasp of the above creates a solid foundation and enough to get you on you way to becoming an HPC cluster maven.

A blog about making HPC things (kind of) work

For those that don't know, I have been working on the Limulus Project for quite a while. The goal of the project is to create and maintain an open specification and software stack for a personal workstation cluster. Ideally, a user should be able to build or purchase a small personal workstation cluster using the Limulus reference design, low cost hardware, and open software. The idea started in 2005 when Jeff Layton and I asked, "how much computing power can you buy for $2500?".

Today that question is still open to interpretation. The initial 8-node cluster we built performed quite well and offered an outstanding price-to-performance ratio -- wire racks and all. We managed to get 14.5 GFLOPS running HPL (remember this was 2005 and our budget was $2500). I am in the process of building a new 4-node system using Intel's new Sandy Bridge processors. It will have 16 cores. BTW, I also developed a single case design for this new cluster.

When I started building my cheapskate clusters, there were no multi-core processors and most cluster nodes were dual socket (two single core processors). Today it is almost impossible to buy a single core x86 processor that is not designed for low power applications. It is possible, however, to buy a 16-core (or more) desk-side SMP system. (i.e. a single motherboard with 16+ cores). This type of system has the advantage of a single OS image and shared memory programming.

The question I wonder about is how well applications run on such a "core heavy" box. As my previous tests indicate, depending on the workload, you may not always "see" all the cores. In some cases, you may be surprised how little speed-up you achieve on these SMP systems. The culprit, of course, is memory contention.

What about my "personal cluster?" In my new, Sandy Bridge system, each of the four nodes will have four cores sharing the local memory. If my applications are not network limited (I use Gigabit Ethernet), then I should be able to get better memory bandwidth on parallel applications than on a typical SMP node. Of course, I'll want to test this assumption. I'll post results as I get them. Stay tuned.

A blog about making HPC things (kind of) work

AMD has recently released their long awaited Bulldozer processor. This processor is a completely new "ground up" design and is a departure from the existing K10 line of processors cores.

Briefly, AMD has introduced a new microarchitecture building block called a "module" which consists of two tightly coupled, "conventional" x86 out-of-order processing engines (cores). Each module has the following independent hardware, (up to) 2048 KB L2 cache per module (shared between the cores in a module), a 16 KB four-way L1 data cache per core and a two-way 64 KB L1 instruction cache per module, two dedicated integer cores, and two symmetrical 128-bit FMAC (fused multiply–add capability) floating-point pipelines per module that can be unified into one large 256-bit-wide unit if one of integer cores dispatch AVX instruction and two symmetrical x87/MMX/SSE capable FPPs for backward compatibility with SSE2 non-optimized software. Multiple modules share an L3 cache as well as an Advanced Dual-Channel Memory Sub-System (IMC - Integrated Memory Controller).

You can find a more detailed description over at SemiAccurate. There are also plenty of product reviews for the FX family of desktop processors, including the top of the chart FX-8150 (8 cores, 3.6 GHz, 125W, 8MB L2, 8MB Shared L3 Cache). Some of the early benchmarks, seem to indicate that the FX does not meet many of the market expectations. There was the hope (expectation) that the FX Family would "bulldoze" the SandyBridge processors from Intel. That does not seem to be the case, however, there are some important issues to consider.

The Bulldozer design is the beginning of a AMD new generation, which means they plan on improving and scaling this well into the future. The current K10 design was launched in 2007. In addition, there is much more to understand about today's processors than the traditional popular single threaded benchmarks. Indeed, single program (thread) performance on a multi-core processor provides only one part of the overall performance picture (See Benchmarking A Multi-Core Processor For HPC). As AMD continues to roll out the Bulldozer family, I would expect to see better multi-socket performance than in the past (i.e. good scaling with 2 and 4 sockets per motherboard). Also, once compilers understand the architecture better, performance will improve.

Keep in mind, one important factor with today's multi-core processors is not how fast a single core can run a program, but how well you can scale that speed across multiple sockets. The AMD Bulldozer architecture should deliver in this regard.

A blog about making HPC things (kind of) work

y first exposure to Linux was the Slackware distribution back in the early 1990's. I don't remember exactly how it was installed, but all I remember there were a bunch of binary and source files on a CD. By today's standard, the number of packages was quite small. I remember that the general advice for upgrading was to save all your data and reinstall the new version. For this reason, I never installed another Slackware distribution, although I was very happy with my first version of Linux!

I then decided to try this "Red Hat" version because they had this management "thing" called Red Hat Package Manager or RPM. It definitely made life easier. I could upgrade packages and even whole distributions without having to reinstall everything. I'm convinced RPM (and dpkg from Debian) accelerated the growth of Linux by helping to manage all the various component software packages.

One of the key features of RPM was the inclusion of dependencies. Often considered a pain, package dependency ensured that everything would work together -- there would be no library incompatibles like the DLL problems found with Windows. This "feature" actually allowed entire working distributions to be created and maintained. That is, libraries and languages could be built against a consistent environment that then was used to build applications. In the end everything worked even when adding or updating packages. In a way, RPM dependency trees create a "non-portability" between distributions because an RPM from one major version is not guaranteed not to work with a newer version. As a user, you could always download the source code and build packages, but you gave up the RPM management aspects that make life easier (more on that in moment).

As Linux distributions grew in breath and depth, so did the dependencies between packages. This situation created a real problem for RPM based systems. Users often found that when installing an RPM for a particular application it needed other RPMs (often libraries). Fair enough, start installing the dependencies, but then they need packages that you don't have installed or are out of date. You are now traversing the RPM dependency tree, which is no fun. To solve this problem Yellowdog Updater, Modified (YUM) was created. Don't worry about the name, just say "yum." Using YUM it is possible to install an RPM and all the dependent packages with one command. Of course, you had to tell YUM where to find the "YUM ready" repositories on the web for your particular distribution. Once YUM finds the repositories, it will work out the dependencies and install the needed RPMs from the repository. It all works rather well. By the way, Debian did this first.

RPM and YUM have made distribution management very easy and have reduced headaches. (Although to be fair some argue that they increases headaches!) In addition, I find making my own cluster RPMs makes managing clusters easier as well.

As an example, consider MPI libraries. Some distributions include these as standard RPMs. There are two reasons I don't use these and build my own RPMs. First, I often want the latest version, which is usually not available, and I want to control where the libraries are placed and configured. Most clusters want at least MPICH2 and Open-MPI installed (and often others). These packages need to be integrated into the environment so that they don't interfere with one another.

Because I work with a bunch of different clusters and I like consistency, I can install and manage things easier. In addition, Using RPM is a great way to document and remember how a package is installed. I can include configuration (installation and de-installation) scripts directly in the RPM as well as comments and notes about the package. In addition, to MPI libraries, I have batch schedulers, monitors, libraries, etc. that are all managed via RPMs. This convenience comes at a price because it takes time to create a good RPM, but once built I can easily update it for new versions of the underlying package. I'll have more about managing various package versions next time.

User Rating: / 2
PoorBest 
A blog about making HPC things (kind of) work

As recently reported by bit-tech, a survey has found that many people have experienced data loss from solid state storage. Could this be a problem?

The survey was put together by Kroll Ontrack and included responses from 560 people. Interestingly, 57% said they had experienced data loss when using SSD/flash technology while 75% also considered the recovery of data from SSD/flash to be nearly impossible or complicated when compared to the techniques used to extract data from broken or damaged hard disks. Then in what seems quite contrary, 75% of respondents believed that SSD/flash is a safer, more robust storage technology. Finally, over 90% of respondents said they perceived SSD/flash technology to be reliable. There was no breakdown between SSD's and flash devices. I would assume thumbdrives have more issues than SSD's due to the way they are used.

Where to begin. Okay people, it is 2011, if you loose data, it is your fault. Stuff breaks. There are a multitude of cheap and simple ways to backup TBytes of data. My conclusion, 60% of the people who took the survey deserved to loose their data. Second, how do 70% of the respondents know how hard it is to recover data from an SSD or flash drive? Have seven out of ten people you know lost data on a SSD or flash drive and tried to recover it?

Now comes the totally illogical part. Of the same bunch of imbeciles that lost data on an SSD or flash drive, a full 90% perceived the technology to be reliable. That is like saying, We had a picnic in the middle of the freeway, six of our ten guests got hit by a car, but almost everyone thinks this is safe activity. Where is my clue stick.

The survey seems absurd until you consider the source. A data recovery firm that promises to get your data back. What a surprise. That is like an ambulance company telling you to have a picnic in the middle of the road; Sure it is dangerous, but we are there to help (and get paid) when things go wrong. I'm not arguing SSDs do not break, they do, I had to get my SSD replaced because it would randomly disappear until the system was cold booted, but bogus data loss and failure rates are just Fear, Uncertainty, and Doubt (FUD) used to drive business. If there is anything the Internet has taught me, the more unbelievable a story headline reads, the more it is.