13 October 2011
A blog about making HPC things (kind of) workAMD has recently released their long awaited Bulldozer processor. This processor is a completely new "ground up" design and is a departure from the existing K10 line of processors cores. Briefly, AMD has introduced a new microarchitecture building block called a "module" which consists of two tightly coupled, "conventional" x86 out-of-order processing engines (cores). Each module has the following independent hardware, (up to) 2048 KB L2 cache per module (shared between the cores in a module), a 16 KB four-way L1 data cache per core and a two-way 64 KB L1 instruction cache per module, two dedicated integer cores, and two symmetrical 128-bit FMAC (fused multiply–add capability) floating-point pipelines per module that can be unified into one large 256-bit-wide unit if one of integer cores dispatch AVX instruction and two symmetrical x87/MMX/SSE capable FPPs for backward compatibility with SSE2 non-optimized software. Multiple modules share an L3 cache as well as an Advanced Dual-Channel Memory Sub-System (IMC - Integrated Memory Controller). You can find a more detailed description over at SemiAccurate. There are also plenty of product reviews for the FX family of desktop processors, including the top of the chart FX-8150 (8 cores, 3.6 GHz, 125W, 8MB L2, 8MB Shared L3 Cache). Some of the early benchmarks, seem to indicate that the FX does not meet many of the market expectations. There was the hope (expectation) that the FX Family would "bulldoze" the SandyBridge processors from Intel. That does not seem to be the case, however, there are some important issues to consider. The Bulldozer design is the beginning of a AMD new generation, which means they plan on improving and scaling this well into the future. The current K10 design was launched in 2007. In addition, there is much more to understand about today's processors than the traditional popular single threaded benchmarks. Indeed, single program (thread) performance on a multi-core processor provides only one part of the overall performance picture (See Benchmarking A Multi-Core Processor For HPC). As AMD continues to roll out the Bulldozer family, I would expect to see better multi-socket performance than in the past (i.e. good scaling with 2 and 4 sockets per motherboard). Also, once compilers understand the architecture better, performance will improve. Keep in mind, one important factor with today's multi-core processors is not how fast a single core can run a program, but how well you can scale that speed across multiple sockets. The AMD Bulldozer architecture should deliver in this regard.
|< Prev||Next >|