Detailed Platform Analysis in RightMark Memory Analyzer. Part 14: 65nm AMD Athlon 64 X2 Energy Efficient Processors
We haven't paid attention to AMD K8 microarchitecture for a long time already, since the first article in this series over three and a half years ago. Strange as it may seem, the reason is very simple - all AMD K8 revisions (starting from new revisions of single-core processors and dual-core processors to the new AM2 platform with DDR2 memory) have had practically the same microarchitectural characteristics. Of course, we couldn't miss the new integrated DDR2 memory controller that replaced the DDR memory controller in dual-core processors (AM2 platform) with Core Revision F. But there has been no cause for a detailed analysis of modern AMD K8 processors up to now... if not for the results of our recent shootout of Athlon 64 X2 Energy Efficient processors and their "regular" counterparts. In this article the 65nm Athlon 64 X2 4800+ processor was significantly outperformed in most tests by its regular counterparts with equal or even lower model numbers. In order to understand what microarchitectural differences of energy efficient processors can provide such results, we decided to run a low-level comparison of a "regular" 90nm Athlon 64 X2 5200+ (Windsor core) and a 65nm Athlon 64 X2 4800+ EE (Brisbane core) in RightMark Memory Analyzer.
Testbed configuration
- Processor 1: AMD Athlon 64 X2 5200+ (2.6 GHz, CPUID 40F32h, Windsor rev. F2, 90nm)
- Processor 2: AMD Athlon 64 X2 4800+ EE (2.5 GHz, CPUID 60FB1h, Brisbane rev. G1, 65nm)
- Chipset: NVIDIA nForce 590 SLI
- Motherboard: ASUS CROSSHAIR, BIOS 0702 dated 20.06.2007
- Memory: 2×1 GB Corsair XMS2-6400 DDR2-800, 5-5-5-18
Real Bandwidth of Data Cache/Memory
We'll start with the tests of real L1/L2 D-Cache and RAM bandwidth.

Picture 1. Average real throughput of Data Cache and RAM, Athlon 64 X2 EE
Test results of the Athlon 64 X2 EE are shown in Picture 1. On the qualitative level, they are identical to the results of the Athlon 64 X2, except for L2 Cache size (1024 KB in Athlon 64 X2 versus 512 KB in Athlon 64 X2 EE).
Table 1
| Level |
Average real bandwidth, bytes/cycle |
| Athlon 64 X2 |
Athlon 64 X2 EE |
L1, reading, MMX
L1, reading, SSE2
L1, writing, MMX
L1, writing, SSE2 |
15.68
8.00
8.00
8.00 |
15.68
8.00
8.00
8.00 |
L2, reading, MMX
L2, reading, SSE2
L2, writing, MMX
L2, writing, SSE2 |
4.10
4.02
3.94
3.92 |
3.14
3.14
3.06
3.01 |
RAM*, reading (SSE2)
RAM, writing (SSE2) |
3.89 GB/s (32.7%)
3.27 GB/s (27.5%) |
3.15 GB/s (27.5%)
2.80 GB/s (24.5%) |
*Values relative to the theoretical FSB bandwidth limit are in parentheses
Quantitative characteristics of real D-Cache/RAM bandwidth are published in Table 1. L1 D-Cache characteristics of both processors are identical in all cases (reading and writing data with MMX and SSE2 registers). Differences between the processors begin to appear in L2 Cache - the Athlon 64 X2 EE is outperformed here. It features lower real bandwidth both for reading (3.14 versus 4.10 bytes/cycle) and writing (3.06 versus 3.94 bytes/cycle) - lower by 23% in both cases. Athlon 64 X2 EE is also slower at addressing data in memory. Even if we take into account that the real memory bus frequency of the Athlon 64 X2 5200+ is approximately 371 MHz (theoretical bandwidth is 11.87 GB/s), while it's a tad lower in the Athlon 64 X2 4800+ EE - 357 MHz (theoretical bandwidth - 11.42 GB/s), relative memory bandwidth values of the latter are still lower. It's 27.5% for reading (versus 32.7%) and 24.5% for writing (versus 27.5%), which is 10-15% lower than the Athlon 64 X2 level.
Maximum Real Memory Bandwidth
As usual, maximum real memory bandwidth is reached owing to software prefetch for reading and non-temporal store for writing. Picture 2 shows results of these tests for the Athlon 64 X2 EE. On the qualitative level, they again match those of the Athlon 64 X2.

Picture 2. Maximum real memory bandwidth, Software Prefetch/Non-Temporal Store,
Athlon 64 X2 EE
Quantitative characteristics of these tests are published in Table 2. They are again worse in the EE modification of the processor. Relative maximum real memory bandwidth in this processor reaches 51.3% for reading (versus 65.9%) and 47.1% for writing (versus 60.8%), even taking into account the lower theoretical memory bus bandwidth. It's approximately 22% as low as the results obtained by the "regular" Athlon 64 X2. Lower values of maximum real memory bandwidth of the Athlon 64 X2 EE most likely have to do with lower real bandwidth of L2 D-Cache. It's also 23% as low as that of the Athlon 64 X2.
Table 2
| Operation |
Maximum real memory bandwidth, GB/s* |
| Athlon 64 X2 |
Athlon 64 X2 EE |
| Reading, Software Prefetch |
7.87 (65.9%) |
5.89 (51.3%) |
| Writing, Non-Temporal Store |
7.26 (60.8%) |
5.41 (47.1%) |
*values relative to theoretical bandwidth limit of the memory bus are in parentheses