First Look At DDR3
Testbed configuration
- CPU: Intel Core 2 Duo E6600, 2.4 GHz, 4 MB shared L2 cache
- Chipset: Intel P35
- Motherboard: MSI P35 Neo Combo, BIOS V1.0B16 dated 20.04.2007
- DDR2 memory: Corsair DOMINATOR XMS2-9136C5D in DDR2-1066 mode, 5-5-5-15 timings
- DDR3 memory: Corsair XMS3-1066C7 (engineering sample), DDR3-1066, 7-7-7-21 timings
We ran the tests with MSI P35 Neo Combo.
The first test results
Let's proceed from theory to practice. Our testlab got hold of unique pre-production samples of an MSI P35 Neo Combo motherboard based on the new Intel P35 chipset and Corsair XMS3-1066 memory modules (CM3X1024-1066C7 ES). As follows from its name, MSI P35 Neo Combo is a combo motherboard, that is it can accommodate both DDR2 and DDR3 memory modules. Note that the motherboard can use either DDR2 or DDR3. That is simultaneous usage of DDR2 and DDR3 memory is impossible (in the same channel or in different channels). As there is no official specifications on the new chipsets from Intel, we cannot say whether it's a fundamental limitation of Intel P35, or the problem is just in this PCB layout. But the first option is highly probable - Intel chipsets usually do not support such exotic features as using various memory types simultaneously.
CM3X1024-1066C7 ES memory modules are engineering samples of DDR3-1066 memory with 7-7-7-21 timings (they match the prospective timings scheme for DDR3 memory modules of this speed group published in Table 1). In order to compare speed characteristics of these memory modules (as representatives of DDR3) with speed characteristics of the current generation of DDR2 memory, we selected Corsair DOMINATOR XMS2-9136C5D modules from the similar speed group (DDR2-1142). We used them in DDR2-1066 mode with nominal 5-5-5-15 timings.
DDR2 memory modes and timings were set manually in BIOS, memory voltage was raised to 2.3V. Note that the current BIOS version (V1.0B16 dated 20.04.2007) of the MSI P35 Neo Combo motherboard does not allow to configure DDR3 timings properly. It still offers to adjust main parameters (tCL, tRCD, and tRP) from 3 to 6 inclusive, which corresponds to DDR2 timings, not DDR3. The same concerns memory voltage - you can still control it from 1.8V to 2.5V, while the official DDR3 memory voltage is just 1.5V. That's why we set DDR3 memory settings "by SPD" at minimal 1.8V. But we cannot really say anything about these settings (about real support for the unconfirmed SPD standard for DDR3 as well as correct DDR3 timings for the memory controller in the Intel P35 chipset). We can say only one relevant thing: our combination of DDR3 Corsair XMS3-1066 memory modules and MSI P35 Neo Combo motherboard really works. So let's analyze test results obtained in the latest RightMark Memory Analyzer 3.72, which includes RightMark Multi-Threaded Memory Test 1.0.
We'll start with the tests of real memory bandwidth for single-core memory access. As usual, we measured real memory bandwidth in four modes: Read, Write, Read with software prefetch and an optimal prefetch distance, which is 1024 bytes for Intel Core 2 Duo (Read SW PF), and Write NT. The first two modes help us evaluate the average real memory bandwidth for reading and writing. The last two modes evaluate maximum real memory bandwidth for the same operations.
Diagram 8 with test results of DDR2-1066 and DDR3-1066 in a single-thread mode shows that DDR3 is outperformed by equally-clocked DDR2 only insignificantly: the difference amounts to 5-8%, it's especially noticeable in maximum real memory read bandwidth. In both cases, the real memory bandwidth values are very far from maximum theoretical bandwidth of DDR2/DDR3-1066 (17.1 GB/s in dual-channel mode). However, it can be easily explained by the following bottleneck - 266 MHz FSB (1066 MHz Quad-Pumped bus), which peak bandwidth is just 8.53 GB/s.
Dual-core memory access mode (both CPU cores access memory simultaneously, Picture 9) allows to reach higher memory bandwidth (about 8.0 GB/s, which is closer to the maximum theoretical bandwidth of FSB - 8.53 GB/s). In this case DDR3-1066 is generally on a par with DDR2-1066. And its maximum real memory read bandwidth is even higher by 2% . Here is our conclusion: what concerns real bandwidth, DDR3 is at least no worse than DDR2 memory of the same frequency on the current generation of Intel platforms, sometimes it's even faster. That is the fly-by architecture for addressing and commands, and read/write leveling, necessary to reach high memory clock rates justify themselves, because they do not make memory performance worse (they may even improve it).
An attentive reader may object to these conclusions based on memory tests in dual-channel mode only. Indeed, in this case the bottleneck is not in the memory bus (from two channels of the controller to each memory module), but in FSB (from a CPU to a chipset/memory controller). So can we just fail "to see" the difference between DDR2 and DDR3 for this very reason? As this objection would have been appropriate, we decided to check our conclusion by analyzing results of a single-channel memory mode. This operating mode is of purely theoretical interest these days. But it allows to equate peak bandwidth of FSB and memory bus, and thus eliminate possible effects of the former on low level test results. These results are published in Table 2.
Table 2. Real DDR2 and DDR3 bandwidth in the single-channel mode
| Access mode |
Real memory bandwidth, GB/s |
| DDR2-1066 |
DDR3-1066 |
| Reading, 1 core |
6.47
|
5.80
|
| Writing, 1 core |
2.42
|
2.33
|
| Read SW PF, 1 core |
6.90
|
6.34
|
| Write NT, 1 core |
4.88
|
4.88
|
| Read, 2 cores |
6.83
|
6.89
|
| Write, 2 cores |
2.17
|
2.06
|
| Read SW PF, 2 cores |
6.96
|
7.10
|
| Write NT, 2 cores |
4.83
|
4.84
|
Both single- and dual-core memory bandwidth values in single-channel mode are expectedly noticeably lower than bandwidth values obtained in dual-channel mode. Moreover, a single-core access mode demonstrates a larger but still irrelevant gap between DDR3 (slower) and DDR2 (4-11%). But the dual-core memory access again equates results of DDR2 and DDR3, and allows the latter to outperform equally-clocked DDR2 by 1-2% in reading operations. Maximum real memory bandwidth of DDR2-1066 and DDR3-1066 reaches 82-83% of their theoretical maximum in single-channel mode, which seems a good result. Test results of DDR2 and DDR3 memory in single-channel mode confirm our conclusions about speed characteristics of DDR3 memory.
Well, now we can only evaluate access latencies of equally-clocked DDR2 and DDR3 memory modules. Out of general considerations, we should expect higher values for the latter (considering higher timings - 7-7-7 versus 5-5-5 for DDR2). But we'll see what the real difference in latencies will be. Note that in this case we obtained practically identical results in dual-channel and single-channel modes, so we'll publish results for dual-channel mode only (see Picture 10).
So, DDR3-1066 latencies are naturally higher than those of DDR2-1066. The relative latency growth is 13% for pseudo-random access and 16% for random access. Nevertheless, if we take into account that the difference between 7-7-7-21 and 5-5-5-15 is good 40% (as we have written above, we cannot say anything certain about real DDR3 timings), the real increase in latencies looks more than acceptable, when we upgrade from DDR2 to DDR3.
Conclusions
Results of our first low-level tests of DDR3 engineering samples in comparison with equally-clocked DDR2 memory modules in identical conditions allow us to conclude that memory of the new DDR3 standard (still unadopted) is justified even today. Its speed characteristics are at least no worse than those of DDR2 memory modules, sometimes they are even better. The relative DDR2-DDR3 latency growth is not high either (13-16%), all other things being equal. And if we take into account that development of memory technologies generally follows the trail of simultaneous growth of clock rates and reduction of latencies, the future generation of DDR3 memory can close in this gap, or even outscore DDR2 in latencies (DDR2 has practically stopped its development now).
At the same time, we should note that DDR3 will have the same lot as the current generation of high-speed DDR2 memory (DDR2-800 and higher). Namely - serious problems with revealing the huge performance potential of this memory type, which stopped being a bottleneck long ago. For example, our Intel Core 2 Duo / Intel P35 platform can reveal the potential of DDR2-1066 or DDR3-1066 only in the single-channel mode (the real memory bandwidth in this case reaches 83% of the theoretical maximum), which is of no interest from the practical point of view. And the dual-channel mode seriously restricts memory bandwidth on the side of FSB, which bandwidth is twice as narrow. We mentioned such limitations in our articles about system memory (see, for example, Digest 2006). We can only hope that manufacturers of the most important PC components - processors and chipsets - will see the need to modernize their products to reach high performance standards dictated by... memory technologies.