Testing DDR3 and DDR4 RAM performance on Linux
RAM is one of essential computer components. It holds executed program, its data and result. From RAM availability and performance depends how your computer will perform in general.
With the launch of Intel Skylake CPUs a new generation of RAM was introduced to the mainstream - DDR4. So let us take a look on modern DDR3 and DDR4 performance.
Few things about RAM
RAM can be a crucial component responsible for computer performance. In most extreme case where there is no enough RAM the OS will start using HDD to store data needed by the CPU. That will greatly lower performance as RAM memory is much faster. If there is enough RAM then it performance is dictated by memory bandwidth and latency.
Memory bandwidth determines how fast data can be sent or received from memory. The more the better, especially when big amount of data must be transferred. CAS latency is a delay between receiving a command from memory controller to finalizing that command. The shorter latency the better.
Most modern computers before Skylake will use DDR3 memory clocked at 1333 MHZ with around CL9 and 1600MHz with CL11-9. Skylake by default will use 2133MHz CL14-15 or some DDR3L.
As you can see with technology development RAM frequency increases which increases memory bandwidth. 1333 MHZ/PC3-10600 bandwidth is 10,6 GB/s or 1333 MT/s (megatransfers per second), 1600 MHz/PC3-12800 is 12,8 GB/s or 1600 MT/s and 2133MHz is 17 GB/s or 2133 MT/s.
CAS latency seems to increase with increasing memory frequency, but that's not actually what's happening. The number given as CL latency is a number of clock cycles. The duration of a clock cycle decreases with increasing RAM frequency so actual latency stays at the same level or slightly decreases. 1333MHz/CL9 latency is 13.50 ns, 1600MHz/CL11 is 13.75 ns and 2133MHz/CL14 is 13,12 ns.
The bandwidth between RAM and memory controller can also be doubled by using two identical memory DIMMs working in dual channel mode. On motherboards you have to put RAM in two matched DIMM slots for it to work.
DDR4 versus DDR3
The new DDR4 RAM is made with newer technologies and works on lower voltage (1,2V) than DDR3 (1,5V and 1,35V for DDR3L) which gives some power savings. Also DDR4 is more future-proof as it allows for easier design of high capacity and high frequency RAM DIMMs.
DDR4 is supported by some top-shelf Broadwell CPUs. Mainstream support showed up alongside Skylake CPUs. To ease the transition to new RAM type Intel also retained DDR3 support in Skylake, but that support is limited. Officially only rare on desktop DDR3L is supported. Motherboard vendors however list DDR3 as also supported. Intel state that using 1,5V DDR3 with Skylake CPU can after some time damage the memory controller making the CPU dead. Thankfully DDR4 is getting cheaper.
Benchmarks were made on Asrock B150M Combo-G motherboard with i5-6600 CPU. This motherboard allows using up to two DDR3 DIMMs or up to two DDR4 dimms. Ubuntu 15.10 was used as the OS and AMD R9-270 was the GPU. Tested RAM configurations are:
|Corsair DDR3 1333MHz/CL9 4GB||10666 MB/s||13.50 ns|
|Kingston DDR3 1866MHz/CL10 4GB||14933 MB/s||10.72 ns|
|Geil DDR4 2133MHz/CL15 4GB||17066 MB/s||14 ns|
|Kingston DDR4 2133MHz/CL14 4GB||17066 MB/s||13,1 ns|
|Corsair DDR3 1333MHz/CL9 2x4GB dual channel||~21332 MB/s||13.50 ns|
|Kingston DDR4 2133MHz/CL14 2x4GB dual channel||~34132 MB/s||13,1 ns|
Some benchmark sites divide frequency by CL latency as a rough performance estimate. By that estimation best would be 1866/CL10, then 20% lower 2133/CL14 and 1333/CL9 followed by 2133/CL15. This comparison isn't precise and as you will see in benchmarks - not that accurate.
You can check results on openbenchmarking.org. Below I'll show some normalised results.
In the Stream benchmark which measures memory bandwidth the results are as expected just like the bandwidth of each configuration:
It's common to use games for RAM benchmarks. Usually better performing RAM gives some extra FPS if the game isn't limited by the GPU. I used Linux native games like Xonotic, Warsow or UrbanTerror.
As you can see biggest difference (33%) is in Xonotic, and lowest in OpenArena (likely RAM isn't limiting this game). DDR3/1333MHz scores 114 FPS, while dual channel DDR4 153 FPS and dual channel DDR3 145 FPS. It looks like key element is the memory bandwidth. The games used less than 4GB but additional RAM even not in dual channel could play a role too.
The low level APITEST testing OpenGL4 performance piece by piece in most benchmarks shows the same order. Only few aren't affected by used RAM.
Next set of benchmarks are on in memory operations like file compression or compilation:
BZIP2 has the biggest spread of results - 15% (9 seconds vs 7,83 seconds). In other benchmarks the difference is lower. The order is also the same as in games.
It's also interesting to check RAM-heavy server applications performance - like databases, HTTP server, cache and data stores.
Here results are vastly different. Nginx performance was solid across all tested RAM configurations. Apache favoured by 5% the 1866MHz/CL10 RAM with lowest latency among tested sets. Redis results weren't very stable so it's hard to tell which is better than other. In case of PostgresSQL there was some spread of results but still it looks comparable. For some reason dual channel memory got worst performance and the best-worst difference is big 87%. This may be related to the database configuration, which likely wasn't tuned for given hardware setup. Actual production performance with proper configuration could vary a lot from those results.
In case of desktop use current DDR4 performs equally as modern DDR3 memory. Even older 1333MHz DIMMs can be top performers if used in dual channel. Latests DDR4 can also give few FPS more but it's rather not worth the upgrade if you have one of the latest CPUs.
For server performance I guess proper settings are the key to proper results, but some applications may actually favour latency over bandwidth for performance.
I didn't tested integrated graphics performance, which is quite noticeably related to RAM performance, as tested many times on phoronix.com.