Memory Scaling on Ryzen 7 with Team Group's Night Hawk RGBby Ian Cutress & Gavin Bonshor on September 27, 2017 11:05 AM EST
A large number of column inches have been put towards describing and explaining AMD's new underlying scalable interconnect: the Infinity Fabric. A superset of HyperTransport, this interconnect is designed to enable both the CPUs and GPUs from AMD to communicate quickly, at high bandwidth, low latency, and with low power with the ability to scale out to large systems. One of the results of the implementation of Infinity Fabric on the processor side is that it runs at the frequency of the DRAM in the system, with a secondary potential uplift in performance when using faster memory. The debate between enthusiasts, consumers and the general populous in regards to Ryzen's memory performance and has been an ever-raging topic since the AGESA 188.8.131.52 BIOS updates were introduced several weeks ago. We dedicated some time to test the effect of high-performance memory on Ryzen using Team Group's latest Night Hawk RGB memory.
Memory Scaling on Ryzen 7: AMD's Infinity Fabric
Typically overlooked by many when outlining components for a new system, memory can a key role in system operation. For the last ten years, memory performance for consumers has been generally inconseqential on memory speed: we tested this for DDR3 for Haswell and DDR4 for Haswell-E, and two major conclusions came out of that testing:
- As long as a user buys something above the bargain basement specification, performance is better than the worst,
- Performance tapers to a point with memory, very quickly hitting large price increases for little gain,
- The only major performance gain that scales comes from integrated gaming
So it is perhaps not surprising to read in forums that the general pervasive commentary is that “memory speed over DDR4-2400 does not matter and is a con by manufacturers”. This has the potential to change with AMD's Infinity Fabric, where the interconnect speed between sets of cores is directly linked with the memory speed. For any workload that transfers data between cores or out to main memory, the speed of the Infinity Fabric can potentially directly influence the performance. Despite the fact that pure speed isn’t always the ‘be all and end all’ of establishing performance gains, it has the potential to provide some gains with this new interconnect design.
The Infinity Fabric (hereafter shortened to IF) consists of two fabric planes: the Scalable Control Fabric (SCF) and the Scalable Data Fabric (SDF).
The SCF is all about control: power management, remote management and security and IO. Essentially when data has to flow to different elements of the processor other than main memory, the SCF is in control.
The SDF is where main memory access comes into play. There's still management here - being able to organize buffers and queues in order of priority assists with latency, and the organization also relies on a speedy implementaiton. The slide below is aimed more towards the IF implementation in AMD's server products, such as power control on individual memory channels, but still relevant to accelerating consumer workflow.
AMD's goal with IF was to develop an interconnect that could scale beyond CPUs, groups of CPUs, and GPUs. In the EPYC server product line, IF connects not only cores within the same piece of silicon, but silicon within the same processor and also processor to processor. Two important factors come into the design here: power (usually measured in energy per bit transferred) and bandwidth.
The bandwidth of the IF is designed to match the bandwidth of each channel of main memory, creating a solution that should potentially be unified without resorting to large buffers or delays.
Discussing IF in the server context is a bit beyond the scope of what we are testing in this article, but the point we're trying to get across is that IF was built with a wide scope of products in mind. On the consumer platform, while IF isn't necessarily used to such a large degree as in server, the potential for the speed of IF to affect performance is just as high.
AGESA 184.108.40.206 (aka AGESA 1006) and Memory Support
At the time of the launch of Ryzen, a number of industry sources privately disclosed to us that the platform side of the product line was rushed. There was little time to do full DRAM compatibility lists, even with standard memory kits in the marketplace, and this lead to a few issues for early adopters to try and get matching kits that worked well without some tweaking. Within a few weeks this was ironed out when the memory vendors and motherboard vendors had time to test and adjust their firmware.
Overriding this was a lower than expected level of DRAM frequency support. During the launch, AMD had promized that Ryzen would be compatible with high speed memory, however reviewers and customers were having issues with higher speed memory kits (3200 MT/s and above) . These issues have been addressed via a wave of motherboard BIOS updates built upon an updated version of the AGESA (AMD Generic Encapsulated Software Architecture), specifically up to version 220.127.116.11.
Given that the Ryzen platform itself has matured over the last couple of months, now is the time for a quick test on the scalability on AMDs Zen architecture to see if performance can scale consistency with raw memory frequency, or if any performance gains are achieved at all. For this testing we are using Team Group's latest Night Hawk RGB memory kit at several different memory straps under our shorter CPU and CPU gaming benchmark suites.
- The AMD Zen and Ryzen 7 Review: A Deep Dive on 1800X, 1700X and 1700
- The AMD Ryzen 5 1600X vs Core i5 Review: Twelve Threads vs Four at $250
- The AMD Ryzen 3 1300X and Ryzen 3 1200 CPU Review: Zen on a Budget
- The AMD Ryzen Threadripper 1950X and 1920X: CPUs on Steroids
- Retesting AMD Ryzen Threadripper’s Game Mode: Halving Cores for More Performance
Post Your CommentPlease log in or sign up to comment.
View All Comments
lyssword - Friday, September 29, 2017 - linkSeems these tests are GPU-limited (gtx 980 is about 1060-6gb) thus may not show true gains if you had something like 1080ti, and also not the most demanding cpu-wise except maybe warhammer and ashes
Alexvrb - Sunday, October 1, 2017 - linkSome of the regressions don't make sense. Did you double-check timings at every frequency setting, perhaps also with Ryzen Master software (the newer versions don't require HPET either IIRC)? I've read on a couple of forums where above certain frequencies, the BIOS would bump some timings regardless of what you selected. Not sure if that only affects certain AGESA/BIOS revisions and if it was only certain board manufacturers (bug) or widespread. That could reduce/reverse gains made by increasing frequency, depending on the software.
Still, there is definitely evidence that raising memory frequency enables decent performance scaling, for situations where the IF gets hammered.
ajlueke - Friday, October 6, 2017 - linkAs others have mentioned here, it is often extremely useful to employ modern game benchmarks that will report CPU results regardless of GPU bottlenecks. Case in point, I ran a similar test to this back in June utilizing the Gears of War 4 benchmark. I chose it primarily because the benchmark with display CPU (game) and CPU (render) fps regardless of GPU frames generated.
At least in Gears of War 4, the memory scaling on the CPU style was substantial. But to be fair, I was GPU bound in all of these tests, so my observed fps would have been identical every time.
Really curious if my results would be replicated in Gears 4 with the hardware in this article? That would be great to see.
farmergann - Wednesday, October 11, 2017 - linkFor gaming, wouldn't it be more illuminating to look at frame-time variance and CPU induced minimums to get a better idea of the true benefit of the faster ram?
JasonMZW20 - Tuesday, November 7, 2017 - linkI'd like to see some tests where lower subtimings were used on say 3066 and 3200, versus higher subtimings at the same speeds (more speeds would be nice, but it'd take too much time). I'd think gaming is more affected by latency, since they're computing and transferring datasets immediately.
I run my Corsair 3200 Vengeance kit (Hynix ICs) at 3066 using 14-15-15-34-54-1T at 1.44v. The higher voltage is to account for tighter subtimings elsewhere, but I've tested just 14-15-15-34-54-1T (auto timings for the rest) in Memtest86 at 1.40v and it threw 0 errors after about 12 hours. Geardown mode disabled.