Windows 7 Application Performance

3dsmax 9

Today's desktop processors are more than fast enough to do professional level 3D rendering at home. To look at performance under 3dsmax we ran the SPECapc 3dsmax 8 benchmark (only the CPU rendering tests) under 3dsmax 9 SP1. The results reported are the rendering composite scores.

3dsmax r9 - SPECapc 3dsmax 8 CPU Test

Cinebench 11.5

Created by the Cinema 4D folks we have Cinebench, a popular 3D rendering benchmark that gives us both single and multi-threaded 3D rendering results.

Cinebench 11.5 - Single Threaded

With only a 100MHz clock speed advantage over a 2600K when running in single core turbo mode, the 3820 isn't much faster than the 2600K in our single threaded Cinebench test. The additional L3 cache doesn't have much of an impact here, although I suspect that has more to do with this particular workload rather than a general statement about the 3820. Let's look at multithreaded perf:

Cinebench 11.5 - Multi-Threaded

The performance gap increases to 5% once we ramp up thread count. The extra performance is mostly due to clock speed here, although you'll see later on that there are some applications that definitely appreciate the larger L3 cache.

7-Zip Benchmark

While Cinebench shows us multithreaded floating point performance, the 7-zip benchmark gives us an indication of multithreaded integer performance:

7-zip Benchmark

The 7-zip benchmark gives us a good example of what the SNB-E platform can offer given the right workload. Here we see an 8.6% performance advantage, despite a much smaller clock speed advantage. The added L3 cache helps out a bit here, although obviously there's a huge gap between the 3820 and its hexa-core brethren.

PAR2 Benchmark

Par2 is an application used for reconstructing downloaded archives. It can generate parity data from a given archive and later use it to recover the archive

Chuchusoft took the source code of par2cmdline 0.4 and parallelized it using Intel’s Threading Building Blocks 2.1. The result is a version of par2cmdline that can spawn multiple threads to repair par2 archives. For this test we took a 708MB archive, corrupted nearly 60MB of it, and used the multithreaded par2cmdline to recover it. The scores reported are the repair and recover time in seconds.

Par2 - Multi-Threaded par2cmdline 0.4

In tests that have more of an IO influence the difference between the 3820 and the 2600K is negligible, it will take higher clock speeds and more cores to really separate SNB-E from the vanilla SNB systems.

TrueCrypt Benchmark

TrueCrypt is a very popular encryption package that offers full AES-NI support. The application also features a built-in encryption benchmark that we can use to measure CPU performance:

AES-128 Performance - TrueCrypt 7.1 Benchmark

Encryption speed once again scales with core count and clock speeds, the additional L3 cache doesn't do much in this benchmark.

x264 HD 3.03 Benchmark

Graysky's x264 HD test uses x264 to encode a 4Mbps 720p MPEG-2 source. The focus here is on quality rather than speed, thus the benchmark uses a 2-pass encode and reports the average frame rate in each pass.

x264 HD Benchmark - 1st pass - v3.03

We see a slight advantage over the 2600K in our x264 HD benchmark, however video transcoding doesn't benefit all that much from the small gains the 3820 offers. Most client users would be better off with the Quick Sync enabled 2600K, and the serious video professionals will want to invest in a six-core 3930K at the minimum.

x264 HD Benchmark - 2nd pass - v3.03

Compile Chromium Test

You guys asked for it and finally I have something I feel is a good software build test. Using Visual Studio 2008 I'm compiling Chromium. It's a pretty huge project that takes over forty minutes to compile from the command line on the Core i3 2100. But the results are repeatable and the compile process will stress all 12 threads at 100% for almost the entire time on a 980X so it works for me.

Build Chromium Project - Visual Studio 2008

Again we see a step function improvement when moving from four to six cores in our compile test, but no change between the 2600K and 3820. If you're building a dev workstation you're going to either want to save money and grab a 2600K or move to six cores for better performance. It is worth mentioning however that if you need eight DIMM slots the 3820 might be a better option than the 2600K, allowing you to outfit your workstation with insane amounts of memory.

Excel Monte Carlo

Microsoft Excel 2007 SP1 - Monte Carlo Simulation

Our Monte Carlo simulation test is CPU bound but the 3820 shows a marginal improvement over the 2600K.

SYSMark 2007 & 2012

Although not the best indication of overall system performance, the SYSMark suites do give us a good idea of lighter workloads than we're used to testing. SYSMark 2007 is a better indication of low thread count performance, although 2012 isn't tremendously better in that regard.

In 2007 we see mild gains over the 2600K, although 2012 shows a much bigger gap between the 3820 and the 2500K due to the former's support for 8 threads vs. 4.

SYSMark 2007 - Overall

SYSMark 2012 - Overall

The Chip & Overclocking Gaming Performance
Comments Locked


View All Comments

  • horangl3e - Thursday, December 29, 2011 - link

    So if a user were to buy two graphics cards, the gtx 580 for example, would it be more beneficial for that user to use the X79 platform instead of the Z68? Currently I believe that the user would have to split between x8 x8 for each graphics card if they used the non X79 platform but with the X79 both cards would be able to use the proper x16 bandwith for both graphics cards right?
  • chizow - Thursday, December 29, 2011 - link

    There's not much benefit when using 2xPCIE 2.0 single-GPU cards with PCIE 2.0 x8 slots. They just don't need that much bandwidth.

    With multi-GPU cards however, that PCIE 2.0 x8 starts to choke them a bit and x16 starts showing its benefits.

    Another thing to keep in mind too is that with X79 (and IB?) they support PCIE 3.0, so with PCIE 3.0 cards, PCIE 3.0 x8 is the equivalent of PCIE 2.0 x16 in terms of bandwidth. Should be beneficial for multi-card solutions especially with multi-GPU cards.

    I'm assuming Ivy Bridge will also be PCIE 3.0 compliant, but if not, X79 might be even more appealing to people looking to buy the next-gen GPU offerings from AMD/Nvidia.

    Also, Anand or anyone else, since the PCIE controllers are on the CPU dies now, is it possible for SB to support PCIE 3.0 as well? Or are they just too different?
  • B3an - Thursday, December 29, 2011 - link

    Sandy Bridge could never support PCI-E 3.0 without a pretty major revision to the CPU's. Even then i'm not sure if the motherboards would actually work with it.

    If i was buying a quad core right now i would actually go for this i7 3820 over a 2500 - 2700K. Simply because it's more future proof for graphics cards. While no card needs more than x16 PCI-E 2.0 right now for games they certainly will in the future. Plus with SNB-E you can run two cards at x16 @ PCI-3.0 speed, but with SB it's only x8 @ 2.0 speed which already takes a slight performance hit with current cards. The graphics upgrade path for SNB-E will last for years to come.
  • Tchamber - Thursday, December 29, 2011 - link

    I have a Gulftown i7, and doesn't it support PCIe at 16x16? I didn't know Intel took a step back with SNB and PCIe lanes for multi GPUs. Do I understand this right?
  • SlyNine - Thursday, December 29, 2011 - link

    I believe with Gulftown the PCI-E Controller is not based on the CPU. Thats why you can get 16x16x8 with them.
  • SlyNine - Thursday, December 29, 2011 - link

    By based of course I mean located.
  • DanNeely - Thursday, December 29, 2011 - link

    LGA1366 supports 32 PCIe 2.0 lanes. That is more than LGA 1155's 16, but LGA 1155 isn't the official successor to 1366 (even if SB is fast enough it's quads beat 1366 hexes on many benches); it's the LGA 1156 replacement and 11556 also only has 16 2.0 lanes. LGA 2011 replaces LGA 1366 (and LGA 1567 on high end Xeons); and is a major PCIe upgrade.
  • dgingeri - Thursday, December 29, 2011 - link

    Actually, the x58 chipset supports 36 lanes from the northbridge and 6 more form the southbridge:

    This allows three slots using x16/x16/x4 with 6 more for expansion devices or x16/x8/x8 with 10 more for expansion devices. (My P6T is the former while my Rampage III Formula is the latter. Yes, I have 2 lga1366 systems, one a server and one a gaming machine.)

    The 40 lanes from the processor and the 8 lanes from the chipset will be a big boost for using certain devices. I'll be able to run both my video cards at x16 while still using my 10Gbe and LSI RAID cards with x8 slots. :)
  • chizow - Thursday, December 29, 2011 - link

    SNB/P67/Z68 was the successor to the short-lived Lynnfield/Clarkdale P55-based platforms.

    SB-E/X79 is the direct successor to Nehalem/Gulftown/X58.

    Intel has just had the luxury of pushing back the release dates of their parts because no matter what they sell, its faster than AMD and still netting them boatloads of cash.

    So yes, for some time Intel's leading CPUs have been behind on the platform side of things, SB-E settles the balance and represents an upgrade over Nehalem in every aspect.

    This will change again however with IVB, which will be on a smaller process node and probably swing the clockspeed, Turbo, overclocking, and power consumption considerations back in favor of the weaker P67/Z68 platform.

    The main difference however is that IVB will also support PCIE 3.0 so the fewer PCIE lanes will be less of a disadvantage on cards that support PCIE 3.0 when used in multi-GPU configs.
  • Kevin G - Thursday, December 29, 2011 - link

    Ivy Bridge will be supported on current socket 1155 motherboards. It'll bring PCI-E 3.0 but currently motherboards are going to be hit or miss if they support that speed. Intel won't officially support PCI-E 3.0 withe P67/Z68 (and related) chipsets but motherboard manufacturers can take that burden if they choose.

    The main reason to go with socket 2011 isn't a single GPU but rather running multiple GPU's that'll need that bandwidth. For gaming, the performance difference is only a few percentage points. For GPGPU, the difference is greater but if that is the target market, then using a multi-socket 2011 motherboard populated with Xeons is more likely. That'd allow for four PCI-E 16x and two PCI-E 8x lanes all at 3.0 speeds.

    For the majority of consumers, socket 1155 will remain good enough for 2012.

Log in

Don't have an account? Sign up now