Product and service reviews are conducted independently by our editorial team, but we sometimes make money when you click on links. Learn more.
 

Test System And Benchmarks

Nvidia Quadro M6000 Review
By

Test System

CPU
Intel Core i7-4930K
Cooler
Be Quiet! Dark Rock Pro 3
Motherboard
Asus Rampage IV
Memory
64GB Corsar Dominator Platinum
Storage
2 x Samsung 850 Pro (256GB system drive and 256GB app/benchmark drive)
Operating System
Windows 7 Ultimate SP1 (64-bit)
Drivers
Quadro 374.52
Catalyst Pro 8.01.01.1423
Power Consumption Measurement
Contactless DC measurement onPCIe slot (riser card)
Contactless DC measurement on the external PCIe power supply
Direct voltage measurement on the power supply

2x HAMEG HMO 3054, 500MHz multi-channel oscilloscope
4x HAMEG HZO50 current clamp adapter (1mA to 30A, 100KHz, DC)
4x HAMEG HZ355 (10:1 probe, 500MHz)
Infrared Measurement
1x Optris PI450 80Hz High-Res Infrared Camera + PI Connect

AutoCAD 2015: 2D And 3D Performance

Autodesk switched over to Direct3D and away from OpenGL across almost all of its product line. That makes our story simultaneously simpler and more complex at the same time.

On one hand, special optimizations aren't as necessary. Moreover, consumer-oriented graphic cards perform equally well with the conventional GeForce or Catalyst drivers. On the other hand, differentiating between boards becomes more difficult, since 2D performance is similar up and down the line. Each tested card performs equally well (or badly), so our bar charts don't tell any crazy stories.

The differences are more pronounced when we start looking at 3D performance, although it must be said that all of the cards perform well. We're basically seeing that even the lower-end boards are perfectly adequate in this application; you don't always need a massive hammer for driving nails. All of the Cadalyst suite's workloads would have run in a similarly useful manner on each tested card.

Inventor Fusion 2013

We now start to see how 3D performance shakes out in a more taxing 3D metric. Even still, in applications like this you hardly need the power of Nvidia's Quadro M6000.

Wall Element Management

We purposely looked to a piece of software able to better utilize graphics resources. The program we selected is not for the masses; it’s used in the construction industry for wall element management. Optimization of the wall structure with single stones and taking overlap dimensions, connections and custom shapes into consideration is one side of the coin. Additionally, you have the representation of complex formations with countless stones in the walls of a floor plan posing a real challenge, especially since programs like this still use DirectX 8 or 9. So, we’re actually talking about raw performance here; specific driver optimizations are rarely effective.

2D Performance On GDI/GDI+

Today, almost all GUIs are rendered entirely using the Graphics Device Interface. Interestingly, though, in the past most of these API functions were directly supported by GPUs. But with the introduction of unified shader architectures, the approach of directly and separately implementing special 2D functions in hardware was already dying. Windows Vista finished that concept off by introducing a new driver model that effectively sent all functions (with just a few exceptions) to the D3D interface.

This “detour” still had to be optimally built into the drivers, though. Otherwise, we would have quickly found ourselves wondering why overloaded program interfaces were refreshing so slowly, while desktop and 3D content ran smoothly. 

In this context, we refer you to the detailed description of our benchmarks in the five-year-old performance experiment called 2D, Acceleration, and Windows: Aren’t All Graphic Cards Equal? By the way, this situation is hardly different, even today, though graphics processor vendors did increase performance after our article was published.

We decided on the following benchmarks for direct output to the device, not for the temporary “preparatory work” in a DIB (device-independent bitmap in the memory), together with subsequent copying.

Together with the CPU-dependent stretching, blitting is the only remaining function that can still be realized directly and not via D3D.

We stopped offering our benchmark as a freely available download a while back because the results depend too heavily on host processing; comparisons between systems just don't make sense. Standardizing on one platform is the only way to generate meaningful results moving forward.

SPECviewperf 12

The latest edition of SPECviewperf includes the source code of various professional-class applications, along with corresponding workloads. It's a collection of tests that most of us wouldn't be able to put together on our own, if only because the software licenses are so expensive, plus of course the work involved in creating representative sample files. So, this sensible comparison benchmark is actually a good barometer of where graphics card development is going and where it is now.

If you aren't already familiar with SPECviewperf 12, check out Workstation Graphics: 19 Cards Tested in SPECviewperf 12, where we present the individual benchmarks in detail.



Nvidia's Quadro M6000 surfaces as the yardstick by which all other cards are measured in SPECviewperf 12. Its clear dominance in many of the tests is definitely impressive. But let's not forget that AMD's GCN architecture is getting older, too.

Standard benchmarks run the risk of targeted optimization over time. And we see this as a prominent issue for such a prolific tool as SPECviewperf. As the next page shows, we get our proof in SolidWorks 2013 SP1 and the corresponding SPECapc benchmark.

Solidworks 2013 SP1

While the code snippet from SolidWorks 2013 SP1 contained in SPECviewperf 12 runs on gaming cards, the benchmark sequence in SPECapc is much more extensive. You really need certified workstation graphic cards and their matching drivers. Because of this, we were excited to compare the results from both metrics. Lo and behold, while Nvidia's Quadro M6000 did well in SPECviewperf 12, its SPECapc results are completely different.

We can only assume that Nvidia was under significant time constraints, because we see absolutely no trace of driver optimization. To make matters worse, no SolidWorks certification seems to be available. Typically, this answers the question of why workstation-class graphics cards are so expensive. There's usually a lot of optimization work that goes into their software, which costs a lot of money. It's just not evident for such a fresh piece of hardware yet.

We are sure that Nvidia will address this oversight quickly. AMD is also working hard on enhanced drivers, so we’ll repeat this test later and only then draw more sweeping conclusions.

3ds Max 2013 And Iray

Our render scene is deliberately kept simple, creating an ideal compromise for testing high- and lower-end workstation-class cards. As it turns out, the Quadro M6000's performance is slightly better than the older K6000, though we still see room for optimization.

Octane 2.7

In the case of Octane, the difference is about the same. We're using the latest version for our benchmarks, so data generated for previous stories isn't comparable.

Blender 2.73

Again, this is the latest version of Blender, so don't bother comparing the results of other cards tested previously. We deliberately chose a tile size of 256x256 pixels, which proves optimal for GPU-based rendering.

The delta between cards remains almost unchanged, so we ventured a guess at the ultimate victor based on an emerging trend. As you might imagine, Nvidia's Quadro M6000 comes out the winner.

CUDA: FluidMark

Granted, synthetic metrics are of limited utility, but this one's outcome reflects what we've already seen across the other tests: one card dominates the rest.

OpenCL vs. CUDA: ratGPU

The better implementation of ratGPU's CUDA code path isn't the only reason Nvidia's Quadro M6000 scores so highly.

OpenCL Rendering: LuxMark 2.0

Across all three levels of complexity, the gap between AMD's once-unbeatable FirePro W9100 and the new M6000 is like an abyss. The Hawaii-based card used to be the yardstick, but Nvidia's Maxwell architecture now lands at the top of the OpenCL food chain.

OpenCL: Single-Precision (FP32)

On paper, the peak single-precision compute performance of Nvidia's Quadro M6000 and K6000 isn't this large. However, the company did put a lot of work into maximizing the utilization of its SMM units, which could be what we're seeing in practice here. Depending on the workload, M6000's performance is enough to smoke the Kepler-based board, but it can't consistently trump AMD's GCN architecture.

OpenCL: Double Accuracy (FP64)

The Quadro M6000 does a lot, but it can’t do everything. Double-precision math is Maxwell's biggest weakness, at least as it's exposed in GM200. You only get 1/32 of the FP32 rate, yielding downright poor performance. Nvidia would rather see you buy one of its pricey Tesla cards if you need big FP64 numbers.