Tech —

Bumpy road to multi-core: Ars reviews the 12-core 2010 Mac Pro

What has twelve cores, twenty four threads, a brand new GPU, and still doesn't …

Video benchmarks

Final Cut Pro

Not pretty. I made sure to disable pre-render in the app prefs so we're just seeing rendering for processing done during output. Nothing's cached, so the results are consistent. Looking at the CPU usage, it never went over 400%:

There are scenarios where core usage can reach 800%, if you're using multithreaded codecs like ProRes 422, for both input and output from Final Cut. I was told that the Picture-in-Picture feature of FCP also uses multiple-cores well. That's good to know, since it would affect video feedback if it wasn't multi-threaded. But even so, it's not saving people time for rendering footage, which is what's really needed here. The Smoothcam interpolation—a brutally slow operation—only used two threads:

People are going to be buying these with the impression that FCP will use more than a third of the processor power.

Compressor had an equally weak showing. Even using the Qmaster QuickCluster workaround, which forces each thread to work as a worker thread, the core usage was dismal. I tried a bunch of different codecs—ProRes 422, H.264, DVD MPEG—and all worked just as poorly with the multiple cores.

That Apple is still the weakest link with Final Cut Pro is discouraging. If ever there was a program that should have been in early reworking for Grand Central Dispatch, Final Cut Pro would be it. I'll remain optimistic and hope that better multicore support is in the next Final Cut Pro release. Still, sliding a multi-core rug under a behemoth of a video application is no easy undertaking. Just ask...

After Effects CS5

Another poor result here. For these video render tasks, I made sure to push CPU utilization, not disk bandwidth or compression. There are many 3D layers in this test, with some filters applied to some individual layers. This may seem like I was trying hard to hold AE's hand and help it across the multi-lane street, but this is a very typical After Effects project and workflow.

Still, that wasn't enough to get good results. AE is seeing similar results as Photoshop. It obviously still needs better threading for many cores. CPU utilization was pretty gross, but a tiny bit better than Final Cut Pro's:

The project was in SD resolution, and RAM allocation for both configurations was the same, so it's not the RAM that's the problem. Changing the multi-core settings in the app preferences made it slower, which was likely a RAM problem since each thread needs at least 750MB, adding up to 18GB and paging to virtual memory. I even tried to see if I could get this project to push the cores more by scaling the comp up to 1080p and then duplicating and offsetting some of the 3D layers so there were more thread opportunities and a lot more computation. No dice:

Hopefully, CS6 will bring some much-needed improvements to After Effects' core usage. This is the good thing about having a multi-core system: it's probably only going to get faster as programmers write more multi-threaded code. But Adobe's task is harder than Apple's is with Final Cut, because Adobe is writing cross-platform code. This is probably why Adobe and NVIDIA are scratching each other's backs so hard, with many cross-promotional efforts to promote CUDA GPU processing in Adobe products. With CUDA also being ported to x86 CPUs, the benefits will be spread to users without NVIDIA GPUs. It just won't be as fast on the CPU as it is on the GPU. Don't expect an OpenCL port of these features—Adobe has their logo on NVIDIA's Quadro tour bus for a reason. Before you scream bloody murder, you should know there are plenty of other high-end video apps that are also CUDA-only.

Nuke

Since I last reviewed the 2009 Mac Pro, I've learned how to use the 3D compositor Nuke, and it's the patron saint of multi-core video. Nuke has a few advantages: it doesn't have a ton of single-threaded legacy code holding it back, and its scanline renderer works a lot like a 3D rendering program. The test project for the benchmark is a common test scenario for Nuke: a few 32-bit EXR render passes composited over a live-rendered, UV-mapped, 3D OBJ mesh. The length of the Handbrake benchmark bar visually downplays the Nuke gains, but it's 25% faster on the 12-core Mac Pro. That's not a linear gain, but it's still significant.

Nuke defaults to using only physical CPU cores, but you can use a TCL command ("set threads 24") or command line argument to force it to use all hyper-threaded cores: ?

I stuck with the default 12 threads for the test; I'll explain why later on.

Handbrake CLI 64-bit

For the Handbrake benchmark, I took a section from my Nightmare Before Christmas Blu-ray and converted it to an iPad-compatible H.264 .m4v file. This was another wince-inducing video conversion. The results were predictable from what I've seen of Handbrake's CPU usage with the 8-core.

Channel Ars Technica