Pedal to the Metal - NVIDIA has something that could be nothing short of revolutionary to creative users: the Quadro CX chipset. The Quadro CX includes a technology called CUDA, which stands for Compute Unified Device Architecture. Basically, the idea is...
Pedal to the Metal
Editor's note: NVIDIA's Quadro CX card, like other Quadro cards, is designed and built by NVIDIA. The intent is to guarantee consistency, performance and value-even when marketed by third-party partners. This strategy is in contrast to NVIDIA's GeForce line of cards, where NVIDIA's partners build their own cards based around NVIDIA's chipsets. The board that we tested was a sample provided to us directly by NVIDIA. Any Quadro CX-based video card will have similar performance characteristics to the card reviewed here.
NVIDIA has something that could be nothing short of revolutionary to creative users: the Quadro CX chipset. The Quadro CX includes a technology called CUDA, which stands for Compute Unified Device Architecture. Basically, the idea is to offload the heaviest number-crunching from the CPU onto the GPU (graphics processing unit). The GPU itself is comprised of 192 individual cores.
NVIDIA markets the chipset as an accelerator for Adobe Premiere Pro/Photoshop/After Effects CS4, TMPGEnc 4.0 XPress and a lot of other apps (check out NVIDIA's CUDA zone at www.nvidia.com/object/cuda_home.html for more details).
Hardware-wise, the card is thick and huge (mainly since it includes a huge fan). It's a PCI Express x16 card that requires attachment of an additional 6-pin power connection. There's also an SLI connection - if you're made of money, you could buy two of these bad boys and tie them together to potentially get through even more work.
Under the casing of the card, there is 1.5GB of memory onboard (yes, that's right - more memory than you would find in your basic entry-level desktop computer).
The card includes a DVI port, two DisplayLink ports (they look like HDMI ports, but they're completely different) and a 3-pin port for attaching stereoscopic glasses. We think two DVI ports would be a lot more useful - we would be surprised if DisplayLink takes off. (Yes, Mac users, we know that Apple has put them in the newest MacBooks and MacBook Pros. We read all about it.)
There's also a daughtercard available that can provide SDI output for After Effects and Premiere Pro, in particular. SDI also provides for 30-bit color, if you need color depths that great.
We tested the Quadro CX in an iBUYPOWER system running Windows Vista Home Premium 64-bit. The machine runs an Intel Core 2 Quad Q9550 at 2.83GHz and includes 8GB of RAM, so it's no slouch in the performance department, by any means.
Testing: Adobe CS4
NVIDIA claims a 4X encode speed advantage for H.264 rendering out of Premiere Pro (well, technically, Adobe Media Encoder), using Elemental Technologies' RapiHD (quoted with an Intel Core 2 Duo running at 2.33GHz). Using the above configuration, however, we found that the Quadro CX provided a speed advantage of about 2X. We wonder if there would be any encoding speed advantage at all if we tested on an 8-core system. However, NVIDIA points out that you can further leverage the power of CUDA if you find yourself needing to do tasks like video upsampling for full HD.
After Effects really shines when it comes to realtime previews of your comps. The handling of bilateral blur effects was remarkable - while one particularly advanced clip previewed at a speed far less than real time, giving us preview speeds around 5fps, turning off CUDA meant our speed plummeted to about 0.33fps. The turbulent noise filter, which can create gorgeous fire and water effects, had more dramatic results - the previews with CUDA were almost always real time (surely smooth enough to pass for real time even when After Effects reported that it was occasionally not in real time), but turning off CUDA made the preview crawl. Cartoon effects were also noticeably faster, particularly once enough frames were rendered for the system to use RAM previews.
We were very impressed with the added features for Photoshop CS4. We loaded in a 720MB sample file and were able to zoom in and out smoothly and seamlessly. Rotating a huge image can be done in real time (every image we could possibly throw at it could be manipulated without a problem at all). It really shines if you use DNG (digital negatives) or RAW files. If you are a serious Photoshop or After Effects user, you'll want this card - no doubt about it.
Testing: TMPGEnc 4.0 XPress
TMPGEnc 4.0 XPress is one of the best video encoders out there. If you've never tried it, the trial is very much worth playing with. We were excited to try out the software with some HD clips we had on hand from the Videomaker Short Video Contest.
We performed previous encoding of contest clips on a very old machine: a 2.66GHz P4 with a 533MHz front-side bus. The software could slog through some of the HD clips we fed it, but not everything. We set this machine free on those clips, which included two WMV-HD files, an MPEG-2 HD file and two QuickTime HD files.
At first, the render speed blew us away. Between the much-faster hardware we were using for our testing and the new GPU, an H.264 (MPEG-4 AVC) HD 2-pass variable bitrate render was faster by a factor of 6 (e.g., a render that took 3 hours on the previous hardware now took about 30 minutes). The analysis pass was much faster than the encode pass.
Interestingly, we found that the CPU usage stays relatively high during an encode, across all four CPU cores. We found that with CUDA turned on, the GPU performed 92% of the work, but the main CPU still did about 8% of the work.
However, our testing revealed something very interesting. The performance advantage turned out to be somewhat minimal; in fact, on the WMV files we tested with, the encode was a little faster when CUDA was turned off. On the QuickTime clips, having CUDA turned on caused the test to fail.
We think CUDA is a really great idea in theory. However, if you have a quad- or 8-core system of recent vintage and aren't doing much of anything other than Premiere Pro, the speed advantages provided by the card might not make it a worthwhile add-on to your system. The $1,999 retail price and $1,799 street price are both very high. Also consider that this much money could certainly buy some or all of a new quad- or 8-core workstation.
While the hardware is very stable and undeniably cool, it's not for everyone. You have to decide whether the benefit is worth the huge price tag that you'd pay now. (Just wait a few months, though, and it'll probably get cheaper.) If you're working on jobs that have to get done now, particularly those that are heavy in After Effects or Photoshop, it's worth a very hard look.
Premiere Pro Render Times: 1440x1080, 23.976fps, 1-pass VBR CPU: 0:01:07; GPU: 0:00:3
TMPGEnc 4.0 XPressRender Times: 007 (1080 MPEG) 28:10 w/ hardware, 32:35 w/o; 069 (720 WMV) 38:53 w/ hardware, 38:21 w/o; 070 (720 WMV) 46:01 w/ hardware, 45:45 w/o; 072 (1080 QT) would not encode w/ hardware, 1:39:01 w/o
- Solid hardware
- Massive Photoshop performance gains
- Extremely expensive
- Performance gains less noticeable for video encoding
Whether the performance gains will be worth it to you depends on what you need to do and how much money you have just lying around.
Charles Fulton is Videomaker's Technical Editor.
2701 San Tomas Expy.
Santa Clara, CA 95050