by Kevin Goldsmith
There has been a lot of confusion around Pixel Bender and GPUs in CS4. Admittedly, some of it was caused by me :). I wanted to do a clarifying post about GPU, Pixel Bender, and multi-core and how apps in CS4 do different things.
One thing I wanted to correct is the assumption that GPU = FASTER. I’ve seen this misconception a lot, and I think it is confusing some people. The chips on graphics cards (GPUs) are extremely efficient processors capable of doing lots of math in parallel and have the benefits of fast local memory with a super fast connection to the processor. This makes them ideal for the kinds of things that Pixel Bender does. However, this super-efficient processor is connected to the main computer processor by a not-so-fast connection, the bus. Moving data on and off of the GPU is expensive relative to doing things on the GPU directly. What this means is that if you want to do something on the CPU with some data then do something on the GPU and then use the output of the GPU on the CPU again, it might be more expensive than having just done the whole thing on the CPU in the first place. The overhead of the bus transfers can overwhelm the benefits of the fast GPU computation. The busses are getting faster, and when things will work better in one place vs. another is very different from machine to machine. There are a ton of other details I’m glossing over. I’m just trying to make a central point here: that the GPU is not always faster than the CPU.
Pixel Bender is designed to run very efficiently on the GPU, but that design also allows it to execute extremely efficiently on a multi-core CPU. In Flash Player 10, Pixel Bender does not run on the GPU, it does run multi-threaded and executes really fast, especially on multi-core and multi-processor chips (see Tinic’s post for more info). The Flash team really has done an outstanding job with their JITter and their multi-threading and Pixel Bender runs pretty darn fast on every machine I’ve tried (from a lowly single core based laptop to an 8-core Penryn MacPro).
In After Effects CS4, all the OpenGL effects including the new Cartoon Effect, Turbulent Noise, and Bilateral Blur effects are written in Pixel Bender and can run on the GPU or CPU. When don’t they run on the GPU? When you have a non-GPU effect following them in the effects chain on the layer. In those cases, it isn’t clear if you would have a performance gain by running on the GPU. Cartoon is the exception. The algorithm is complex enough that AE assumes it is always faster on the GPU. All 3rd party Pixel Bender filters run multi-threaded on the CPU. This was an architectural decision.
In the Photoshop plug-in, Pixel Bender filters always run on the GPU if you have a graphics card that is supported by CS4. In other cases, the filters run multi-core. The new canvas rotate-pan-and-zoom and the gigantor image support are all done using the GPU. John Nack has lots of details on his blog. One thing I wanted to correct about Photoshop CS4: it is not using CUDA. Not sure how this rumour got out there, but it isn’t true. Not that we aren’t fans of CUDA, we just aren’t shipping anything that uses it in CS4.
There are other apps in CS4 with GPU support, but I wanted to keep this post to the ones that support Pixel Bender, just to clear up the confusion.