by Kevin Goldsmith

Created

August 19, 2009

I’ve been doing some playing around with processing audio using Pixel Bender in Flash and I realized that it was hard to find some working code to get started with. So i wrote up this sample. I tried to do a minimal app that actually did something interesting and would be a start for someone else. To that end, this AIR app sample loads an MP3 file and then the embedded Pixel Bender kernel lets you change the level of the individual channels separately.

The MXML code is below:

All the action is in the ProcessAudio function, that pulls samples from the input file and executes a ShaderJob across them. There is something important to reference here:

effectShader.data["source"].width = BUFFER_SIZE / 1024;
effectShader.data["source"].height = 512;
effectShader.data["source"].input = shaderBuffer;
effectShader.data["volume"].value = [ leftSlider.value, righttSlider.value ];

var effectJob:ShaderJob = new ShaderJob( effectShader, event.data, BUFFER_SIZE / 1024, 512 );
effectJob.start(true);

I pass the buffer into the shader job as a 2D buffer instead of a buffer with a height of one. This may make less sense logically, but the Flash player breaks the data up by rows for multi-threading, so this should make that perform faster.

Here is the kernel:

One thing to notice here is that rather than using an image2 as input and a pixel2 as output (which may make more sense logically again), I instead just use the buffer layout and process 2 stereo samples at the same time. This should also give you better performance for filters that can do this.

Here are the files:

For more info, the following references might be helpful

COMMENTS

  • By Ben - 4:10 PM on August 9, 2010  

    How come the links to the files are broken. I would like to get the source files to learn more.

    Thanks

    • By Kevin Goldsmith - 1:18 AM on August 10, 2010  

      they switched us to new blogging software, and it looks like it broke the file links. I’ll try to get them fixed. Thanks!

  • By Kevin Goldsmith - 5:40 AM on August 10, 2010  

    links are fixed now.

  • By Nitin - 8:03 PM on August 23, 2010  

    I don’t understand why shaderBuffer.length = BUFFER_SIZE * 2 * 4
    What is the purpose of multiplying by 2 and 4 if the sound.extract function only takes BUFFER_SIZE amount of samples. This would like a lot of extra zeros in the shaderBuffer bytearray, but this doesn’t seem to cause an issue.

    Also, how was I don’t get how a height of 512 was determined for the shader. the number of rows should be equal to Buffer_Size / (buffer_size/1024) = 1024 . I feel like I’m missing something crucial here.

    • By Kevin Goldsmith - 12:19 AM on August 24, 2010  

      this honestly messed me up for a long time and I’m not 100% sure I have the correct answer for you. That calculation came from one of the referenced posts. Changing it results in errors when queuing the ShaderJob because the buffer size is incorrect. I don’t have a satisfying answer on how the audio buffer returned actually maps to the corresponding buffer size that I’m using in the ShaderJob. I think it has to do with the way that Flash lays out the audio data in the buffer, I don’t understand that nearly as well as I understand the imaging side. I’ll see if i can get a better answer and post a follow-up.

      • By Nitin - 3:00 PM on September 9, 2010  

        I think I may have a clue now. That 2 * 4 could be a conversion from number of samples to number of bytes.

        1 sample = 2 channels * (32-bit float per channel / 8 bits per byte)
        therefore,
        1 sample = 2 * 4 bytes

        ———————

        My second question:

        Assuming each pixel is returning 4 * 32-bit floats= 4 * 4 bytes= 16 bytes

        Our bytearray has( BUFFER_SIZE * 8) bytes in it. That divided by 16 = (BUFFER_SIZE/2) pixels
        The image size we give should also equal this number of pixels, i think:
        height * width = BUFFER_SIZE /1024 * 512 = BUFFER_SIZE * (512/1024) = BUFFER_SIZE/2

        That means our input bytes matches our output bytes.