Dynamic sound with the
Web Audio API

Audio vs. Audio

Before there was the Web Audio API, HTML5 gave us the <audio> element. It might seem hard to remember now, but prior to the <audio> element, our best option for sound in a browser was a plugin! The <audio> element was, indeed, exciting but it had a pretty singular focus. It was essentially a video player without the video, good for long audio like music or a podcast, but ill-suited for the demands of gaming. We put up with (or found work arounds for): looping issues, concurrent sound limits, glitches and total lack of access to the sound data itself.

Fortunately, our patience has paid off. Where the <audio> element may have been lacking, the Web Audio API delivers. It's gives us unprecedented control over sound and it's perfect for everything from gaming to sophisticated sound editing. All this with a tidy API that's really fun to use and well supported.

Let's be a little more specific: Web Audio gives you access to the raw waveform data of a sound and lets you manipulate, anaylyze, distort or otherwise modify it. It is to audio what the canvas API is to pixels. You have deep and mostly unfettered access to the sound data. It's really powerful!

Flight Sounds

Even the earliest versions of Flight Simulator made efforts to recreate the feeling of flight using sound and one of the most important sounds is the dynamic pitch of the engine which changes pitch with the throttle. We knew that, as we reimagined the game for the web, a static engine noise would really seem flat, so the dynamic pitch of the engine noise was an obvious candidate for Web Audio.

Try it out here:

Original Audio
Adjusted for Engine Throttle

Less obvious (but possibly more interesting) was the voice of our flight instructor. In early iterations of Flight Arcade, we played the instructor's voice just as it had been recorded and it sounded like it was coming out of a sound booth! We noticed that we started referring to the voice as the "narrator" instead of the "instructor." Somehow that pristine sound broke the illusion of the game. It didn't seem right to have such perfect audio coming over the noisy sounds of the cockpit. So, in this case, we used Web Audio to apply some simple distortions to the voice instruction and enhance the realism of learning to fly!

There's a sample of the instructor audio at the end of the article. In the sections below, we'll give you a pretty detailed view of how we used the Web Audio API to create these sounds.

Using the API: AudioContext and Audio Sources

The first step in any Web Audio project is to create an AudioContext object. Some browsers (including Chrome) still require this API to be prefixed, so the code looks like this:


var AudioContext = window.AudioContext || window.webkitAudioContext;
var audioCtx = new AudioContext();

Then you need a sound. You can actually generate sounds from scratch with the Web Audio API, but for our purposes we wanted to load a prerecorded audio source. If you already had an HTML <audio> element, you could use that but a lot of times you won't. After all, who needs an <audio> element if you've got Web Audio? Most commonly, you'll just 'download the audio directly into a buffer with an http request:


var request = new XMLHttpRequest();
request.open("GET", url, true);
request.responseType = "arraybuffer";

var loader = this;

request.onload = function() {
    loader.context.decodeAudioData(
        request.response,
        function(buffer) {
            if (!buffer) {
                console.log('error decoding file data: ' + url);
                return;
            }
            loader.bufferList[index] = buffer;
            if (++loader.loadCount === loader.urlList.length){
                loader.onload(loader.bufferList);
            }
        },
        function(error) {
            console.error('decodeAudioData error', error);
        }
    );
};

Now we have an AudioContext and some audio data. Next step is to get these things working together. For that, we need...

AudioNodes

Just about everything you do with Web Audio happens via some kind of AudioNode and they come in many different flavors: some nodes are used as audio sources, some as audio outputs and some as audio processors or analyzers. You can chain them together to do interesting things.

You might think of the AudioContext as a sort of sound stage. The various instruments, amplifiers and speakers that it contains would all be different types of AudioNodes. Working with the Web Audio API is a lot like plugging all these things together (instruments into, say, effects pedals and the pedal into an amplifiers and then speakers, etc.).

Well, in order to do anything interesting with our newly acquired AudioContext audio sources, we need to first ecencapsulate the audio data as a source AudioNode.


var sourceNode = audioContext.createBufferSource();

Playback

That's it. We have a source. But before we can play it, we need to connect it to a destination node. For convenience, the AudioContext exposes a default destination node (usually your headphones or speakers). Once connected, it's just a matter of calling start and stop

.

sourceNode.connect(audioContext.destination);
sourceNode.start(0);
sourceNode.stop();

It's worth noting that you can only call to start() once on each source node. That means "pause" isn't directly supported. Once a source is stopped, it's expired. Fortunately, source nodes are inexpensive objects, designed to be created easily (the audio data itself, remember, in a separate buffer). So, if you want to resume a paused sound you can simply create a new source node and call start() with a timestamp parameter. AudioContext has a internal clock that you can use to manage timestamps.

The Engine Sound

That's it for the basics, but everything we've done so far (simple audio playback) could have done with the old <audio> element. For Flight Arcade, we needed to do something more dynamic. We wanted the pitch to change with the speed of the engine.

That's actually pretty simple with Web Audio (and would have been nearly impossible without it)! The source node has a rate property which affects the speed of playback. To increase the pitch we just increase the playback rate:


throttleSlider.onMove = function(val){
    sourceNode.source.playbackRate.value = val;
};

The engine sound also needs to loop. That's also very easy (there's a property for it too):


sourceNode.source.loop = true;

But there's a catch. Many audio formats (especially compressed audio) store the audio data in fixed size frames and, more often than not, the audio data itself won't "fill" the final frame. This can leave a tiny gap at the end of the audio file and result in clicks or glitches when those audio files get looped. Standard HTML audio elements don't offer any kind of control over this gap and it can be a big challenge for web games that rely on looping audio.

Fortunately, gapless audio playback with the Web Audio API is really straightforward. It's just a matter of setting a timestamp for the beginning and end of the looping portion of the audio (note that these values are relative to the audio source itself and not the AudioContext clock).


engineAudioSource.source.loopStart = 0.5;
engineAudioSource.source.loopEnd = 1.5;

The Instructor's Voice

So far, everything we've done has been with a source node (our audio file) and an output node (the sound destination we set early, probably your speakers), but AudioNodes can be used for a lot more, including sound manipulation or analysis. In Flight Arcade, we used two node types (a ConvolverNode and a WaveShaperNode) to make the instructor's voice sounds like it's coming through a speaker.

Convolution

From the W3C spec:

Convolution is a mathematical process which can be applied to an audio signal to achieve many interesting high-quality linear effects. Very often, the effect is used to simulate an acoustic space such as a concert hall, cathedral, or outdoor amphitheater. It can also be used for complex filter effects, like a muffled sound coming from inside a closet, sound underwater, sound coming through a telephone, or playing through a vintage speaker cabinet. This technique is very commonly used in major motion picture and music production and is considered to be extremely versatile and of high quality.

Convolution basically combines two sounds: a sound to be processed (the instructor's voice) and a sound called an impuse response. The impulse response is, indeed, a sound file but it's really only useful for this kind of convolution process. You can think of it as an audio filter of sorts, designed to produce a specific effect when convolved with another sound. The result is typically far more realistic than simple mathematical manipulation of the audio.

To use it, we create a convolver node, load the audio containing the impulse response and then connect the nodes.


// create the convolver
var convolverNode = audioContext.createConvolver();

// assume we've already downloaded the telephone sound into a buffer
convolver.buffer = telephponeBuffer;

// connect the nodes
sourceNode.connect(convolverNode);
convolverNode.connect(audioContext.destination);

Wave Shaping

To increase the distortion, we also used a WaveShaper node. This type of node node lets you apply mathematical distortion to the audio signal to achieve some really dramatic effects. The distortion is defined as a curve function. Those functions can require some complex math. For the example below, we borrowed a good one from our friends at MDN.


// create the waveshaper
var waveShaper = audioContext.createWaveShaper();

// our distortion curve function
function makeDistortionCurve(amount) {
    var k = typeof amount === 'number' ? amount : 50,
        n_samples = 44100,
        curve = new Float32Array(n_samples),
        deg = Math.PI / 180,
        i = 0,
        x;
    for ( ; i < n_samples; ++i ) {
        x = i * 2 / n_samples - 1;
        curve[i] = ( 3 + k ) * x * 20 * deg / 
            (Math.PI + k * Math.abs(x));
    }
    return curve;
}

// connect the nodes
sourceNode.connect(convolver);
convolver.connectwaveShaper);
waveShaper.connect(audioContext.destination);

// vary the amount of distortion with the slider
distortionSlider.onMove = function(val){
    waveShaper.curve = makeDistortionCurve(val);
};

Notice the stark difference between the original waveform and the waveform with the WaveShaper applied to it.

Original Audio
Audio with Radio Distortion

The example above is a dramatic representation of just how much you can do with the Web Audio API. Not only are we making some pretty dramatic changes to the sound right from the browser, but we're also analyzing the waveform and rendering it into a canvas element! The Web Audio API is incredibly powerful and versatile and, frankly, a lot of fun!

Flight Arcade runs best on screens that are a little bigger

(The minimum recommended size is at least 800px by 600px)

Got it!