Optimised Stereo II – Near Field Compensation

How far away from you are your loudspeakers – are they within a couple of metres of where you’re sitting? If you’re in a home/project studio then I’m going to assume that they are. They might even be as close as a metre from your listening position!

So what? Why does it matter how close they are? Well, you’ve heard of the proximity effect when you record with a microphone, yes? It’s the bass boost that you get as you move nearer and nearer to the microphone. Well if you’re sitting nearby the your loudspeakers then this can happen as well. And it’s not just the bass boost because there’s an associated phase difference that goes with it. This can cause some issues with where we localise the sound at low frequencies.

It was suggested by Michael Gerzon in his 1992 “Metatheory of Auditory Localisation” that this should be corrected. He states

“It might be argued that the effect on localization at such low frequencies is “unimportant”, but we believe this not to be so insofar as the more things that are made correct the better.”

With that in mind, and in the spirit of experimentation, I’ve made a very simple VST plugin that does this. I haven’t had a chance to listen to it in a decent room (mine is currently untreated, but not for long) so I don’t know how well it works. It’s definitely a subtle effect in from my preliminary listens.

Basically it’s a case of taking your stereo signal right before the output, splitting it into Sum (L+R) and Difference (L-R) channels and high-pass filtering the difference channel. These are then recombined back into L and R before being output via your loudspeakers.

I don’t know if Gerzon was right but this sort of near field compensation (NFC) is very common with Ambisonics so I thought it’d be worth giving it a go for stereo. I’ll be very interested to hear what other people think: important or pointless?

I’ll try have the VST up tomorrow. I just need to write up some brief documentation for it this evening.

Optimised Stereo

Our ears use different methods of hearing at different frequencies to allow us to localise sound. We use phase differences at low frequencies and level/intensity differences at high frequency (a slight simplification but it’ll do). Traditional amplitude panning actually recreates these low frequency cues if you’re sitting in the sweet spot i.e. equidistant from both loudspeakers.

But there’s a problem: using the same amplitude across the whole frequency spectrum means that the cues for the low frequency tell your brain that the image is one one direction and the high frequency cues tell it that it’s in another direction.

Perceived localised image position from Gerzon's Velocity and Energy vectors

Perceived localised image position from Gerzon’s Velocity and Energy vectors for standard amplitude panning

A very smart man (which is underselling him) called Michael Gerzon developed two vectors, known as the Velocity and Energy vectors, that can give you an idea of where a sound will be perceived if you know the loudspeaker gains. The Velocity vector is an indication for low frequencies below approximately 700 Hz and Energy vector for high frequencies. If we consider the prediction a standard stereo set-up (2 loudspeakers and a listener forming an equilateral triangle) shown to the right then there is indeed a discrepancy between the directions of the two Gerzon vectors!

How’d you get all the way up there?! (Illusion of Elevation)

“But the car door slammed halfway up the wall!”

This was the reaction one person had after listening to a scene of a radio drama by one of my colleagues from the MA I did a few years ago (listen to it here) that played using 3rd order Ambisonics in the Sonic Lab in SARC. The thing is, this elevation of some sounds is present even for a 2D representation.

Music via Ambisonics

Late last year I was playing around with some songs I’ve recorded and doing some quick 3rd order Ambisonic mixes. It got me thinking about what I wanted to use Ambisonics for and how best to present my songs using it.

[Just as a side note, I’m talking here about “pop” music mixes, not electro-acoustic music where use of spatial audio is much more widespread.]

For example, do I want to use the full 360 degrees (or full sphere if were doing 3D) for the sounds or is it better to stick with a frontal sound stage and just use surround for ambience, which is common in 5.1 music mixes?

