Optimised Stereo

Our ears use different methods of hearing at different frequencies to allow us to localise sound. We use phase differences at low frequencies and level/intensity differences at high frequency (a slight simplification but it’ll do). Traditional amplitude panning actually recreates these low frequency cues if you’re sitting in the sweet spot i.e. equidistant from both loudspeakers.

But there’s a problem: using the same amplitude across the whole frequency spectrum means that the cues for the low frequency tell your brain that the image is one one direction and the high frequency cues tell it that it’s in another direction.

Perceived localised image position from Gerzon's Velocity and Energy vectors

Perceived localised image position from Gerzon’s Velocity and Energy vectors for standard amplitude panning

A very smart man (which is underselling him) called Michael Gerzon developed two vectors, known as the Velocity and Energy vectors, that can give you an idea of where a sound will be perceived if you know the loudspeaker gains. The Velocity vector is an indication for low frequencies below approximately 700 Hz and Energy vector for high frequencies. If we consider the prediction a standard stereo set-up (2 loudspeakers and a listener forming an equilateral triangle) shown to the right then there is indeed a discrepancy between the directions of the two Gerzon vectors!

The only place where they match is at the loudspeakers (where only one of them is playing) and at the centre (where they are both of equal gain). This creates a problem because we’ve got conflicting cues at different frequency ranges. Ideally, we should have these two lines perfectly matching as the source moves from one loudspeaker to the other.

But what can we do? Well, we’ve panned this using Amplitude panning so we can try using Intensity panning since it’s intensity that the ears respond to at high frequencies. But switching from Amplitude to Intensity panning isn’t enough. The same problem will still exist. What we need to do is change the panning type based on the frequency.

So at low frequencies we’ll use your standard, everyday Amplitude panning and at high frequencies would modify it to use Intensity panning. Now let’s check what this dual-band approach does to the predicted localisation positions:

Perceived localised image position from Gerzon's Velocity and Energy vectors for dual-band stereo

Perceived localised image position from Gerzon’s Velocity and Energy vectors for dual-band stereo

They match! So now, theoretically, we should have an image that is less blurred.

But does it work?

I don’t know. I’ve not actually tried it! Sorry. But I’m interested in hearing from people who have tried it and if it’s worth the trouble. I’m wondering how much it would change things on headphone listening since that’s probably the main method of music consumption at present. I do intend to code it up soon and give it a try.

There are a few practical considerations that must be overcome if it’s to be implements. For example, the filters used for each of the loudspeakers should be phase matched to avoid creating more problems than we’re solving and to maintain mono compatibility. There’s also an issue that without some careful consideration an image panned to the centre would have more high frequency content compared to one hard panned left or right.

I’ve added making an optimised stereo panpot to my list of VST plugins that I want to make. I’m not sure how when I’ll be able to do it but I’m quite interested to hear how it sounds and if it actually helps or not


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s