Chris’s dynamic compressor

Update: 1.2 version released. See bottom.

I’ve written a plugin, designed to be used with the free audio editor Audacity, that makes it easier to listen to classical music, or other music that has a wide range of volumes, at low volumes or in high noise conditions (such as in your car) so that you can still hear the soft parts. (I might start work on a standalone version, maybe.) For details on how to use it, see the bottom of this post.

Lots of interesting (really!) exposition

For a long time now, I’ve listened to classical music less than I might like to. A lot of that is because listening to classical music is hard. You have to turn the volume up to hear the soft parts, and then either turn it down when it gets loud or have it be loud enough to be distracting (or painful). For instance, in the car, there’s a lot of noise from the road, so even at normal volumes it can be difficult to hear some of the soft parts. And some albums and songs are so quiet that many players can’t be turned up loud enough to hear the music, even in quiet conditions!

I’ve sort of put up with this problem for a long time, but it really hit home how inconvenient it is the other day when I was driving with some friends. I changed my mp3 player to some Debussy piano solos (from his Children’s Corner suite, really nice stuff) and tried to play some of it for them. Well, it failed pretty miserably. It was so quiet, even with the volume all the way up, that you couldn’t even hear anything below a mezzo-piano, and the tape adapter was already so noisy (not to mention the harmonic distortion) that enjoying the music was basically impossible. And I realized: if only this music were as loud as normal music, they probably would have really liked the tracks. So the low volume of those songs actually turned two people off of classical music. What a shame!

So, what to do? Well, there’s no simple solution, but there is a pretty good one. Using an audio editing program, it’s possible to rip the music off of a CD and change the sound to be louder. Then you can make an MP3 of it or burn it back to a CD. This solution is kind of labor intensive, but after the first CD it can be largely automated (depending on how much you want to tweak it for each song, and that does make a difference), and anyway it’s the sort of thing many people enjoy doing with their music collections. Well, me, at least.

So one day I loaded up a song into my editor and tried applying a dynamic compression effect to it. Dynamic compression, often simply called “compression”, is a technical term in audio editing that refers to a process that raises the volume of the quiet parts of an audio input up to near the volume of the loud parts. This has the effect of compressing, or reducing, the dynamic range—the difference between the soft and the loud—of the signal. The way it does this is usually to detect how loud the input is at a given point and amplify or reduce the output until it’s near the target volume. Now, it doesn’t adjust its amplification instantly. If the signal drops suddenly (as it might in a recording of speaking, after the end of a sentence), the compressor gradually increases the volume over a period of anywhere from .25 seconds to 5 seconds.

However, most dynamic compressors decrease the volume instantly when the signal rises. Well, actually, worse than “instantly”, they actually have a lag. They decrease the volume only after the input starts to rise. This is because most compressors are designed to be used on vocals and solo instruments, in realtime. Even in audio editing programs, where things aren’t realtime, the compression effect is usually designed the same way. Because it’s in realtime, it doesn’t have the option of decreasing amplification before the signal starts to rise. So some of the signal, usually some tenths of a second, is output at a very high volume while the compressor is detecting the loudness and reducing the amplification.

As you might imagine, this doesn’t really sound that great. When I applied that effect to my music, it didn’t turn out well at all. It sounded much too harsh, and sudden loudness usually led to clipping. (Clipping is like when someone speaks loudly into a microphone very close to their mouth, and you can’t understand what they’re saying because of the distortion.) The feel of the music was completely lost.

So I let it be for a few years. The other day, I was thinking some more on the problem, and came up with a design for a compressor that would do the kind of job I really wanted it to. The main idea is to anticipate the loud parts in advance, and start reducing the volume smoothly ahead of time so that it doesn’t get loud too quickly. (Another idea is to support extremely long rise and fall times so that the effect is never too intrusive.) I did some of the math (fifth grade math, albeit, involving finding the intersection of lines—I tried a version with parabolas that didn’t work out) and read up on the programming interface to Nyquist, and within two days I had a nice, working compressor. It does wonders to all sorts of piano solo and orchestral music. It’s possible to listen to things at very low volumes without missing any of the music. It even makes you hear things that you couldn’t otherwise hear even at full volume, because they’re just so quiet. (Things like the sound of the pedals being released on the piano at the end of a soft piece, or the very beginning of Ravel’s Daphne et Chloe.)

End of exposition

For you to hear it in action, I’ve provided three 30 second clips of the fourth piece from Debussy’s Children’s Corner suite, The Snow is Falling. They use version 1.1 of the compressor, so results will differ slightly, and the settings are named differently. The first has no adjustment, the second has compression applied with both rise and fall speed at the default 1.5 dB/s, and the third has compression applied with fall and rise speed set to a more aggressive 5 dB/s.

It was a lot of fun to program. I hope a lot of people get some good use out of it. I hope it allows me, and possibly you, to enjoy our classical music more often, and to more easily share it with our friends.

My program currently comes in two versions. One is a standalone program, both with a GUI and console interface. Download version 1.2 here. It requires the .NET framework version 3.5 (or Mono 1.9). If you’re using the GUI version, check out the commandline version for explanations of the parameters.

The other is an Audacity plugin, currently at version 1.2.1. To use it, download and install Audacity. Download the plugin source, and put it in your audacity plugins directory (which should be under the main install directory on windows, and under /usr/[local/]share/audacity/plugins on unix). If your music is on CD, use a CD ripping program to make a .wav file of each track. You can encode the wav file (to mp3, ogg, etc.) after you’ve applied the dynamic compression. Use Audacity to open the sound file. Select the whole thing, then go to the Effects menu, to the bottom at Plugins, and select “Dynamic compressor…”. Adjust the settings to your liking, and click OK. To save your changes, you need to go to File \\ Export wav or Export mp3. File \\ Save will not do what you want it to.

Changes in 1.2

A completely different compression algorithm is used, based on parabolas instead of lines. It strikes a better balance between fast and slow changes, by varying its rate of change so that the envelope doesn’t leave so much empty space (i.e. low-volume sections) without losing too much transparency. You get a couple more parameters to play with. You can emulate the behavior of the old version by setting the exponent parameters to 1. The audacity version has much better memory usage (and thus, speed) on larger inputs than the previous version.

Changes in 1.1

  • Bugfix. It caused some areas to be amplified way too much on files, especially when the rise and fall speed were too high.
  • Renamed existing parameters to more conventional things for compressors.
  • Added a noise floor parameter, which lets you leave alone (apply a constant gain) parts of the audio below a threshold.
  • Added a compression ratio parameter, which lets you not bring everything all the way up to the same level, and also do weird things with ratios outside the normal range.

21 Responses to “Chris’s dynamic compressor”

Michael L Perry says:

Thanks for this fantastic compressor. I’m using it to level the speech in the latest episode of my podcast. The one built into Audacity must have a non-zero attack time, which results in every new segment of speech beginning with a loud blast. I was actually looking for Nyquist samples so that I could write my own leveler. You’ve not only saved me hours of fine-tuning per episode, but days or weeks of research and development.

I record each person on a separate track, so there are long periods of silence on each track while that person is listening. I’m experiencing a strange artifact during those portions where the volume would turn up and I would hear that person breathing. But I think that adjusting the floor will take care of that problem.

Thanks for all your hard work. It’s a real time saver.

David Gessel says:

The program is excellent - but it crashes when I try to enhance long (90 minute) files. The files are 24 bit or 32 bit floating, and I’ve tried 1.3.2 and 1.2.6. It takes a long time for the crash to happen, but it does, perhaps a memory leak or something. Enhancing sections of the same files (1-5 minutes or so) has worked perfectly. I haven’t found the threashold yet…

pdf23ds says:

It’s probably not a leak, just using too much memory. Try changing this line:

(defun get-my-sound (sound)
(snd-avg (if (= use-percep 1) (get-percep-adjusted-sound sound) sound) 2000 1000 op-peak))

Change the numbers to 10000 5000, and if that doesn’t work, to 20000 and 10000.

That makes it so you get a less fine-grained view of the source sound, so if you’re using relatively fast attacks or releases, you might notice that the volume changes slightly too soon or late the larger those numbers get. You can lower the larger number to closer to the smaller, and that will increase the chance of clipping but make the volume changes slightly better timed.

You might be able to apply it in sequence without introducing clicks by making the edges of your selections be high peaks in the file.

Mike says:

Wow! This is exceedingly convenient. I, for one, appreciate this very very much.

David Gessel says:

Will try it - thanks! Also, is it possible to set the attack and release to faster than 20db/sec? I have gun shots in a live sound track over voices, and want to avoid hard clipping.

-David

pdf23ds says:

For those, I would definitely try to manually correct for the individual gunshots. (The Nutcracker?) Audacity has a volume envelope tool that you can use to precisely deal with that sort of thing. If you still want to use the plugin to do that, at the very top of the .ny file there’s the definitions for the parameters, that you can edit to change the allowed range.

Dan Jolt says:

Wow, your compressor works really well. I agree with Michael, it’s good enough to use for podcast and (homebrew) radio production.

Hope you don’t mind a feature request:
Is it possible to add a flag, so that the compression would only be performed on a single channel (left/right) based on either the volume of this channel or the combined volume of both?

This would allow mono “talk-over” effects if you record your spoken track on one channel and have music on the other. First you could compress the voice channel and then the music in a way that it whenever something is spoken on the other channel, the music stays in the background to immediately return afterwards.

pdf23ds says:

Dan, I think that the option you’re looking for is really more of a separate plugin. The basic idea would be to take a volume envelope of the compressed voice track, invert it, and scale the background track to follow the inverted envelope. That sounds like a five line plugin to me, but I’m not up to writing it anytime soon. If I had to guess, it’d look something like

(mult background-snd (invert (snd-avg voice-snd 2000 1000 op-peak)))

where “invert” isn’t actually a Nyquist function, and you extract the “background-snd” and “voice-snd” sounds from the appropriate places.

You could try putting a request to the audacity-nyquist list, though.

Henry says:

Can someone help out a newbie with some suggested settings for the parameters? I am using this on a conference call recording that has all sorts of people talking with lots of different amplitudes.

pdf23ds says:

You might want to increase the rise and fall speeds to 10 db/sec, and raise up the noise floor to .3 or higher.

Erik says:

Hello, is it really Chopin’s Ballade no. 1 in G minor? The one I have at home does not have that part in it at least. Just curious :-)

Take care.

pdf23ds says:

Oh my. I had Chopin’s ballade in the sound clips at one point, but then I substituted them for Debussy’s The Snow is Falling, from the Children’s Corner suite. I guess I forgot to change the text.

links for 2007-10-15 « Where Is All This Leading To? says:

[...] Metablog » Chris’s dynamic compressor (tags: compressor compression audio audacity nyquist plugins opensource freeware mac linux osx windows winxp) [...]

Allen McBride says:

Great plug-in! It’s weird no one has posted since last October. Is there a way to make the settings more precise, so I can adjust the floor in increments of .001? Thanks! —Allen

pdf23ds says:

Heh. I think I had some overzealous spam protection there for a while.

I don’t directly control how precisely you can set that parameter. If you just type in the values into the text box I bet it will let you be as precise as you wish, up to the limits of double-precision floating point.

Incidentally, I’m about to release a new version with a new, better algorithm. I have initially written it in C# but could make it work (somewhat more slowly) in Audacity. If you have a standalone tool, would you still want the audacity plug-in? (If I don’t write a new audacity plug-in I’ll leave the old plug-in.)

Allen McBride says:

Thanks! I had played around with some very small values like .002 and convinced myself that they were getting rounded to zero, partly because I couldn’t hear a difference and partly because when I re-opened the effect dialogue after, that field said zero again even though usually the last values used are remembered. But neither of those is conclusive. (And as for values being remembered, there seems to be a difference in other Audacity effects as well between typing them in vs. using the sliders… sometimes it seems like only using the sliders works, but I haven’t tested carefully yet.)

I’ll look forward to your new version. From my perspective, I think an Audacity plug-in would be useful. To use a standalone tool I assume I’d need to export the audacity project to a single file and then re-import, right? Plus I use a Mac… is my sense correct that it’s hard to make C# programs run on Mac?

Thanks again,
Allen

pdf23ds says:

Huh. Well, if you’re seeing odd behavior then you could always change the plugin source so that the range of the noise floor is smaller. That way the slider will be more detailed. It’s the 10th or so line:

;control floor “Floor” real “linear” .02 0.0 1.0

Just lower that 1.0 to .25 or some such. Be warned that any changes to the .ny file will cause Audacity to forget the saved values of all the parameters and go back to the built-in defaults.

The Mono framework should allow programs such as mine to run on a Mac. It requires the 1.9 version, though, which is still prerelease and a quite large download—95 MB or so.

Allen McBride says:

Hi Chris,

I’m still getting lots of milage from your compressor. Thanks for the advice about changing the slider ranges; that lets me know for sure my settings are what I want them to be. I have another question about the noise floor, though — I’ve finally realized that it doesn’t behave as I had assumed. Sections of audio beneath the noise floor are still raised, just not quite as much as if you don’t set a noise floor. So if someone doesn’t speak for a couple seconds, you get a little hill of background noise in the middle of the break in speech. So I changed “(defun limit-amp (samp) (if (and (> floor 0) (> samp max-gain)) max-gain samp))” to “(defun limit-amp (samp) (if (and (> floor 0) (> samp max-gain)) 1 samp))”. This sort of does more like what I’d expect, though of course if something just barely breaks the noise floor, up it comes. But I barely read Lisp, so I still don’t know how most of the compressor works, and I thought I’d get your reaction to this, and see if you had any other thoughts.

Thanks,
Allen

pdf23ds says:

Yeah, I think what you’re looking for is a “noise gate”. Lots of compressors have them built in, but I don’t have one in mine because it’s not really appropriate for most classical music recordings, which is what I wrote the plugin for. Alas, it seems that most people use it for speech instead. I’m not sure how I’d go about implementing a good noise gate algorithm, because I haven’t ever thought about it. The naive noise gate algorithm is just what your change consists of, and I’m not sure how much you can improve on that. But you might look for another plugin to apply in a separate pass (probably before compression).

The other thing I might do is decrease the compression ratio and attack/falloff speeds a bit. (Normalizing before compressing would help too.) That way the compression isn’t much harsher overall, but noise isn’t boosted as much.

I guess a more general solution would be to generalize the compression ratio parameter into a function with a domain of input dB and range of output dB, which is how SoundForge’s compressor works. It would probably be relatively easy to implement, but Audacity doesn’t offer a good GUI for that, or a way to create one, so you’d have to tweak the file itself every time you changed the function. The function would just replace “adjust-samp”, and look something like this, I guess:

(defun adjust-samp (db)
  (db-to-linear
    (cond
      ((< db -30) 0)
      (t db))))

“cond” is like a switch statement, with “t” being the default clause. The maximum amplitude and noise floor parameters would become superfluous. The sample function there silences everything below -30 dB and applies no change to everything else. But ideally if you graphed the “adjust-samp” function it would be continuous. I don’t think adjust-samp actually works on samples, btw.

Allen McBride says:

Thanks, Chris! This is helpful. I will play with noise gates, and with the adjust-samp function once I learn enough Nyquist to understand what’s going on. So do you think we’re crazy to use your compressor on speech? The conventional compressors may have some advantages, but the thing I like about yours is that it doesn’t miss any spikes. The Audacity compressor misses spikes (like mic hits or cassette tape artifacts) during compression, but then the spikes keep the normalizer from bringing up the volume, so I end up having to remove the spikes by hand. (Click removal doesn’t really seem to work; I’m not sure why.) Also, what do you mean when you say you don’t think adjust-samp works on samples?

Allen

pdf23ds says:

“So do you think we’re crazy to use your compressor on speech?”

Hey, if it works for you that’s great. I’m not surprised it does better than other sorts of compressors for some things. Nyquist comes with a compressor that’s similar to mine, but Audacity’s repackaging of Nyquist doesn’t include it (or a couple of library functions it uses).

“Click removal doesn’t really seem to work”

Well, clicks, pops, and other artifacts may sound similar but methods used to detect them can vary widely. For a microphone hitting something, I usually do a high-pass filter at around 20-50 Hz. It doesn’t get rid of the sound, but it gets rid of the peak in the waveform. For speech, I’d probably do a highpass to the whole thing. For dealing manually with big peaks, it’s pretty easy to use Audacity’s interactive envelope function to scale it way down.

“what do you mean when you say you don’t think adjust-samp works on samples”

The value passed to adjust-samp isn’t an individual sample, but part of the envelope that gets interpolated over the waveform. adjust-samp gets called once for each thousand samples.

Leave a Reply

To include an em dash, use three hypens with no surrounding spaces, or two with surrounding spaces.