The science of loudness
Thanks to my sponsors: Luuk, Dirkjan Ochtman, Mark Tomlin, Beth Rennie, Matt Jadczak, Brian L. Troutwine, Raphaël Thériault, Dom, Antoine Rouaze, Marty Penner, callym, Lucille Blumire, Zac Harrold, Thehbadger, Gioele Pannetto, Tyler Bloom, dataphract, Brandon Piña, Eugene Bulkin, Herman J. Radtke III and 277 more
My watch has a “Noise” app: it shows , for decibels.
My amp has a volume knob, which also shows decibels, although.. negative ones, this time.
And finally, my video editing software has a ton of meters — which are all in decibel or decibel-adjacent units.
How do all these decibels fit together?
Are the decibels from my watch and the decibels from my amp related? And if so, how? I’ve decided to spend twenty minutes of your time answering that question.
I’ve also spent about a month of my time making FAM: the fasterthanlime audio meter, in Rust of course, with egui. If you’re a patron of any tier, you can clone and run it right now, and if you’re not, well you can’t.
What even is sound?
Sound, like wind, is more of a concept than a thing, since it’s the name we’ve given to a specific behavior of particles.
When you strum a guitar, the chord vibrates:
…transferring energy to the body of the guitar, which amplifies it and projects it into the air as a pressure wave!
This wave eventually makes its way to the ear, where some processing is already done via its unique mechanical design: after being collected by the the outer ear, the wave is ferried through the ear canal into the eardrum, where three tiny bones amplify it. Then it’s destination: inner ear, where hair cells travel up and down some fluid.
Eventually, it’s converted by chemicals into electrical signals and interpreted by the brain as sound.
People experiencing hearing loss and who use a cochlear implant bypass the mechanical parts of that pipeline, relying instead on microphones and speech processors.
Cochlear implant at the museum Kulturama in Zürich, Switzerland.
Although those implants do not replicate natural hearing, they give the brain enough information to recognize and process human speech, and environmental sounds.
For the rest of us, our ears detect tiny changes in pressure.
Under pressure
You have to realize that there is pressure constantly being applied to our bodies all the time, on the order of one atmosphere, or about one hundred thousand pascals, the SI unit for pressure.
But you’ll notice that my watch’s “Noise” app doesn’t use pascals. In fact, no audio equipment I’ve ever looked at uses pascals. Instead, they show sound pressure level, defined as follows:
Decibels are a logarithmic unit expressing a ratio — in this case the ratio between , a pressure we measured, and , a reference pressure.
In air (because sound can also travel through water and other media), we usually pick 20 micropascals, which is about the quietest sound the human ear can detect.
Instead of having a linear scale that spans 8 orders of magnitudes (from micropascals to the thousands), decibels give us a nice human-friendly scale going from to :
Decibels | Example |
0 | Faintest sound heard by human ear |
30 | Whisper, quiet library |
60 | Normal conversation, sewing machine, or typewriter |
90 | Lawnmower, shop tools, or truck traffic (90 dB for 8 hours per day is the maximum exposure without protection*) |
100 | Chainsaw, pneumatic drill, or snowmobile (2 hours per day is the maximum exposure without protection) |
115 | Sandblasting, loud music concert, or automobile horn (15 minutes per day is the maximum exposure without protection) |
140 | Gun muzzle blast or jet engine (noise causes pain, and even brief exposure injures unprotected ears, and injury may occur even with hearing protectors) |
180 | Rocket launching pad |
Source: Merck Manual
Above , we don’t get sound waves, we get shock waves — the pressure amplitude would need to be more than one atmosphere, resulting in negative absolute pressure, which is impossible. Once you reach a vacuum… that’s it. There’s no going any more vacuumy.
Signal processing
We haven’t yet elucidated what the decibels on my amp mean. I would call those , for decibel “relative to full scale”, because at , we would get the absolute maximum power the amp can output (which would be damaging to my ears and to my relationship with the neighbors).
The formula is the same, except that we don’t pick a reference like with sound pressure levels. Here, we consider an input signal and an output signal:
Say the solid curve here is our input signal, with amplitude , and the dotted curve is the signal after it comes out of.. some system, with amplitude :
Based on those amplitudes, our system has a gain of:
Because of the way human hearing works, it makes more sense to design a volume control around decibels. It is, in fact, logarithmic, but it feels more linear to the ear.
If you don’t do that, you end up with a very frustrating volume control where the upper 80% are way too loud, and the value you want is between two ticks on the low end of the slider. I’m sure you’ve seen those before, I know I have.
Those are the ones we’re interested in as broadcast engineers: since we’re dealing with a signal directly.
Of course, most of the time, a signal will be transformed back into a sound wave, and then we’ll have to worry about …
But as long as it’s within an audio system, we have to worry about exceeding levels. With an analog signal, that typically results in distortion, which… can be done on purpose for style, and exceeding levels in a digital system typically results in clipping, a very harsh form of distortion.
Which sounds quite awful — you may recognize that from public announcement systems in parks or maybe trains:
To avoid that, we have watch our levels. And over the past hundred years, we’ve come up with a bunch of solutions to do that, all of which are flawed in some way.
In the 1930s, the BBC came up with meters that look like this:
A typical British quasi-PPM. Each division between '1' and '7' is exactly four decibels and '6' is the intended maximum level.
Well, that one isn’t from the 1930s, but the basic idea hasn’t changed.
Independently and around the same time, the Germans also developed level meters, putting actual decibels on the scale…
A German PPM from Siemens & Halske
…and giving them the cute little nickname “Lichtzeigerinstrument” (light pointer instruments).
Today we would call them both “quasi-PPMs”, PPM for “peak programme meter”, and quasi because… they don’t actually report peaks accurately.
Type II PPMs, for example, have an integration time of 10ms — any peak shorter than that gets under-reported. This succession of notes, which all have the exact same volume but are getting longer and longer, shows how the quasi-PPM under-reports at first:
I don't own one, so the best I can do is show you a plug-in that simulates it!
Root Mean Square
But quasi-PPMs are still pretty good at showing peaks. A lot more than, say, VU meters (for Volume Unit), which were invented in the US in the 1940s, and which get us a lot closer to “loudness”.
What VU meters measure is similar to the Root Mean Square, which gives us the average level of a signal over a period of time:
A typical VU meter has an integration time of 300ms (vs 10ms for a Type II PPM if you remember), giving us even more severe under-reporting of peaks:
Which is fine!
Here’s another side-by-side example — using an RMS meter this time, which is close enough to an actual VU meter:
A prototype of the fasterthanlime audio meter playing one of my tracks, we see that the sample peak is consistently higher than the root mean square.
A VU meter is not meant to show short peaks it’s meant to let a radio operator know how loud a song roughly is, so they can adjust the volume, and the listeners don’t have to.
However, I’m happy to report that, since the 1940s, both our understanding of human hearing and technology have improved.
Sample Peak, True Peak
First off, most audio processing is now done in the digital domain. Which is both a blessing and a curse.
Here’s Ableton showing a piece of audio:
If we zoom in, we can see the wave:
We can zoom in some more:
And eventually, Ableton will show us individual audio samples:
Today’s PPMs are not “quasi” anymore. It’s really easy to make a sample peak monitor, because you just look at a window of samples, say, a thousand of them, and keep whichever value is the furthest from zero! That’s your peak!
Your sample peak! Not your true peak!
Because the actual sound wave is reconstructed from a limited amount of discrete samples, it’s possible for all samples to be below the maximum desired level, and yet for the reconstructed wave to be above it!
To actually measure the true peak, one can use a sinc filter to upsample the signal, which fills in additional samples between the original ones — letting us know how high that sound wave truly goes.
The Loudness Wars
That takes care of peaks. What about loudness then? Well, we made progress there too. First, in the wrong direction.
The idea of compression is to get rid of dynamic range by taking anything above a certain level and progressively making it smaller:
A demonstration of compression in DaVinci Resolve 20 using the very text for this article.
We can apply gain to the resulting signal without clipping, making the whole thing louder. That gain is called “make-up” gain, because it makes up for the loud bits that were made quieter by the compression. I can’t believe that just clicked for me now.
During the 2000s, sound engineers started abusing compression to make their albums sound louder and louder, based on the theory that people preferred louder things.
The song "Super Trouper" as shown on the major issues of the album, the 1980 Super Trouper LP, 2001 Jon Astley remaster, 2005 The Complete Studio Recordings box set disc 7, and 2011 Super Trouper Deluxe Edition remaster disc 1.
These “loudness wars” lasted until the mid-2010s, when the music industry finally tackled the problem, by inventing a proper loudness unit: LKFS.
The first interesting thing about LKFS is that it takes into account multiple channels and does a downmix into one value:
Simplified block diagram of multichannel loudness algorithm
Which begs the question: what did the BBC do, with their PPMs?
Well, they didn’t have to worry about stereo until the late 50s, when they started experimenting with stereo themselves, with two separate AM transmitters.
So, two separate PPMs was one option:

Screenshot of BBC-type Peak programme meter in AB (left/right) mode
…and then they had a different variant that showed the sum and the difference of both channels, which came in two versions, M3:

Screenshot of BBC-type Peak programme meter in M3 (sum/difference) mode
And M6:

Screenshot of BBC-type Peak programme meter in M6 (sum/difference) mode
This is important because if you have two waves of opposite phase, they cancel each other out!
But the first stage of LKFS computation is filtering, to model how humans perceive sound.
The first filter boosts everything above 1000Hz:
Response of stage 1 of the pre-filter used to account for the acoustic effects of the head
And the second is a high-pass filter, which attenuates anything under 100Hz.
Second stage weighting curve
Next, we integrate over some interval to calculate the power of the filtered signal:
And finally, this should look familiar: it’s very close to the formula from earlier:
But because this time we’re measuring a power, not an amplitude, we use a 10x factor instead of a 20.
are the weighting coefficients for individual channels, given in table 3 of BS.1770-5 as:
Depending on the interval chosen to calculate loudness, we call the result different things:
- for momentary (400 milliseconds)
- for short-term (3 seconds)
As for I, it’s the integrated loudness, and it takes into account an entire piece of media, minus the quiet parts, using a standard gating mechanism.
This prevents any “cheating” done by audio engineers to make their songs louder than the others, because we finally have one number that is relatively good at predicting how loud something will sound to the human ear.
When mastering for YouTube, we target an integrated loudness level of .
In the “Stats for nerds” section of a YouTube video (that you can find in the context menu), there is a content loudness section:
On that video of George Michael’s “Careless Whisper” they left a bit of headroom: their integrated loudness is .
I checked with ffmpeg’s -af ebur128
filter:
~/Downloads
❯ ffmpeg -i careless-whisper.webm -af ebur128 -f null -
ffmpeg version 7.1 Copyright (c) 2000-2024 the FFmpeg developers
✂️
[Parsed_ebur128_0 @ 0x6000002640b0] t: 0.106979 TARGET:-23 LUFS M:-120.7 S:-120.7 I: -70.0 LUFS LRA: 0.0 LU
[Parsed_ebur128_0 @ 0x6000002640b0] t: 0.206979 TARGET:-23 LUFS M:-120.7 S:-120.7 I: -70.0 LUFS LRA: 0.0 LU
[Parsed_ebur128_0 @ 0x6000002640b0] t: 0.306979 TARGET:-23 LUFS M:-120.7 S:-120.7 I: -70.0 LUFS LRA: 0.0 LU
[Parsed_ebur128_0 @ 0x6000002640b0] t: 0.406979 TARGET:-23 LUFS M: -20.6 S:-120.7 I: -20.6 LUFS LRA: 0.0 LU
[Parsed_ebur128_0 @ 0x6000002640b0] t: 0.506979 TARGET:-23 LUFS M: -20.5 S:-120.7 I: -20.6 LUFS LRA: 0.0 LU
[Parsed_ebur128_0 @ 0x6000002640b0] t: 0.606979 TARGET:-23 LUFS M: -21.4 S:-120.7 I: -20.8 LUFS LRA: 0.0 LU
[Parsed_ebur128_0 @ 0x6000002640b0] t: 0.706979 TARGET:-23 LUFS M: -25.0 S:-120.7 I: -21.6 LUFS LRA: 0.0 LU
[Parsed_ebur128_0 @ 0x6000002640b0] t: 0.806979 TARGET:-23 LUFS M: -33.5 S:-120.7 I: -21.6 LUFS LRA: 0.0 LU
✂️
[Parsed_ebur128_0 @ 0x6000002640b0] t: 300.606979 TARGET:-23 LUFS M: -60.3 S: -60.3 I: -16.9 LUFS LRA: 8.2 LU
[Parsed_ebur128_0 @ 0x6000002640b0] Summary:
Integrated loudness:
I: -16.9 LUFS
Threshold: -27.1 LUFS
Loudness range:
LRA: 8.2 LU
Threshold: -37.1 LUFS
LRA low: -22.0 LUFS
LRA high: -13.8 LUFS
✂️
Because -16.9 is below the target of -14, YouTube does not apply any change to this video.
By contrast, Rihanna’s Umbrella is mastered to , so YouTube turns the volume down:
Going from 100% to 55% is a change of… hey, we can calculate that!
And that’s what the “stats for nerds” overlay is showing: it’s 5.2, 5.3 above their target loudness, so they’re turning the volume down.
Personally, I find it strange that they show the difference between their loudness target and the content’s integrated loudness in decibels.
If they want to display the delta, they should use LU, loudness units. And honestly, they could just say the target is and give us the actual loudness of a track in LUFS. This is stats for nerds! Not stats for normies!
Even more recently, YouTube has introduced “DRC” (for dynamic range compression):
As seen here on this Scott the Woz video, which is mastered way below YouTube’s target loudness level, at around , the target for “broadcast” rather than “streaming”.
YouTube’s user-facing name for it is “Stable Volume”, it’s not supposed to kick in for music because it ruins music and you can turn it off in the settings:
A-weighting
LKFS, and LUFS (same thing different name) aren’t the only units that try to take psychoacoustics into account: the Apple Watch’s Noise app also does filtering.
The first experiments regarding “how loud humans think sound is” date back to 1927:
Note: "T.U." stands for telephone units, and "cycle" for Hertz.
A few years later, Fletcher & Munson publish this equal-contour loudness graph:
An equal-contour loudness from Fletcher & Munson, 1933
Which takes a minute to figure out. Each of the lines determine a level of loudness. Their test subjects reported that, for example, a 1000Hz sound blasted at 40 decibels felt as loud as a 100Hz sound at 62 decibels.
In other words: we are much, much more sensitive to sounds at 1000Hz than to those at 100Hz.
That dip around 3 to 4 KHz is where our hearing is most sensitive: we made our smoke detectors beep at that frequency for maximum alert, and our babies cry at that frequency for similar reasons.
From that graph, an ISO standard was derived, specifying the A-weighting curve, which predates LKFS’s K-weighting by almost 50 years:
Although more basic and somewhat outdated, A-weighting is used in a bunch of places.
French law requires sound level meters like these in every music venue:
As of 2023, the levels to respect are and Level Equivalent (LEQ), or, “average sound energy” over 15 minutes.
American work safety organizations give different recommendations when it comes to maximum sound exposure:
OSHA:
Duration per day | Sound level (dBA) |
8 hours | 90 |
4 hours | 95 |
2 hours | 100 |
1 hour | 105 |
30 minutes | 110 |
15 minutes | 115 |
Duration per day | Sound level (dBA) |
8 hours | 85 |
4 hours | 88 |
2 hours | 91 |
1 hour | 94 |
30 minutes | 97 |
15 minutes | 100 |
7.5 minutes | 103 |
3.75 minutes | 106 |
1.88 minutes | 109 |
0.94 minutes | 112 |
These tables use ; and so does the Apple Watch Noise app.
Now that we know how all these units fit together, we can all be fun at the next party.
Here's another article just for you:
So you want to live-reload Rust
Good morning! It is still 2020, and the world is literally on fire, so I guess we could all use a distraction.
This article continues the tradition of me getting shamelessly nerd-sniped - once by Pascal about small strings, then again by a twitch viewer about Rust enum sizes.
This time, Ana was handing out free nerdsnipes, so I got in line, and mine was: