Bit Depth, Sample Rates and Buffer Size for Podcasts

Riley Byrne, Owner, Podigy.co

20 October 2017

Today we’re going to walk through a few things that every podcaster has to decide upon before recording. In our Podcast Recording Guide, we outline that recording with 16-bit bit depth, 44 100 Hz sample rate and a buffer size of 128 or 256 is optimal for most podcasters, but we didn’t go into why. Today we’ll explain these numbers and what they mean, and let you decide for yourself what is best for your podcast.

Let’s start with bit depth, which is different from bitrate, which we talk about in our rendering guide. Microphones work by converting the amplitude of our voice into a voltage. To record that audio into our computer, that voltage is converted into a number representative of the voltage. The bit depth of our audio is basically how accurately the computer will record the amplitude of incoming audio. The higher the bit depth, the closer our digital recorder can match the analog input and the less noise is introduced when the digital recorder invariably rounds off the number in order to record it. A 4-bit recording will be able to give each sample one of 16 possible values, whereas a 16-bit recording will be able to give each sample one of 65 536 values.

Think of it this way: imagine you had book a caterer for a wedding with 100 people. One company offers a packages for 10 people, 500 people or 1000 people and nothing in between. None of those packages are representative of what you need, but if you had to choose one, it would probably be the 500 person package, even if that would leave you with a bunch of leftovers, because you don’t want to leave 90 people without food. That’s what a 4-bit recording is like. A 16-bit recording is more like a catering service that offers a per person rate.

But you’ll see a lot of interfaces advertise these days that they offer 24-bit recording, and surely that must be like a catering company that lasers the name of each guest into the steak or chicken they’ve requested. And while it’s true that 24-bit bit depth offers 16,777,216 values for each sample of audio, it is almost certainly wasted on podcast recordings.

You see, the main advantage of increasing bit depth is it allows for a lower noise floor. While 24-bit recordings can sound cleaner in controlled environments like recording studios, it’s not going to make any difference for a podcast recorded in an untreated apartment. It just gives you the capacity for lower noise, but doesn’t actually change anything sonically.

To put it another way, for a 16-bit file, you’d have to turn up the volume until it was equivalent to the sound of a food processor before you would notice the noise floor introduced by quantization. For a 24-bit file, you’d have to turn the volume until it was the equivalent of a jet engine taking off before you’d notice the noise floor introduced by quantization. While this is great, chances are you’re recording in a place that has a noise floor much higher than that, and thus you can’t take advantage of the lower noise floor to begin with. The noise introduced with 16-bit recording is about equivalent to someone dropping pins one of a time on the floor during your podcast. Definitely not something the listener is going to pick up after the audio is converted to a lossy mp3, and has significantly smaller file sizes than 24-bit recordings.

Sample Rate

When we looked at bit depth, I mentioned that each sample of audio is given an amplitude value. Our sample rate is what defines the length of each sample, with our recommended setting of 44 100 Hz producing 44 100 samples of audio every second. Higher sample rates also allow recordings to pick up higher frequencies. But what does this mean for podcasters, and why do we recommend using 44 100 Hz for recording?

So the first thing you need to know is that the human ear can only hear frequency up to ~20 000 Hz. Anything above that is wasted on us. However, the Nyquist–Shannon sampling theorem tells us that our recording frequency needs to be at least twice as much as what we’re trying to record to reproduce it accurately. Part of the reason Stuff You Should Know and the WTF podcasts sound a little “off” is because they render their episodes out at 22 050 Hz, which is right around the limit of human hearing, which means it’s actually losing information we’re used to hearing in podcasts. So we need to be rendering at at least 40 000 Hz to be capturing everything the human ear does.

As to why 44 100 Hz became the norm, Sony decided that when CDs were first being manufactured. As to why it continues to be used widespread in the internet age, that has more to do with compatibility. Just about everything on the internet uses 44 100 Hz for audio, and this ensures that generally audio sounds the same across all platforms. Occasionally you’ll hear audio that sounds slightly pitched down and kind of slow, which usually means a 48 000 Hz file is being played at 44 100 Hz. It’s long and complicated as to why these things happen, but sufficed to say 44 100 Hz is always going to ensure your audio plays well across the net.

Buffer Size

So now what we know what sample rate to use, we have to decide how many samples at a time our computer will write to disk. This is called our buffer or block size. At a sample rate of 44 100 Hz, our computer needs to write 44 100 samples every second, and we need to decide how often it will check for new samples. A lower buffer size will mean less latency (the time between you speaking into your microphone and your computer playing it back) but will increase the strain on your computer. A buffer size of 32 will mean your computer is checking on the incoming audio ~1,378 every second, whereas a buffer size of 512 will mean it only checks ~86 times a second. When your computer is still writing one buffer after the next buffer fills up, you’ll start to hear clips and clicks and dropouts in your audio.

Unlike sample rates and bit depths, there is no hard and fast rule for buffer sizes. Each of you will have to determine what is right for you on your own. If you’re a solo podcaster, or you’re guests are all in studio and you don’t use headphones, you can set the buffer as high as you want to ensure your computer has plenty of time to write each buffer before the next one fills. However, if you’re recording and talking to someone over Skype/Discord/Zencastr/etc., you’ll have to be wary of the effect higher buffer sizes has on latency. The higher your latency, the longer it will take for your audio to get to your guest, and you’ll probably have more false starts and instances of talking over each other. Nothing that can’t be fixed up while editing, but something to keep in mind.

That’s it! If you’re looking to apply these principles, be sure to check out our Guide to Recording podcasts, and we will see you next time!

Want more help podcasting?

Leave your name, email and a link to your podcast and Riley will listen to your podcast and  make you a personalized action plan to take it to the next level.

Want to Learn more? 

Here are a selection of articles to help you become a better podcaster



Best Podcast Microphones

PODCAST RECORDING The Best Microphone for Podcasts Riley Byrne, Owner, Podigy.co 14 December 2017 iTunes | Google Play | Stitcher So now that we’ve done software, let’s look at microphones. We touched on them briefly in our guide to podcast recording, but today we’re...

Best Podcast Editing Software

PODCAST EDITING Best Podcast Editing Software Riley Byrne, Owner, Podigy.co 06 December 2017 iTunes | Google Play | Stitcher If you’re familiar with Podigy, you probably know that we like Reaper. A whole lot. And if you’ve followed any of our guides, you know that...

Boutique Plugins

PODCAST EDITING Boutiqu Plugins And Podcasters Riley Byrne, Owner, Podigy.co 01 December 2017 iTunes | Google Play | Stitcher Right now we’re in the middle of sale season, having just come off of Cyber Monday and soon enough Boxing Day (for us Canadians) will be upon...

Gain Staging for Podcasts | Podigy Guide to Podcasting

PODCAST EDITING Gain Staging for Podcasts Riley Byrne, Owner, Podigy.co 17 November 2017 Want more great tips? Subscribe to The Podcaster's Podcast: Gain staging is a pretty simple concept. Basically, it’s how you set your volume levels throughout the signal chain to...

Gates and How to Use Them

PODCAST EDITING Gating for Podcasts Riley Byrne, Owner, Podigy.co 02 November 2017 Gates serve a very specific purpose, and in the context of podcasting I often find that gates are unnecessary and can even be detrimental to the final product. Gates essentially only...

Master FX | Podigy Guide to Podcast Editing

PODCAST EDITING Master FX for Podcasts Riley Byrne, Owner, Podigy.co 15 September 2017 Google “audio mastering”, and you’ll undoubtedly come across dozens upon dozens of sites that call mastering “the dark art”  of production. To a large extent this is hyperbolic, but...

Multiband Compressors | Podigy Guide to Podcast Editing

PODCAST EDITING Multiband Compressors for Podcasts Riley Byrne, Owner, Podigy.co 14 September 2017 Following up on our article about de-essers, today we look at multiband compressors. Both are frequency-specific (a la EQ), compressors, but whereas de-essers only...

De-essing for Podcasters | Podigy Guide to Podcast Editing

PODCAST EDITING De-essing for Podcasts Riley Byrne, Owner, Podigy.co 08 September 2017 A de-esser is a podcaster’s best friend. It reduces the sibilance (words with “sss” or “shh” sounds, like ship, chip, sip, zip, or jump) of phrases so that they don’t jump out in...

Compression for Podcasters | Podigy Podcast Editing Guide

PODCAST EDITING Compression for Podcasts Riley Byrne, Owner, Podigy.co 06 September 2017 Compressors are perhaps the most misunderstood and misused of all the tools podcasters have at their disposal to edit podcasts. They’ve been on every single episode of a big name...

Podcaster’s EQ | Podigy Guide to Podcast Editing

PODCAST EDITING EQ for Podcasts Riley Byrne, Owner, Podigy.co 01 September 2017 An equalizer is a plugin we use in our digital audio workstation to modify the frequency response in our audio. It’s the most basic of all the building blocks to make our voice start...

Get in touch with Riley, our owner, anytime