Have you ever wondered how audio signals get from the microphone to the tape or memory card? It’s a mysterious process, rich with history, drama and intrigue. OK, it’s not really like that, but it is a pretty interesting procedure that’s been refined over the years. Whether you shoot with a compact video camera, HDSLR or just record audio on a pocket recorder, the operations are the same. This can get a little geeky at times, but we’ll try to keep the math at a minimum.
Microphones transform acoustic energy into an electrical signal. Some mics are better at this transformation than others, and there are hundreds to choose from. Regardless, audio goes in and electricity comes out. Back in olden times, the audio signal was recorded on analog videotape much like an old cassette tape (you have seen a cassette tape, haven’t you?). This passed across a recording head that rearranged the magnetic particles to represent the audio signal. On playback, the tape passed over the head again but this time, the head read the magnetic field on the tape and turned it back into an audio signal. Sounds archaic today, doesn’t it?
That’s because, since the mid 1980s, digital audio has been the reference standard for audio quality – both for professionals and consumers. The reason is simple: with analog recording, you never get back exactly what you put in. On playback, the audio is a little less clear, and noise – in the form of tape hiss – is mixed in with your recording. With digital audio, the recording is virtually identical to the playback. The only limitation is the quality of your equipment. This makes it easy for the recordist to ensure a good product. If it sounds good going into the recorder, it will sound good coming out.
Today, digital audio is everywhere – in our cameras, MP3 players, on the Internet, in theaters, on our phones, and, for the past few years, on television too. But it doesn’t take a recording studio, fancy hardware or software to do digital audio sampling. The process is already baked into all of our production equipment.
The way we transform acoustic energy into digital audio is called Sampling. In addition, there are some other terms associated with the procedure. The ones we need right now are Bit Depth and Sampling Frequency. Here’s how it works. Once the electrical audio signal comes into a digital audio device, it goes through something called an analog-to-digital converter. The converter is set to a specific bit depth – usually 16 or 24 bits – and samples the incoming audio. This produces a string of either 16 or 24 ones and zeros called a bit word. That represents the audio signal at that moment in time. It’s a lot like taking a still image with a camera. Although things happened before and after, the picture you end up with represents just one moment in time.
To continue our analogy, Sampling Frequency is more like shooting video since it’s really just a series of still images. We record video at 24, 30 or 60 frames per second and string them together to create a moving image. Digital audio sampling works the same way. The difference is the sampling rate. Instead of just a few times a second, digital audio creates samples at a much higher rate. In the video world, it’s usually 48,000 times per second; abbreviated as 48kHz or just 48k. For one second of digital audio, there are 48,000 individual bit words. If those bit words are 16 digits long, there are 768,000 individual ones and zeros. Double that number for a stereo track.
That sounds like a lot of data, but computers speak in ones and zeros, so it’s easy for them to store that type of information. In fact, a one-second stereo file with these specifications is only 187 kilobytes. Compared to our terabyte storage devices, it’s really pretty small. Once the audio is sampled and digitized, it can go on a hard drive, memory card and even digital audio or videotape. For many of us, our workflow is tapeless today, so as soon as the audio is recorded to some form of media, it’s simple to move it over a network, transport it with a flash drive or shoot it across the world with the Internet.
Play It Back
Once audio is in digital form, it’s easy to do all sorts of things. You can chop it up, reassemble the parts in a different order, process and generally have your way with it. But ultimately, digitally sampled audio isn’t very useful unless you play it back. This means turning our ones and zeros back into an audio signal we can actually hear. Every digital audio recorder contains another section called a Digital-To-Analog converter. As bit words stream in, the converter understands what they mean and translates each one back into an electrical signal that can be amplified and played through your speakers or headphones.
In the early days of digital audio, this was done on a one-to-one basis – one sample in, one sample out. But something just wasn’t right. Listeners complained of lifeless sound, distortion and a number of other things, real or perceived. In response, engineers developed the process of over-sampling. Each bit word was now read many times instead of just once. This made the decoding process more accurate, musical and eliminated errors in early digital audio devices. That technique continues today, but it’s often done in software rather than hardware. Computer processing power is cheap and plentiful today. So much so that sophisticated converters can be included in something as simple as a $100 pocket audio recorder.
Knowledge is Power
That wasn’t too painful, was it? Digital audio sampling is a part of everything we do as video creators. Here’s the best part: understanding the process involved in getting audio from our talent to our viewer isn’t something you’ll have to deal with every day. The work has already been done for you. In fact, we all take it for granted every time we work on a project. But understanding the terminology can come in handy when setting up a voice-over session or even doing a field recording. (Or playing a game of Trivial Pursuit – people still do that, right?) Now you know what all the little numbers mean and what’s going on behind the scenes inside your computer, camera or audio recorder. Aren’t you glad we left out the part about the Nyquist Theorem?
Sidebar: Bit Rates
The type of digital audio we’ve discussed in this article is bit-for-bit uncompressed audio – often stored in .WAV or .AIFF files. A stereo file in one of these formats has a bit rate of 1.536 Megabits per second. This is huge compared to a typical bitrate for MP3 files. At 160 kilobits per second, a typical MP3 is a small fraction of the uncompressed file size. Compressed audio tracks use various types of perceptual coding to eliminate details – and file size – that we can’t really hear anyway. But don’t get them confused. Compressed audio is lossy like a bad JPEG picture on the Internet.
Contributing Editor Hal Robertson is a digital media producer, photographer and technology consultant.