Long ago, in a studio far, far away, audio engineers labored with fragile plastic ribbons covered
with a mysterious rust-colored substance. Their task: to coax audio recordings onto and off of reels
filled with these ribbons. While there was a great deal of science involved, a generous dose of
black magic was required to achieve a quality recording. Not to mention the time and money required
archiving these delicate recordings for future generations.
Today, things are much easier. Video producers have the ability to record near-perfect
performances all day with just the click of a mouse. Editing is easy and mistakes are just an undo
away. Archiving with optical discs is the norm and the digital media revolution makes it both
possible and practical to keep multiple copies of our work scattered all over the globe.
Unfortunately, with all this ease-of-use comes confusion about the process of recording and
processing digital audio. In this month’s Sound Advice, we’ll take a closer look at all those ones
and zeros.
History and Math
Back in the late 1920s, a man named Harry Nyquist developed a theorem that describes how to
accurately convert an analog signal into a data stream. While the theoretical framework could easily
fill this space on its own, the basic equation is pretty simple. To recreate our signal correctly,
Nyquist said, we must take a sample of the audio with a period that is at least twice the frequency
(pitch) of the audio. Wait. Sample? Frequency? What does that all mean?
Let’s use the universal audio CD as an example. The digital audio on a compact disc is sampled at
a periodic rate of 44,100 times per second (with 16-bit accuracy: we’ll get to that in a moment). If
we divide the sampling rate by two, we see that our music CDs can theoretically record audio
frequencies of around 22,000 cycles per second (hertz), which is higher than the average range of
human hearing. That’s a theoretically maximum, however, and it would be nice if we had a little more
wiggle room. If it helps, you might think of the sample rate in terms of resolution. More resolution
means more detail, in this case, more detail in the higher frequencies.
The number of digital bits we assign to each sample also influences the quality of the recording.
In the digital world, there are only two possible values for any memory location: zero or one. You
can also think of them as Off and On. Each of our 16 bits can have one of those two values. A little
calculation (216) shows there are 65,536 possible combinations in our 16-bit word or sample. That’s
just one sample. That’s a lot of ones and zeros! If it helps, you might think about the audio bit-
depth in terms of the number of potential colors for a particular sample.
Of course, as soon as CD audio hit the streets, there were those in the audio community who
believed the quality wasn’t high enough and began to look to future standards. Although our audio
CDs are still 16-bit, 44.1kHz, many cutting edge studios record with sampling rates of 192kHz with
24-bit accuracy. 24 bits allow for over 16 million variations in the audio sample. Most of the
desktop audio recording packages will record 48kHz at 24 bits. These numbers will continue to rise
as computer technology advances.
Split Personality
So what digital audio system does Mini DV use? Well, actually, it uses two different systems: a
high-quality two-channel system and a lower-quality 4-channel system. The high-quality version uses
16-bit accuracy and a sampling rate of 48,000 times per second. This is actually better than CD
quality and one of the reasons clever video producers use their camcorders as portable audio
recorders as well. When you record audio with these settings, rest assured the quality of the
recording will only be limited by the audio sources, not the recording medium. It isn’t a
coincidence that DVD PCM audio is also 16-bit, 48kHz.
The secondary audio system is 12-bit with a sampling rate of 32,000 samples per second. This is
not bad, by any means, but it is not as high as it could be (see the Sidebar for an important tip).
So, why did they go to the trouble of designing a completely different, if inferior, audio system?
The 12-bit system has two independent pairs of stereo audio channels: 4 channels in all. OK,
great, but what can you do with it? Unless you’re recording voice-overs in the field, not much. This
feature is a holdover from the analog video days. Those producing video on 3/4-inch tape, back in
ancient times, often recorded natural audio on one channel and the voice-over on another channel.
During the design phase of the DV format, it seemed like a good idea to allow two stereo pairs of
tracks for this purpose. Of course today, most camcorder owners in the Americas don’t even know
about the option, let alone how to use it.
Squeeze Me
When producing a DV project, it’s best to use the 16-bit, 48kHz standard for audio, since it’s
higher quality and is a natural for DVD distribution. Many editing applications ask you to choose
the audio format for your project when you start the program. Most of the time, you can just select
an NTSC DV template and the audio settings will be automatic.
While most professional editing applications allow you to mix and match audio formats, including
MP3s and WAV files of various rates, it’s best to leave the audio conversion to your dedicated audio
software. We’ve seen bugs around this issue in the past: incorrect sample rate conversion will turn
your distinguished narration into Alvin and the Chipmunks. I’ve personally seen two of the major
applications choke and die during real-time previews of multi-source audio.
If you have either of these increasingly rare problems, using a dedicated audio utility to
properly convert your audio to the correct format before bringing it into your video editor will
solve the problem.
I know we’ve done a little more math than usual in this month’s column, but the goal is to give
you a better understanding of the digital audio process. Armed with this knowledge, you’re on your
way to higher quality audio and a streamlined production process.
Author Hal Robertson has been producing stereo soundtracks for over 24 years. He owns a
consulting company that specializes in media production.
[Sidebar: Legendary Standards]
When we think of worldwide standards such as that for the audio CD, it’s easy to imagine a
consortium of geeks in lab coats hashing out the finer mathematical details. In reality, many
decisions are rather arbitrary. For example, one proposed CD audio standard was 14-bit, 36kHz
digital audio on an 11.5-cm disc, which produces a disc that can hold 60 minutes of music. Another
proposed standard was 16-bit, 44kHz audio, 12-cm discs that can hold 74 minutes of music. Legend has
it that a higher-up at Sony (or was it Philips?) demanded that Beethoven’s 9th fit on a single disc,
which required a 74-minute disc (never mind that performances rarely hit exactly 74 minutes). A
further, and even more fun legend, is that executives at Philips and Sony settled the matter with a
surfboard match.
[Sidebar: Math Bytes]
Bit depth and sample rate can be used to calculate the storage size of an audio file. For one second
of CD-quality audio:
(44,100Hz * 16-bit) * 2 stereo channels = 1,411,200 bits
or, since there are eight bits in a byte:
(1,411,200 bits / 8) = ~176,400 bytes
[Sidebar: 12-bit by Default]
One of our biggest complaints to camcorder manufacturers is that almost all camcorders come from the
factory with their default audio set to the lower-quality 12-bit, 32kHz audio mode. If you are not
explicitly doing audio dubs over your audio tracks (or don’t even know what that is), you should put
this magazine down right now and get your camcorder out. Turn it on, go to the record settings menu
and find the audio settings. Then change the "12-bit" setting to "16-bit" or,
alternately, on some cameras, change the "32kHz" setting to "48kHz." Will you
notice a difference in quality? Probably not, but we’ll all feel better.