Think back to high school science for a moment. Remember how we were taught that sound waves were invisible and could only be heard and felt?
The humble audiometer offers a simple way for video producers to visualize audio volume as they record and post their projects. But what if I told you there are ways to look deeper into the content of your soundtrack? Would you be interested? This month, we'll explore a couple of different techniques for better understanding what's hiding inside your audio soundtracks.
Level meters are an important tool for the video producer. During the shoot, they help identify correct sound levels, ensuring a clean, punchy soundtrack. During post, level meters keep things in check as you add audio elements to your production — too strong and the audio will distort, too weak and your viewers will lunge for the remote control. As useful as level meters are, they only tell a small part of the story — namely, volume.
Your audio editor of choice operates in a mode called 'waveform view'. Many NLEs have this feature as well, but you'll probably have to find and activate it. The waveform view offers a clear picture of your recorded sound from beginning to end. Tall spikes indicate loud passages while softer sections are visibly lower in volume. Using the zoom feature, it's easy to magnify various areas of the waveform for further analysis. Zooming is perfect for finding edit points in dialog and music. Learning to interpret what you see is the key to professional audio editing.
Figure 1 is a small segment of voice-over dialog for a movie trailer. What you see is the word starring. Closer examination shows the vowels and consonants of the word. The burst of sound at the beginning is the 'S' sound followed by an intense spike for the 'T' sound. As the 'A' sound melds into the 'R', then the 'ING' you'll see a combination of waves superimposed on each other. While it would be difficult to guess the words and letters without hearing them, you can easily see the difference in the sounds. This makes it easy to identify small clicks, pops and breath sounds, simplifying your editing duties.
Figure 2 is a sample of music. The goal is to extract a four-measure section to loop under a DVD menu. Waveform view makes it easy to see the beats and musical changes in the piece. After a brief listen, you find the perfect section. After setting rough cue points, zoom in tight to find the actual downbeat of the starting measure. If you zoom in close enough, you'll see exactly where the beat begins. Notice where the waveform moves from negative to positive territory. This point is called the zero crossing point and is important in editing your perfect loop. If you do not make your edit at this intersection, you will likely hear a small click or pop at the beginning of the loop. Most popular audio editing programs automate this process or at least provide a way to easily identify the zero crossing nearest your edit point. I realize we're talking about fractions of a frame here but, if it's easy to get it right the first time, why do it any other way? After you've set the starting point, move to the end of the loop and find the exact same point and move your cue point accordingly. With a little practice, this makes it easy to identify and edit the perfect loop or cut for your video project.
The Whole Spectrum
Another useful option is spectrum or frequency analysis. You're probably familiar with the histogram view in your photo editor. Spectrum analysis is the same thing for audio, showing the intensity of specific frequencies in your soundtrack. Let's say your voice-over artist sent an audio file of their performance. You've edited the best cuts, but it sounds too boomy. You could apply a graphic equalizer and fool with the controls until the sound improves, but there's a better way. Using spectrum analysis on a troublesome section, you can actually see the buildup of sound in the exact frequency range. This gives you the information you need to apply the proper amount of equalization or filtering to correct the problem. Spectrum analysis can be applied in several different ways. The simplest is to highlight a small segment of your audio and have the program scan the content. This produces an accurate snapshot of the audio content for that particular piece. Another method is real-time analysis. Here, you play the audio while the software shows the peaks and valleys in real time. Watching your audio change over time is often the best way to assess the content and plan for adjustments. Some software allows you to load a WAV file for analysis — the program scans the entire file and displays frequency content for the entire piece.
The frequency analysis function in Adobe Audition offers another unique feature — pitch detection. With this option, you can actually see whether individual notes of a vocal performance are sharp or flat, and by how much. Using this information, you can apply a pitch shift to the offending note or notes, correcting the performance. Keep in mind, this only works on solo instruments and voices, but it's the type of repair that can save your project.
Scratching the Surface
This is one of those topics that require some homework on your part. Reading about visualizing sound is very different from doing it. Experience is the key. You won't want or need all these techniques all the time but, by understanding the principles, you'll know when to use them properly. Dig into your audio software, try some of the examples and you'll soon see sound in a brand new light.
Contributing Editor Hal Robertson is a digital media producer and technology consultant.
[Sidebar: Everyone’s Different]
Adobe Audition, Apple's Sound Track Pro, Digidesign's Pro Tools and Sony Sound Forge have advanced, integrated audio analysis tools. But what if your editor doesn't have this option or you do your editing and mixing with your NLE, Googling for “audio spectrum analysis” will reveal several options, but the king has to be Elemental Audio's InspectorXL. This amazing plugin sports every audio display you could imagine – and a couple you couldn't. Learning to interpret these displays will take time, but this level of information puts your audio productions on the leading edge.