Saving Time in Transcribing

For the past four years, I’ve shot and sold video to companies – news outlets like MSNBC and Yahoo! News, websites like and CurrentTV – and clients ranging from the New York Restaurant Association to Chinese factories while I lived in Shenzhen. All of these videos included interviews and transcription, which can make postproduction a bit tedious. Through constant revision of my techniques, I’ve come up with an efficient and even elegant way to edit video. Here are some key tips I’ve found that are indispensable for scripting and editing interview-laden video.

Record Keeping: The Tedious Task

Camera operators are good at shooting. Producers are good at writing and reporting. Editors are good at editing. But these lines are becoming increasingly blurred, and integrating these jobs can be an organizational nightmare. A sophisticated method of organizing the postproduction process is essential to producing video efficiently in the YouTube age. The trouble is, the rise of freelancers who produce news packages has met the economics of supply and demand, and typical rates for these videos have fallen.

Compare the average day rate of $500 for the average network TV field cameraperson to the $250 starting rate CurrentTV pays for a fully-edited package. Similar companies will rarely pay more than $750 for a 5-minute video. Indeed, well under $1000 is apparently a fair price to pay for content that involves many days of labor, equipment costs, writing, narration and of course, all postproduction. Still, you can make out handsomely if you can work fast. Create content efficiently and move on to the next project, or spend weeks on a single story and watch your equivalent hourly wage reduce to nothing.

All of my projects include multiple interviews that typically require tedious transcription and logging of all the tape. If I’m not organized, a small project can start eating up days of my time through re-watching clips, locating lost footage and writing a script that is inefficient to piece together.

That’s why I have an adaptable, efficient system to convert a stack of videotapes into manageable data. The following is a step-by-step process for organizing and transcribing footage using Adobe Premiere, Dragon NaturallySpeaking and MS Word. You can tweak this for other types of video projects, platforms and NLE software.

Step 1: Logging Tape

Logging videotape is tedious and time-consuming. Trying to edit badly-organized video can be even more frustrating. So you have to keep your eyes on the prize – the prize is well-organized video clips, and they will let you breeze through the editing process.

Naming video clips has become a science for me. My tapes always have sequential numbers and the “clip name” is the project name followed by a number: project-number.avi. Say my project is a segment about fire ants. I would start with “fire ants-001.avi.” As I log each successive clip, Premiere automatically increments the numbers of the clips. I ignore all the other fields – description, scene, log note – as they have proven to be time wasters.

I use the keyboard to do all the logging: I is for in point and O is for out point. I also change the keyboard shortcut to assign the letter L to do the same as clicking on the Log Clip button. Hitting L tells Premiere to add that clip to the log and rename the next clip by incrementing the number. By not touching the mouse and letting Premiere automatically name my clips, it takes me just over an hour to get through logging each 1-hour tape.

Of course, the Automatic Tape Capture tool is amazing for batch-capturing a tape with no supervision. The program uses only the date-time stamps in the video to determine the in and out points. This works wonderfully for B-roll tapes and shorter interviews. But, if an interview goes for a half hour or longer, I always break it into individual sound-bite clips.

The sequential numbering across all tapes is crucial. So do not restart the numbering of clips for each tape. When you log tape 2, just continue with the next clip number, and so on.

Step 2: Batch Capture

Create a new project folder and capture all the clips there. I like to set the frame handle to 40 to maintain a buffer, though this is not relevant if you do the auto tape capture.

By keeping video clips organized in specific folders based on the names of the projects, I can easily copy them to my editor’s external hard drive. When the editor assembles a rough cut in Premiere, she will simply email me the project file instead of an exported WMV or MP4 video. This way, I can view and tweak the timeline directly, rather than annotating the video. This collaboration doesn’t require transferring gigabytes, since we both have the same media files.

Assemble edit all of your captured SOT (sound on tape) video clips into the timeline sequentially. These videos should contain any discernable audio, like an interview or a walk-and-talk, that you’d want transcribed. Because your interview clips will be clustered by person, it should be easy to Select, then Shift-Select the last clip and drag them to the timeline. Then Export the entire timeline as an audio (WAV) file.

Step 3: Time Stretch

Open this WAV file in sound editing software, such as the freeware program Audacity or Sound Forge. Use the Time Stretch feature to expand the time of the audio clip to 150%. For example, a 50-minute recording will now run 75 minutes. This feature lengthens the audio while maintaining its pitch. The resulting speech will sound normal enough to understand but a bit slowed down. The purpose of this step is to create an audio file that you’ll be able to listen to and simultaneously say aloud into the transcription program, without rushing and without errors. (Normal rates of speech are about 150 words per minute, which is nearly impossible for the uninitiated to listen and say accurately, in real time. Try it out by listening to a radio talk show and saying exactly what you hear, adding punctuation like “period” and “new paragraph,” for a few minutes, and you’ll see how taxing it is.)

Step 4: Dictate and Transcribe

Open up Dragon NaturallySpeaking, the best voice dictation program on the market ( You’ll need to train it for a few minutes before transcribing interviews. The trick to accurate dictation is all in the clear enunciation of words. You should speak articulately and speak in full sentences without saying “um” or “uh.” I can speak comfortably about 150 words per minute sustained, with 99.99% accuracy. With any decent computer, the time it takes to convert audio to text is almost immediate.

Put on your headset, which should have an earphone and a microphone mouthpiece. Open up a new Dragon Pad document or MS Word, though I find Dragon’s word processing program slightly more reliable than MS Word. Begin playing the WAV file in Sound Forge. While you hear the interviewee’s voice, you’ll need to simultaneously speak exactly what you hear into the mic. Insert punctuation (say “comma” and “period”) and say “new paragraph” whenever you hear a new clip (when the corresponding video goes from fireants-034.avi to fireants-035.avi). An audible click or hiccup will denote each new clip, or it will simply cut someone mid-sentence.

This transcription technique takes some getting used to, and it requires mental stamina to consistently and accurately say what you hear, word for word, with no interruptions or mistakes. I find I cannot sustain this for much more than 15 to 20 minutes without a quick break or a drink of water. I find it more accurate and faster to speak monotone, with no intonation (like the way you’d ask a question). Just speak exactly what you hear. This sometimes causes a sore throat if you do it all day.

The upside, of course, is that you can transcribe a one-hour interview in about an hour and a half with nearly perfect accuracy. Compare this to the usual five or six hours to tediously type out what you hear, constantly re-listening to the audio and chugging along at a frustratingly slow rate.

Step 5: Clip Numbers

The main idea in this step is to align your transcript pieces with your video clips. If you do this correctly, you’ll never have to search for an interview clip.

Copy and paste the entire transcript into MS Word and apply the Numbered List option to the text. These numbers will ultimately correspond exactly with the names of your captured video clips, but you’ll need to do a bit of housekeeping first. Go into Premiere to double-check the clips for a particular interviewee and adjust the numbers from interview to interview.

For example, say you have a tape with a 35-minute interview with Tom Smith, then a few minutes of B-roll footage of Tom in his office, then a 20-minute interview with Jane Atwood. Your captured clips could be numbered like crimewatch-001.avi through crimewatch-041.avi for the first interview, then 42 through 46 for the B-roll footage, then 47 through 65 for the second interview. You’ll need to go to the Set Numbering Value menu in Word and change the starting value for the second batch of interview clips to 47. This will leave a necessary gap between the transcript of the last interview clip for Tom, 41, and the first one for Jane, 47.

Step 6: Write Your Script

You probably have your own way of writing a script. Your editor is probably specific on how he or she wants the script annotated to provide key information on which clips (visual and interview) go where and how to locate them. Fortunately, we have divided our transcript by sound bites that correspond with exact video clip filenames. So when I write my script, I copy and paste the exact quotes from this transcript and include the number at the end of the sound bite. This tells the editor which clip to drop into that part of the timeline.

Of course, you’ll still need to specify B roll, but for typical news segments or even documentaries with clear chapters, determining which visuals go with which narration is pretty straightforward.

Step 7: Edit With Ease

Editing this script with the transcript and captured files is less tedious than traditional editing with time code. If you have an editor, you’ll copy all the video files, transcript, script and narration audio to his or her external HD and receive the project files via email.

If you are editing, just fire up your favorite editing program, and start dropping clips into the timeline. You’ll know immediately which interview clips are where. For example, if the script says:
SOT (Adrian Manning-12): “as the number of trees decrease over time, there’s going to be more competition for that resource, for nesting.”

You’d locate video clip number 12, and trim it to that sound bite. Now the editor can focus more time on ‘craft editing’ – the part of video editing that is creative and stylish – rather than what I call ‘blueprint’ or ‘Tetris’ editing, where the editor spends most of his time hunting down clips and dropping them into the timeline to create the rough cut.

Indeed, when I provided these files to an editor I hadn’t worked with previously, he was able to turn around a rough cut by mid-afternoon and commented on how incredibly easy it was to understand and edit using this organization of bites, clips and transcripts.

Final Advice

This procedure eliminates the tediousness of locating the quotes from a log, which ultimately points to a video clip. Both the writer and the editor ultimately accomplish more work in less time, becoming more efficient pieces of the puzzle. If you are the writer and editor, even better.

This procedure is not necessarily the absolute best or most efficient, but it’s the best I’ve discovered thus far. There are many paths you can take to convert a stack of video tapes into a final cut. I know there are techniques and software features that I will read about on a blog or in a magazine. Some techniques are traditions, relics from the days of linear editing, while new gadgets may condense a particular series of steps into one mouse click. Either way, I remain completely open to new organizational techniques, as should you, as long as it speeds up the time it takes to edit.

Jeff Novich is a freelance videographer and broadcast journalist in New York City.

Did you find this content helpful?

The Videomaker Editors are dedicated to bringing you the information you need to produce and share better video.