The easy and free way would be to capture a sound clip from just before or after the word you want to remove and then drop your volume in the timeline for just the word and cut the length of the sound capture to fit just that length.

You can capture a cut of video for this if your software won’t capture just sound and then turn the video layer off to just use the sound portion. With a little practice you can get very fast at this and in most cases get real good results.

Only downside is having to manually do it one at a time. I don’t know of anything that will accurately do it in a batch method.

