[vox-tech] noise removal with sox?

Shawn P. Neugebauer vox-tech@lists.lugod.org
Wed, 31 Jul 2002 17:59:02 -0700


On Wednesday 31 July 2002 12:03 am, Ryan wrote:
> On Tuesday 30 July 2002 10:42 pm, Shawn P. Neugebauer wrote:
> > I've not used sox before, but it looks cool--lots of basic DSP stuff.
> >
> > Here are a few suggestions:
> > - if fan noise is within the same frequency band as the voice, it will be
> >   difficult to remove only the noise using sox.  you would need more
> >   sophisticated tools.
>
> It's all over the place since it's from 9 computer fans(3 computers...),
> however it's not very noticable except in the pauses between words.

I assume by "all over the place" you are hypothesizing that the fan noise has 
frequency components in the same band as the voice?  Since you are able to 
analyze periods of silence, you should be able to tell.  The frequency 
content information you supply below gives a clue:  there's more power in
the low-frequency bins.  However, there's not really enough frequency
resolution in the data to draw a strong conclusion.

> > - if you can't tell just by listening, you can get an idea of where, in
> >   frequency, the noise is located but using a spectrum analysis tool.
> >   for example, xmms has a spectrum analyzer.  there are others.
> >   if you can find one that is used for analysis, rather than for pretty
> >   pictures of music, you'll get more information from it.  a spectrogram
> >   (frequency, vertically, vs. time) will be particularly useful for
> > visually identified constant- or periodically-varying noise/interference.
> > - hopefully, the noise is either very low in frequency (few hundred Hz)
> > or relatively high in frequency (>> 4 kHz).
>
> no such luck :( since it's only noticable between words, perhaps there is a
> way to filter out low volume portions of the fiile.
>
> The silence effect looked helpgul, but I could not figure out how to use it
> :(

The silence option ignores silence at the start or end of the file; that's not 
what you want.

> > - you didn't mention the format of the speech data.  you may need to
> >   use sox to convert a coded format to, e.g., 16-bit linear before trying
> >   any of these ideas.
>
> It's 8 bit 8000Hz, ment for use with vgetty as my greeting message

In your case, that means you should care about the frequency range 0-4kHz.

> Exported from audacity.....
>
[data removed]

Since you only care about the non-voice periods, I would focus your attention 
there.  You're already using audacity--I would use it to edit those periods.  
Record some silence (or use sox to heavily filter the fan-only noise).  Open 
both tracks.  Copy appropriately-sized periods of "silence" into the 
non-voice periods of your speech track.

A second option would be to try sox.  It looks like your fan noise has more 
power in the low frequencies (it's hard to tell without higher frequency 
resolution and/or looking at frequency content over time).  This implies that 
you *might* be able to band-pass filter your speech.  You could try the
filter options in sox with a few combinations of center frequency and 
bandwidth (center should be about 1800; vary the bandwidth, over, say
2000-3000).  This would attack the fan noise over the entire track, not
just during the periods of silence, but it might not do a good enough job 
during those periods.

I do not think sox has a magic solution to your problem.  Noise removal is a 
common problem in industry, but speech processing is far from "canned."

shawn.