[vox-tech] Marking Audio file based on Freq.
Bill Broadley
bill at cse.ucdavis.edu
Wed Mar 12 23:30:07 PDT 2008
>I think I'm on the right track now. What I figured out is that I can use
>whatever application I want to find the start time's for the region of
>interest and easily write those as a label track for audacity (see the .aup
>file its xmlish)
Cool. Not sure what good audacity would be doing at that point, I'm pretty
sure it would just be a line or two to start playing an audio form the .wav
file from a detection point or even from a cursor.
>So python is up on the top of the list now. I also poked around and took a
>look at the new VAMP Plugin system which is directed towards analysis and
>could be integrated into Audacity or Sonic Visualiser (
>http://www.sonicvisualiser.org/ ). The other application I've tinkered with
>is Praat ( http://www.fon.hum.uva.nl/praat/ )which is for speech recognition
>but that might be more hassle than it's worth.
Interesting, I hadn't known about those.
> This linked file has:
> A sound sample
Was the sample recorded in stereo? It makes it harder to work with if it's
fake stereo. I was hoping a mono .wav file or related so I could suck it into
a program easily. Uncompressed data also makes it much easier to seek around
to the interesting parts.
> An audacity project with label track
> A spectrogram screenshot
Normally I'd display display frequency on a log based graph. I doubt you are
really seeing over 50db at 0hz. Hell I don't think MP3's even encode at less
than 20 or so. I wouldn't usually use a lossy compression designed for
fooling the pschyoacoustic characteristics of a human ear for research related
data, but in this case I don't think it would make much difference. Not to
mention there may well be ultrasonics involved, but not necessary of course
for simple recognition.
Here's a log based graph of the same data:
http://cse.ucdavis.edu/~bill/out.png
> A spectrogram text dump (looks like 10-60Hz is the region of interest)
> http://ftp.dfg.ca.gov/Public/RAP/Projects/GGOW/GGOWspectrum.zip
Looks like 42 hz to 900hz on the log based graph. Hard to say if 20 hz
is real, background noise, microphone limitations, or the result of an
MP3 encoder.
> Thanks for the help, I guarantee at least one lugod talk later this year (after I finish my thesis) in exchange.
Cool, I'd like to see more science based talks. I'm pretty sure with a .wav
file I could put together a pretty side scrolling colorful spectogram like
http://www.onlamp.com/python/2001/01/31/graphics/num_py_2.gif with a few lines
of code.
More information about the vox-tech
mailing list