Constructing a several wav files from one source wav - python

I have one source wav. file where i have recorded 5 tones separated by silence.
I have to make 5 different wav. files containing only this tones (without any silence)
I am using scipy.
I was trying to do sth similar as in this post: constructing a wav file and writing it to disk using scipy
but it does not work for me.
Can you please advise me how to do it ?
Thanks in advance

You need to perform silence detection to identify the tones starts and ends. You can use a measurement of magnitude, such as the root mean square.
Here's how to do a RMS using scipy
Depending on your noise levels, this may be hard to do programmatically. You may want to consider simply using a audio editor, such as audacity

Related

Is There a function which converts an audio file into array of hz?

I'm working on a project and try to compare audio file with another one.
I want to know the hz height of each signal in the audio and than check if it is accurate note or not.
For this I have tried librosa library but with it you need to crate mel spectogram and than work on the spec in order to find notes...
is there a simpler way?I prefer without CNN
Thanks
*maybe it's not so complicated to convert the spectogram as I thought...waiting to your help.

Spectrogram image to Audio

I want to write a python script which takes the input as the image of the spectrogram and generates the audio from it. Is there a way to convert the image of spectrogram into corresponding audio ?
I believe that there must be a way to reverse engineer the image of spectrogram to generate the audio. Can someone please help me with the same?
By a strange coincidence, I also needed to do this to recover audio for which only the spectrograms were available, but could find no tools to do this so wrote it myself in C. It's not simple and the results are, as user14325 rightly points out, very noisy compared to the originals, partly due to the low time resolution of most spectrograms but mostly because the phase information for each data point is lost and has to be invented.
However, if you are interested, you will find a brief description at
https://wikidelia.net/wiki/Spectrograms#Inverse_spectrograms
and you can find the code by following the "even hairier custom software" link and checking the files named "run.*" (the rest of the code there is for log-freq-axis forward spectrograms)

How to extract audio after particular sound?

Let's say I have a few very long audio files (for ex., radio recordings). I need to extract 5 seconds after particular sound (for ex., ad start sound) from each file. Each file may contain 3-5 such sounds, so I should get *(3-5)number of source files result files.
I found librosa and scipy python libraries, but not sure if they can help. What should I start with?
You could start by calculating the correlation of the signal with your particular sound. Not sure if librosa offers this. I'd start with scipy.signal.correlate or scipy.signal.convolve.
Not sure what your background is. Start here if you need some theory.
Basically the correlation will be high if the audio matches your particular signal or is very similar to it. After identifying these positions you can select an area around them.

Generate volume curve from mp3

I'm trying to build something in python that can analyze an uploaded mp3 and generate the necessary data to build a waveform graphic. Everything I've found is much more complex than I need. Ultimately, I'm trying to build something like you'd see on SoundCloud.
I've been looking into numpy and fft's, but it all seem more complicated than I need. What's the best approach to this? I'll build the actual graphic using canvas, so don't worry about that part of it, I just need the data to plot.
An MP3 file is an encoded version of a waveform. Before you can work with the waveform, you must first decode the MP3 data into a PCM waveform. Once you have PCM data, each sample represents the waveform's amplitude at the point in time. If we assume an MP3 decoder outputs signed, 16-bit values, your amplitudes will range from -16384 to +16383. If you normalize the samples by dividing each by 16384, the waveform samples will then range between +/- 1.0.
The issue really is one of MP3 decoding to PCM. As far as I know, there is no native python decoder. You can, however, use LAME, called from python as a subprocess or, with a bit more work, interface the LAME library directly to Python with something like SWIG. Not a trivial task.
Plotting this data then becomes an exercise for the reader.
I suggest you using Pygame if you don't want to deal with the inner workings of the mp3 file format.
Pygame is a multimedia library which can open common audio file formats - including .mp3 and .ogg as "Sound" objects - if you have Numpy instaled underneath, you can browse the uncompressed (and therefore, post fft transforms) sound, using the pygame.sndarray.array call - which returns a numpy array object with the sound samples.
I've found a little trick - be shure to call pygame.mixer.init with the same parameters (for frequency, bit sample size and n.of channels) as your .mp3 file has, or the call to sndarray.array may raise an Exception.
Check the documentation at http://www.pygame.org/docs/

Extracting an amplitude list from *.wav file for use in Python

I'm having a little bit of programing and conversion trouble. I'm designing an AI to recognize notes played by instruments and need to extract the raw sound data from a wave file. My objective is to perform a FFT operation over chunks of time in the file for use by the AI. For this I need an amplitude list of the audio file, but I can't seem to find a conversion technique that will work. The files start as MP3's and then I convert them to wav file, but I always end up with a compressed file that spits out gibberish when I try to read it. Does anyone know how I might convert the wav file to something that would be compatible with Python's wave module or even something that would directly convert the data into an amplitude list?
The default Python wave module isn't very thorough. You might try the one included in scipy as an alternative.
Check out: Reading *.wav files in Python
If you're going to do any numerical heavy lifting with the audio, scipy might be your best option anyway.
I believe Python can read .dat files. You can use SoX to turn mp3s or wavs or whatever into .dat files that are simply a text list of "time - Left amp - Right amp"
The code is simply
sox soundfile.mp3 soundfile.dat
http://sox.sourceforge.net/
Sox is command line - I run it with Terminal on my mac, but anything that understands Bash or Linux commands should work depending on what cpu you're using.
Hope that helps!
You might want to look at Pure Data too, it's got some nice FFT transforms built into an intuitive graphical programming language.

Categories