I am working on a project where I need to extract the Mel-Cepstral Frequency Coefficients (MFCC) from audio signals. The first step for this process is to read the audio file into Python.
The audio files I have are stored in a .sph format. I am unable to find a method to read these files directly into Python. I would like to have the sampling rate, and a NumPy array with the data, similar to how wav read works.
Since the audio files I will be dealing with are large in size, I would prefer not to convert to .wav format for reading. Could you please suggest a possible method to do so?
I was against converting to a .wav file as I assumed it would take a lot of time. That is not the case. So, converting using SoX suited my needs.
The following script when run in a windows folder converts all the files in that folder to a .wav file.
cd %~dp0
for %%a in (*.sph) do sox "%%~a" "%%~na.wav"
pause
After this, the following command can be used to read the file.
import scipy.io.wavfile as wav
(rate,sig) = wav.read("file.wav")
Based on The answer of ben, I was able to read a .sph file with librosa, as it can read everything that audioread and ffmpeg can read.
import librosa
import librosa.display # You need this in librosa to be able to plot
import matplotlib.pyplot as plt
clip_dir = os.path.join("..","babel","LDC2016S10.sph")
audio,sr = librosa.load(clip_dir,sr=16000) # audio is a numpy array
fig, ax = plt.subplots(figsize=(15,8))
librosa.display.waveplot(audio, sr=sr, ax=ax)
ax.set(title="LDC2016S10.sph waveform")
You can read sph files via audioreadwith ffmpeg codecs.
Related
I'm working on a project and try to compare audio file with another one.
I want to know the hz height of each signal in the audio and than check if it is accurate note or not.
For this I have tried librosa library but with it you need to crate mel spectogram and than work on the spec in order to find notes...
is there a simpler way?I prefer without CNN
Thanks
*maybe it's not so complicated to convert the spectogram as I thought...waiting to your help.
How do I open .NPY files in python so that I can read them?
I've been trying to run some code I've found but it outputs in .NPY files so I can't tell if its working.
*.npy files are binary files to store numpy arrays. They are created with
import numpy as np
data = np.random.normal(0, 1, 100)
np.save('data.npy', data)
And read in like
import numpy as np
data = np.load('data.npy')
Late reply but I think NPYViewer is a tool that can help you, as it allows you to quickly visualize the contents of .npy files without having to write code. It also has options to visualize 2D .npy arrays as grayscale images as well as 3D point clouds.
Reference: https://github.com/csmailis/NPYViewer
I am trying to implement the package:
https://pyradiomics.readthedocs.io/en/latest/usage.html
It looks super simple, but they expect .nrrd files.
My files are .nii.gz. How do I solve this?
Also, have anyone tried to apply PyRadiomics on TCIA data? if so, can I see your github or Jupyter Notebook?
Thanks a lot.
You could turn NII into numpy array firstly and then turn it into NRRD with using:
nrrd and nibabel
import numpy as np
import nibabel as nib
import nrrd
# Download NII
example_filename = "image.nii.gz"
image = nib.load(example_filename)
# Turn into numpy array
array = np.array(img.dataobj)
# Save NRRD
nrrd_path_to = "image.nrrd"
nrrd.write(image_path_to, array)
Although the examples are in .nrrd, PyRadiomics uses SimpleITK for image operations. This allows PyRadiomics to support a whole range of image formats, including .nii.gz. You don't have to convert them.
The DWIConverter converts diffusion-weighted MR images in DICOM series into nrrd format for analysis in Slicer. It parses the DICOM header to extract necessary information about measurement frame, diffusion weighting directions, b-values, etc, and write out a nrrd image. For non-diffusion weighted DICOM images, it loads in an entire DICOM series and writes out a single DICOM volume in a .nhdr/.raw pair.
So that trying to convert your .nii.gz inside DICOM files for the nrrd format is a possibility by using this tools. Also, you can look at the SlicerDMRI that is a similar module.
I have one source wav. file where i have recorded 5 tones separated by silence.
I have to make 5 different wav. files containing only this tones (without any silence)
I am using scipy.
I was trying to do sth similar as in this post: constructing a wav file and writing it to disk using scipy
but it does not work for me.
Can you please advise me how to do it ?
Thanks in advance
You need to perform silence detection to identify the tones starts and ends. You can use a measurement of magnitude, such as the root mean square.
Here's how to do a RMS using scipy
Depending on your noise levels, this may be hard to do programmatically. You may want to consider simply using a audio editor, such as audacity
I'm having a little bit of programing and conversion trouble. I'm designing an AI to recognize notes played by instruments and need to extract the raw sound data from a wave file. My objective is to perform a FFT operation over chunks of time in the file for use by the AI. For this I need an amplitude list of the audio file, but I can't seem to find a conversion technique that will work. The files start as MP3's and then I convert them to wav file, but I always end up with a compressed file that spits out gibberish when I try to read it. Does anyone know how I might convert the wav file to something that would be compatible with Python's wave module or even something that would directly convert the data into an amplitude list?
The default Python wave module isn't very thorough. You might try the one included in scipy as an alternative.
Check out: Reading *.wav files in Python
If you're going to do any numerical heavy lifting with the audio, scipy might be your best option anyway.
I believe Python can read .dat files. You can use SoX to turn mp3s or wavs or whatever into .dat files that are simply a text list of "time - Left amp - Right amp"
The code is simply
sox soundfile.mp3 soundfile.dat
http://sox.sourceforge.net/
Sox is command line - I run it with Terminal on my mac, but anything that understands Bash or Linux commands should work depending on what cpu you're using.
Hope that helps!
You might want to look at Pure Data too, it's got some nice FFT transforms built into an intuitive graphical programming language.