I'm trying to use librosa to read an .opus file but it runs forever and doesn't load anything (I've waited for around 30 minutes for a 51MB file and still nothing).
Here is the code I am using
path_to_opus = '/my/path/to/file.opus'
y, sr = librosa.load(path_to_opus, sr=16000)
Is there a good way of reading .opus audio files in python fast?
Thanks!
By looking at the librosa documentation you can specify a res_type field that sounds to be useful for you.
This is a quote from the doc:
res_type : str
By default, this uses resampy’s high-quality mode (‘kaiser_best’).
To use a faster method, set res_type=’kaiser_fast’.
To use scipy.signal.resample, set res_type=’scipy’.
You could try something like:
X, sr = librosa.load('myfile.opus', res_type='kaiser_fast', ...)
Related
My problem
I'm trying to fit a (machine-learning) model that takes in an audiofile (.wav) and predicts the emotion from it (multi-label classification).
I'm trying to read the sample rate and signal from the file, but when calling read(filename) from scipy.io.wavfile, I'm getting ValueError: Incomplete wav chunk.
What I've tried
I've tried switching from scipy.read() to librosa.read().
They both output the signal and sample rate, but for some reason librosa takes exponentially longer time than scipy, and is impractical for my task.
I've tried sr, y = scipi.io.wavfile.read(open(filename, 'r')) as suggested here, to no avail.
I've tried looking into my files and checking what might cause it:
Out of all 2084 wav files, 1057 were good (=scipy managed to read them), and
1027 were bad (=raised the error).
I couldn't seem to find any thing pointing as to what makes a file pass or fail, but nonetheless it's a weird result, as all files are taken from the same dataset from the same origin.
I've heard people saying I could just re-export the files as wav using some software, and it should work.
I didn't try this because a) I don't have any audio-processing software and it seems like an overkill, and b) I want to understand the actual problem rather than put a bandaid on it.
Minimal, reproducible example
Assume filenames is a subset of all my audio files, containing fn_good and fn_bad, where fn_good is an actual file that gets processed, and fn_bad is an actual file that raises an error.
def extract_features(filenames):
for fn in filenames:
sr, y = scipy.io.wavfile.read(fn)
print('Signal is: ', y)
print('Sample rate is: ', sr)
Additional info
Using VLC, it seems that the codecs are supported by scipy.io.wavfile, but in either case, both files have the same codec, so it's weird they don't have the same effect...
Codec of the GOOD file:
Codec of the BAD file:
I don't know why scipy.io.wavfile can't read the file--there might be an invalid chunk in there that other readers simply ignore. Note that even when I read a "good" file with scipy.io.wavfile, a warning (WavFileWarning: Chunk (non-data) not understood, skipping it.) is generated:
In [22]: rate, data = wavfile.read('fearful_song_strong_dogs_act10_f_1.wav')
/Users/warren/mc37/lib/python3.7/site-packages/scipy/io/wavfile.py:273: WavFileWarning: Chunk (non-data) not understood, skipping it.
WavFileWarning)
I can read 'fearful_song_strong_dogs_act06_f_0.wav' using wavio (source code on github: wavio), a package I created that wraps Python's standard wave library with functions that understand NumPy arrays:
In [13]: import wavio
In [14]: wav = wavio.read('fearful_song_strong_dogs_act06_f_0.wav')
In [15]: wav
Out[15]: Wav(data.shape=(198598, 1), data.dtype=int16, rate=48000, sampwidth=2)
In [16]: plot(np.arange(wav.data.shape[0])/wav.rate, wav.data[:,0])
Out[16]: [<matplotlib.lines.Line2D at 0x117cd9390>]
I solve the problem by changing this number "4" to "1" in the file wavefile.py file,
in this condition of the code:
- len(chunk_id) < 1
if not chunk_id:
raise ValueError("Unexpected end of file.")
elif len(chunk_id) < 1:
raise ValueError("Incomplete wav chunk.")
but it was by just intuition and good luck, now i wonder why this works and what are the possible reasons?
I am attempting to load in images I can't figure out how to load the images using pkgutil.get_data() which I would prefer to use since I don't want to have fixed paths in my code.
Currently, I have something like this, which works only when running out of the same folder.
...
self.imagePixmap = QtGui.QPixmap("img/myImage.png")
...
The issue then is, if you run the script from other folders the path is messed up and you get this error:
QPixmap::scaled: Pixmap is a null pixmap
I would like to use something like: pkutil.get_data("img", "myImage.png") to load the images however this provides the data from the image file where QPixmap() wants other kinds of data.
The only "workaround" I can see is to use part of what they specify here: pkgutil.get_data
and do something like this:
self.myPath = os.path.dirname(sys.modules[__name__].__file__)
self.imagePixmap = QtGui.QPixmap(os.path.join(self.myPath,"img/myImage.png"))
This just seems to much of a kludge to me. Is there a better way?
Here is what I ended up finding out. I should have rtfm a little closer. Anyway, you can use the loadFromData() method to get data from a QByteArray and pass it the format of the data therein.
self.imagePixmap = QtGui.QPixmap()
self.imagePixmap.loadFromData(get_data(__name__, "img/myImage.png"), 'png')
Here is a link to the information from here:
bool QPixmap::loadFromData(const QByteArray &data, const char *format = Q_NULLPTR, Qt::ImageConversionFlags flags = Qt::AutoColor)
This is an overloaded function.
Loads a pixmap from the binary data using the specified format and conversion flags.
I work in a lab where we acquire electrophysiological recordings (across 4 recording channels) using custom Labview VIs, which save the acquired data as a .DAT (binary) file. The analysis of these files can then be continued in more Labview VIs, however I would like to analyse all my recordings in Python. First, I need walk through all of my files and convert them out of binary!
I have tried numpy.fromfile (filename), but the numbers I get out make no sense to me:
array([ 3.44316221e-282, 1.58456331e+029, 1.73060724e-077, ...,
4.15038967e+262, -1.56447362e-090, 1.80454329e+070])
To try and get further I looked up the .DAT header format to understand how to grab the bytes and translate them - how many bytes the data is saved in etc:
http://zone.ni.com/reference/en-XX/help/370859J-01/header/header/headerallghd_allgemein/
But I can't work out what to do. When I type "head filename" into terminal, below is what I see.
e.g. >> head 2014_04_10c1slice2_rest.DAT
DTL?
0???? ##????
empty array
PF?c ƀ????l?N?"1'.+?K13:13:27;0.00010000-08??t???DY
??N?t?x???D?
?uv?tgD?~I??
??N?t?x>?n??
????t?x>?n??
????t???D?
????t?x???D?
????t?x?~I??
????tgD>?n??
??N?tgD???D?
??N?t?x???D?
????t????DY
??N?t?x>?n??
??N?t????DY
?Kn$?t?????DY
??N?t??>?n??
??N?tgD>?n??
????t?x?~I??
????tgD>?n??
??N?tgD>?n??
??N?tgD???DY
????t?x???D?
????t???~I??
??N?tgD???DY
??N?tgD???D?
??N?t?~I??
??N?t?x???DY
??N?tF>?n??
??N?t?x??%Y
Any help or suggestions on what to do next would be really appreciated.
Thanks.
P.s. There is an old (broken) matlab file that seems to have been intended to convert these files. I think this could probably be helpful, but having spent a couple of days trying to understand it I am still stuck. http://www.mathworks.co.uk/matlabcentral/fileexchange/27195-load-labview-binary-data
Based on this link it looks like the following should do the trick:
binaryFile = open('Measurement_4.bin', mode='rb')
(data.offset,) = struct.unpack('>d', binaryFile.read(8))
Note that mode is set to 'rb' for binary.
With numpy you can directly do this as
data = numpy.fromfile('Measurement_4.bin', dtype='>d')
Please note that if you are just using Python as an intermediate and want to go back to LabVIEW with the data, you should instead use the function Read from Binary file.vi to read the binary file using native LabVIEW.
DAT is a pretty generic suffix, not necessarily something pointing to a specific format. If I'm understanding correctly, that help section is for DIAdem, which may be completely unrelated to how your data is saved from LV.
What you want is this help section, which tells you how LV flattens data to be stored on disk - http://zone.ni.com/reference/en-XX/help/371361J-01/lvconcepts/flattened_data/
You will need to look at the LV code to see exactly what kind of data you're saving and how the write file function is configured (byte order, size prepending, etc.) and then use that document to translate it to the actual representation.
I've found pyDub, and it seems like just what I need:
http://pydub.com/
The only issue is with generating silence. Can pyDub do this?
Essentially the workflow I want is:
Take all the WAV files in a directory
Piece them together in filename order with 1 sec of silence in between
Generate a single MP3 of the result
Is this possible? I realize I could create a WAV of silence and do it that way (spacer GIF flashback, anyone?), but I'd prefer to generate the silence programmatically, because I may want to experiment with the duration of silence and/or the bitrate of the MP3.
I greatly appreciate any responses.
The pydub sequences are composed of pydub.AudioSegment instances. The pydub quickstart documentation only shows how to create AudioSegments from files.
However, reading the source, or even more easily, running pydoc pydub.AudioSequence reveals
pydub.AudioSegment = class AudioSegment(__builtin__.object)
| AudioSegments are *immutable* objects representing segments of audio
| that can be manipulated using python code.
…
| silent(cls, duration=1000) from __builtin__.type
| Generate a silent audio segment.
| duration specified in milliseconds (default: 1000ms).
which would be called like (following the usage in the quick start guide):
from pydub import AudioSegment
second_of_silence = AudioSegment.silent() # use default
second_of_silence = AudioSegment.silent(duration=1000) # or be explicit
now second_of_silence would be an AudioSegement just like song in the example
song = AudioSegment.from_wav("never_gonna_give_you_up.wav")
and could be manipulated, composed, etc. with no blank audio files needed.
I am trying to write some code that will extract the amplitude data from an mp3 as a function of time. I wrote up a rough version on MATLAB a while back using this function: http://labrosa.ee.columbia.edu/matlab/mp3read.html However I am having trouble finding a Python equivalent.
I've done a lot of research, and so far I've gathered that I need to use something like mpg321 to convert the .mp3 into a .wav. I haven't been able to figure out how to get that to work.
The next step will be reading the data from the .wav file, which I also haven't had any success with. Has anyone done anything similar or could recommend some libraries to help with this? Thanks!
You can use the subprocess module to call mpg123:
import subprocess
import sys
inname = 'foo.mp3'
outname = 'out.wav'
try:
subprocess.check_call(['mpg123', '-w', outname, inname])
except CalledProcessError as e:
print e
sys.exit(1)
For reading wav files you should use the wave module, like this:
import wave
import numpy as np
wr = wave.open('input.wav', 'r')
sz = 44100 # Read and process 1 second at a time.
da = np.fromstring(wr.readframes(sz), dtype=np.int16)
wr.close()
left, right = da[0::2], da[1::2]
After that, left and right contain the samples of the same channels.
You can find a more elaborate example here.
Here is a project in pure python where you can decode an MP3 file about 10x slower than realtime: http://portalfire.wordpress.com/category/pymp3/
The rest is done by Fourier mathematics etc.:
How to analyse frequency of wave file
and have a look at the python module wave:
http://docs.python.org/2/library/wave.html
The Pymedia library seems to be stable and to deals with what you need.