Convolving Room Impulse Response with a Wav File (python)

Convolving Room Impulse Response with a Wav File (python) - python

I have written the following code which is supposed to put echo over an available sound file. Unfortunately the output is a very noisy result which I don't exactly understand. Can anybody help me with regard to this? Is there any skipped step?
#convolving a room impulse response function with a sound sample both of stereo type
from scipy.io import wavfile
inp=wavfile.read(sound_path+sound_file_name)
IR=wavfile.read(IR_path+IR_file_name)
if inp[0]!=IR[0]:
print "Size mismatch"
sys.exit(-1)
else:
rate=inp[0]
print sound_file_name
out_0=fftconvolve(inp[1][:,1],IR[1][:,0])
out_1=fftconvolve(inp[1][:,1],IR[1][:,1])
in_counter+=1
out=np.vstack((out_0,out_1)).T
out[:inp[1].shape[0]]=out[:inp[1].shape[0]]+inp[1]
wavfile.write(sound_path+sound_file_name+'_echoed.wav',rate,out)

Adding echo to a sound file is just that... adding echo. Your code doesn't look like it's adding two sounds together; it looks like it's transforming the input sound into something else.
Your data flow should look something like this:
source sound ------------------------------>|
| + ----------> target sound
---------> convolution echo --------->|
Note that your echo sound is going to be longer than your original sound (i.e. it has a "tail.")
Adding two sounds together is simply a matter of adding each of the individual samples together from both sounds to produce a new output wave. I don't think vstack does that.

Apparently Wav files are imported as int16 files and modification should be done after converting them to floats:
http://nbviewer.ipython.org/github/mgeier/python-audio/blob/master/audio-files/audio-files-with-pysoundfile.ipynb
After convolution one needs to renormalize again. And thats it.
Hope this helps the others too.
from utility import pcm2float,float2pcm
input_rate,input_sig=wavfile.read(sound_path+sound_file_name)
input_sig=pcm2float(input_sig,'float32')
IR_rate,IR_sig=wavfile.read(IR_path+IR_file_name)
IR_sig=pcm2float(IR_sig,'float32')
if input_rate!=IR_rate:
print "Size mismatch"
sys.exit(-1)
else:
rate=input_rate
print sound_file_name
con_len=-1
out_0=fftconvolve(input_sig[:con_len,0],IR_sig[:con_len,0])
out_0=out_0/np.max(np.abs(out_0))
out_1=fftconvolve(input_sig[:con_len,1],IR_sig[:con_len,1])
out_1=out_0/np.max(np.abs(out_1))
in_counter+=1
out=np.vstack((out_0,out_1)).T
wavfile.write(sound_path+sound_file_name+'_'+IR_file_name+'_echoed.wav',rate,float2pcm(out,'int16'))
One can download utility from the above link.
UPDATE: Although it generates a working output its still not as good as the result when using the original website Openair for convolving.

Related

How can I fix this value error: Incomplete wav chunk error? [duplicate]

My problem
I'm trying to fit a (machine-learning) model that takes in an audiofile (.wav) and predicts the emotion from it (multi-label classification).
I'm trying to read the sample rate and signal from the file, but when calling read(filename) from scipy.io.wavfile, I'm getting ValueError: Incomplete wav chunk.
What I've tried
I've tried switching from scipy.read() to librosa.read().
They both output the signal and sample rate, but for some reason librosa takes exponentially longer time than scipy, and is impractical for my task.
I've tried sr, y = scipi.io.wavfile.read(open(filename, 'r')) as suggested here, to no avail.
I've tried looking into my files and checking what might cause it:
Out of all 2084 wav files, 1057 were good (=scipy managed to read them), and
1027 were bad (=raised the error).
I couldn't seem to find any thing pointing as to what makes a file pass or fail, but nonetheless it's a weird result, as all files are taken from the same dataset from the same origin.
I've heard people saying I could just re-export the files as wav using some software, and it should work.
I didn't try this because a) I don't have any audio-processing software and it seems like an overkill, and b) I want to understand the actual problem rather than put a bandaid on it.
Minimal, reproducible example
Assume filenames is a subset of all my audio files, containing fn_good and fn_bad, where fn_good is an actual file that gets processed, and fn_bad is an actual file that raises an error.
def extract_features(filenames):
for fn in filenames:
sr, y = scipy.io.wavfile.read(fn)
print('Signal is: ', y)
print('Sample rate is: ', sr)
Additional info
Using VLC, it seems that the codecs are supported by scipy.io.wavfile, but in either case, both files have the same codec, so it's weird they don't have the same effect...
Codec of the GOOD file:
Codec of the BAD file:

I don't know why scipy.io.wavfile can't read the file--there might be an invalid chunk in there that other readers simply ignore. Note that even when I read a "good" file with scipy.io.wavfile, a warning (WavFileWarning: Chunk (non-data) not understood, skipping it.) is generated:
In [22]: rate, data = wavfile.read('fearful_song_strong_dogs_act10_f_1.wav')
/Users/warren/mc37/lib/python3.7/site-packages/scipy/io/wavfile.py:273: WavFileWarning: Chunk (non-data) not understood, skipping it.
WavFileWarning)
I can read 'fearful_song_strong_dogs_act06_f_0.wav' using wavio (source code on github: wavio), a package I created that wraps Python's standard wave library with functions that understand NumPy arrays:
In [13]: import wavio
In [14]: wav = wavio.read('fearful_song_strong_dogs_act06_f_0.wav')
In [15]: wav
Out[15]: Wav(data.shape=(198598, 1), data.dtype=int16, rate=48000, sampwidth=2)
In [16]: plot(np.arange(wav.data.shape[0])/wav.rate, wav.data[:,0])
Out[16]: [<matplotlib.lines.Line2D at 0x117cd9390>]

I solve the problem by changing this number "4" to "1" in the file wavefile.py file,
in this condition of the code:
- len(chunk_id) < 1
if not chunk_id:
raise ValueError("Unexpected end of file.")
elif len(chunk_id) < 1:
raise ValueError("Incomplete wav chunk.")
but it was by just intuition and good luck, now i wonder why this works and what are the possible reasons?

pdflatex hang after large number of figures

I have a script that generates a number of figures and puts them in the appendix of a report, e.g.
Appendix
********
.. figure:: images/generated/image_1.png
.. figure:: images/generated/image_2.png
.. figure:: images/generated/image_3.png
... etc
It looks like after a large number (~50) of images, my pdflatex command will hang, and point to one of the graphics in my .tex file around here
...
\begin(figure)[htbp]
\centering
\noindent\sphinxincludegraphics{{image_49}.png}
\end{figure}
\begin(figure)[htbp]
\centering
\noindent\sphinxincludegraphics{{image_50}.png} <--- here
\end{figure}
\begin(figure)[htbp]
\centering
\noindent\sphinxincludegraphics{{image_51}.png}
\end{figure}
...
When pdflatex fails I can't really figure out what to make from the console output, I get a number of these lines which seem to be good news
<image_48.png, id=451, 411.939pt x 327.3831pt>
File: image_48.png Graphic file (type png)
<use image_48.png>
Package pdftex.def Info: image_48.png used on input line 1251.
(pdftex.def) Requested size: 411.93797pt x 327.3823pt.
<image_49.png, id=452, 411.939pt x 327.3831pt>
File: image_49.png Graphic file (type png)
<use image_49.png>
Package pdftex.def Info: image_49.png used on input line 1257.
(pdftex.def) Requested size: 411.93797pt x 327.3823pt.
Then after the last successful image (~50) it starts outputting
! Output loop---100 consecutive dead cycles.
\end#float ...loatpenalty <-\#Mii \penalty -\#Miv
\#tempdima \prevdepth \vbo...
l.1258 \end{figure}
I've concluded that your \output is awry; it never does a
\shipout, so I'm shipping \box255 out myself. Next time
increase \maxdeadcycles if you want me to be more patient!
[9
! Undefined control sequence.
\reserved#a ->\#nil
l.1258 \end{figure}
The control sequence at the end of the top line
of your error message was never \def'ed. If you have
misspelled it (e.g., `\hobx'), type `I' and the correct
spelling (e.g., `I\hbox'). Otherwise just continue,
and I'll forget about whatever was undefined.
If all I do is reduce the number of figures, it will run and produce a pdf without issue. Is there a hard limit to the number of images a section can have? Is there somewhere else I can look in the build log to narrow down why this is happening?

This seemed to be a combination of a couple things.
The first symptom was essentially an error caused by too many unprocessed floats. The fix for this was to add the following to the babel element of latex_elements
\usepackage[maxfloats=256]{morefloats}
The second symptom was complaining about Output loop---100 consecutive dead cycles. so the fix was simply to increase the number of cycles
\maxdeadcycles=1000
After these two adjustments, the pdflatex command will finish successfully now, even with a large number of figures.

I had this problem and the above suggestions did not work. I was however able to get it to run just fine by inserting subsections which may or may not compatible with your objectives. The script generates code as follows which is then input into another code snippet to preview the generated images,
( I'm generating svg plots from c++, converting to png, and previewing essentially raw data for selection into later plots that go into an actual document not just a collection of images )
\subsection{svghappy2.tyrosine.png}
\begin{figure}[htbp]
\testplot{svghappy2_tyrosine.png}
\caption{svghappy2.tyrosine.png}
\end{figure}
\subsection{svghappy2.valine.png}
\begin{figure}[htbp]
\testplot{svghappy2_valine.png}
\caption{svghappy2.valine.png}
\end{figure}

As the problem arises from the compiler having hard time to set all the images. Splitting between them would help. As #mike-marchywka noted sections may do the trick, but so would other things, such as \pagebreak or \FloatBarrier from placeins

Read binary data off Windows clipboard, in Blender (python)

EDIT: Figured THIS part out, but see 2nd post below for another question.
(a little backstory here, skip ahead for the TLDR :) )
I'm currently trying to write a few scripts for Blender to help improve the level creation workflow for a game that I play (Natural Selection 2). Currently, to move geometry from the level editor to Blender, I have to 1) Save a file from the editor as an .obj 2) import obj into blender, and make my changes. Then I 3) export to the game's level format using an exporter script I wrote, and 4) re-open the file in a new instance of the editor. 5) copy the level data from the new instance. 6) paste into the main level file. This is quite a pain to do, and quite clearly discourages even using the tool at all but for major edits. My idea for an improved workflow: 1) Copy data to clipboard in editor 2) Run importer script in Blender to load data. 3) Run exporter script in blender to save data. 4) Paste back into original file. This not only cuts out two whole steps in the tedious process, but also eliminates the need for extra files cluttering up my desktop. Currently though, I haven't found a way to read in clipboard data from the Windows clipboard into Blender... at least not without having to go through some really elaborate installation steps (eg install python 3.1, install pywin32, move x,y,z to the blender directory, uninstall python 3.1... etc...)
TLDR
I need help finding a way to write/read BINARY data to/from the clipboard in Blender. I'm not concerned about cross-platform capability -- the game tools are Windows only.
Ideally -- though obviously beggars can't be choosers here -- the solution would not make it too difficult to install the script for the layman. I'm (hopefully) not the only person who is going to be using this, so I'd like to keep the installation instructions as simple as possible. If there's a solution available in the python standard library, that'd be awesome!
Things I've looked at already/am looking at now
Pyperclip -- plaintext ONLY. I need to be able to read BINARY data off the clipboard.
pywin32 -- Kept getting missing DLL file errors, so I'm sure I'm doing something wrong. Need to take another stab at this, but the steps I had to take were pretty involved (see last sentence above TLDR section :) )
TKinter -- didn't read too far into this one as it seemed to only read plain-text.
ctypes -- actually just discovered this in the process of writing this post. Looks scary as hell, but I'll give it a shot.

Okay I finally got this working. Here's the code for those interested:
from ctypes import *
from binascii import hexlify
kernel32 = windll.kernel32
user32 = windll.user32
user32.OpenClipboard(0)
CF_SPARK = user32.RegisterClipboardFormatW("application/spark editor")
if user32.IsClipboardFormatAvailable(CF_SPARK):
data = user32.GetClipboardData(CF_SPARK)
size = kernel32.GlobalSize(data)
data_locked = kernel32.GlobalLock(data)
text = string_at(data_locked,size)
kernel32.GlobalUnlock(data)
else:
print('No spark data in clipboard!')
user32.CloseClipboard()

Welp... this is a new record for me (posting a question and almost immediately finding an answer).
For those interested, I found this: How do I read text from the (windows) clipboard from python?
It's exactly what I'm after... sort of. I used that code as a jumping-off point.
Instead of CF_TEXT = 1
I used CF_SPARK = user32.RegisterClipboardFormatW("application/spark editor")
Here's where I got that function name from: http://msdn.microsoft.com/en-us/library/windows/desktop/ms649049(v=vs.85).aspx
The 'W' is there because for whatever reason, Blender doesn't see the plain-old "RegisterClipboardFormat" function, you have to use "...FormatW" or "...FormatA". Not sure why that is. If somebody knows, I'd love to hear about it! :)
Anyways, haven't gotten it actually working yet: still need to find a way to break this "data" object up into bytes so I can actually work with it, but that shouldn't be too hard.
Scratch that, it's giving me quite a bit of difficulty.
Here's my code
from ctypes import *
from binascii import hexlify
kernel32 = windll.kernel32
user32 = windll.user32
user32.OpenClipboard(0)
CF_SPARK = user32.RegisterClipboardFormatW("application/spark editor")
if user32.IsClipboardFormatAvailable(CF_SPARK):
data = user32.GetClipboardData(CF_SPARK)
data_locked = kernel32.GlobalLock(data)
print(data_locked)
text = c_char_p(data_locked)
print(text)
print(hexlify(text))
kernel32.GlobalUnlock(data_locked)
else:
print('No spark data in clipboard!')
user32.CloseClipboard()
There aren't any errors, but the output is wrong. The line print(hexlify(text)) yields b'e0cb0c1100000000', when I should be getting something that's 946 bytes long, the first 4 of which should be 01 00 00 00. (Here's the clipboard data, saved out from InsideClipboard as a .bin file: https://www.dropbox.com/s/bf8yhi1h5z5xvzv/testLevel.bin?dl=1 )

Binary file (Labview .DAT file) conversion using Python

I work in a lab where we acquire electrophysiological recordings (across 4 recording channels) using custom Labview VIs, which save the acquired data as a .DAT (binary) file. The analysis of these files can then be continued in more Labview VIs, however I would like to analyse all my recordings in Python. First, I need walk through all of my files and convert them out of binary!
I have tried numpy.fromfile (filename), but the numbers I get out make no sense to me:
array([ 3.44316221e-282, 1.58456331e+029, 1.73060724e-077, ...,
4.15038967e+262, -1.56447362e-090, 1.80454329e+070])
To try and get further I looked up the .DAT header format to understand how to grab the bytes and translate them - how many bytes the data is saved in etc:
http://zone.ni.com/reference/en-XX/help/370859J-01/header/header/headerallghd_allgemein/
But I can't work out what to do. When I type "head filename" into terminal, below is what I see.
e.g. >> head 2014_04_10c1slice2_rest.DAT
DTL?
0???? ##????
empty array
PF?c ƀ????l?N?"1'.+?K13:13:27;0.00010000-08??t?޾??DY
??N?t?x???D?
?uv?tgD?~I??
??N?t?x>?n??
????t?x>?n??
????t?޾??D?
????t?x???D?
????t?x?~I??
????tgD>?n??
??N?tgD???D?
??N?t?x???D?
????t????DY
??N?t?x>?n??
??N?t????DY
?Kn$?t?????DY
??N?t??>?n??
??N?tgD>?n??
????t?x?~I??
????tgD>?n??
??N?tgD>?n??
??N?tgD???DY
????t?x???D?
????t???~I??
??N?tgD???DY
??N?tgD???D?
??N?t?޿~I??
??N?t?x???DY
??N?tF>?n??
??N?t?x??%Y
Any help or suggestions on what to do next would be really appreciated.
Thanks.
P.s. There is an old (broken) matlab file that seems to have been intended to convert these files. I think this could probably be helpful, but having spent a couple of days trying to understand it I am still stuck. http://www.mathworks.co.uk/matlabcentral/fileexchange/27195-load-labview-binary-data

Based on this link it looks like the following should do the trick:
binaryFile = open('Measurement_4.bin', mode='rb')
(data.offset,) = struct.unpack('>d', binaryFile.read(8))
Note that mode is set to 'rb' for binary.
With numpy you can directly do this as
data = numpy.fromfile('Measurement_4.bin', dtype='>d')
Please note that if you are just using Python as an intermediate and want to go back to LabVIEW with the data, you should instead use the function Read from Binary file.vi to read the binary file using native LabVIEW.

DAT is a pretty generic suffix, not necessarily something pointing to a specific format. If I'm understanding correctly, that help section is for DIAdem, which may be completely unrelated to how your data is saved from LV.
What you want is this help section, which tells you how LV flattens data to be stored on disk - http://zone.ni.com/reference/en-XX/help/371361J-01/lvconcepts/flattened_data/
You will need to look at the LV code to see exactly what kind of data you're saving and how the write file function is configured (byte order, size prepending, etc.) and then use that document to translate it to the actual representation.

Read amplitude data from mp3

I am trying to write some code that will extract the amplitude data from an mp3 as a function of time. I wrote up a rough version on MATLAB a while back using this function: http://labrosa.ee.columbia.edu/matlab/mp3read.html However I am having trouble finding a Python equivalent.
I've done a lot of research, and so far I've gathered that I need to use something like mpg321 to convert the .mp3 into a .wav. I haven't been able to figure out how to get that to work.
The next step will be reading the data from the .wav file, which I also haven't had any success with. Has anyone done anything similar or could recommend some libraries to help with this? Thanks!

You can use the subprocess module to call mpg123:
import subprocess
import sys
inname = 'foo.mp3'
outname = 'out.wav'
try:
subprocess.check_call(['mpg123', '-w', outname, inname])
except CalledProcessError as e:
print e
sys.exit(1)
For reading wav files you should use the wave module, like this:
import wave
import numpy as np
wr = wave.open('input.wav', 'r')
sz = 44100 # Read and process 1 second at a time.
da = np.fromstring(wr.readframes(sz), dtype=np.int16)
wr.close()
left, right = da[0::2], da[1::2]
After that, left and right contain the samples of the same channels.
You can find a more elaborate example here.

Here is a project in pure python where you can decode an MP3 file about 10x slower than realtime: http://portalfire.wordpress.com/category/pymp3/
The rest is done by Fourier mathematics etc.:
How to analyse frequency of wave file
and have a look at the python module wave:
http://docs.python.org/2/library/wave.html

The Pymedia library seems to be stable and to deals with what you need.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.