Importing wav file in to Notebook - python

I have a problem in importing a wavefile into the Jupyter Notebook. I want to take the audio file from my desktop and perform fft on it. Does anyone know how to do this?

You can follow the examples at http://people.csail.mit.edu/hubert/pyaudio/docs/.
I also want to do FFT analyses on WAV files and use this approach (only the essential bits shown):
NOTE: this is a 16bit stereo WAV file, the "unpack" doesn't work with 24bit
import pyaudio
import wave
import numpy as np
import struct
wf = wave.open(sound_file_name, 'r')
n_frames = wf.getnframes()
all_frames = wf.readframes(n_frames)
wf.close()
value_list = []
for x in range(0, len(all_frames), 2):
value_list += struct.unpack('<h', all_frames[x:x+2])
two_channel_values = np.transpose(np.reshape(np.asanyarray(value_list), (int(len(value_list)/2), 2)))
Now you have an array of two vectors, each containing the amplitude values of one stereo channel.

Related

Pass_band filter in Python

I've develop a script that given an input file, extract the voice signal and give in output the signal WITHOUT voice (so the signal that containts the noise):
!pip install pydub
from pydub import AudioSegment
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np
audio = AudioSegment.from_file('fileInput.mp3')
Download fileInput.mp3
samples = audio.get_array_of_samples()
plt.plot(list(samples))
from scipy import signal
sos = signal.butter(10, [100, 4000], 'bandstop', fs=44100, output='sos')
filtered = signal.sosfilt(sos, np.array(samples))
plt.figure(figsize=(10,10))
plt.plot(np.array(samples))
plt.plot(filtered)
plt.title('After 1 - 10 Hz pass-band filter')
plt.tight_layout()
plt.show()
To export the file filtered (so the file that contains the noise) i write that following line:
from scipy.io.wavfile import write
write('./test.wav', 44100, filtered.astype(np.int16))
That codes save a file but the file don't have the same lenght of the original (input) one.
As you can notice, the input file has 36second lenght instead the output is 1:12 ...
Download Output file
The input file is stereo. The pydub documentation states that:
AudioSegment(…).get_array_of_samples()
Returns the raw audio data as an array of (numeric) samples. Note: if the audio has multiple channels, the samples for each channel will be serialized – for example, stereo audio would look like [sample_1_L, sample_1_R, sample_2_L, sample_2_R, …]
for scipy this is just 1 "long" channel. it can not know that the samples are split like this. A filter also has state. Meaning it can not process data that is shuffled like this and produce the desired output.
either you reshape the data from AudioSegment for example into 2 mono channels like:
[sample1L, sample2L, ...]
and
[sample1R, sample2R, ...]
and process these individually.
OR
you simply convert the AudioSegment to mono. like so:
audio = AudioSegment.from_file('fileInput.mp3')
audio = audio.set_channels(1)
either way I highly recommend you use the sample rate of the input file, wherever a sample rate is required. else loading a file with other sample rate will shift the filter frequencies and change the length and playback speed of the output file. e.g.
sos = signal.butter(10, [100, 4000], 'bandstop', fs=audio.frame_rate, output='sos')

How to get numpy arrays output of .wav file format

I am new to Python, and I am trying to train my audio voice recognition model. I want to read a .wav file and get output of that .wav file into Numpy arrays. How can I do that?
In keeping with #Marco's comment, you can have a look at the Scipy library and, in particular, at scipy.io.
from scipy.io import wavfile
To read your file ('filename.wav'), simply do
output = wavfile.read('filename.wav')
This will output a tuple (which I named 'output'):
output[0], the sampling rate
output[1], the sample array you want to analyze
This is possible with a few lines with wave (built in) and numpy (obviously). You don't need to use librosa, scipy or soundfile. The latest gave me problems reading wav files and it's the whole reason I'm writting here now.
import numpy as np
import wave
# Start opening the file with wave
with wave.open('filename.wav') as f:
# Read the whole file into a buffer. If you are dealing with a large file
# then you should read it in blocks and process them separately.
buffer = f.readframes(f.getnframes())
# Convert the buffer to a numpy array by checking the size of the sample
# with in bytes. The output will be a 1D array with interleaved channels.
interleaved = np.frombuffer(buffer, dtype=f'int{f.getsampwidth()*8}')
# Reshape it into a 2D array separating the channels in columns.
data = np.reshape(interleaved, (-1, f.getnchannels()))
I like to pack it into a function that returns the sampling frequency and works with pathlib.Path objects. In this way it can be played using sounddevice
# play_wav.py
import sounddevice as sd
import numpy as np
import wave
from typing import Tuple
from pathlib import Path
# Utility function that reads the whole `wav` file content into a numpy array
def wave_read(filename: Path) -> Tuple[np.ndarray, int]:
with wave.open(str(filename), 'rb') as f:
buffer = f.readframes(f.getnframes())
inter = np.frombuffer(buffer, dtype=f'int{f.getsampwidth()*8}')
return np.reshape(inter, (-1, f.getnchannels())), f.getframerate()
if __name__ == '__main__':
# Play all files in the current directory
for wav_file in Path().glob('*.wav'):
print(f"Playing {wav_file}")
data, fs = wave_read(wav_file)
sd.play(data, samplerate=fs, blocking=True)

How to play a stereo aiff file with pyaudio?

This plays mono aifc files, but for any stereo files I get a loud blast of static:
import pyaudio
import aifc
CHUNK = 1024
wf = aifc.open('C:\\path_to_file.aiff', 'rb')
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
data = wf.readframes(CHUNK)
while data != '':
stream.write(data)
data = wf.readframes(CHUNK)
stream.stop_stream()
stream.close()
p.terminate()
The stereo file I am testing with: https://archive.org/details/TestAifAiffFile
I'm on windows 7, if that's important.
Swapping every other sample does the trick. Load up the whole file into data, then do
a = numpy.fromstring(data, dtype = '<i2')
temp = a[1::2]
a[1::2] = a[::2]
a[::2] = temp
Then play a as though it were a string of audio samples rather than a numpy array. I've tested it on two different aiff files, and in both cases it preserves both channels and plays correctly.
This works probably because the file has an opposite byte order to what pyaudio expects.
I tried your code on Linux, and I also get a horrible noise.
The aifc module seems to read the file correctly, I converted it to a NumPy array and this looks fine:
import numpy as np
data = wf.readframes(wf.getnframes())
sig = np.frombuffer(data, dtype='<i2').reshape(-1, wf.getnchannels())
So I guess the problem is in PyAudio or in your usage of it.
I don't know a solution but I can offer you an alternative: Use soundfile and sounddevice.
import soundfile as sf
import sounddevice as sd
data, fs = sf.read('02DayIsDone.aif')
sd.play(data, fs, blocking=True)
Instead of doing tricky stuffs with the byte chain, you could also consider the audioread package instead of aifc/wave/... It decodes your audio file using whichever backend is available and regardless of the file format, it always returns buffers of 16-bit little-endian signed integer PCM data that you can feed pyaudio with.

how to adjust the volume of the audio file by serially getting the voltage signals from the potentiometer using arduino board and python scripts

I want to adjust the volume of the mp3 file while it is being playing by adjusting the potentiometer. I am reading the potentiometer signal serially via Arduino board with python scripts. With the help of pydub library i can able to read the file but cannot adjust the volume of the file while it is being playing. This is the code i have done after a long search
I specified only the portion of Pydub part. for your information im using vlc media player for changing the volume.
>>> from pydub import AudioSegment
>>> song = AudioSegment.from_wav("C:\Users\RAJU\Desktop\En_Iniya_Ponnilave.wav")
While the file is playing, i cannot adjust the value. Please, someone explain how to do it.
First you need decode your audio signal to raw audio and Split your signal in X frames, and you can manipulate your áudio and at every frame you can change Volume or change the Pitch or change the Speed, etc!
To change the volume you just need multiply your raw audio vector by one factor (this can be your potentiometer data signal).
This factor can be different if your vector are in short int or float point format !
One way to get raw audio data from wav files in python is using wave lib
import wave
spf = wave.open('wavfile.wav','r')
#Extract Raw Audio from Wav File
signal = spf.readframes(-1)
decoded = numpy.fromstring(signal, 'Float32');
Now you can multiply the vector decoded by one factor, for example if you want increase 10dB you need calculate 10^(DbValue/20) then in python 10**(10/20) = 3.1623
newsignal = decoded * 3.1623;
Now you need encode the vector again to play the new framed audio, you can use "from struct import pack" and pyaudio to do it!
stream = pyaud.open(
format = pyaudio.paFloat32,
channels = 1,
rate = 44100,
output = True,
input = True)
EncodeAgain = pack("%df"%(len(newsignal)), *list(newsignal))
And finally Play your framed audio, note that you will do it at every frame and play it in one loop, this process is too fast and the latency can be imperceptibly !
stream.write(EncodeAgain)
PS: This example is for float point format !
Ederwander,As u said I have treid coding but when packing the data, im getting total zero. so it is not streaming. I understand the problem may occur in converting the format data types.This is the code i have written. Please look at it and say the suggestion
import sys
import serial
import time
import os
from pydub import AudioSegment
import wave
from struct import pack
import numpy
import pyaudio
CHUNK = 1024
wf = wave.open('C:\Users\RAJU\Desktop\En_Iniya_Ponnilave.wav', 'rb')
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
# open stream (2)
stream = p.open(format = p.get_format_from_width(wf.getsampwidth()),channels = wf.getnchannels(),rate = wf.getframerate(),output = True)
# read data
data_read = wf.readframes(CHUNK)
decoded = numpy.fromstring(data_read, 'int32', sep = '');
data = decoded*3.123
while(1):
EncodeAgain = struct.pack(h,data)
stream.write(EncodeAgain)

Playing a sound from a wave form stored in an array

I'm currently experimenting with generating sounds in Python, and I'm curious how I can take a n array representing a waveform (with a sample rate of 44100 hz), and play it. I'm looking for pure Python here, rather than relying on a library that supports more than just .wav format.
or use the sounddevice module. Install using pip install sounddevice, but you need this first: sudo apt-get install libportaudio2
absolute basic:
import numpy as np
import sounddevice as sd
sd.play(myarray)
#may need to be normalised like in below example
#myarray must be a numpy array. If not, convert with np.array(myarray)
A few more options:
import numpy as np
import sounddevice as sd
#variables
samplfreq = 100 #the sampling frequency of your data (mine=100Hz, yours=44100)
factor = 10 #incr./decr frequency (speed up / slow down by a factor) (normal speed = 1)
#data
print('..interpolating data')
arr = myarray
#normalise the data to between -1 and 1. If your data wasn't/isn't normalised it will be very noisy when played here
sd.play( arr / np.max(np.abs(arr)), samplfreq*factor)
You should use a library. Writing it all in pure python could be many thousands of lines of code, to interface with the audio hardware!
With a library, e.g. audiere, it will be as simple as this:
import audiere
ds = audiere.open_device()
os = ds.open_array(input_array, 44100)
os.play()
There's also pyglet, pygame, and many others..
Edit: audiere module mentioned above appears no longer maintained, but my advice to rely on a library stays the same. Take your pick of a current project here:
https://wiki.python.org/moin/Audio/
https://pythonbasics.org/python-play-sound/
The reason there's not many high-level stdlib "batteries included" here is because interactions with the audio hardware can be very platform-dependent.
I think you may look this list
http://wiki.python.org/moin/PythonInMusic
It list many useful tools for working with sound.
To play sound given array input_array of 16 bit samples. This is modified example from pyadio documentation page
import pyaudio
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
# open stream (2), 2 is size in bytes of int16
stream = p.open(format=p.get_format_from_width(2),
channels=1,
rate=44100,
output=True)
# play stream (3), blocking call
stream.write(input_array)
# stop stream (4)
stream.stop_stream()
stream.close()
# close PyAudio (5)
p.terminate()
Here's a snippet of code taken from this stackoverflow answer, with an added example to play a numpy array (scipy loaded sound file):
from wave import open as waveOpen
from ossaudiodev import open as ossOpen
from ossaudiodev import AFMT_S16_NE
import numpy as np
from scipy.io import wavfile
# from https://stackoverflow.com/questions/307305/play-a-sound-with-python/311634#311634
# run this: sudo modprobe snd-pcm-oss
s = waveOpen('example.wav','rb')
(nc,sw,fr,nf,comptype, compname) = s.getparams( )
dsp = ossOpen('/dev/dsp','w')
print(nc,sw,fr,nf,comptype, compname)
_, snp = wavfile.read('example.wav')
print(snp)
dsp.setparameters(AFMT_S16_NE, nc, fr)
data = s.readframes(nf)
s.close()
dsp.write(snp.tobytes())
dsp.write(data)
dsp.close()
Basically you can just call the tobytes() method; the returned bytearray then can be played.
P.S. this method is supa fast

Categories