NumPy array holding .wav audio data for sounddevice - python

I would like to use sounddevice's playrec feature. To start I would like to just get sd.play() to work, I am new to Python and have never worked with NumPy, I have gotten audio to play using pyaudio, but I need the simultaneous play record feature in sounddevice. When I try to play an audio .wav file I get: TypeError: Unsupported data type: 'string288'. I think it has something to do with having to store the .wav file in a numpy array, but I have no idea how to do that. Here is what I have:
import sounddevice as sd
import numpy as np
sd.default.samplerate = 44100
sd.play('test.wav')
sd.wait

The documentation of sounddevice.play() says:
sounddevice.play(data, samplerate=None, mapping=None, blocking=False, loop=False, **kwargs)
where data is an "array-like".
It can't work with an audio file name, as you tried. The audio file has first to be read, and interpreted as a numpy array.
This code should work:
data, fs = sf.read(filename, dtype='float32')
sd.play(data, fs)
You'll find more examples here.

Related

Couldn't store audio as MP3 file using soundfile

My intention is to process MP3 file using Librosa library (normalize volume, trim silences, etc). However, as Librosa doesn't support MP3 format I use audioread library to load audio; however, I could not find the function in audioread that writes back the file, for that purpose I have loaded soundfile and saved processed file into WAV. Unfortunately, I am able to save only one channel (MONO) not Stereo.
Kindly advise, what library can I use to load and write MP3 file and process it using Librosa? or how can I write both channels into WAV or MP3 using soundfile?
import audioread, librosa
import soundfile as sf
filename="../sounds/music.mp3"
audio_file = audioread.audio_open(filename)
audio, sr = librosa.load(audio_file, sr= 44100)
clip = librosa.effects.trim(audio, top_db= 10)
sf.write('../sounds/output/out.wav', clip[0], sr, 'PCM_24')
Soundfile supports multichannel saving just fine. However, Librosa works with audio arrays where the dimensions are: (N_channels, N_samples). Soundfile on the other hand works with: (N_samples, N_channels). You can use numpy to transpose from one format to the other:
sf.write('../sounds/output/out.wav', np.transpose(clip), sr, 'PCM_24')

How can I convert a .wav to .mp3 in-memory?

I have a numpy array from a some.npy file that contains data of an audio file that is encoded in the .wav format.
The some.npy was created with sig = librosa.load(some_wav_file, sr=22050) and np.save('some.npy', sig).
I want to convert this numpy array as if its content was encoded with .mp3 instead.
Unfortunately, I am restricted to the use of in-memory file objects for two reasons.
I have many .npy files. They are cached in advance and it would be highly inefficient to have that much "real" I/O when actually running the application.
Conflicting access rights of people who are executing the application on a server.
First, I was looking for a way to convert the data in the numpy array directly, but there seems to be no library function. So is there a simple way to achieve this with in-memory file objects?
NOTE: I found this question How to convert MP3 to WAV in Python and its solution could be theoretically adapted but this is not in-memory.
You can read and write memory using BytesIO, like this:
import BytesIO
# Create "in-memory" buffer
memoryBuff = io.BytesIO()
And you can read and write MP3 using pydub module:
from pydub import AudioSegment
# Read a file in
sound = AudioSegment.from_wav('stereo_file.wav')
# Write to memory buffer as MP3
sound.export(memoryBuff, format='mp3')
Your MP3 data is now available at memoryBuff.getvalue()
You can convert between AudioSegments and Numpy arrays using this answer.
I finally found a working solution. This is what I wanted.
from pydub import AudioSegment
wav = np.load('some.npy')
with io.BytesIO() as inmemoryfile:
compression_format = 'mp3'
n_channels = 2 if wav.shape[0] == 2 else 1 # stereo and mono files
AudioSegment(wav.tobytes(), frame_rate=my_sample_rate, sample_width=wav.dtype.itemsize,
channels=n_channels).export(inmemoryfile, format=compression_format)
wav = np.array(AudioSegment.from_file_using_temporary_files(inmemoryfile)
.get_array_of_samples())
There exists a wrapper package (audiosegment) with which one could convert the last line to:
wav = audiosegment.AudioSegment.to_numpy_array(AudioSegment.from_file_using_temporary_files(inmemoryfile))

`st.audio` does not take a numpy array?

The streamlit docs (https://docs.streamlit.io/en/stable/api.html#streamlit.audio) state that streamlit.audio can take a numpy ndarray containing raw sample data and display an audio player. I tried this as follows on a local host:
import streamlit as st
import soundfile
data, sr = soundfile.read('test.wav')
st.audio(data)
It successfully displays an audio player and throws no errors, however there is no available sound.
Am I doing anything wrong or are the docs incorrect?
Is your code correct? It appears you unpack data and sr to variables, but then you pass audio to the np.array function. Where does audio come from, or should that be data instead?

How to get numpy arrays output of .wav file format

I am new to Python, and I am trying to train my audio voice recognition model. I want to read a .wav file and get output of that .wav file into Numpy arrays. How can I do that?
In keeping with #Marco's comment, you can have a look at the Scipy library and, in particular, at scipy.io.
from scipy.io import wavfile
To read your file ('filename.wav'), simply do
output = wavfile.read('filename.wav')
This will output a tuple (which I named 'output'):
output[0], the sampling rate
output[1], the sample array you want to analyze
This is possible with a few lines with wave (built in) and numpy (obviously). You don't need to use librosa, scipy or soundfile. The latest gave me problems reading wav files and it's the whole reason I'm writting here now.
import numpy as np
import wave
# Start opening the file with wave
with wave.open('filename.wav') as f:
# Read the whole file into a buffer. If you are dealing with a large file
# then you should read it in blocks and process them separately.
buffer = f.readframes(f.getnframes())
# Convert the buffer to a numpy array by checking the size of the sample
# with in bytes. The output will be a 1D array with interleaved channels.
interleaved = np.frombuffer(buffer, dtype=f'int{f.getsampwidth()*8}')
# Reshape it into a 2D array separating the channels in columns.
data = np.reshape(interleaved, (-1, f.getnchannels()))
I like to pack it into a function that returns the sampling frequency and works with pathlib.Path objects. In this way it can be played using sounddevice
# play_wav.py
import sounddevice as sd
import numpy as np
import wave
from typing import Tuple
from pathlib import Path
# Utility function that reads the whole `wav` file content into a numpy array
def wave_read(filename: Path) -> Tuple[np.ndarray, int]:
with wave.open(str(filename), 'rb') as f:
buffer = f.readframes(f.getnframes())
inter = np.frombuffer(buffer, dtype=f'int{f.getsampwidth()*8}')
return np.reshape(inter, (-1, f.getnchannels())), f.getframerate()
if __name__ == '__main__':
# Play all files in the current directory
for wav_file in Path().glob('*.wav'):
print(f"Playing {wav_file}")
data, fs = wave_read(wav_file)
sd.play(data, samplerate=fs, blocking=True)

Playing a sound from a wave form stored in an array

I'm currently experimenting with generating sounds in Python, and I'm curious how I can take a n array representing a waveform (with a sample rate of 44100 hz), and play it. I'm looking for pure Python here, rather than relying on a library that supports more than just .wav format.
or use the sounddevice module. Install using pip install sounddevice, but you need this first: sudo apt-get install libportaudio2
absolute basic:
import numpy as np
import sounddevice as sd
sd.play(myarray)
#may need to be normalised like in below example
#myarray must be a numpy array. If not, convert with np.array(myarray)
A few more options:
import numpy as np
import sounddevice as sd
#variables
samplfreq = 100 #the sampling frequency of your data (mine=100Hz, yours=44100)
factor = 10 #incr./decr frequency (speed up / slow down by a factor) (normal speed = 1)
#data
print('..interpolating data')
arr = myarray
#normalise the data to between -1 and 1. If your data wasn't/isn't normalised it will be very noisy when played here
sd.play( arr / np.max(np.abs(arr)), samplfreq*factor)
You should use a library. Writing it all in pure python could be many thousands of lines of code, to interface with the audio hardware!
With a library, e.g. audiere, it will be as simple as this:
import audiere
ds = audiere.open_device()
os = ds.open_array(input_array, 44100)
os.play()
There's also pyglet, pygame, and many others..
Edit: audiere module mentioned above appears no longer maintained, but my advice to rely on a library stays the same. Take your pick of a current project here:
https://wiki.python.org/moin/Audio/
https://pythonbasics.org/python-play-sound/
The reason there's not many high-level stdlib "batteries included" here is because interactions with the audio hardware can be very platform-dependent.
I think you may look this list
http://wiki.python.org/moin/PythonInMusic
It list many useful tools for working with sound.
To play sound given array input_array of 16 bit samples. This is modified example from pyadio documentation page
import pyaudio
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
# open stream (2), 2 is size in bytes of int16
stream = p.open(format=p.get_format_from_width(2),
channels=1,
rate=44100,
output=True)
# play stream (3), blocking call
stream.write(input_array)
# stop stream (4)
stream.stop_stream()
stream.close()
# close PyAudio (5)
p.terminate()
Here's a snippet of code taken from this stackoverflow answer, with an added example to play a numpy array (scipy loaded sound file):
from wave import open as waveOpen
from ossaudiodev import open as ossOpen
from ossaudiodev import AFMT_S16_NE
import numpy as np
from scipy.io import wavfile
# from https://stackoverflow.com/questions/307305/play-a-sound-with-python/311634#311634
# run this: sudo modprobe snd-pcm-oss
s = waveOpen('example.wav','rb')
(nc,sw,fr,nf,comptype, compname) = s.getparams( )
dsp = ossOpen('/dev/dsp','w')
print(nc,sw,fr,nf,comptype, compname)
_, snp = wavfile.read('example.wav')
print(snp)
dsp.setparameters(AFMT_S16_NE, nc, fr)
data = s.readframes(nf)
s.close()
dsp.write(snp.tobytes())
dsp.write(data)
dsp.close()
Basically you can just call the tobytes() method; the returned bytearray then can be played.
P.S. this method is supa fast

Categories