How can I convert a .wav to .mp3 in-memory? - python

I have a numpy array from a some.npy file that contains data of an audio file that is encoded in the .wav format.
The some.npy was created with sig = librosa.load(some_wav_file, sr=22050) and np.save('some.npy', sig).
I want to convert this numpy array as if its content was encoded with .mp3 instead.
Unfortunately, I am restricted to the use of in-memory file objects for two reasons.
I have many .npy files. They are cached in advance and it would be highly inefficient to have that much "real" I/O when actually running the application.
Conflicting access rights of people who are executing the application on a server.
First, I was looking for a way to convert the data in the numpy array directly, but there seems to be no library function. So is there a simple way to achieve this with in-memory file objects?
NOTE: I found this question How to convert MP3 to WAV in Python and its solution could be theoretically adapted but this is not in-memory.

You can read and write memory using BytesIO, like this:
import BytesIO
# Create "in-memory" buffer
memoryBuff = io.BytesIO()
And you can read and write MP3 using pydub module:
from pydub import AudioSegment
# Read a file in
sound = AudioSegment.from_wav('stereo_file.wav')
# Write to memory buffer as MP3
sound.export(memoryBuff, format='mp3')
Your MP3 data is now available at memoryBuff.getvalue()
You can convert between AudioSegments and Numpy arrays using this answer.

I finally found a working solution. This is what I wanted.
from pydub import AudioSegment
wav = np.load('some.npy')
with io.BytesIO() as inmemoryfile:
compression_format = 'mp3'
n_channels = 2 if wav.shape[0] == 2 else 1 # stereo and mono files
AudioSegment(wav.tobytes(), frame_rate=my_sample_rate, sample_width=wav.dtype.itemsize,
channels=n_channels).export(inmemoryfile, format=compression_format)
wav = np.array(AudioSegment.from_file_using_temporary_files(inmemoryfile)
.get_array_of_samples())
There exists a wrapper package (audiosegment) with which one could convert the last line to:
wav = audiosegment.AudioSegment.to_numpy_array(AudioSegment.from_file_using_temporary_files(inmemoryfile))

Related

How to Convert MP3 audio file to 8, 24 or 32 bits in Python?

I have mp3 files that I'm trying to convert to 8, 24 or 32 bits, I was trying to do it using AudioSegment module using the following:
audio = AudioSegment.from_mp3('audio.mp3')
audio.resample(sample_width=1)
which returns an error: AttributeError: 'AudioSegment' object has no attribute 'resample'
(I found resample(sample_rate_Hz=None, sample_width=None, channels=None, console_output=False) in AudioSegment documentation.)
I tried to convert mp3 files to wav and then change sample width using the following function:
def quantify(audio,k):
#audio: wave object \ k : new sample width
sw = audio.getsampwidth()
typ = {1:np.int8, 2:np.int16, 4:np.int32}
data = get_data(audio) #returns numpy array
newdata = data.astype(typ.get(k))
#I can then write this new data to a new wav file
return newdata
But using this way I think it's added work to convert from mp3 to wav to mp3 again, also using this method I can't convert to 24 bits since Numpy has no data type for that from what I've read (I don't know if it's possible with AudioSegment either). Any Suggestions?

Couldn't store audio as MP3 file using soundfile

My intention is to process MP3 file using Librosa library (normalize volume, trim silences, etc). However, as Librosa doesn't support MP3 format I use audioread library to load audio; however, I could not find the function in audioread that writes back the file, for that purpose I have loaded soundfile and saved processed file into WAV. Unfortunately, I am able to save only one channel (MONO) not Stereo.
Kindly advise, what library can I use to load and write MP3 file and process it using Librosa? or how can I write both channels into WAV or MP3 using soundfile?
import audioread, librosa
import soundfile as sf
filename="../sounds/music.mp3"
audio_file = audioread.audio_open(filename)
audio, sr = librosa.load(audio_file, sr= 44100)
clip = librosa.effects.trim(audio, top_db= 10)
sf.write('../sounds/output/out.wav', clip[0], sr, 'PCM_24')
Soundfile supports multichannel saving just fine. However, Librosa works with audio arrays where the dimensions are: (N_channels, N_samples). Soundfile on the other hand works with: (N_samples, N_channels). You can use numpy to transpose from one format to the other:
sf.write('../sounds/output/out.wav', np.transpose(clip), sr, 'PCM_24')

how to append audio frames to wav file python

I have a stream of PCM audio frames coming to my python code .Is there way to write frame in a way that appends to an existing .wav file. What i have tried is i am taking 2 wav files . From 1 wav file i am reading the data and writing to a existing wav file
import numpy
import wave
import scipy.io.wavfile
with open('testing_data.wav', 'rb') as fd:
contents = fd.read()
contents1=bytearray(contents)
numpy_data = numpy.array(contents1, dtype=float)
scipy.io.wavfile.write("whatstheweatherlike.wav", 8000, numpy_data)
data is getting appended in the existing wav file but the wav file is getting corrupted when i am trying to play in a media player
With wave library you can do that with something like:
import wave
audiofile1="youraudiofile1.wav"
audiofile2="youraudiofile2.wav"
concantenated_file="youraudiofile3.wav"
frames=[]
wave0=wave.open(audiofile2,'rb')
frames.append([wave0.getparams(),wave0.readframes(wave0.getnframes())])
wave.close()
wave1=wave.open(audiofile2,'rb')
frames.append([wave1.getparams(),wave1.readframes(wave1.getnframes())])
wave1.close()
result=wave.open(concantenated_file,'wb')
result.setparams(frames[0][0])
result.writeframes(frames[0][1])
result.writeframes(frames[1][1])
result.close()
And the order of concatenation is exactly the order of the writing here :
result.writeframes(frames[0][1]) #audiofile1
result.writeframes(frames[1][1]) #audiofile2

How to convert numpy array to bytes object without save audio file on disk?

I am now learning to build a TTS project based on Tacotron-2.
Here, the original code in save_wav(wav, path, sr) function has a step to save a numpy array to .wav file by using
wav *= 32767 / max(0.01, np.max(np.abs(wav)))
scipy.io.wavfile.write(path, hparams.sample_rate, wav.astype(np.int16))
However, after obtained a numpy array using wav *= 32767 / max(0.01, np.max(np.abs(wav))), I want to convert it to a .mp3 file so that it will be easier to send it back as streaming response.
Right now, I can convert .wav bytes object to a .mp3 file, but the problem is that I don't know how to convert the numpy array to a .wav bytes object.
I searched about it and found that it seems like I need to set a header for the numpy array, but in almost all posts that I looked into indicated using modules like scipy.io.wave and audioop, which will first save the numpy array to a .wav file and then with open('filename.wav', 'rb').
(This is the link for scipy.io.wavfile.write module, where the filename param should be string or open file handle which, from my understanding, the generated .wav file will be saved on disk.)
Could anyone give any suggestion on how to achieve this?
Use io.BytesIO
There is a much simpler and more convenient solution using a little hack creating i/o interface of bytes. We can use it like file for write and read:
import io
from scipy.io.wavfile import write
bytes_wav = bytes()
byte_io = io.BytesIO(bytes_wav)
write(byte_io, <audio_sr>, <audio_numpy_array>)
result_bytes = byte_io.read()
Use your data sample rate and values array instead of <audio_sr> and <audio_numpy_array>.
You can operate with result_bytes as bytes of .wav file (as required).
P.S. Also check this simple gist of how to perform values array -> bytes -> values array for wav file.
I finally solved this problem by modifying and creating new modules based on scipy.io.wavfile.write and audio_segment.py of pydub.
Beside, when you want to do operation on wave/mp3 bytes without saving them as a .wav/.mp3 file (normally by using some handful APIs or python package module), you should manually add header for it. It will not be a too-tough task if you look into those excellent package source codes.

pydub accessing the sampling rate(Hz) and the audio signal from an mp3 file

Just found out this interesting python package pydub which converts any audio file to mp3, wav, etc.
As far as I have read its documentation, the process is as follows:
read the mp3 audio file using from_mp3()
creates a wav file using export().
Just curious if there is a way to access the sampling rate and the audio signal(of 1-dimensional array, supposing it is a mono) directly from the mp3 file without converting it to a wav file. I am working on thousands of audio files and it might be expensive to convert all of them to wav file.
If you aren't interested in the actual audio content of the file, you may be able to use pydub.utils.mediainfo():
>>> from pydub.utils import mediainfo
>>> info = mediainfo("/path/to/file.mp3")
>>> print info['sample_rate']
44100
>>> print info['channels']
1
This uses avlib's avprobe utility, and returns all kinds of info. I suggest giving it a try :)
Should be much faster than opening each mp3 using AudioSegment.from_mp3(…)
frame_rate means sample_rate, so you can get like below;
from pydub import AudioSegment
filename = "hoge.wav"
myaudio = AudioSegment.from_file(filename)
print(myaudio.frame_rate)

Categories