Couldn't store audio as MP3 file using soundfile - python

My intention is to process MP3 file using Librosa library (normalize volume, trim silences, etc). However, as Librosa doesn't support MP3 format I use audioread library to load audio; however, I could not find the function in audioread that writes back the file, for that purpose I have loaded soundfile and saved processed file into WAV. Unfortunately, I am able to save only one channel (MONO) not Stereo.
Kindly advise, what library can I use to load and write MP3 file and process it using Librosa? or how can I write both channels into WAV or MP3 using soundfile?
import audioread, librosa
import soundfile as sf
filename="../sounds/music.mp3"
audio_file = audioread.audio_open(filename)
audio, sr = librosa.load(audio_file, sr= 44100)
clip = librosa.effects.trim(audio, top_db= 10)
sf.write('../sounds/output/out.wav', clip[0], sr, 'PCM_24')

Soundfile supports multichannel saving just fine. However, Librosa works with audio arrays where the dimensions are: (N_channels, N_samples). Soundfile on the other hand works with: (N_samples, N_channels). You can use numpy to transpose from one format to the other:
sf.write('../sounds/output/out.wav', np.transpose(clip), sr, 'PCM_24')

Related

How can i set an audio file for video clip with moviepy without re encoding

So far i tried:
from moviepy.editor import *
videoclip = VideoFileClip("filename.mp4")
audioclip = AudioFileClip("audioname.mp3")
new_audioclip = CompositeAudioClip([audioclip])
videoclip.audio = new_audioclip
videoclip.write_videofile("new_filename.mp4")
But it takes very long time.
I'd like to do it without re encoding. i also prefer opening video or audio clip from bytes in moviepy
One way to do it is using ffmpeg_merge_video_audio from FFMPEG tools.
ffmpeg_merge_video_audio - merges video file video and audio file audio into one movie file output.
By default the merging is performed without re-encoding.
Code sample:
from moviepy.video.io import ffmpeg_tools
ffmpeg_tools.ffmpeg_merge_video_audio("filename.mp4", "audioname.mp3", 'new_filename.mp4') # Merge audio and video without re-encoding
Note:
As far as I know, it's not possible to do it "from bytes" using MoviePy.

How can I convert a .wav to .mp3 in-memory?

I have a numpy array from a some.npy file that contains data of an audio file that is encoded in the .wav format.
The some.npy was created with sig = librosa.load(some_wav_file, sr=22050) and np.save('some.npy', sig).
I want to convert this numpy array as if its content was encoded with .mp3 instead.
Unfortunately, I am restricted to the use of in-memory file objects for two reasons.
I have many .npy files. They are cached in advance and it would be highly inefficient to have that much "real" I/O when actually running the application.
Conflicting access rights of people who are executing the application on a server.
First, I was looking for a way to convert the data in the numpy array directly, but there seems to be no library function. So is there a simple way to achieve this with in-memory file objects?
NOTE: I found this question How to convert MP3 to WAV in Python and its solution could be theoretically adapted but this is not in-memory.
You can read and write memory using BytesIO, like this:
import BytesIO
# Create "in-memory" buffer
memoryBuff = io.BytesIO()
And you can read and write MP3 using pydub module:
from pydub import AudioSegment
# Read a file in
sound = AudioSegment.from_wav('stereo_file.wav')
# Write to memory buffer as MP3
sound.export(memoryBuff, format='mp3')
Your MP3 data is now available at memoryBuff.getvalue()
You can convert between AudioSegments and Numpy arrays using this answer.
I finally found a working solution. This is what I wanted.
from pydub import AudioSegment
wav = np.load('some.npy')
with io.BytesIO() as inmemoryfile:
compression_format = 'mp3'
n_channels = 2 if wav.shape[0] == 2 else 1 # stereo and mono files
AudioSegment(wav.tobytes(), frame_rate=my_sample_rate, sample_width=wav.dtype.itemsize,
channels=n_channels).export(inmemoryfile, format=compression_format)
wav = np.array(AudioSegment.from_file_using_temporary_files(inmemoryfile)
.get_array_of_samples())
There exists a wrapper package (audiosegment) with which one could convert the last line to:
wav = audiosegment.AudioSegment.to_numpy_array(AudioSegment.from_file_using_temporary_files(inmemoryfile))

How To Parse Audio Wav File in Tensorflow

I am new to Python, and want to train an audio model. I converted my audio file to .wav format.
How can i parse those audio .wav file into the tensorflow?
You can use Librosa, it is sound processing library. You can install it by
pip install librosa
Then,
import librosa
import tensorflow as tf
data , sampling_rate = librosa.load('data/sound.wav')
# for use in tensorflow
data_tensor = tf.convert_to_tensor( data )
What you need is documented in the link below:
https://www.tensorflow.org/api_docs/python/tf/audio/decode_wav
With
tf.audio.decode_wav(
contents, desired_channels=-1, desired_samples=-1, name=None
)
you can decode a 16-bit PCM WAV file to a float tensor. In return, you get a tuple of Tensor objects (audio, sample_rate) where audio is a tensor of type float32, and sample_rate a tensor of type int32.

NumPy array holding .wav audio data for sounddevice

I would like to use sounddevice's playrec feature. To start I would like to just get sd.play() to work, I am new to Python and have never worked with NumPy, I have gotten audio to play using pyaudio, but I need the simultaneous play record feature in sounddevice. When I try to play an audio .wav file I get: TypeError: Unsupported data type: 'string288'. I think it has something to do with having to store the .wav file in a numpy array, but I have no idea how to do that. Here is what I have:
import sounddevice as sd
import numpy as np
sd.default.samplerate = 44100
sd.play('test.wav')
sd.wait
The documentation of sounddevice.play() says:
sounddevice.play(data, samplerate=None, mapping=None, blocking=False, loop=False, **kwargs)
where data is an "array-like".
It can't work with an audio file name, as you tried. The audio file has first to be read, and interpreted as a numpy array.
This code should work:
data, fs = sf.read(filename, dtype='float32')
sd.play(data, fs)
You'll find more examples here.

pydub accessing the sampling rate(Hz) and the audio signal from an mp3 file

Just found out this interesting python package pydub which converts any audio file to mp3, wav, etc.
As far as I have read its documentation, the process is as follows:
read the mp3 audio file using from_mp3()
creates a wav file using export().
Just curious if there is a way to access the sampling rate and the audio signal(of 1-dimensional array, supposing it is a mono) directly from the mp3 file without converting it to a wav file. I am working on thousands of audio files and it might be expensive to convert all of them to wav file.
If you aren't interested in the actual audio content of the file, you may be able to use pydub.utils.mediainfo():
>>> from pydub.utils import mediainfo
>>> info = mediainfo("/path/to/file.mp3")
>>> print info['sample_rate']
44100
>>> print info['channels']
1
This uses avlib's avprobe utility, and returns all kinds of info. I suggest giving it a try :)
Should be much faster than opening each mp3 using AudioSegment.from_mp3(…)
frame_rate means sample_rate, so you can get like below;
from pydub import AudioSegment
filename = "hoge.wav"
myaudio = AudioSegment.from_file(filename)
print(myaudio.frame_rate)

Categories