I have small-sized sound files stored in MongoDB as BSON.
Task is to retrieve Binary data from the database, convert it to an appropriate format and send back to the front end.
The problem is with the converting. I have found pydub can be used for this.
My code is as follows
query_param = json_data['retriever']
query_param1 = query_param.replace('"', "");
data = db.soundData
y = data.find_one({'name': query_param1})
s = y['data'] // here I retrieve the binary data
AudioSegment.from_file(s).export(x, format="mp3")
return send_file(x, 'audio/mp3')
The question is with Audiosegment line as it does not follow the standard of
AudioSegment.from_wav("/input/file.wav").export("/output/file.mp3", format="mp3")
and an error of 'bytes' object has no attribute 'read' is still thrown. Is it achievable with pydub?
AudioSegment.from_file() takes a file path or file-like object as it's first argument. Assuming you have the raw bytes of a whole wave file (including wave headers, not just the audio data) then you can:
import io
s = io.BytesIO(y['data'])
AudioSegment.from_file(s).export(x, format='mp3')
If you only have the bytes of the audio samples you would need to know some metadata about your audio data:
AudioSegment(y['data'], sample_width=???, frame_rate=???, channels=???)
sample_width is the number of bytes in each sample (so for 16-bit/CD audio, you'd use 2)
frame_rate is number of samples/second (aka, sample rate, for CD audio it's 44100)
channels how many audio streams are there, stereo is 2, mono is 1, etc
Related
I have mp3 files that I'm trying to convert to 8, 24 or 32 bits, I was trying to do it using AudioSegment module using the following:
audio = AudioSegment.from_mp3('audio.mp3')
audio.resample(sample_width=1)
which returns an error: AttributeError: 'AudioSegment' object has no attribute 'resample'
(I found resample(sample_rate_Hz=None, sample_width=None, channels=None, console_output=False) in AudioSegment documentation.)
I tried to convert mp3 files to wav and then change sample width using the following function:
def quantify(audio,k):
#audio: wave object \ k : new sample width
sw = audio.getsampwidth()
typ = {1:np.int8, 2:np.int16, 4:np.int32}
data = get_data(audio) #returns numpy array
newdata = data.astype(typ.get(k))
#I can then write this new data to a new wav file
return newdata
But using this way I think it's added work to convert from mp3 to wav to mp3 again, also using this method I can't convert to 24 bits since Numpy has no data type for that from what I've read (I don't know if it's possible with AudioSegment either). Any Suggestions?
I have a service that sends text to an external text to speech service that returns back audio in a response. This is how i access the audio:
res = requests.get(TTS_SERVICE_URL, params={"text":text_to_synth})
bytes_content = io.BytesIO(bytes(res.content))
audio = bytes_content.getvalue()
Now i would like to send multiple lines of text in different requests, and receive all the audio content in bytes, merge them into one audio and then display it, can anyone guide me as to how would i be able to merge the bytes_content into one audio byte stream
I got this to work, posting the answer here if someone else faces the same problem, solved it as such
Read the bytes_content into a numpy array using soundfile:
data, samplerate = sf.read(bytes_content)
datas.append(data)
where datas is an empty array where each file to be concatenated is added
Then combine the files again
combined = np.concatenate(datas)
and convert back to a byte stream if needed
out = io.BytesIO()
sf.write(out, combined, samplerate=samplerate, format="wav")
I am pretty sure that this isn't the right way to do things, but this is what worked for me
I'm working on creating an embedded compression system, similar to those found on professional audio mixers. I am capturing the audio samples using PyAudio via the given "wire" example.
What's Supposed to Happen
Those samples are sectioned into "chunks", thanks to the library and streamed shortly after recording. I'm simply attempting to compress the chunks if the incoming signal becomes too loud. However, there are mismatched types.
The types which are being used are:
data = samples from the stream <type 'str'> - Unicode string
chunk = batch of audio bytes <type 'int'> - always returns 1024
stream.write(data, chunk) <type 'NoneType'>
compressed_segment = to be compressed <class 'pydub.audio_segment.AudioSegment'>
What's Happening
PyAudio returns as a string from the method stream.read() which is stored in data. I need the ability to convert these string samples to the AudioSegment object in order to use the compression function.
As a result, what ends up happening is I get several errors related to the type conversion, depending on how I have everything setup. I know that it's not the right type. So how can I make this type conversion work?
Here's 2 ways I've tried to do the conversion within the for i in range loop
1. Creating a "wave" object before compression
wave_file = wave.open(f="compress.wav", mode="wb")
wave_file.writeframes(data)
frame_rate = wave_file.getframerate()
wave_file.setnchannels(2)
# Create the proper file
compressed = AudioSegment.from_raw(wave_file)
compress(compressed) # Calling compress_dynamic_range in Pydub
Exception wave.Error: Error('# channels not specified',) in <bound method Wave_write.del of <wave.Wave_write instance at 0x000000000612FE88>> ignored
2. Sending RAW PyAudio data to compress method
data = stream.read(chunk)
compress(chunk) # Calling compress_dynamic_range in Pydub
thresh_rms = seg.max_possible_amplitude * db_to_float(threshold)
AttributeError: 'int' object has no attribute 'max_possible_amplitude'
The first error which was thrown because the wave file was written to before # of channels was set can be fixed as follows:
# inside for i in range loop
wave_file = wave.open(f="compress.wav(%s)" %i, mode="wb")
wave_file.setnchannels(channels)
wave_file.setsampwidth(sample_width)
wave_file.setframerate(sample_rate)
wave_file.writeframesraw(data) # place this after all attributes are set
wave_file.close()
# send temp files to compressor
compressed = AudioSegment.from_raw(wave_file)
compress(compressed)
This can then be sent to the PyDub funciton compress_dynamic_range.
However...
A more efficient way to do this - which is without creating the temp wav files - is to create a simple AudioSegment object in the following way. One can also stream back to PyAudio the compressed sound using stream.write().
sound = AudioSegment(data, sample_width=2, channels=2, frame_rate=44100)
stream.write(sound.raw_data, chunk) # stream via speakers / headphones
Just found out this interesting python package pydub which converts any audio file to mp3, wav, etc.
As far as I have read its documentation, the process is as follows:
read the mp3 audio file using from_mp3()
creates a wav file using export().
Just curious if there is a way to access the sampling rate and the audio signal(of 1-dimensional array, supposing it is a mono) directly from the mp3 file without converting it to a wav file. I am working on thousands of audio files and it might be expensive to convert all of them to wav file.
If you aren't interested in the actual audio content of the file, you may be able to use pydub.utils.mediainfo():
>>> from pydub.utils import mediainfo
>>> info = mediainfo("/path/to/file.mp3")
>>> print info['sample_rate']
44100
>>> print info['channels']
1
This uses avlib's avprobe utility, and returns all kinds of info. I suggest giving it a try :)
Should be much faster than opening each mp3 using AudioSegment.from_mp3(…)
frame_rate means sample_rate, so you can get like below;
from pydub import AudioSegment
filename = "hoge.wav"
myaudio = AudioSegment.from_file(filename)
print(myaudio.frame_rate)
I want to adjust the volume of the mp3 file while it is being playing by adjusting the potentiometer. I am reading the potentiometer signal serially via Arduino board with python scripts. With the help of pydub library i can able to read the file but cannot adjust the volume of the file while it is being playing. This is the code i have done after a long search
I specified only the portion of Pydub part. for your information im using vlc media player for changing the volume.
>>> from pydub import AudioSegment
>>> song = AudioSegment.from_wav("C:\Users\RAJU\Desktop\En_Iniya_Ponnilave.wav")
While the file is playing, i cannot adjust the value. Please, someone explain how to do it.
First you need decode your audio signal to raw audio and Split your signal in X frames, and you can manipulate your áudio and at every frame you can change Volume or change the Pitch or change the Speed, etc!
To change the volume you just need multiply your raw audio vector by one factor (this can be your potentiometer data signal).
This factor can be different if your vector are in short int or float point format !
One way to get raw audio data from wav files in python is using wave lib
import wave
spf = wave.open('wavfile.wav','r')
#Extract Raw Audio from Wav File
signal = spf.readframes(-1)
decoded = numpy.fromstring(signal, 'Float32');
Now you can multiply the vector decoded by one factor, for example if you want increase 10dB you need calculate 10^(DbValue/20) then in python 10**(10/20) = 3.1623
newsignal = decoded * 3.1623;
Now you need encode the vector again to play the new framed audio, you can use "from struct import pack" and pyaudio to do it!
stream = pyaud.open(
format = pyaudio.paFloat32,
channels = 1,
rate = 44100,
output = True,
input = True)
EncodeAgain = pack("%df"%(len(newsignal)), *list(newsignal))
And finally Play your framed audio, note that you will do it at every frame and play it in one loop, this process is too fast and the latency can be imperceptibly !
stream.write(EncodeAgain)
PS: This example is for float point format !
Ederwander,As u said I have treid coding but when packing the data, im getting total zero. so it is not streaming. I understand the problem may occur in converting the format data types.This is the code i have written. Please look at it and say the suggestion
import sys
import serial
import time
import os
from pydub import AudioSegment
import wave
from struct import pack
import numpy
import pyaudio
CHUNK = 1024
wf = wave.open('C:\Users\RAJU\Desktop\En_Iniya_Ponnilave.wav', 'rb')
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
# open stream (2)
stream = p.open(format = p.get_format_from_width(wf.getsampwidth()),channels = wf.getnchannels(),rate = wf.getframerate(),output = True)
# read data
data_read = wf.readframes(CHUNK)
decoded = numpy.fromstring(data_read, 'int32', sep = '');
data = decoded*3.123
while(1):
EncodeAgain = struct.pack(h,data)
stream.write(EncodeAgain)