This plays mono aifc files, but for any stereo files I get a loud blast of static:
import pyaudio
import aifc
CHUNK = 1024
wf = aifc.open('C:\\path_to_file.aiff', 'rb')
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
data = wf.readframes(CHUNK)
while data != '':
stream.write(data)
data = wf.readframes(CHUNK)
stream.stop_stream()
stream.close()
p.terminate()
The stereo file I am testing with: https://archive.org/details/TestAifAiffFile
I'm on windows 7, if that's important.
Swapping every other sample does the trick. Load up the whole file into data, then do
a = numpy.fromstring(data, dtype = '<i2')
temp = a[1::2]
a[1::2] = a[::2]
a[::2] = temp
Then play a as though it were a string of audio samples rather than a numpy array. I've tested it on two different aiff files, and in both cases it preserves both channels and plays correctly.
This works probably because the file has an opposite byte order to what pyaudio expects.
I tried your code on Linux, and I also get a horrible noise.
The aifc module seems to read the file correctly, I converted it to a NumPy array and this looks fine:
import numpy as np
data = wf.readframes(wf.getnframes())
sig = np.frombuffer(data, dtype='<i2').reshape(-1, wf.getnchannels())
So I guess the problem is in PyAudio or in your usage of it.
I don't know a solution but I can offer you an alternative: Use soundfile and sounddevice.
import soundfile as sf
import sounddevice as sd
data, fs = sf.read('02DayIsDone.aif')
sd.play(data, fs, blocking=True)
Instead of doing tricky stuffs with the byte chain, you could also consider the audioread package instead of aifc/wave/... It decodes your audio file using whichever backend is available and regardless of the file format, it always returns buffers of 16-bit little-endian signed integer PCM data that you can feed pyaudio with.
Related
I'm doing almost everything exactly as the pyaudio documentation describes. The main difference: in the stream loop, I am running my song data through a filter that outputs audio data as an array. The resulting output is only white noise. Here's my code:
import librosa as lr
y, sr = lr.load(song_filename)
# open stream
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
while _:
x = my_filter(parameter, y=y[CHUNK])
stream.write(x)
CHUNK = updateCHUNK()
My guess is that librosa uses numpy array while pyaudio uses bytes. So I tried this:
while _:
x = my_filter(parameter, y=y[CHUNK])
x = x.tobytes()
stream.write(x)
CHUNK = updateCHUNK()
Nothing was fixed by .tobytes(). Can anyone suggest solutions to this issue?
Edit:
I realize that I'm not ready to look for solutions. Still trying to understand what the problem is.
I am looking for ways to directly encode mp3 files from the microphone without saving to an intermediate wav file. There are tons of examples for saving to a wav file out there and a ton of examples for converting a wav file to mp3. But I have had no luck finding a way to save an mp3 directly from the mic. For example I am using the below example found on the webs to record to a wav file.
Am hoping to get suggestions on how to convert the frames list (pyaudio stream reads) to an mp3 directly. Or alternatively, stream the pyaudio microphone input directly to an mp3 via ffmpeg without populating a list/array with read data. Thank you very much!
import pyaudio
import wave
# the file name output you want to record into
filename = "recorded.wav"
# set the chunk size of 1024 samples
chunk = 1024
# sample format
FORMAT = pyaudio.paInt16
# mono, change to 2 if you want stereo
channels = 1
# 44100 samples per second
sample_rate = 44100
record_seconds = 5
# initialize PyAudio object
p = pyaudio.PyAudio()
# open stream object as input & output
stream = p.open(format=FORMAT,
channels=channels,
rate=sample_rate,
input=True,
output=True,
frames_per_buffer=chunk)
frames = []
print("Recording...")
for i in range(int(44100 / chunk * record_seconds)):
data = stream.read(chunk)
frames.append(data)
print("Finished recording.")
# stop and close stream
stream.stop_stream()
stream.close()
# terminate pyaudio object
p.terminate()
# save audio file
# open the file in 'write bytes' mode
wf = wave.open(filename, "wb")
# set the channels
wf.setnchannels(channels)
# set the sample format
wf.setsampwidth(p.get_sample_size(FORMAT))
# set the sample rate
wf.setframerate(sample_rate)
# write the frames as bytes
wf.writeframes(b"".join(frames))
# close the file
wf.close()
I was able to find a way to convert the pyaudio pcm stream to mp3 without saving to an intermediate wav file using a lame 3.1 binary from rarewares. I'm sure it can be done with ffmpeg as well but since ffmpeg uses lame to encode to mp3 I thought I would just focus on lame.
For converting the raw pcm array to an mp3 directly, remove all the wave file operations and replace with the following. This pipes the data into lame all in one go.
raw_pcm = b''.join(frames)
l = subprocess.Popen("lame - -r -m m recorded.mp3", stdin=subprocess.PIPE)
l.communicate(input=raw_pcm)
For piping the pcm data into lame as it is read, I used the following. I'm sure you could do this in a stream callback if you wished.
l = subprocess.Popen("lame - -r -m m recorded.mp3", stdin=subprocess.PIPE)
for i in range(int(44100 / chunk * record_seconds)):
l.stdin.write(stream.read(chunk))
I should note, that either way, lame did not start encoding until after the data was finished piping in. When piping in the data on each stream read, I assumed the encoding would start right away, but that was not the case.
Also, using .stdin.write may cause some trouble if stdout and stderr buffers arent read. Something I need to look into further.
Currently I'm using NumPy to generate the WAV file from a NumPy array. I wonder if it's possible to play the NumPy array in realtime before it's actually written to the hard drive. All examples I found using PyAudio rely on writing the NumPy array to a WAV file first, but I'd like to have a preview function that just spits out the NumPy array to the audio output.
Should be cross-platform, too. I'm using Python 3 (Anaconda distribution).
This has worked! Thanks for help!
def generate_sample(self, ob, preview):
print("* Generating sample...")
tone_out = array(ob, dtype=int16)
if preview:
print("* Previewing audio file...")
bytestream = tone_out.tobytes()
pya = pyaudio.PyAudio()
stream = pya.open(format=pya.get_format_from_width(width=2), channels=1, rate=OUTPUT_SAMPLE_RATE, output=True)
stream.write(bytestream)
stream.stop_stream()
stream.close()
pya.terminate()
print("* Preview completed!")
else:
write('sound.wav', SAMPLE_RATE, tone_out)
print("* Wrote audio file!")
Seems so simple now, but when you don't know Python very well, it seems like hell.
This is really simple with python-sounddevice:
import sounddevice as sd
sd.play(myarray, 44100)
As you can see in the examples, pyaudio just reads data from the WAV file and writes that to the stream.
It is not necessary to write a WAV file first, you just need a stream of data in the right format.
I'm adding the example below in case the link ever goes dead (note that I didn't write this code):
"""PyAudio Example: Play a WAVE file."""
import pyaudio
import wave
import sys
CHUNK = 1024
if len(sys.argv) < 2:
print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
sys.exit(-1)
wf = wave.open(sys.argv[1], 'rb')
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
data = wf.readframes(CHUNK)
while data != '':
stream.write(data)
data = wf.readframes(CHUNK)
stream.stop_stream()
stream.close()
p.terminate()
I'm trying to create a spectrogram program (in python), which will analyze and display the frequency spectrum from a microphone input in real time. I am using a template program for recording audio from here: http://people.csail.mit.edu/hubert/pyaudio/#examples (recording example)
This template program works fine, but I am unsure of the format of the data that is being returned from the data = stream.read(CHUNK) line. I have done some research on the .wav format, which is used in this program, but I cannot find the meaning of the actual data bytes themselves, just definitions for the metadata in the .wav file.
I understand this program uses 16 bit samples, and the 'chunks' are stored in python strings. I was hoping somebody could help me understand exactly what the data in each sample represents. Even just a link to a source for this information would be helpful. I tried googling, but I don't think I know the terminology well enough to search accurately.
stream.read gives you binary data. To get the decimal audio samples, you can use numpy.fromstring to turn it into a numpy array or you use Python's built-in struct.unpack.
Example:
import pyaudio
import numpy
import struct
CHUNK = 128
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=44100, input=True, frames_per_buffer=CHUNK)
data = stream.read(CHUNK)
print numpy.fromstring(data, numpy.int16) # use external numpy module
print struct.unpack('h'*CHUNK, data) # use built-in struct module
stream.stop_stream()
stream.close()
p.terminate()
I want to adjust the volume of the mp3 file while it is being playing by adjusting the potentiometer. I am reading the potentiometer signal serially via Arduino board with python scripts. With the help of pydub library i can able to read the file but cannot adjust the volume of the file while it is being playing. This is the code i have done after a long search
I specified only the portion of Pydub part. for your information im using vlc media player for changing the volume.
>>> from pydub import AudioSegment
>>> song = AudioSegment.from_wav("C:\Users\RAJU\Desktop\En_Iniya_Ponnilave.wav")
While the file is playing, i cannot adjust the value. Please, someone explain how to do it.
First you need decode your audio signal to raw audio and Split your signal in X frames, and you can manipulate your áudio and at every frame you can change Volume or change the Pitch or change the Speed, etc!
To change the volume you just need multiply your raw audio vector by one factor (this can be your potentiometer data signal).
This factor can be different if your vector are in short int or float point format !
One way to get raw audio data from wav files in python is using wave lib
import wave
spf = wave.open('wavfile.wav','r')
#Extract Raw Audio from Wav File
signal = spf.readframes(-1)
decoded = numpy.fromstring(signal, 'Float32');
Now you can multiply the vector decoded by one factor, for example if you want increase 10dB you need calculate 10^(DbValue/20) then in python 10**(10/20) = 3.1623
newsignal = decoded * 3.1623;
Now you need encode the vector again to play the new framed audio, you can use "from struct import pack" and pyaudio to do it!
stream = pyaud.open(
format = pyaudio.paFloat32,
channels = 1,
rate = 44100,
output = True,
input = True)
EncodeAgain = pack("%df"%(len(newsignal)), *list(newsignal))
And finally Play your framed audio, note that you will do it at every frame and play it in one loop, this process is too fast and the latency can be imperceptibly !
stream.write(EncodeAgain)
PS: This example is for float point format !
Ederwander,As u said I have treid coding but when packing the data, im getting total zero. so it is not streaming. I understand the problem may occur in converting the format data types.This is the code i have written. Please look at it and say the suggestion
import sys
import serial
import time
import os
from pydub import AudioSegment
import wave
from struct import pack
import numpy
import pyaudio
CHUNK = 1024
wf = wave.open('C:\Users\RAJU\Desktop\En_Iniya_Ponnilave.wav', 'rb')
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
# open stream (2)
stream = p.open(format = p.get_format_from_width(wf.getsampwidth()),channels = wf.getnchannels(),rate = wf.getframerate(),output = True)
# read data
data_read = wf.readframes(CHUNK)
decoded = numpy.fromstring(data_read, 'int32', sep = '');
data = decoded*3.123
while(1):
EncodeAgain = struct.pack(h,data)
stream.write(EncodeAgain)