Python: Realtime audio streaming with PyAudio (or something else)? - python

Currently I'm using NumPy to generate the WAV file from a NumPy array. I wonder if it's possible to play the NumPy array in realtime before it's actually written to the hard drive. All examples I found using PyAudio rely on writing the NumPy array to a WAV file first, but I'd like to have a preview function that just spits out the NumPy array to the audio output.
Should be cross-platform, too. I'm using Python 3 (Anaconda distribution).

This has worked! Thanks for help!
def generate_sample(self, ob, preview):
print("* Generating sample...")
tone_out = array(ob, dtype=int16)
if preview:
print("* Previewing audio file...")
bytestream = tone_out.tobytes()
pya = pyaudio.PyAudio()
stream = pya.open(format=pya.get_format_from_width(width=2), channels=1, rate=OUTPUT_SAMPLE_RATE, output=True)
stream.write(bytestream)
stream.stop_stream()
stream.close()
pya.terminate()
print("* Preview completed!")
else:
write('sound.wav', SAMPLE_RATE, tone_out)
print("* Wrote audio file!")
Seems so simple now, but when you don't know Python very well, it seems like hell.

This is really simple with python-sounddevice:
import sounddevice as sd
sd.play(myarray, 44100)

As you can see in the examples, pyaudio just reads data from the WAV file and writes that to the stream.
It is not necessary to write a WAV file first, you just need a stream of data in the right format.
I'm adding the example below in case the link ever goes dead (note that I didn't write this code):
"""PyAudio Example: Play a WAVE file."""
import pyaudio
import wave
import sys
CHUNK = 1024
if len(sys.argv) < 2:
print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
sys.exit(-1)
wf = wave.open(sys.argv[1], 'rb')
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
data = wf.readframes(CHUNK)
while data != '':
stream.write(data)
data = wf.readframes(CHUNK)
stream.stop_stream()
stream.close()
p.terminate()

Related

Convert PyAudio microphone input stream to mp3

I am looking for ways to directly encode mp3 files from the microphone without saving to an intermediate wav file. There are tons of examples for saving to a wav file out there and a ton of examples for converting a wav file to mp3. But I have had no luck finding a way to save an mp3 directly from the mic. For example I am using the below example found on the webs to record to a wav file.
Am hoping to get suggestions on how to convert the frames list (pyaudio stream reads) to an mp3 directly. Or alternatively, stream the pyaudio microphone input directly to an mp3 via ffmpeg without populating a list/array with read data. Thank you very much!
import pyaudio
import wave
# the file name output you want to record into
filename = "recorded.wav"
# set the chunk size of 1024 samples
chunk = 1024
# sample format
FORMAT = pyaudio.paInt16
# mono, change to 2 if you want stereo
channels = 1
# 44100 samples per second
sample_rate = 44100
record_seconds = 5
# initialize PyAudio object
p = pyaudio.PyAudio()
# open stream object as input & output
stream = p.open(format=FORMAT,
channels=channels,
rate=sample_rate,
input=True,
output=True,
frames_per_buffer=chunk)
frames = []
print("Recording...")
for i in range(int(44100 / chunk * record_seconds)):
data = stream.read(chunk)
frames.append(data)
print("Finished recording.")
# stop and close stream
stream.stop_stream()
stream.close()
# terminate pyaudio object
p.terminate()
# save audio file
# open the file in 'write bytes' mode
wf = wave.open(filename, "wb")
# set the channels
wf.setnchannels(channels)
# set the sample format
wf.setsampwidth(p.get_sample_size(FORMAT))
# set the sample rate
wf.setframerate(sample_rate)
# write the frames as bytes
wf.writeframes(b"".join(frames))
# close the file
wf.close()
I was able to find a way to convert the pyaudio pcm stream to mp3 without saving to an intermediate wav file using a lame 3.1 binary from rarewares. I'm sure it can be done with ffmpeg as well but since ffmpeg uses lame to encode to mp3 I thought I would just focus on lame.
For converting the raw pcm array to an mp3 directly, remove all the wave file operations and replace with the following. This pipes the data into lame all in one go.
raw_pcm = b''.join(frames)
l = subprocess.Popen("lame - -r -m m recorded.mp3", stdin=subprocess.PIPE)
l.communicate(input=raw_pcm)
For piping the pcm data into lame as it is read, I used the following. I'm sure you could do this in a stream callback if you wished.
l = subprocess.Popen("lame - -r -m m recorded.mp3", stdin=subprocess.PIPE)
for i in range(int(44100 / chunk * record_seconds)):
l.stdin.write(stream.read(chunk))
I should note, that either way, lame did not start encoding until after the data was finished piping in. When piping in the data on each stream read, I assumed the encoding would start right away, but that was not the case.
Also, using .stdin.write may cause some trouble if stdout and stderr buffers arent read. Something I need to look into further.

What are the differences between a WAV file (.wav) and a WAVE audio file (.wave)?

I am trying to use the PyAudio library to record guitar audio through my USB audio interface in a python project. When I use audio applications such as Audacity to save the audio I get a WAV (.wav) file which can be played using apps such as Groove music, windows media player etc. and I am able to manipulate the files as I need.
However, now I need to implement recording into the project and when I use PyAudio to record guitar input, it saves the audio as a WAVE Audio File (.wave) file which cannot be manipulated in the program and cannot be played using the playsound library. When I try to play it from my file manager it will only play using Itunes while Groove music and windows media player don't support it.
Anywhere I check online describes WAVE and WAV files as the same thing so I am unsure why I am having this issue. My code is as shown below. Any help or advice would be appreciated!
import pyaudio
import wave
from playsound import playsound
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "live_guitar_input.wave"
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("NOW RECORDING")
frames = []
for i in range(0, int(RATE/CHUNK*RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("Finished Recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
playsound(WAVE_OUTPUT_FILENAME)
As pointed out by the O.P. "wave" and "wav" are the same thing. But the file manager application does not recognize wave extension. The solution is just to rename the "file.wave" to "file.wav".

Python read playing sound data [duplicate]

I'm trying to record the output from my computer speakers with PyAudio.
I tried to modify the code example given in the PyAudio documentation, but it doesn't work.
Technically, there's no error. I obtain the file output.wav and I can open it, but there's no sound. On Audacity, I can only see a straight line.
What's going wrong?
import pyaudio
import wave
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
SPEAKERS = p.get_default_output_device_info()["hostApi"] #The part I have modified
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK,
input_host_api_specific_stream_info=SPEAKERS) #The part I have modified
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
In case someone is still stumbling over this like me, I found a PyAudio fork to record the output on windows.
Explanation:
The official PyAudio build isn't able to record the output. BUT with Windows Vista and above, a new API, WASAPI was introduced, which includes the ability to open a stream to an output device in loopback mode. In this mode the stream will behave like an input stream, with the ability to record the outgoing audio stream.
To set the mode, one has to set a special flag (AUDCLNT_STREAMFLAGS_LOOPBACK). Since this flag is not supported in the official build one needs to edit PortAudio as well as PyAudio, to add loopback support.
New option:
"as_loopback":(true|false)
If you create an application on windows platform, you can use default stereo mixer virtual device to record your PC's output.
1) Enable stereo mixer.
2) Connect PyAudio to your stereo mixer, this way:
p = pyaudio.PyAudio()
stream = p.open(format = FORMAT,
channels = CHANNELS,
rate = RATE,
input = True,
input_device_index = dev_index,
frames_per_buffer = CHUNK)
where dev_index is an index of your stereo mixer.
3) List your devices to get required index:
for i in range(p.get_device_count()):
print(p.get_device_info_by_index(i))
Alternatively, you can automatically get index by device name:
for i in range(p.get_device_count()):
dev = p.get_device_info_by_index(i)
if (dev['name'] == 'Stereo Mix (Realtek(R) Audio)' and dev['hostApi'] == 0):
dev_index = dev['index'];
print('dev_index', dev_index)
4) Continue to work with pyAudio as in the case of recording from a microphone:
data = stream.read(CHUNK)
I got to record my speaker output with pyaudio with some configuration and code from pyaudio's documentation.
Code
"""PyAudio example: Record a few seconds of audio and save to a WAVE file."""
import pyaudio
import wave
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
Configuration
First, with pulseaudio running, create a loopback device:
pacmd load-module module-loopback latency_msec=5
Then set the default (fallback) to this loopback device in pavucontrol:
Then you can start the script, wait 5 seconds, and you should have an output.wav.
You can't record from an output stream as though it were input. To record, you need to connect PyAudio to an input device, like a microphone. At least that's the normal way to do things.
Try connecting to a microphone first, and see if you get anything. If this works, then try doing something unusual.
As a small speedup to your iterations, rather than recording and looking at the file, it's often easier just to print out the max for a few chunks to make sure you're bringing in data. Usually just watching the numbers scroll by and comparing them to the sound gives a quick estimate of whether things are correctly connected.
import audioop
mx = audioop.max(data, 2)
print mx
The speaker is an output stream even if you open it as an input. The hostApi value of the speaker is probably 0.
You can check the 'maxInputChannels' and 'maxOutputChannels' of every connected devices and the maxInputChannels for the speaker shall be 0.
You can't write to an input stream and you can't read from an output stream.
You can detect the available devices with the following code:
import pyaudio
# detect devices:
p = pyaudio.PyAudio()
host_info = p.get_host_api_info_by_index(0)
device_count = host_info.get('deviceCount')
devices = []
# iterate between devices:
for i in range(0, device_count):
device = p.get_device_info_by_host_api_device_index(0, i)
devices.append(device['name'])
print devices
After you get all the connected devices you can check the 'hostApi' of each devices.
For instance if the speaker index is 5 than:
p.get_device_info_by_host_api_device_index(0, 5)['hostApi']

How to play a stereo aiff file with pyaudio?

This plays mono aifc files, but for any stereo files I get a loud blast of static:
import pyaudio
import aifc
CHUNK = 1024
wf = aifc.open('C:\\path_to_file.aiff', 'rb')
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
data = wf.readframes(CHUNK)
while data != '':
stream.write(data)
data = wf.readframes(CHUNK)
stream.stop_stream()
stream.close()
p.terminate()
The stereo file I am testing with: https://archive.org/details/TestAifAiffFile
I'm on windows 7, if that's important.
Swapping every other sample does the trick. Load up the whole file into data, then do
a = numpy.fromstring(data, dtype = '<i2')
temp = a[1::2]
a[1::2] = a[::2]
a[::2] = temp
Then play a as though it were a string of audio samples rather than a numpy array. I've tested it on two different aiff files, and in both cases it preserves both channels and plays correctly.
This works probably because the file has an opposite byte order to what pyaudio expects.
I tried your code on Linux, and I also get a horrible noise.
The aifc module seems to read the file correctly, I converted it to a NumPy array and this looks fine:
import numpy as np
data = wf.readframes(wf.getnframes())
sig = np.frombuffer(data, dtype='<i2').reshape(-1, wf.getnchannels())
So I guess the problem is in PyAudio or in your usage of it.
I don't know a solution but I can offer you an alternative: Use soundfile and sounddevice.
import soundfile as sf
import sounddevice as sd
data, fs = sf.read('02DayIsDone.aif')
sd.play(data, fs, blocking=True)
Instead of doing tricky stuffs with the byte chain, you could also consider the audioread package instead of aifc/wave/... It decodes your audio file using whichever backend is available and regardless of the file format, it always returns buffers of 16-bit little-endian signed integer PCM data that you can feed pyaudio with.

Record speakers output with PyAudio

I'm trying to record the output from my computer speakers with PyAudio.
I tried to modify the code example given in the PyAudio documentation, but it doesn't work.
Technically, there's no error. I obtain the file output.wav and I can open it, but there's no sound. On Audacity, I can only see a straight line.
What's going wrong?
import pyaudio
import wave
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
SPEAKERS = p.get_default_output_device_info()["hostApi"] #The part I have modified
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK,
input_host_api_specific_stream_info=SPEAKERS) #The part I have modified
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
In case someone is still stumbling over this like me, I found a PyAudio fork to record the output on windows.
Explanation:
The official PyAudio build isn't able to record the output. BUT with Windows Vista and above, a new API, WASAPI was introduced, which includes the ability to open a stream to an output device in loopback mode. In this mode the stream will behave like an input stream, with the ability to record the outgoing audio stream.
To set the mode, one has to set a special flag (AUDCLNT_STREAMFLAGS_LOOPBACK). Since this flag is not supported in the official build one needs to edit PortAudio as well as PyAudio, to add loopback support.
New option:
"as_loopback":(true|false)
If you create an application on windows platform, you can use default stereo mixer virtual device to record your PC's output.
1) Enable stereo mixer.
2) Connect PyAudio to your stereo mixer, this way:
p = pyaudio.PyAudio()
stream = p.open(format = FORMAT,
channels = CHANNELS,
rate = RATE,
input = True,
input_device_index = dev_index,
frames_per_buffer = CHUNK)
where dev_index is an index of your stereo mixer.
3) List your devices to get required index:
for i in range(p.get_device_count()):
print(p.get_device_info_by_index(i))
Alternatively, you can automatically get index by device name:
for i in range(p.get_device_count()):
dev = p.get_device_info_by_index(i)
if (dev['name'] == 'Stereo Mix (Realtek(R) Audio)' and dev['hostApi'] == 0):
dev_index = dev['index'];
print('dev_index', dev_index)
4) Continue to work with pyAudio as in the case of recording from a microphone:
data = stream.read(CHUNK)
I got to record my speaker output with pyaudio with some configuration and code from pyaudio's documentation.
Code
"""PyAudio example: Record a few seconds of audio and save to a WAVE file."""
import pyaudio
import wave
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
Configuration
First, with pulseaudio running, create a loopback device:
pacmd load-module module-loopback latency_msec=5
Then set the default (fallback) to this loopback device in pavucontrol:
Then you can start the script, wait 5 seconds, and you should have an output.wav.
You can't record from an output stream as though it were input. To record, you need to connect PyAudio to an input device, like a microphone. At least that's the normal way to do things.
Try connecting to a microphone first, and see if you get anything. If this works, then try doing something unusual.
As a small speedup to your iterations, rather than recording and looking at the file, it's often easier just to print out the max for a few chunks to make sure you're bringing in data. Usually just watching the numbers scroll by and comparing them to the sound gives a quick estimate of whether things are correctly connected.
import audioop
mx = audioop.max(data, 2)
print mx
The speaker is an output stream even if you open it as an input. The hostApi value of the speaker is probably 0.
You can check the 'maxInputChannels' and 'maxOutputChannels' of every connected devices and the maxInputChannels for the speaker shall be 0.
You can't write to an input stream and you can't read from an output stream.
You can detect the available devices with the following code:
import pyaudio
# detect devices:
p = pyaudio.PyAudio()
host_info = p.get_host_api_info_by_index(0)
device_count = host_info.get('deviceCount')
devices = []
# iterate between devices:
for i in range(0, device_count):
device = p.get_device_info_by_host_api_device_index(0, i)
devices.append(device['name'])
print devices
After you get all the connected devices you can check the 'hostApi' of each devices.
For instance if the speaker index is 5 than:
p.get_device_info_by_host_api_device_index(0, 5)['hostApi']

Categories