Python - recording and playing microphone input

Python - recording and playing microphone input - python

I am working on an app that receives audio from the user (with a microphone) and plays it back. Does anyone have a way/module that can store audio as an object (not as a .wav/.mp3) from a microphone?
Btw, it's on Windows, if it matters.
Thank you all for your help!

pyaudio can be used to store audio as an stream object.
On windows you can install pyaudio as python -m pip install pyaudio
Here is an example taken from pyaudio site which takes audio from microphone for 5 seconds duration then stores audio as stream object and plays back immediately .
You can modify to store stream object for different duration, manipulate then play it back.
Caution: Increase in duration will increase memory requirement.
"""
PyAudio Example: Make a wire between input and output (i.e., record a
few samples and play them back immediately).
"""
import pyaudio
CHUNK = 1024
WIDTH = 2
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
p = pyaudio.PyAudio()
stream = p.open(format=p.get_format_from_width(WIDTH),
channels=CHANNELS,
rate=RATE,
input=True,
output=True,
frames_per_buffer=CHUNK)
print("* recording")
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK) #read audio stream
stream.write(data, CHUNK) #play back audio stream
print("* done")
stream.stop_stream()
stream.close()
p.terminate()

Related

Python: Real-time input into the microphone

I just want to know if there is a way to input something real-time into the microphone with python. I am planning to make an open-source real-time noise cancellation app like Krisp.

You can give pyaudio a shot.
python -m pip install pyaudio
PyAudio example
import pyaudio
import wave
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
CHUNK = 1024
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "your-rockstar-voice.wav"
audio = pyaudio.PyAudio()
# start Recording
stream = audio.open(format=FORMAT, channels=CHANNELS,
rate=RATE, input=True,
frames_per_buffer=CHUNK)
print "recording..."
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print "finished recording"
# stop Recording
stream.stop_stream()
stream.close()
audio.terminate()
waveFile = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
waveFile.setnchannels(CHANNELS)
waveFile.setsampwidth(audio.get_sample_size(FORMAT))
waveFile.setframerate(RATE)
waveFile.writeframes(b''.join(frames))
waveFile.close()

I was searching for a solution made with python for live noise canceling problem, because I have a noisy neighbor. So, in my searches I found this Python solution:
rattlesnake - A python application that does noise cancellation
As I noticed, the live mode captures noise from a microphone while playing an audio file. So, the output stream plays the audio file joining the inverted waves for canceling noise, like those ear phones that have noise canceling system.
I'm planning to create a noise canceling system using a Raspberry PI running something like this live noise canceling to create a silence zone at home. As I noticed, it requires some changes on the original code because the live mode requires a mp3 file as parameter.

What are the differences between a WAV file (.wav) and a WAVE audio file (.wave)?

I am trying to use the PyAudio library to record guitar audio through my USB audio interface in a python project. When I use audio applications such as Audacity to save the audio I get a WAV (.wav) file which can be played using apps such as Groove music, windows media player etc. and I am able to manipulate the files as I need.
However, now I need to implement recording into the project and when I use PyAudio to record guitar input, it saves the audio as a WAVE Audio File (.wave) file which cannot be manipulated in the program and cannot be played using the playsound library. When I try to play it from my file manager it will only play using Itunes while Groove music and windows media player don't support it.
Anywhere I check online describes WAVE and WAV files as the same thing so I am unsure why I am having this issue. My code is as shown below. Any help or advice would be appreciated!
import pyaudio
import wave
from playsound import playsound
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "live_guitar_input.wave"
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("NOW RECORDING")
frames = []
for i in range(0, int(RATE/CHUNK*RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("Finished Recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
playsound(WAVE_OUTPUT_FILENAME)

As pointed out by the O.P. "wave" and "wav" are the same thing. But the file manager application does not recognize wave extension. The solution is just to rename the "file.wave" to "file.wav".

Python read playing sound data [duplicate]

I'm trying to record the output from my computer speakers with PyAudio.
I tried to modify the code example given in the PyAudio documentation, but it doesn't work.
Technically, there's no error. I obtain the file output.wav and I can open it, but there's no sound. On Audacity, I can only see a straight line.
What's going wrong?
import pyaudio
import wave
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
SPEAKERS = p.get_default_output_device_info()["hostApi"] #The part I have modified
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK,
input_host_api_specific_stream_info=SPEAKERS) #The part I have modified
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

In case someone is still stumbling over this like me, I found a PyAudio fork to record the output on windows.
Explanation:
The official PyAudio build isn't able to record the output. BUT with Windows Vista and above, a new API, WASAPI was introduced, which includes the ability to open a stream to an output device in loopback mode. In this mode the stream will behave like an input stream, with the ability to record the outgoing audio stream.
To set the mode, one has to set a special flag (AUDCLNT_STREAMFLAGS_LOOPBACK). Since this flag is not supported in the official build one needs to edit PortAudio as well as PyAudio, to add loopback support.
New option:
"as_loopback":(true|false)

If you create an application on windows platform, you can use default stereo mixer virtual device to record your PC's output.
1) Enable stereo mixer.
2) Connect PyAudio to your stereo mixer, this way:
p = pyaudio.PyAudio()
stream = p.open(format = FORMAT,
channels = CHANNELS,
rate = RATE,
input = True,
input_device_index = dev_index,
frames_per_buffer = CHUNK)
where dev_index is an index of your stereo mixer.
3) List your devices to get required index:
for i in range(p.get_device_count()):
print(p.get_device_info_by_index(i))
Alternatively, you can automatically get index by device name:
for i in range(p.get_device_count()):
dev = p.get_device_info_by_index(i)
if (dev['name'] == 'Stereo Mix (Realtek(R) Audio)' and dev['hostApi'] == 0):
dev_index = dev['index'];
print('dev_index', dev_index)
4) Continue to work with pyAudio as in the case of recording from a microphone:
data = stream.read(CHUNK)

I got to record my speaker output with pyaudio with some configuration and code from pyaudio's documentation.
Code
"""PyAudio example: Record a few seconds of audio and save to a WAVE file."""
import pyaudio
import wave
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
Configuration
First, with pulseaudio running, create a loopback device:
pacmd load-module module-loopback latency_msec=5
Then set the default (fallback) to this loopback device in pavucontrol:
Then you can start the script, wait 5 seconds, and you should have an output.wav.

You can't record from an output stream as though it were input. To record, you need to connect PyAudio to an input device, like a microphone. At least that's the normal way to do things.
Try connecting to a microphone first, and see if you get anything. If this works, then try doing something unusual.
As a small speedup to your iterations, rather than recording and looking at the file, it's often easier just to print out the max for a few chunks to make sure you're bringing in data. Usually just watching the numbers scroll by and comparing them to the sound gives a quick estimate of whether things are correctly connected.
import audioop
mx = audioop.max(data, 2)
print mx

The speaker is an output stream even if you open it as an input. The hostApi value of the speaker is probably 0.
You can check the 'maxInputChannels' and 'maxOutputChannels' of every connected devices and the maxInputChannels for the speaker shall be 0.
You can't write to an input stream and you can't read from an output stream.
You can detect the available devices with the following code:
import pyaudio
# detect devices:
p = pyaudio.PyAudio()
host_info = p.get_host_api_info_by_index(0)
device_count = host_info.get('deviceCount')
devices = []
# iterate between devices:
for i in range(0, device_count):
device = p.get_device_info_by_host_api_device_index(0, i)
devices.append(device['name'])
print devices
After you get all the connected devices you can check the 'hostApi' of each devices.
For instance if the speaker index is 5 than:
p.get_device_info_by_host_api_device_index(0, 5)['hostApi']

Is playing back an ongoing recording possible?

I want to record the audio being captured by the mic of my laptop and then say after some delay, play it back through the headphones connected to laptop. What I tried is recording the incoming audio in batches of 10 sec as background process & after the 1st audio clip of 10 sec is recorded, start playing it back in the background through the headphones. The problem that I am facing is that when in the end of recording, I combine all the batches of sound clips, some sound samples are lost in the process of stopping one recording & restarting the recording of next incoming sound.
So, is it possible to let the recording continue & after some samples are collected start playing that ongoing recording ? Or is there any other work around to this samples being lost?

If you just want record and playback, pyaudio has good basic examples here.
However, if you need to customize delay between record and playback then there are various approaches depending upon complexity and efforts.
One way is to record & save chunks of audio files and play them sequentially after some time delay between recording and playback.
It is possible to save small chunks to objects in memory (although I haven't tried that yet).
The playback and recording can be threaded or spawned to run simultaneously. I attempted multiprocessing, however, since I don't have multi-core CPU it may not be seem to be working. You are welcome to develop it further.
So, as discussed first we record and save chunks of audio files using record_audio function.
import pyaudio
import wave
import time
from multiprocessing import Process
def record_audio(AUDIO_FILE):
#Create audio stream
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
# begin recording
print"* recording audio clip: ",AUDIO_FILE
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
#print"* done recording audio clip:", AUDIO_FILE
#cleanup objects
stream.stop_stream()
stream.close()
#save frames to audio clips
print"* sending data to audio file:", AUDIO_FILE
wf = wave.open(AUDIO_FILE , 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
Next, we define function to play audio chunks using play_audio function.
def play_audio(AUDIO_FILE):
#open saved audio clip
wf2 = wave.open(AUDIO_FILE , 'rb')
#Introduce playback delay
time.sleep(AUDIO_DELAY)
#Define playback audio stream
stream2 = p.open(format=p.get_format_from_width(wf2.getsampwidth()),
channels=wf2.getnchannels(),
rate=wf2.getframerate(),
output=True)
data = wf2.readframes(CHUNK)
print" *************************** playing back audio file:", AUDIO_FILE
while data != '':
stream2.write(data)
data = wf2.readframes(CHUNK)
stream2.stop_stream()
stream2.close()
p.terminate()
Then we put two functions together in main and (attempt to ) kick off simultaneously.
if __name__=='__main__':
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2 #stereo
RATE = 44100
RECORD_SECONDS = 5 #record chunks of 5 sec
TOTAL_RECORD_NUMBER = 5 # total chunks to record and play
AUDIO_DELAY = 5.0 #playback delay in seconds
x = 0
while x < TOTAL_RECORD_NUMBER:
#define audio file clip
AUDIO_FILE = "audio{0}.wav".format(x)
#initialize pyaudio
p = pyaudio.PyAudio()
#Kick off record audio function process
p1 = Process(target = record_audio(AUDIO_FILE))
p1.start()
#kick off play audio function process
p2 = Process(target = play_audio(AUDIO_FILE))
p2.start()
p1.join()
p2.join()
#increment record counter
x += 1
Output:
Python 2.7.9 (default, Dec 10 2014, 12:24:55) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>>
* recording audio clip: audio0.wav
* sending data to audio file: audio0.wav
*************************** playing back audio file: audio0.wav
* recording audio clip: audio1.wav
* sending data to audio file: audio1.wav
*************************** playing back audio file: audio1.wav
* recording audio clip: audio2.wav
* sending data to audio file: audio2.wav
*************************** playing back audio file: audio2.wav
* recording audio clip: audio3.wav
* sending data to audio file: audio3.wav
*************************** playing back audio file: audio3.wav
* recording audio clip: audio4.wav
* sending data to audio file: audio4.wav
*************************** playing back audio file: audio4.wav
>>>
As you can see , the processes did not spawn simultaneously. You may develop it further.
Hope this helps.

Record speakers output with PyAudio

I'm trying to record the output from my computer speakers with PyAudio.
I tried to modify the code example given in the PyAudio documentation, but it doesn't work.
Technically, there's no error. I obtain the file output.wav and I can open it, but there's no sound. On Audacity, I can only see a straight line.
What's going wrong?
import pyaudio
import wave
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
SPEAKERS = p.get_default_output_device_info()["hostApi"] #The part I have modified
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK,
input_host_api_specific_stream_info=SPEAKERS) #The part I have modified
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

In case someone is still stumbling over this like me, I found a PyAudio fork to record the output on windows.
Explanation:
The official PyAudio build isn't able to record the output. BUT with Windows Vista and above, a new API, WASAPI was introduced, which includes the ability to open a stream to an output device in loopback mode. In this mode the stream will behave like an input stream, with the ability to record the outgoing audio stream.
To set the mode, one has to set a special flag (AUDCLNT_STREAMFLAGS_LOOPBACK). Since this flag is not supported in the official build one needs to edit PortAudio as well as PyAudio, to add loopback support.
New option:
"as_loopback":(true|false)

If you create an application on windows platform, you can use default stereo mixer virtual device to record your PC's output.
1) Enable stereo mixer.
2) Connect PyAudio to your stereo mixer, this way:
p = pyaudio.PyAudio()
stream = p.open(format = FORMAT,
channels = CHANNELS,
rate = RATE,
input = True,
input_device_index = dev_index,
frames_per_buffer = CHUNK)
where dev_index is an index of your stereo mixer.
3) List your devices to get required index:
for i in range(p.get_device_count()):
print(p.get_device_info_by_index(i))
Alternatively, you can automatically get index by device name:
for i in range(p.get_device_count()):
dev = p.get_device_info_by_index(i)
if (dev['name'] == 'Stereo Mix (Realtek(R) Audio)' and dev['hostApi'] == 0):
dev_index = dev['index'];
print('dev_index', dev_index)
4) Continue to work with pyAudio as in the case of recording from a microphone:
data = stream.read(CHUNK)

I got to record my speaker output with pyaudio with some configuration and code from pyaudio's documentation.
Code
"""PyAudio example: Record a few seconds of audio and save to a WAVE file."""
import pyaudio
import wave
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
Configuration
First, with pulseaudio running, create a loopback device:
pacmd load-module module-loopback latency_msec=5
Then set the default (fallback) to this loopback device in pavucontrol:
Then you can start the script, wait 5 seconds, and you should have an output.wav.

You can't record from an output stream as though it were input. To record, you need to connect PyAudio to an input device, like a microphone. At least that's the normal way to do things.
Try connecting to a microphone first, and see if you get anything. If this works, then try doing something unusual.
As a small speedup to your iterations, rather than recording and looking at the file, it's often easier just to print out the max for a few chunks to make sure you're bringing in data. Usually just watching the numbers scroll by and comparing them to the sound gives a quick estimate of whether things are correctly connected.
import audioop
mx = audioop.max(data, 2)
print mx

The speaker is an output stream even if you open it as an input. The hostApi value of the speaker is probably 0.
You can check the 'maxInputChannels' and 'maxOutputChannels' of every connected devices and the maxInputChannels for the speaker shall be 0.
You can't write to an input stream and you can't read from an output stream.
You can detect the available devices with the following code:
import pyaudio
# detect devices:
p = pyaudio.PyAudio()
host_info = p.get_host_api_info_by_index(0)
device_count = host_info.get('deviceCount')
devices = []
# iterate between devices:
for i in range(0, device_count):
device = p.get_device_info_by_host_api_device_index(0, i)
devices.append(device['name'])
print devices
After you get all the connected devices you can check the 'hostApi' of each devices.
For instance if the speaker index is 5 than:
p.get_device_info_by_host_api_device_index(0, 5)['hostApi']

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python - recording and playing microphone input - python

I am working on an app that receives audio from the user (with a microphone) and plays it back. Does anyone have a way/module that can store audio as an object (not as a .wav/.mp3) from a microphone? Btw, it's on Windows, if it matters. Thank you all for your help!

Related

Python: Real-time input into the microphone

What are the differences between a WAV file (.wav) and a WAVE audio file (.wave)?

Python read playing sound data [duplicate]

Is playing back an ongoing recording possible?

Record speakers output with PyAudio

Categories

Resources