I'm going to implement a voice chat using python. So I saw few examples, how to play sound and how to record. In many examples they used pyAudio library.
I'm able to record voice and able to save it in .wav file. And I'm able play a .wav file. But I'm looking for record voice for 5 seconds and then play it. I don't want to save it into file and then playing, it's not good for voice chat.
Here is my audio record code:
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT, channels=1, rate=RATE,
input=True, output=True,
frames_per_buffer=CHUNK_SIZE)
num_silent = 0
snd_started = False
r = array('h')
while 1:
# little endian, signed short
snd_data = array('h', stream.read(CHUNK_SIZE))
if byteorder == 'big':
snd_data.byteswap()
r.extend(snd_data)
silent = is_silent(snd_data)
if silent and snd_started:
num_silent += 1
elif not silent and not snd_started:
snd_started = True
if snd_started and num_silent > 30:
break
Now I want to play it without saving. I don't know how to do it.
I do not like this library try 'sounddevice' and 'soundfile'its are very easy to use and to implement.
for record and play voice use this:
import sounddevice as sd
import soundfile as sf
sr = 44100
duration = 5
myrecording = sd.rec(int(duration * sr), samplerate=sr, channels=2)
sd.wait()
sd.play(myrecording, sr)
sf.write("New Record.wav", myrecording, sr)
Having looked through the PyAudio Documentation, you've got it all as it should be but what you're forgetting is that stream is a duplex descriptor. This means that you can read from it to record sound (as you have done with stream.read) and you write to it to play sound (with stream.write).
Thus the last few lines of your example code should be:
# Play back collected sound.
stream.write(r)
# Cleanup the stream and stop PyAudio
stream.stop_stream()
stream.close()
p.terminate()
Related
I am trying to record my mic with pyaudio. The problem is that when record using
FORMAT = pyaudio.paUInt8
I cannot hear any sound when the recorded file is played. But if I use paInt16 format I can hear the recorded voice without any problem. I am using VLC player for playback.My code is below
import pyaudio
import wave
import threading
#FORMAT = pyaudio.paInt16 # working properly
FORMAT = pyaudio.paUInt8 # Not hearing any sound on play back
CHANNELS = 1
RATE = 8000
CHUNK = 2040
WAVE_OUTPUT_FILENAME = "file.wav"
stop_ = False
audio = pyaudio.PyAudio()
stream = audio.open(format=FORMAT, channels=CHANNELS,
rate=RATE, input=True,input_device_index = 0,
frames_per_buffer=CHUNK)
def stop():
global stop_
while True:
if not input('Press Enter >>>'):
print('exit')
stop_ = True
t = threading.Thread(target=stop, daemon=True).start()
frames = []
while True:
data = stream.read(CHUNK)
frames.append(data)
if stop_:
break
stream.stop_stream()
stream.close()
audio.terminate()
waveFile = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
waveFile.setnchannels(CHANNELS)
waveFile.setsampwidth(audio.get_sample_size(FORMAT))
waveFile.setframerate(RATE)
waveFile.writeframes(b''.join(frames))
waveFile.close()
if I convert pyaudio.paInt16 recorded file to uint8 format using Audacity , it is playing fine in VLC.
I'm writing a simple player in python using the pyaudio Library, with some basic functionalities, such start play, pause and start position.
I started working on the first example of the Documentation:
import pyaudio
import wave
import sys
CHUNK = 1024
if len(sys.argv) < 2:
print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
sys.exit(-1)
wf = wave.open(sys.argv[1], 'rb')
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
# open stream (2)
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
# read data
data = wf.readframes(CHUNK)
# play stream (3)
while len(data) > 0:
stream.write(data)
data = wf.readframes(CHUNK)
# stop stream (4)
stream.stop_stream()
stream.close()
# close PyAudio (5)
p.terminate()
It works perfectly but I really wouldn't know where to add a frame offset to start the playback at a specific frame.
I saw that there are different libraries available, but PyAudio allows me to read the raw data from the file in real time, and I need this functionality.
Do you have any suggestions?
You just have to count how many bytes to move in the audio.
nbytes = wf.getsampwidth() # Gives the number of bytes per sample for 1 channel
nchannels = wf.getnchannels() # Number of channels
sample_rate = wf.getframerate() # Number of samples per second 44100 is 44100 samples per second
nbytes_per_sample_per_channel = nbytes * nchannels
nbytes_per_second = nbytes_per_sample_per_channel * sample_rate
skip_seconds = 5 # Skip 5 seconds
wf.readframes(int(skip_seconds * nbytes_per_second)) # Read data that you want to skip
Start playing the file after the offset was read
# read data
data = wf.readframes(CHUNK)
# play stream (3)
while len(data) > 0:
stream.write(data)
data = wf.readframes(CHUNK)
# stop stream (4)
stream.stop_stream()
stream.close()
# close PyAudio (5)
p.terminate()
I am working on speech interface with python. I am having trouble with audio playback.
What do you use to black back simple mp3 files on the raspberry pi?
I need to play audio and 2 seconds before the end of the playback I need to start another task (opening the stream of the microphone)
How can I archive this? May problem is that I haven't found a way to read the current seconds of the playback yet. If I could read this, I would just start a new thread when the currenttime is audiolength - 2 seconds.
I hope you can help me or have any experience with this.
I found a solution to this.
PyAudio is providing a way to play audio chunk by chunk. Through that you can read the current chunk and compare it to the overall size of the audio.
class AudioPlayer():
"""AudioPlayer class"""
def __init__(self):
self.chunk = 1024
self.audio = pyaudio.PyAudio()
self._running = True
def play(self, audiopath):
self._running = True
#storing how much we have read already
self.chunktotal = 0
wf = wave.open(audiopath, 'rb')
stream = self.audio.open(format =self.audio.get_format_from_width(wf.getsampwidth()),channels = wf.getnchannels(),rate = wf.getframerate(),output = True)
print(wf.getframerate())
# read data (based on the chunk size)
data = wf.readframes(self.chunk)
#THIS IS THE TOTAL LENGTH OF THE AUDIO
audiolength = wf.getnframes() / float(wf.getframerate())
while self._running:
if data != '':
stream.write(data)
self.chunktotal = self.chunktotal + self.chunk
#calculating the percentage
percentage = (self.chunktotal/wf.getnframes())*100
#calculating the current seconds
current_seconds = self.chunktotal/float(wf.getframerate())
data = wf.readframes(self.chunk)
if data == b'':
break
# cleanup stream
stream.close()
def stop(self):
self._running = False
Hope it helps someone,
Alex
Try just_playback. It's a wrapper I wrote around miniaudio that provides playback control functionality like pausing, resuming, seeking, getting the current playback positions and setting the playback volume.
I have been searching for this since last week. Tried pyaudio also and when i used its another fork the system audio was mixed with microphone audio. I was not able to find any other module for this and thus finally asked the question.
Edit:
import pyaudio
import wave
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
SPEAKERS = p.get_default_output_device_info()["hostApi"] #The modified part
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK,
input_host_api_specific_stream_info=SPEAKERS,
as_loopback = True) #The part I have modified
print("* recording")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS) + 1):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
This code was taken from stack overflow. It records the speaker output but the output is mixed with the microphone Input.
Also the pyaudio module used was from the fork : https://github.com/intxcc/pyaudio_portaudio.
using https://github.com/intxcc/pyaudio_portaudio
This only records the audio of the device specified by "device_id"
import pyaudio
import wave
chunk = 1024 # Record in chunks of 1024 samples
sample_format = pyaudio.paInt16 # 16 bits per sample
channels = 2
fs = 44100 # Record at 44100 samples per second
seconds = 3
filename = "output.wav"
p = pyaudio.PyAudio() # Create an interface to PortAudio
#Select Device
print ( "Available devices:\n")
for i in range(0, p.get_device_count()):
info = p.get_device_info_by_index(i)
print ( str(info["index"]) + ": \t %s \n \t %s \n" % (info["name"], p.get_host_api_info_by_index(info["hostApi"])["name"]))
pass
#ToDo change to your device ID
device_id = 7
device_info = p.get_device_info_by_index(device_id)
channels = device_info["maxInputChannels"] if (device_info["maxOutputChannels"] < device_info["maxInputChannels"]) else device_info["maxOutputChannels"]
# https://people.csail.mit.edu/hubert/pyaudio/docs/#pyaudio.Stream.__init__
stream = p.open(format=sample_format,
channels=channels,
rate=int(device_info["defaultSampleRate"]),
input=True,
frames_per_buffer=chunk,
input_device_index=device_info["index"],
as_loopback=True
)
frames = [] # Initialize array to store frames
print('\nRecording', device_id, '...\n')
# Store data in chunks for 3 seconds
for i in range(0, int(fs / chunk * seconds)):
data = stream.read(chunk)
frames.append(data)
# Stop and close the stream
stream.stop_stream()
stream.close()
# Terminate the PortAudio interface
p.terminate()
print('Finished recording')
# Save the recorded data as a WAV file
wf = wave.open(filename, 'wb')
wf.setnchannels(channels)
wf.setsampwidth(p.get_sample_size(sample_format))
wf.setframerate(fs)
wf.writeframes(b''.join(frames))
wf.close()
P.S. check out https://github.com/intxcc/pyaudio_portaudio/tree/master/example
This can be done with soundcard. You will have to figure out which device index to use for your loopback. This code prints out the ones you will have to choose from. I found the correct one by looping over all of them and seeing which produced non zeros when speakers were playing.
pip install soundcard
import soundcard as sc
import time
# get a list of all speakers:
speakers = sc.all_speakers()
# get the current default speaker on your system:
default_speaker = sc.default_speaker()
# get a list of all microphones:v
mics = sc.all_microphones(include_loopback=True)
# get the current default microphone on your system:
default_mic = mics[index of your speaker loopback here]
for i in range(len(mics)):
try:
print(f"{i}: {mics[i].name}")
except Exception as e:
print(e)
with default_mic.recorder(samplerate=148000) as mic, \
default_speaker.player(samplerate=148000) as sp:
print("Recording...")
data = mic.record(numframes=1000000)
print("Done...Stop your sound so you can hear playback")
time.sleep(5)
sp.play(data)
I install a virtul soundcard(blackhole) on mac to record the system audio, and is worked.
I only record system audio without microphone audio, as I don't need it
On Ubuntu, you can use 'pavucontrol' to change the recording source. An example of recording audio directly from the speakers (without using a microphone):
First you run a script like the one below:
import pyaudio
mic = pyaudio.PyAudio()
stream = mic.open(format=pyaudio.paInt16, channels=1, rate=44100, input=True, output=True, frames_per_buffer=2048)
stream.start_stream()
if __name__ == '__main__':
while True:
data = stream.read(1024)
# Do something with sound
Then you can change the recording source (recording tab) from 'Built-in=Audio Analog Stereo' to 'Monitor of Built-in=Audio Analog Stereo'.
With this approach, you can analyze the sound from the speakers during the video call.
I've seen the recording tutorial on the PyAudio website for recording a fixed length recording, but I was wondering how I could do the same with a non-fixed recording? Bascially, I want to create buttons to start and end the recording but I haven't found anything on the matter. Any ideas, and I am not looking for an alternative library?
Best is to use the non-blocking way of recording, i.e. you provide a callback function that gets called from the moment you start the stream and keeps getting called for every block/buffer that gets processed until you stop the stream.
In that callback function you check for a boolean for example, and when it is true you write the incoming buffer to a datastructure, when it is false you ignore the incoming buffer. This boolean can be set from clicking a button for example.
EDIT: look at the example of wire audio: http://people.csail.mit.edu/hubert/pyaudio/#wire-callback-example
The stream is opened with an argument
stream_callback=my_callback
Where my_callback is a regular function declared as
def my_callback(in_data, frame_count, time_info, status)
This function will be called every time a new buffer is available. in_data contains the input, which you want to record. In this example, in_data just gets returned in a tuple together with pyaudio.paContinue. Which means that the incoming buffer from the input device is put/copied back into the output buffer sent the the output device (its the same device, so its actually routing input to output aka wire). See the api docs for a bit more explanation: http://people.csail.mit.edu/hubert/pyaudio/docs/#pyaudio.PyAudio.open
So in this function you can do something like (this is an extract from some code I've written, which is not complete: I use some functions not depicted. Also I play a sinewave on one channel and noise on the other in 24bit format.):
record_on = False
playback_on = False
recorded_frames = queue.Queue()
def callback_play_sine(in_data, frame_count, time_info, status):
if record_on:
global recorded_frames
recorded_frames.put(in_data)
if playback_on:
left_channel_data = mysine.next_block(frame_count) * MAX_INT24 * gain
right_channel_data = ((np.random.rand(frame_count) * 2) - 1) * MAX_INT24 * gain
data = interleave_channels(max_nr_of_channels, (left_output_channel, left_channel_data), (right_output_channel, right_channel_data))
data = convert_int32_to_24bit_bytestream(data)
else:
data = np.zeros(frame_count*max_nr_of_channels).tostring()
if stop_callback:
callback_flag = pyaudio.paComplete
else:
callback_flag = pyaudio.paContinue
return data, callback_flag
You can then set record_on and playback_on to True or False from another part of your code while the stream is open/running, causing recording and playback to start or stop independently without interrupting the stream.
I copy the in_data in a (threadsafe) queue, which is used by another thread to write to disk there, else the queue will get big after a while.
BTW: pyaudio is based on portaudio, which has much more documentation and helpful tips. For example (http://portaudio.com/docs/v19-doxydocs/writing_a_callback.html): the callback function has to finish before a new buffer is presented, else buffers will be lost. So writing to a file inside the callback function usually not a good idea. (though writing to a file gets buffered and I don't know if it blocks when its written to disk eventually)
import pyaudio
import wave
import pygame, sys
from pygame.locals import *
pygame.init()
scr = pygame.display.set_mode((640, 480))
recording = True
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("* recording")
frames = []
while True:
if recording:
data = stream.read(CHUNK)
frames.append(data)
for event in pygame.event.get():
if event.type == KEYDOWN and recording:
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
recording = False
if event.type == QUIT:
pygame.quit(); sys.exit()
This is what I came up with when compiling it to an exe. Passing arguments to the
exeparser = argparse.ArgumentParser()
parser.add_argument('-t', dest='time', action='store')
args = parser.parse_args()
time = int(args.time)