How to record system audio in python - python

I tried to record audio with this code:
import sounddevice
from scipy.io.wavfile import write
fs = 44100
second = 3
file = sounddevice.rec(int(second * fs), samplerate=fs, channels=2)
sounddevice.wait()
write('output.wav', fs, file)
but it only records output of mic while i want to record system media sound

Most operating systems do not offer default output as source to record. You need to do use audio routing tools to archive that.
For example, in MacOS, one could install a tool called Soundflower or some later alternatives. Then on OS level, send all output to that soundflower "device" and then you would be able to record those after you select that soundflower device as the source via; https://python-sounddevice.readthedocs.io/en/0.4.5/usage.html#device-selection
On Linux, JACK probably can do the same as MacOS soundflower counterpart. On Windows, no idea...

Related

Python - Reading a large audio file to a stream?

The Question
I want to load an audio file of any type (mp3, m4a, flac, etc) and write it to an output stream.
I tried using pydub, but it loads the entire file at once which takes forever and runs out of memory easily.
I also tried using python-vlc, but it's been unreliable and too much of a black box.
So, how can I open large audio files chunk-by-chunk for streaming?
Edit #1
I found half of a solution here, but I'll need to do more research for the other half.
TL;DR: Use subprocess and ffmpeg to convert the file to wav data, and pipe that data into np.frombuffer. The problem is, the subprocess still has to finish before frombuffer is used.
...unless it's possible to have the pipe written to on 1 thread while np reads it from another thread, which I haven't tested yet. For now, this problem is not solved.
I think the python package https://github.com/irmen/pyminiaudio can be of helpful. You can stream an audio file like this
import miniaudio
audio_path = "my_audio_file.mp3"
target_sampling_rate = 44100 #the input audio will be resampled a this sampling rate
n_channels = 1 #either 1 or 2
waveform_duration = 30 #in seconds
offset = 15 #this means that we read only in the interval [15s, duration of file]
waveform_generator = miniaudio.stream_file(
filename = audio_path,
sample_rate = target_sampling_rate,
seek_frame = int(offset * target_sampling_rate),
frames_to_read = int(waveform_duration * target_sampling_rate),
output_format = miniaudio.SampleFormat.FLOAT32,
nchannels = n_channels)
for waveform in waveform_generator:
#do something with the waveform....
I know for sure that this works on mp3, ogg, wav, flac but for some reason it does not on mp4/acc and I am actually looking for a way to read mp4/acc

check if any devices on windows are playing sound python

I'm trying to detect system sounds on windows and I figured I could use the pyaudio module since winrt didn't work for me.
I've got this code that lists all the devices, and I know I can open streams with pyaudio
import pyaudio
p = pyaudio.PyAudio()
for i in range(p.get_device_count()):
dev = p.get_device_info_by_index(i)
print(dev)
but how can I tell whether any of these devices are currently outputting sound? Do I open a stream for each one and take the mean square root of the bytes? If this is an XY problem and I'd be better off using another module, please let me know
I've done this before with a library called PyCaw (Python Core Audio for Windows)
first pip install pycaw
For some reason this install would only work when running as administrator, but that could be an issue specific to my machine
The following will list the processes currently outputting audio.
from pycaw.pycaw import AudioUtilities
sessions = AudioUtilities.GetAllSessions()
for session in sessions:
print(session.Process)
for some reason it seems to always have a None reference at the end, but you should be able to deal with that pretty easily
In python I didn't find any very effective module to retrieve the internal audio of a pc, rather I found a way to simulate the output devices as Input using Stereo Mixer on windows platform.
Please refer to this link to enable stereo mixer on your windows environment.
After that find the index on which the stereo mixer is located using this code :
import pyaudio
p = pyaudio.PyAudio()
for i in range(p.get_device_count()):
print("\n\n Index " + str(i) + " :\n")
dev = p.get_device_info_by_index(i)
print(dev.get('name'))
print("------------------------------------------------------------")
This code will list all the audio devices along with their indexes. Note the index of Stereo Mixer in your pc.
Here comes the main code to detect sound, replace the device variable's value with the index of Stereo Mixer. For me it was 2.
import pyaudio
import audioop
p = pyaudio.PyAudio()
CHUNK = 1024
FORMAT = pyaudio.paInt16
RATE = 44100
silent_threshold = 1
device = 2
stream = p.open(format=FORMAT, channels = p.get_device_info_by_index(device).get('maxInputChannels') , rate=RATE, input = True , frames_per_buffer=CHUNK , input_device_index = device)
while True:
data=stream.read(CHUNK)
threshold = audioop.max(data , 2)
print(threshold)
if threshold > silent_threshold :
print("Sound found at index " + str(device))
else :
print("Sound not found at index " + str(device))
p.terminate()
Note the use of silent_threshold, the more you increase its value the less reactive it becomes. For example if you want to detect absolute silence use 1 or to detect quietness use 20 , it depends upon your sound playback on your pc.

Getting the internal audio output of the speaker on macOS with python

I want to get the internal audio output of the speaker on macOS with python. I got it working on Windows but I can't get it running on macOS.
At the beginning we used PyAudio, but I figured that SoundDevice is the better option.
This is the working script for Windows:
import sounddevice as REC
import scipy.io.wavfile
# Recording properties
SAMPLE_RATE = 48000
SECONDS = 10
# Channels
MONO = 1
STEREO = 2
print(REC.query_devices())
# Command to get all devices listed: py -m sounddevice
# Device you want to record
REC.default.device = "PC-Lautsprecher (Realtek HD Audio output with SST)"
print(f'Recording for {SECONDS} seconds')
# Starts recording
recording = REC.rec(int(SECONDS * SAMPLE_RATE), samplerate=SAMPLE_RATE, channels=STEREO)
REC.wait() # Waits for recording to finish
print("done recording")
scipy.io.wavfile.write("test.wav", SAMPLE_RATE, recording,)
Does anyone know a way this running on MacOS (Core Audio)? Are there any other Libraries that would work for MacOS?

Playing audio in python at given timestamp

I am trying to find a way in python to play a section of an audio file given a start and end time.
For example, say I have an audio file that is 1 min in duration. I want to play the section from 0:30 to 0:45 seconds.
I do not want to process or splice the file, only playback of the given section.
Any suggestions would be greatly appreciated!
Update:
I found a great solution using pydub:
https://github.com/jiaaro/pydub
from pydub import AudioSegment
from pydub.playback import play
audiofile = #path to audiofile
start_ms = #start of clip in milliseconds
end_ms = #end of clip in milliseconds
sound = AudioSegment.from_file(audiofile, format="wav")
splice = sound[start_ms:end_ms]
play(splice)
step one is to get your python to play entire audio file ... several libraries are available for this ... see if the library has a time specific api call ... you can always roll up your sleeves and implement this yourself after you read the audio file into a buffer or possibly stream the file and stop streaming at end of chosen time section
Another alternative is to leverage command line tools like ffmpeg which is the Swiss Army Knife of audio processing ... ffmpeg has command line input parms to do time specific start and stop ... also look at its sibling ffplay
Similar to ffplay/ffmpeg is another command line audio tool called sox
Use PyMedia and Player. Look at the functions SeekTo() and SeekEndTime(). I think you will be able to find a right solution after playing around with these functions.
I always have trouble installing external libraries and if you are running your code on a server and you don't have sudo privileges then it becomes even more cumbersome. Don't even get me started on ffmpeg installation.
So, here's an alternative solution with scipy and native IPython that avoids the hassle of installing some other library.
from scipy.io import wavfile # to read and write audio files
import IPython #to play them in jupyter notebook without the hassle of some other library
def PlayAudioSegment(filepath, start, end, channel='none'):
# get sample rate and audio data
sample_rate, audio_data = wavfile.read(filepath) # where filepath = 'directory/audio.wav'
#get length in minutes of audio file
print('duration: ', audio_data.shape[0] / sample_rate / 60,'min')
## splice the audio with prefered start and end times
spliced_audio = audio_data[start * sample_rate : end * sample_rate, :]
## choose left or right channel if preferred (0 or 1 for left and right, respectively; or leave as a string to keep as stereo)
spliced_audio = spliced_audio[:,channel] if type(channel)==int else spliced_audio
## playback natively with IPython; shape needs to be (nChannel,nSamples)
return IPython.display.Audio(spliced_audio.T, rate=sample_rate)
Use like this:
filepath = 'directory_with_file/audio.wav'
start = 30 # in seconds
end = 45 # in seconds
channel = 0 # left channel
PlayAudioSegment(filepath,start,end,channel)

PyAudio outputs slow/crackling/garbled audio

I'm trying to use pyaudio to play some wave files but I'm always having slow/crackling/garbled outputs.
When I play this wave file as described bellow, the audio plays just fine:
$ wget http://www.freespecialeffects.co.uk/soundfx/sirens/police_s.wav
$ aplay police_s.wav
However, when I open the 'play_wave.py' example from /pyaudio/test, the audio is so slow and garbled that is useless for any application.
"""PyAudio Example: Play a wave file."""
import pyaudio
import wave
import sys
CHUNK = 1024
#if len(sys.argv) < 2:
# print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
# sys.exit(-1)
wf = wave.open('police_s.wav', 'rb')
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
# open stream (2)
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
# read data
data = wf.readframes(CHUNK)
# play stream (3)
while data != '':
stream.write(data)
data = wf.readframes(CHUNK)
# stop stream (4)
stream.stop_stream()
stream.close()
# close PyAudio (5)
p.terminate()
To reproduce a similar poor quality on your laptop/PC, just make the CHUNK = 1 (the output is pretty similar on my Ubuntu)
Additional information:
What I tried:
1- Another Raspberry Pi B+.
2- Change the audio samples per buffer:
As I was supposing the problem was the audio samples per buffer (the CHUNK variable in this example), I made a loop to increment the CHUNK by 1 and played the audio for each increment. I could notice a slight difference for some CHUNK values, but nothing even close to the quality that I get when I play it by aplay. However, I could notice a big difference between this two files:
1- police_s.wav = 8 bits , 22000Hz, Mono , 176 kb/s -> Way better than the beat.wav played by the same CHUNK (2048)
2- beat.wav = 16bits , 44100Hz, Stereo, 1411 kb/s
When I play the same audio through the example /pyaudio/test/play_wave_callback.py, the output is almost perfect, excepting some interruptions at the end of the audio. So I saw that it doesn't set the CHUNK. It uses the frame_count parameter in the callback function, so I printed it and saw that it was 1024 ¬¬, the same default value that came with the example /pyaudio/test/play_wave.py and that results in a garbled audio.
3- pyaudio 0.2.4:
Since hayderOICO mentioned he was using pyaudio 0.2.4 and said "I'm using PyAudio fine.", I decided to give a try on that older version but I got the same result...
4- Added disable_audio_dither=1 to config.txt
I'm using: Raspberry Pi B+ Raspbian python 2.7.3 pyaudio v0.2.8 portaudio19-dev TRRS analog audio
How I installed everything:
1st try:
$ sudo apt-get update
$ sudo apt-get -y install python-dev python-numpy cython python-smbus portaudio19-dev
$ git clone http://people.csail.mit.edu/hubert/git/pyaudio.git
$ cd pyaudio
$ sudo python setup.py install
2nd try:
$ sudo apt-get install python-pyaudio
3rd try:
From GitHub: https://github.com/jleb/pyaudio
It's very frustrating having the library's example not working properly on Pi. I don't think it's a hardware limitation since the same audio plays well with aplay and other libraries like pygame and SDL2.
I am new to Raspberry Pi and audio programming, so I hope to be doing something stupid...
As I am already using a specific wrapper for pyaudio, I really would like to keep using it instead of moving to another library...
I appreciate any help, suggestions and advice.
Thanks in advance!
I had a similar problem on my raspberry pi (and on my mac). In my experience the pyaudio library is a pain to work with (after 2 weeks of battling it, I ended up using pygame). What worked for me was to check the default sample rate of the audio out and check that its the same and play sounds back as numpy arrays.
So on the RPi, I'd check (extrapolating from ubuntu here...) the files
/etc/pulse/daemon.conf
and
/etc/asound.conf or
~/.asoundrc
Does direct playback of numpy arrays work? If so you could do it in a roundabout way... Here is some code to test if you like
import pyaudio
import numpy as np
def gensin(frequency, duration, sampRate):
""" Frequency in Hz, duration in seconds and sampleRate
in Hz"""
cycles = np.linspace(0,duration*2*np.pi,num=duration*sampRate)
wave = np.sin(cycles*frequency,dtype='float32')
t = np.divide(cycles,2*np.pi)
return t, wave
frequency=8000 #in Hz
duration=1 #in seconds
sampRate=44100 #in Hz
t, sinWav = gensin(frequency,duration,sampRate)
p = pyaudio.PyAudio()
stream = p.open(format = pyaudio.paInt32,
channels = 1,
rate = sampRate,
output = True)
stream.start_stream()
stream.write(sinWav)
This worked on my RPi and mac, but as said before I ended up using pygame because even with on and off ramps I couldn't get rid of crackling at the beginning and end and sample rate wasn't something that could be easily changed. I would really recommend against pyaudio, but if you are set on it I wish you the best of luck! :)
I'm facing same issue on Raspberry Pi with pyaudio. Using a higher value of chunk size (e.g. 40960) and passing it by frames_per_buffer
upon p.open would make music playback much smoother, less popping and static noise comparing to the smaller value of chunk size (e.g. 1024).
I had the same problem. But this fixed it: As the first response says, just pass the "frames_per_buffer" argument to the p.open call:
CHUNK=4096
wf = wave.open(filename, 'rb')
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True, frames_per_buffer=CHUNK)
data = wf.readframes(CHUNK)
while len(data) > 0:
print(len(data))
stream.write(data)
data = wf.readframes(CHUNK)

Categories