I am trying to wrap my head around how I'd go about isolating and amplifying specific sound streams in real time. I am playing with code that enables you to classify specific sounds in a given sound environment (i.e. isolating the guitar in a clip of a song) -- not difficult.
The issue is, how does one then selecitvely amplify the target audio stream? The most common output from existing audio diarizers is a probability of the audio signal belonging to a given class. The crux appears to be using that real-time class probability to identify the stream and then amplify it.
I am new to the signals and Scipy but I may need to use it to remove the square-wave noise on multiple channels. I have tried a few things with fft but nothing seems to make sense so far. I am hoping to get a few clues here that I can try. Problem: I have a series of 6 sensors transmitting data via USB # 1Hz (yes, very slow) / sensor. Every once in a while, they capture an external motor noise along with the signal that I am trying to remove (see attached figure). Any ideas on how to handle this? My original idea was to process the incoming data for 60 seconds, use fft to identify the common frequency on all the sensor channels and to remove it but that did not work. The code is basically useless to even share here.the square wave in the figure is the noise I am trying to remove. Thank you for your input.
I am trying to understand how a programm outputs sound on a low level. I got this question while thinking about programming a synthesizer.
Rather than an unchaning file (like mp3 sound), a synthesizer creates the sound in real time.
Imagine I got a synthesizer programm with GUI:
A slider can change the pitch of a simple sine wave. There has to be some kind of rate in which the programm sends the pitch values to a system API which somehow transforms the digital data to analog sound to the speakers.
How does this whole process works?
How could I access such API on an Unix system with e.g. python?
Can you recommend any ressources for further reading on sound synthesis or real time data processing?
Thanks for taking your time to read the question!:)
I need to
read in variable data from sensors
use those data to generate audio
spit out the generated audio to individual audio output channels in real time
My trouble is with item 3.
Parts 1&2 have a lot in common with a guitar effects pedal, I should think: take in some variable and then adjust the audio output in real time as the input variable changes but don't ever stop sending a signal while doing it.
I have had no trouble using pyaudio to drive wav files to specific channels using the mapping[] parameter of pyaudio.play nor have I had trouble generating sine waves dynamically and sending them out using pyaudio.stream.play.
I'm working with 8 audio output channels. My problem is that stream.play only lets you specify a count of channels and as far as I can tell I can't say, for example, "stream generated_audio to channel 5".
I am trying to write a simple audio function generator in Python, to be run on a Raspberry Pi (model 2). The code essentially does this:
Generate 1 second of the audio signal (say, a sine wave, or a square wave, etc)
Play it repeatedly in a loop
For example:
import pyaudio
from numpy import linspace,sin,pi,int16
def note(freq, len, amp=1, rate=44100):
t = linspace(0,len,len*rate)
data = sin(2*pi*freq*t)*amp
return data.astype(int16) # two byte integers
RATE = 44100
FREQ = 261.6
pa = pyaudio.PyAudio()
s = pa.open(output=True,
channels=2,
rate=RATE,
format=pyaudio.paInt16,
output_device_index=2)
# generate 1 second of sound
tone = note(FREQ, 1, amp=10000, rate=RATE)
# play it forever
while True:
s.write(tone)
The problem is that every iteration of the loop results in an audible "tick" in the audio, even when using an external USB sound card. Is there any way to avoid this, rather than trying to rewrite everything in C?
I tried using the pyaudio callback interface, but that actually sounded worse (like maybe my Pi was flatulent).
The generated audio needs to be short because it will ultimately be adjusted dynamically with an external control, and anything more than 1 second latency on control changes just feels awkward. Is there a better way to produce these signals from within Python code?
You're hearing a "tick" because there's a discontinuity in the audio you're sending. One second of 261.6 Hz contains 261.6 cycles, so you end up with about half a cycle left over at the end:
You'll need to either change the frequency so that there are a whole number of cycles per second (e.g, 262 Hz), change the duration such that it's long enough for a whole number of cycles, or generate a new audio clip every second that starts in the right phase to fit where the last chunk left off.
I was looking for a similar question to yours, and found a variation that plays a pre-calculated length by concatenating a bunch of pre-calculated chunks.
http://milkandtang.com/blog/2013/02/16/making-noise-in-python/
Using a for loop with a 1-second pre-calculated chunk "play_tone" function seems to generate a smooth sounding output, but this is on a PC. If this doesn't work for you, it may be that the raspberry pi has a different back-end implementation that doesn't like successive writes.