I want to play an ultrasound in the interval of 1 second infinitely and I use this code to do it. But it produce a sound when switching between ultrasound frequency and sleep. Is there any way that I can avoid this?
import pyaudio,time
import numpy
p = pyaudio.PyAudio()
volume = 0.5 # range [0.0, 1.0]
fs = 44100 # sampling rate, Hz, must be integer
duration = 1 # in seconds, may be float
f = 19000.0 # sine frequency, Hz, may be float
# generate samples, note conversion to float32 array
samples = (numpy.sin(2*numpy.pi*numpy.arange(fs*duration)*f/fs)).astype(numpy.float32)
# for paFloat32 sample values must be in range [-1.0, 1.0]
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=fs,
output=True)
# play. May repeat with different volume values (if done interactively)
while True:
stream.write(volume*samples)
time.sleep(0.5)
stream.stop_stream()
stream.close()
p.terminate()
Related
I'm trying to write a python script that produces a sine wave audio:
import pyaudio
import numpy as np
p = pyaudio.PyAudio()
volume = 0.5 # range [0.0, 1.0]
fs = 44100 # sampling rate, Hz, must be integer
duration = 1.0 # in seconds, may be float
f = 440.0 # sine frequency, Hz, may be float
# generate samples, note conversion to float32 array
samples = (np.sin(2*np.pi*np.arange(fs*duration)*f/fs)).astype(np.float32)
# for paFloat32 sample values must be in range [-1.0, 1.0]
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=fs,
output=True)
# play. May repeat with different volume values (if done interactively)
stream.write(volume*samples)
stream.stop_stream()
stream.close()
p.terminate()
But when I try to execute in VSCode,I get:
SystemError: PY_SSIZE_T_CLEAN macro must be defined for '#' formats
I'm running Python 3.10.4
I saw a question posted on Stack Overflow about generating a sound based on a sine wav in Python. Here was the code that produced the output I was looking for:
import pyaudio
import numpy as np
p = pyaudio.PyAudio()
volume = 0.5 # range [0.0, 1.0]
fs = 44100 # sampling rate, Hz, must be integer
duration = 1.0 # in seconds, may be float
f = 440.0 # sine frequency, Hz, may be float
# generate samples, note conversion to float32 array
samples = (np.sin(2*np.pi*np.arange(fs*duration)*f/fs)).astype(np.float32)
# for paFloat32 sample values must be in range [-1.0, 1.0]
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=fs,
output=True)
# play. May repeat with different volume values (if done interactively)
stream.write(volume*samples)
stream.stop_stream()
stream.close()
p.terminate()
What if instead of me using the line...
samples = (np.sin(2*np.pi*np.arange(fs*duration)*f/fs)).astype(np.float32)
I want to write whatever lines of code it would take, to where the samples (numpy array) would contain the values of a certain 'position' inside a specific wave file instead. Example, if I have a .wav file that has a sine wav stored inside the .wav file, and the .wav file is 1 second long, then it would 'replace' the line of code above. Is there a way to do that?
So I'm trying to generate a controllable tone with python. I'm using PyAudio and the simple code is like this (source):
import pyaudio
import numpy as np
p = pyaudio.PyAudio()
fs = 44100 # sampling rate, Hz, must be integer
duration = 1.0 # in seconds, may be float
f = 440.0 # sine frequency, Hz, may be float
# generate samples, note conversion to float32 array
samples = (np.sin(2*np.pi*np.arange(fs*duration)*f/fs)).astype(np.float32)
# for paFloat32 sample values must be in range [-1.0, 1.0]
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=fs,
output=True)
# play. May repeat with different volume values (if done interactively)
stream.write(samples.tobytes())
stream.stop_stream()
stream.close()
p.terminate()
the problem is when I played 2 different frequencies in sequence, it didn't play smoothly. I think it is because of the distortion between end of first frequency and beginning of second frequency. I'm trying to give a smooth transition with this code:
fstart = 440.0 # first frequency
fstop = 523.6 # second frequency
transition = 0.5 # transition time
finest = 10 # finest of transition
duration = 2.0 # duration of first and second frequency
samples = []
samples = np.asarray(samples)
# print(duration)
# print(fs*duration)
samples = np.append(samples, (np.sin(2*np.pi*np.arange(fs*duration)*fstart/fs)).astype(np.float32)).astype(np.float32)
for x in range(0, finest):
freq = fstart + (fstop-fstart)*x/finest
time = transition/finest
# print(time)
# print(fs*time)
samples = np.append(samples, (np.sin(2*np.pi*np.arange(fs*time)*freq/fs)).astype(np.float32)).astype(np.float32)
samples = np.append(samples, (np.sin(2*np.pi*np.arange(fs*duration)*fstop/fs)).astype(np.float32)).astype(np.float32)
but as expected, it also didn't go well. And when the finest is set to high, the sound was very distorted. Any idea to make this smooth like frequency sweep sound?
Thank you.
I'm trying to write naiv low pass filter using Python.
Values of the Fourier Transformant higher than a specific frequency should be equal to 0, right?
As far as I know that should to work.
But after an inverse fourier transformation what I get is just noise.
Program1 records RECORD_SECONDS from microphone and writes information about fft in fft.bin file.
Program2 reads from this file, do ifft and plays result on speakers.
In addition, I figured out, that every, even very little change in fft causes Program2 to fail.
Where do I make mistake?
Program1:
import pickle
import pyaudio
import wave
import numpy as np
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1 #1-mono, 2-stereo
RATE = 44100
RECORD_SECONDS = 2
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
f = open("fft.bin", "wb")
Tsamp = 1./RATE
#arguments for a fft
fft_x_arg = np.fft.rfftfreq(CHUNK/2, Tsamp)
#max freq
Fmax = 4000
print("* recording")
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
#read one chunk from mic
SigString = stream.read(CHUNK)
#convert string to int
SigInt = np.fromstring(SigString, 'int')
#calculate fft
fft_Sig = np.fft.rfft(SigInt)
"""
#apply low pass filter, maximum freq = Fmax
j=0
for value in fft_x_arg:
if value > Fmax:
fft_Sig[j] = 0
j=j+1
"""
#write one chunk of data to file
pickle.dump(fft_Sig,f)
print("* done recording")
f.close()
stream.stop_stream()
stream.close()
p.terminate()
Program2:
import pyaudio
import pickle
import numpy as np
CHUNK = 1024
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16,
channels=1,
rate=44100/2, #anyway, why 44100 Hz plays twice faster than normal?
output=True)
f = open("fft.bin", "rb")
#load first value from file
fft_Sig = pickle.load(f)
#calculate ifft and cast do int
SigInt = np.int16(np.fft.irfft(fft_Sig))
#convert once more - to string
SigString = np.ndarray.tostring(SigInt)
while SigString != '':
#play sound
stream.write(SigString)
fft_Sig = pickle.load(f)
SigInt = np.int16(np.fft.irfft(fft_Sig))
SigString = np.ndarray.tostring(SigInt)
f.close()
stream.stop_stream()
stream.close()
p.terminate()
FFTs operate on complex numbers. You might be able to feed them real numbers (which will get converted to complex by setting the imaginary part to 0) but their outputs will always be complex.
This is probably throwing off your sample counting by 2 among other things. It should also be trashing your output because you're not converting back to real data.
Also, you forgot to apply a 1/N scale factor to the IFFT output. And you need to keep in mind that the frequency range of an FFT is half negative, that is it's approximately the range -1/(2T) <= f < 1/(2T). BTW, 1/(2T) is known as the Nyquist frequency, and for real input data, the negative half of the FFT output will mirror the positive half (i.e. for y(f) = F{x(t)} (where F{} is the forward Fourier transform) y(f) == y(-f).
I think you need to read up a bit more on DSP algorithms using FFTs. What you're trying to do is called a brick wall filter.
Also, something that will help you a lot is matplotlib, which will help you see what the data looks like at intermediate steps. You need to look at this intermediate data to find out where things are going wrong.
I need to generate a sine wave sound in Python, and I need to be able to control frequency, duration, and relative volume. By 'generate' I mean that I want it to play though the speakers immediately, not save to a file.
What is the easiest way to do this?
Version with numpy:
import time
import numpy as np
import pyaudio
p = pyaudio.PyAudio()
volume = 0.5 # range [0.0, 1.0]
fs = 44100 # sampling rate, Hz, must be integer
duration = 5.0 # in seconds, may be float
f = 440.0 # sine frequency, Hz, may be float
# generate samples, note conversion to float32 array
samples = (np.sin(2 * np.pi * np.arange(fs * duration) * f / fs)).astype(np.float32)
# per #yahweh comment explicitly convert to bytes sequence
output_bytes = (volume * samples).tobytes()
# for paFloat32 sample values must be in range [-1.0, 1.0]
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=fs,
output=True)
# play. May repeat with different volume values (if done interactively)
start_time = time.time()
stream.write(output_bytes)
print("Played sound for {:.2f} seconds".format(time.time() - start_time))
stream.stop_stream()
stream.close()
p.terminate()
Version without numpy:
import array
import math
import time
import pyaudio
p = pyaudio.PyAudio()
volume = 0.5 # range [0.0, 1.0]
fs = 44100 # sampling rate, Hz, must be integer
duration = 5.0 # in seconds, may be float
f = 440.0 # sine frequency, Hz, may be float
# generate samples, note conversion to float32 array
num_samples = int(fs * duration)
samples = [volume * math.sin(2 * math.pi * k * f / fs) for k in range(0, num_samples)]
# per #yahweh comment explicitly convert to bytes sequence
output_bytes = array.array('f', samples).tobytes()
# for paFloat32 sample values must be in range [-1.0, 1.0]
stream = p.open(format=pyaudio.paFloat32,
channels=1,
rate=fs,
output=True)
# play. May repeat with different volume values (if done interactively)
start_time = time.time()
stream.write(output_bytes)
print("Played sound for {:.2f} seconds".format(time.time() - start_time))
stream.stop_stream()
stream.close()
p.terminate()
ivan-onys gave an excellent answer, but there is a little addition to it:
this script will produce 4 times shorter sound than expected because Pyaudio write method needs string data of float32, but when you pass numpy array to this method, it converts whole array as entity to a string, therefore you have to convert data in numpy array to the byte sequence yourself like this:
samples = (np.sin(2*np.pi*np.arange(fs*duration)*f/fs)).astype(np.float32).tobytes()
and you have to change this line as well:
stream.write(samples)
One of the more consistent andeasy to install ways to deal with sound in Python is the Pygame multimedia libraries.
I'd recomend using it - there is the pygame.sndarray submodule that allows you to manipulate numbers in a data vector that become a high-level sound object that can be playerd in the pygame.mixer module.
The documentation in the pygame.org site should be enough for using the sndarray module.
Today for Python 3.5+ the best way is to install the packages recommended by the developer.
http://people.csail.mit.edu/hubert/pyaudio/
For Debian do
sudo apt-get install python3-all-dev portaudio19-dev
before trying to install pyaudio
The script from ivan_onys produces a signal that is four times shorter than intended. If a TypeError is returned when volume is a float, try adding .tobytes() to the following line instead.
stream.write((volume*samples).tobytes())
#mm_ float32 = 32 bits, and 8 bits = 1 byte, so float32 = 4 bytes. When samples are passed to stream.write as float32, byte count (duration) is divided by 4. Writing samples back .tobytes() corrects for quartering the sample count when writing to float32.
I the bregman lab toolbox you have a set of functions that does exactly what you want. This python module is a little bit buggy but you can adapt this code to get your own functions