PyAudio and Audition CS6 recording different sample values - python

What shall be evaluated and achieved:
I try to record audio data with a minimum of influence by hard- and especially software. After using Adobe Audition for some time I stumbled across PyAudio and was driven by curiosity as well as the possibility to refresh my Python knowledge.
As the fact displayed in the headline above may have given away I compared the sample values of two wave files (indeed sections of them) and had to find out that both programmes produce different output.
As I am definitely at my wit`s end, I do hope to find someone who could help me.
What has been done so far:
An M-Audio “M-Track Two-Channel USB Interface” has been used to record Audio Data with Audition CS6 and PyAudio simultaneously as the following steps are executed in the given order…
Audition is prepared for recording by opening “Prefrences/ Audio Hardware” and selecting the audio interface, a sample rate of 48 kHz and a latency of 250 ms (this value has been examined thoughout the last years as to be the second lowest I can get without getting the warning for lost samples – if I understood the purpose correctly I just have to worry about loosing samples cause monitoring is not an issue).
A new file with one channel, a sample rate of 48 kHz and a bit depth of 24 bit is opened.
The Python code (displayed below) is started and leads to a countdown being used to change over to Audition and start the recording 10 s before Python starts its.)
Wait until Python prints the “end of programme” message.
Stop and save the data recorded by Audition.
Now data has to be examined:
Both files (one recorded by Audition and Python respectively) are opened in Audition (Multitrack session). As Audition was started and terminated manually the two files have completely different beginning and ending times. Then they are aligned visually so that small extracts (which visually – by waveform shape – contain the same data) can be cut out and saved.
A Python programme has been written opening, reading and displaying the sample values using the default wave module and matplotlib.pyplot respectively (graphs are shown below).
Differences in both waveforms and a big question mark are revealed…
Does anybody have an idea why Audition is showing different sample values and specifically where precisely the mistake (is there any?) hides??
some (interesting) observations
a) When calling the pyaudio.PyAudio().get_default_input_device_info() method the default sample rate is listed as 44,1 kHz even though the default M-Track sample rate is said to be 48 kHz by its specifications (indeed Audition recognizes the 48 kHz by resampling incoming data if another rate was selected). Any ideas why and how to change this?
b) Aligning both files using the beginning of the sequence covered by PyAudio and checking whether they are still “in phase” at the end reveals no – PyAudio is shorter and seems to have lost samples (even though no exception was raised and the “exception on overflow” argument is “True”)
c) Using the “frames_per_buffer” keyword in the stream open method I was unable to align both files, having no idea where Python got its data from.
d) Using the “.get_default_input_device_info()” method and trying different sample rates (22,05 k, 44,1 k, 48 k, 192 k) I always receive True as an output.
Official Specifications M-Track:
bit depth = 24 bit
sample rate = 48 kHz
input via XLR
output via USB
Specifications Computer and Software:
Windows 8.1
I5-3230M # 2,6 GHz
8 GB RAM
Python 3.4.2 with PyAudio 0.2.11 – 32 bit
Audition CS6 Version 5.0.2
Python Code
import pyaudio
import wave
import time
formate = pyaudio.paInt24
channels = 1
framerate = 48000
fileName = 'test ' + '.wav'
chunk = 6144
# output of stream.get_read_available() at different positions
p = pyaudio.PyAudio()
stream = p.open(format=formate,
channels=channels,
rate=framerate,
input=True)
#frames_per_buffer=chunk) # observation c
# COUNTDOWN
for n in range(0, 30):
print(n)
time.sleep(1)
# get data
sampleList = []
for i in range(0, 79):
data = stream.read(chunk, exception_on_overflow = True)
sampleList.append(data)
print('end -', time.strftime('%d.%m.%Y %H:%M:%S', time.gmtime(time.time())))
stream.stop_stream()
stream.close()
p.terminate()
# produce file
file = wave.open(fileName, 'w')
file.setnchannels(channels)
file.setframerate(framerate)
file.setsampwidth(p.get_sample_size(formate))
file.writeframes(b''.join(sampleList))
file.close()
Figure 1: first comparison Audition – PyAudio
image 1
Figure 2: second comparison Audition - Pyaudio
image 2

Related

Python nidaqmx stream read does not change on every read

What I'm trying to do is setup 16 analog input channels, sample them constantly at a given rate and read 1 sample from each channel when calling the read function. Ideally I would like to read the newest sample so I can timestamp it when reading.
The problem is that the readings do not change from read to read, only after a few seconds. If I adjust the sampling speed, I can get to a situation where I get an error saying the software can't keep up with the hardware sampling rate.
Which part of my code is wrong?
import numpy
import nidaqmx
from nidaqmx.stream_readers import AnalogSingleChannelReader, AnalogMultiChannelReader
from nidaqmx.constants import Edge, AcquisitionType
# Create a task and a reader
task = nidaqmx.Task()
values_read = numpy.zeros(16, dtype = numpy.float64)
task.ai_channels.add_ai_current_chan('cDAQ1Mod2/ai0:15')
task.timing.cfg_samp_clk_timing(rate = 1000, active_edge = Edge.RISING, sample_mode = AcquisitionType.CONTINUOUS, samps_per_chan = 1)
reader = AnalogMultiChannelReader(task.in_stream)
task.start()
while 1:
reader.read_one_sample(values_read)
print(values_read)
The sampling rate is 1000 but you are reading only one sample each time. Usually, each Read call takes a few milliseconds. You are not reading fast enough hence the buffer overflow error.
Suggestions:
Reduce sample rate.
Read more samples per Read call.
Since you want to read only the latest data and timestamp yourself, you can use the On Demand software timed acquisition. See example ai_voltage_sw_timed.py

Inaccurate real-time audio FFT interpretation with Python

I'm trying to use Python to create a live music visualization. The libraries I'm using are SoundCard (for live audio capture) and Librosa (for short-time Fourier transform).
However I suspect I'm not interpreting the audio data correctly. Looking at the 100Hz-200Hz bin, I get a constant stream of sound even when the song doesn't contain that much bass (or really, any whatsoever). I admit I am a bit in over my head with all the audio processing FFT stuff, since it's not really my expertise and the math beats me most of the time.
This is the function that captures and analyses the audio. lb is set to the speakers and works properly. Fs is set to 48000 and I record 1000 frames in the attempt of keeping 48FPS. fftwindowsize is set to 2048*8 because... I'm not sure. I increased the number until Librosa stopped throwing warnings.
def audioanalysis():
with lb.recorder(samplerate=Fs) as mic:
rawdata = mic.record(numframes=1000)
datalen: int = int(rawdata.size/2)
monodata = numpy.empty(datalen)
for x in range(0, datalen):
monodata[x] = max(rawdata[x][0], rawdata[x][1])
data = numpy.abs(librosa.stft(monodata, n_fft=fftwindowsize, hop_length=1024))
return librosa.amplitude_to_db(data, ref=numpy.max)
And the code for making buckets:
frequencies = librosa.core.fft_frequencies(n_fft=fftwindowsize)
freq_index_ratio = len(frequencies)/frequencies[len(frequencies)-1] / 2
[...]
for i in range(0,buckets):
avg = 0
for j in range (i * bucketsize, (i+1)*bucketsize):
avg += amp(spectrogram=spectrogram, freq=j)
amps[i] = avg/bucketsize
def amp(spectrogram, freq) -> float:
return spectrogram[int(freq*freq_index_ratio)]
Over the course of a song, amps[1] (so 100Hz-200Hz) stays in the -50dB to -30dB range, which isn't really useful or representative of the song playing.
Is my FFT analysis wrong? Is there no way to better interpret short samples of sound?
P.S. I know my Python code isn't excellent. This is my first project in Python :)

Why does PyAudio cut off audio from NumPy array?

I accidentally forgot to convert some NumPy arrays to bytes objects when using PyAudio, but to my surprise it still played audio, even if it sounded a bit off. I wrote a little test script (see below) for playing 1 second of a 440Hz tone, and it seems like writing a NumPy array directly to a PyAudio Stream cuts that tone short.
Can anyone explain why this happens? I thought a NumPy array was a contiguous sequence of bytes with some header information about its dtype and strides, so I would've predicted that PyAudio played the full second of the tone after some garbled audio from the header, not cut the tone off.
# script segment
import pyaudio
import numpy as np
RATE = 48000
p = pyaudio.PyAudio()
stream = p.open(format = pyaudio.paFloat32, channels = 1, rate = RATE, output = True)
TONE = 440
SECONDS = 1
t = np.arange(0, 2*np.pi*TONE*SECONDS, 2*np.pi*TONE/RATE)
sina = np.sin(t).astype(np.float32)
sinb = sina.tobytes()
# console commands segment
stream.write(sinb) # bytes object plays 1 second of 440Hz tone
stream.write(sina) # still plays 440Hz tone, but noticeably shorter than 1 second
The problem is more subtle than you describe. Your first call is passing a bytes array of size 192,000. The second call is passing a list of float32 values with size 48,000. pyaudio handles both of them, and passes the buffer to portaudio to be played.
However, when you opened pyaudio, you told it you were sending paFloat32 data, which has 4 bytes per sample. The pyaudio write handler takes the length of the array you gave it, and divides by the number of channels times the sample size to determine how many audio samples there are. In your second call, the length of the array is 48,000, which it divides by 4, and thereby tells portaudio "there are 12,000 samples here".
So, everyone understood the format, but were confused about the size. If you change the second call to
stream.write(sina, 48000)
then no one has to guess, and it works perfectly fine.

reading 14-bit data using i2c, python, and raspberry pi

I'm trying to read data from this barometric pressure sensor on a raspberry pi using python & i2c/smbus.
The sensor's data sheet (page 10) says it will output a digital value in the range 0-16383 (2**14). So far it seems like I have to read whole bytes, so I'm not sure how to get a 14 bit value. (I had a link to the data sheet, but SO says I need more reputation before I can add more links to posts.)
This sample uses Adafruit's I2C python library, which is basically a wrapper around SMBus.
import Adafruit_I2C
import time
# sensor returns a 14-bit reading
max_output = 2**14
# per data sheet, max_output == 1.6 bar
max_bar = 1.6
# i2c address specified in data sheet
sensor = Adafruit_I2C.Adafruit_I2C(0x78)
while True:
reading = sensor.readU16(0, little_endian=False)
# reading is sometimes, but not always, greater than 2**14
# this adjustment feels pretty hacky/wrong
while reading > max_output:
reading = reading >> 1
bar = reading / float(max_output) * max_bar
print bar
time.sleep(1)
I compare these readings to the output from my handheld GPS, which includes a barometer. I sometimes get readings which are somewhat close (1030 millibar when the GPS reads 1001 millibar), but the sensor then dips drastically (down to 930 millibar) for a few readings. I have a suspicion that this is due to how I'm reading the data, but no real evidence to back that up.
At this point, I'm not sure what to try next.
Some things I've guessed at, but would appreciate some more-informed help with:
How can I read just the 14 bits that the sensor is outputting?
What endian-ness are the returned values? Assuming big-endian produced values which seemed more sane, but I may be conflating multiple problems here.
How can I tell which register to read from? This isn't mentioned in the data-sheet anywhere. I guessed that register 0 is probably the only one.
You should be masking the output of the sensor, not shifting it. e.g. reading = reading & (max_output-1) should probably do it.
The top two bits are the status bits, so if they are set sometimes they could mean things like: normal mode or stale data indicator.

Capturing audio with python on the raspberry pi fails

I am currently working on an easy-to-use audio capturing device for digitizing old casette tapes (i.e. low fidelity). This device is based on a raspberry pi with an usb sound card, which does nothinge else than starting the listed python script on bootup.
import alsaaudio
import wave
import os.path
import RPi.GPIO as GPIO
import key
import usbstick
import time
try:
# Define storage
path = '/mnt/usb/'
prefix = 'Titel '
extension = '.wav'
# Configure GPIOs
GPIO.setmode(GPIO.BOARD)
button_shutdown = key.key(7)
button_record = key.key(11)
GPIO.setup(12, GPIO.OUT)
GPIO.setup(15, GPIO.OUT)
GPIO.output(12, GPIO.HIGH)
# Start thread to detect external memory
usb = usbstick.usbstick(path, 13)
# Configure volume
m = alsaaudio.Mixer('Mic', 0, 1)
m.setvolume(100, 0, 'capture')
# Only run until shutdown button gets pressed
while not (button_shutdown.pressed()):
# Only record if record button is pressed and memory is mounted
if (button_record.pressed() and usb.ismounted()):
# Create object to read input
inp = alsaaudio.PCM(alsaaudio.PCM_CAPTURE, alsaaudio.PCM_NORMAL, 'sysdefault:CARD=Device')
inp.setchannels(1)
inp.setrate(44100)
inp.setformat(alsaaudio.PCM_FORMAT_S16_LE)
inp.setperiodsize(1024)
# Find next file name
i = 0
filename = ''
while (True):
i += 1
filename = path + prefix + str(i) + extension
if not (os.path.exists(filename)):
break
print 'Neue Aufnahme wird gespeichert unter ' + filename
# Create wave file
wavfile = wave.open(filename, 'w')
wavfile.setnchannels(1)
wavfile.setsampwidth(2)
wavfile.setframerate(44100)
# Record sound
while (button_record.pressed()):
l, data = inp.read()
wavfile.writeframes(data)
GPIO.output(15, GPIO.HIGH)
# Stop record an save
print 'Aufnahme beendet\n'
inp.close()
wavfile.close()
GPIO.output(15, GPIO.LOW)
# Record has been started but no memory is mounted
elif (button_record.pressed() and not usb.ismounted()):
print 'Massenspeichergeraet nicht gefunden'
print 'Warte auf Massenspeichergeraet'
# Restart after timeout
timestamp = time.time()
while not (usb.ismounted()):
if ((time.time() - timestamp) > 120):
time.sleep(5)
print 'Timeout.'
#reboot()
#myexit()
print 'Massenspeichergeraet gefunden'
myexit()
except KeyboardInterrupt:
myexit()
According to the documentation pyaudio, the routine inp.read() or alsaaudio.PCM.read() respectively should usually wait until a full period of 1024 samples has been captured. It then should return the number of captured samples as well as the samples itself. Most of the time it returns exactly one period of 1024 samples. I don't think that I have a performance problem, since I would expect it to return several periods then.
THE VERY MYSTERIOUS BEHAVIOR: After 01:30 of recording, inp.read() takes some milliseconds longer than normal to process (this is a useful information in my ignorant opinion) and then returns -32 and faulty data. Then the stream continues. After half a minute at 02:00 it takes about a second (i.e. longer than the first time) to process and returns -32 and faulty data again. This procedere repeats then every minute (02:30-03:00, 03:30-04:00, 04:30-05:00). This timing specification was roughly taken by hand.
-32 seems to result from the following code line in /pyalsaaudio-0.7/alsaaudio.c
return -EPIPE;
Some words about how this expresses: If the data stream is directly written into the wave file, i.e. including the faulty period, the file contains sections of white noise. These sections last 30 seconds. This is because the samples usually consist of 2 bytes. When the faulty period (1 byte) is written, the byte order gets inverted. With the next faulty period it gets inverted again and therefore is correct. If faulty data is refused and only correct data is written into the wave file, the file 'jumps' every 30 seconds.
I think the problem can either be found in
1. the sound card (but I tested 2 different)
2. the computing performance of the raspberry pi
3. the lib pyaudio
Further note: I am pretty new to the linux and python topic. If you need any logs or something, please describe how I can find them.
To cut a long story short: Could someone please help me? How can I solve this error?
EDIT: I already did this usb firmware updating stuff, which is needed, since the usb can be overwhelmed. BTW: What exactly is this EPIPE failure?
Upon further inverstigation, I found out, that this error is not python/pyaudio specific. When I try to record a wave file with arecord, I get the following output. The overrun messages are sent according to the timing described above.
pi#raspberrypi ~ $ sudo arecord -D sysdefault:CARD=Device -B buffersize=192000 -f cd -t wav /mnt/usb/test.wav
Recording WAVE '/mnt/usb/test.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo
overrun!!! (at least -1871413807.430 ms long)
overrun!!! (at least -1871413807.433 ms long)
overrun!!! (at least -1871413807.341 ms long)
overrun!!! (at least -1871413807.442 ms long)
Referring to this thread at raspberrypi.org, the problem seems to be the (partly) limited write speed to the SD card or the usb storage device with a raspberry pi. Recording to RAM (with tmpfs) or compressing the audio data (e.g. to mp3 with lame) before writing it somewhere else could be a good solution in some cases.
I can't say why the write speed is to low. In my opinion, the data stream is 192 kByte/s for 48 kSamples/s, 16 Bit, stereo. Any SD card or usb mass storage should be able to handle this. As seen above, buffering the audio stream doesn't help.

Categories