Why do I have such a distortion with pygame sndarray objects? - python

I'm using sndarray from pygame to play with basic sound synthesis. The problem is Whatever I do, I have an awful distortion on the generated sound.
In the code I'll provide at the end of the question, you'll see a bunch of code coming from here and there. Actually, the main stuff comes from a MIT's source I found on the interweb which is using Numeric to do mathematic stuff and handling arrays, and since I can't install it for now, I decided to use Numpy for this.
First, I thought the problem was coming from the Int format of my arrays, but if I cast the values to numpy.int16, I don't have sound anymore.
Plus, I can't find anything on google about that kind of behavior from pygame / sndarray.
Any idea ?
Thanks !
Code :
global_sample_rate = 44100
def sine_array_onecycle(hz, peak):
length = global_sample_rate / float(hz)
omega = numpy.pi * 2 / length
xvalues = numpy.arange(int(length)) * omega
return (peak * numpy.sin(xvalues))
def zipstereo(f):
return numpy.array(zip (f , f))
def make_sound(arr, n_samples = global_sample_rate):
return pygame.sndarray.make_sound( zipstereo( numpy.resize(numpy.array(arr), (n_samples,)) ) )
def sine(hz, peak):
snd = make_sound(sine_array_onecycle(hz, peak), global_sample_rate)
return snd
=> 'hope I didn't make any lame mistake, I'm pretty new in the world of python

Presuming you have some initialization code like
pygame.mixer.pre_init(44100, -16, 2) # 44.1kHz, 16-bit signed, stereo
sndarray expects you to be passing it 16-bit integer arrays, not float arrays.
Your "peak" value needs to make sense given the 16-bit integer representation. So, if your float array has values in the range -1.0 to +1.0, then you need to multiply by 2**15 to get it scaled appropriately.
To be clear, you may want a conversion like:
numpy.int16(float_array*(2**15))
My best guess of the situation is that you had a float array with a low peak value like 1.0, so when converting it to int16 most everything was getting converted to 0 or +/-1, which you wouldn't be able to hear. When passing the float array, you were probably just getting random bits (when interpreted as 16 bit integers) so then it sounded like harsh noise (I stumbled through this phase on my way to getting this working).

Related

Reverse decode function of c-struct in Python

I am using this function found on GitHub to read some data from a HID-stream in Python. h.read(64)
def decode_bytes(byte_1, byte_2, byte_3, byte_4):
bytes_reversed_and_concatenated = byte_4 * (16 ** 6) + byte_3 * (16 ** 4) + byte_2 * (16 ** 2) + byte_1
bytes_hex = hex(bytes_reversed_and_concatenated)[2:]
bytes_decimal = str(round(struct.unpack('!f', bytes.fromhex(bytes_hex))[0], 1))
return bytes_decimal
The function converts four bytes (in hex-values as integers) from the stream to a Python float-value which is returned as a string. I've read that a C-struct float representation takes up four bytes, so I guess that explains that the function takes four bytes as an input. But apart from that, I'm pretty blank as to how and why the function works.
I have two questions:
First I would very much like to get a better understanding of how the function works. Why does it reverse the byte order and what is up with the 16 ** 6, 16 ** 4 and so on? I am having a hard time figuring out, what that does in Python.
Second I would like to reverse the function. Meaning I would like to be able to supply a float as an input and get out a list of four integer-hex-values, which I can write back via the HID-interface. But I have no idea, where to start.
I was hoping to get some pointers in the right direction. Any help is much appreciated.
So the comment from #user2357112 helped me figure everything out. The working and much simpler function now looks like this:
def decode_bytes(byte_1, byte_2, byte_3, byte_4):
return_value = struct.unpack('<f', bytes([byte_1, byte_2, byte_3, byte_4]))
return str(round(return_value[0], 1))
And if I want to wrap a float back up as a bytes array I do this:
struct.pack('<f', float(float_value))
Also I learned a bit about Endianness along the way. Thanks.

pyAudio what is the in_data in callbacks?

I'm trying to get the frequencies from the array generated from pyAudio's callback().
def callback(in_data, frame_count, time_info, flag):
audio_data = np.fromstring(in_data, dtype=np.float32)
freq_data = np.fft.fft(audio_data)
freq = np.abs(freq_data)
# Operations here
recovered_signal = np.fft.ifft(filtered_freq).astype(np.float32).tostring()
I'm getting a 2048 length array, and am not sure how to proceed. I've narrowed down what operations I need to do and tried applying FFT to it, but realized that I need to unpack the data, and pyAudio's documentation is a little lacking (much less not even online sometimes).
Part of my problem is I'm not understanding what in_data is. From what I can tell from research, it's bytes, which numpy converts into an array for me. However, reading an article on signal-processing for python gave me the impression I should be able to extract this into frequencies, and then perform this on it for a basic passband filter.
for f in freq:
if index > LOWCUT and index < HIGHCUT:
if f > 1:
filtered_freq.append(f)
#print(index)
else:
filtered_freq.append(0)
else:
filtered_freq.append(0)
index += 1
I've looked at np.fft.fftfreq as well, but that also still seems to produce an array of 2048 length, instead of an array containing all the frequencies and their power.
Edit: I know that with two channels the are interweaved, but my issue is mostly not understanding what the converted array by numpy represents and can be used.

Python f.read() and Octave fread(). => Reading a binary file showing the same values

I'm reading a binary file with signal samples both in Octave and Python.
The thing is, I want to obtain the same values for both codes, which is not the case.
The binary file is basically a signal in complex format I,Q recorded as a 16bits Int.
So, based on the Octave code:
[data, cnt_data] = fread(fid, 2 * secondOfData * fs, 'int16');
and then:
data = data(1:2:end) + 1i * data(2:2:end);
It seems simple, just reading the binary data as 16 bits ints. And then creating the final array of complex numbers.
Threfore I assume that in Python I need to do as follows:
rel=int(f.read(2).encode("hex"),16)
img=int(f.read(2).encode("hex"),16)
in_clean.append(complex(rel,img))
Ok, the main problem I have is that both real and imaginary parts values are not the same.
For instance, in Octave, the first value is: -20390 - 10053i
While in Python (applying the code above), the value is: (23216+48088j)
As signs are different, the first thing I thought was that maybe the endianness of the computer that recorded the file and the one I'm using for reading the file are different. So I turned to unpack function, as it allows you to force the endian type.
I was not able to find an "int16" in the unpack documentation:
https://docs.python.org/2/library/struct.html
Therefore I went for the "i" option adding "x" (padding bytes) in order to meet the requirement of 32 bits from the table in the "struct" documentation.
So with:
struct.unpack("i","xx"+f.read(2))[0]
the result is (-1336248200-658802568j) Using
struct.unpack("<i","xx"+f.read(2))[0] provides the same result.
With:
struct.unpack(">i","xx"+f.read(2))[0]
The value is: (2021153456+2021178328j)
With:
struct.unpack(">i",f.read(2)+"xx")[0]
The value is: (1521514616-1143441288j)
With:
struct.unpack("<i",f.read(2)+"xx")[0]
The value is: (2021175386+2021185723j)
I also tried with numpy and "frombuffer":
np.frombuffer(f.read(1).encode("hex"),dtype=np.int16)
With provides: (24885+12386j)
So, any idea about what I'm doing wrong? I'd like to obtain the same value as in Octave.
What is the proper way of reading and interpreting the values in Python so I can obtain the same value as in Octave by applying fread with an'int16'?
I've been searching on the Internet for an answer for this but I was not able to find a method that provides the same value
Thanks a lot
Best regards
It looks like the binary data in your question is 5ab0bbd8. To unpack signed 16 bit integers with struct.unpack, you use the 'h' format character. From that (23216+48088j) output, it appears that the data is encoded as little-endian, so we need to use < as the first item in the format string.
from struct import unpack
data = b'\x5a\xb0\xbb\xd8'
# The wrong way
rel=int(data[:2].encode("hex"),16)
img=int(data[2:].encode("hex"),16)
c = complex(rel, img)
print c
# The right way
rel, img = unpack('<hh', data)
c = complex(rel, img)
print c
output
(23216+48088j)
(-20390-10053j)
Note that rel, img = unpack('<hh', data) will also work correctly on Python 3.
FWIW, in Python 3, you could also decode 2 bytes to a signed integer like this:
def int16_bytes_to_int(b):
n = int.from_bytes(b, 'little')
if n > 0x7fff:
n -= 0x10000
return n
The rough equivalent in Python 2 is:
def int16_bytes_to_int(b):
lo, hi = b
n = (ord(hi) << 8) + ord(lo)
if n > 0x7fff:
n -= 0x10000
return n
But having to do that subtraction to handle signed numbers is annoying, and using struct.unpack is bound to be much more efficient.

Python plot of force in Lennard-Jones system gives TypeError

I am trying to plot the force on the ith particle as function of its distance from the jth particle (ie. xi-xj) in a Lennard-Jones system. The force is given by
where sigma and epsilon are two parameters, Xi is a known quantity and Xj is variable. The force directs from the ith particle to the jth particle.
The code that I have written for this is given below.
from pylab import*
from numpy import*
#~~~ ARGON VALUES ~~~~~~~~(in natural units)~~~~~~~~~~~~~~~~
epsilon=0.0122 # depth of potential well
sigma=0.335 # dist of closest approach
xi=0.00
xj=linspace(0.1,1.0,300)
f = 48.0*epsilon*( ((sigma**12.0)/((xi-xj)**13.0)) - ((sigma**6.0)/2.0/((xi-xj)**7.0)) ) * float(xj-xi)/abs(xi-xj)
plot(xj,f,label='force')
legend()
show()
But it gives me this following error.
f = 48.0*epsilon*( ((sigma**12.0)/((xi-xj)**11.0)) - ((sigma**6.0)/2.0/((xi-xj)**5.0)) ) * float(xj-xi)/abs(xi-xj)
TypeError: only length-1 arrays can be converted to Python scalars
Can someone help me solve this problem. Thanks in advance.
The error is with this part of the expression:
float(xj-xi)
Look at the answer to a related question. It appears to be conflict between Python built-in functions and Numpy functions.
If you take out the 'float' it at least returns. Does it give the correct numbers?
f = 48.0*epsilon*( ((sigma**12.0)/((xi-xj)**11.0)) - ((sigma**6.0)/2.0/((xi-xj)**5.0)) ) * (xj-xi)/abs(xi-xj)
Instead of the term float(xj-xi)/abs(xi-xj) you should use
sign(xj-xi)
If you really want to do the division, since xi and xj are already floats you could just do:
(xj-xi)/abs(xi-xj)
More generally, if you need to convert a numpy array of ints to floats you could use either of:
1.0*(xj-xi)
(xj-xi).astype(float)
Even more generally, it's helpful in debugging to not use equations that stretch across the page because with smaller terms you can identify the location of the errors more easily. It also often runs faster. For example, here you calculate xi-xj four times, when really it only needs to be done once. And it would be easier to read:
x = xi -xj
f = 48*epsilon*(s**12/x**13 - s**6/2/x**7)
f *= sign(-x)
The TypeError is due to float(xi-xj). float() cannot convert an iterable to a single scalar value. Instead, iterate over xj and convert each value in xi-xj to float. This can be easily done with
x = [float(j - xi) for j in xj)]

Process RGBA data efficiently using python?

I'm trying to process an RGBA buffer (list of chars), and run "unpremultiply" on each pixel. The algorithm is color_out=color*255/alpha.
This is what I came up with:
def rgba_unpremultiply(data):
for i in range(0, len(data), 4):
a = ord(data[i+3])
if a != 0:
data[i] = chr(255*ord(data[i])/a)
data[i+1] = chr(255*ord(data[i+1])/a)
data[i+2] = chr(255*ord(data[i+2])/a)
return data
It works but causes a major drawback in performance.
I'm wondering besides writing a C module, what are my options to optimize this particular function?
This is exactly the kind of code NumPy is great for.
import numpy
def rgba_unpremultiply(data):
a = numpy.fromstring(data, 'B') # Treat the string as an array of bytes
a = a.astype('I') # Cast array of bytes to array of uints, since temporary values needs to be larger than byte
alpha = a[3::4] # Every 4th element starting from index 3
alpha = numpy.where(alpha == 0, 255, alpha) # Don't modify colors where alpha is 0
a[0::4] = a[0::4] * 255 // alpha # Operates on entire slices of the array instead of looping over each element
a[1::4] = a[1::4] * 255 // alpha
a[2::4] = a[2::4] * 255 // alpha
return a.astype('B').tostring() # Cast back to bytes
How big is data? Assuming this is on python2.X Try using xrange instead of range so that you don't have to constantly allocate and reallocate a large list.
You could convert all the data to integers for working with them so you're not constantly converting to and from characters.
Look into using numpy to vectorize this: Link I suspect that simply storing the data as integers and using a numpy array will greatly improve the performance.
And another relatively simple thing you could do is write a little Cython:
http://wiki.cython.org/examples/mandelbrot
Basically Cython will compile your above function into C code with just a few lines of type hints. It greatly reduces the barrier to writing a C extension.
I don't have a concrete answer, but some useful pointers might be:
Python's array module
numpy
OpenCV if you have actual image data
There are some minor things you can do, but I do not think you can improve a lot.
Anyway, here's some hint:
def rgba_unpremultiply(data):
# xrange() is more performant then range, it does not precalculate the whole array
for i in xrange(0, len(data), 4):
a = ord(data[i+3])
if a != 0:
# Not sure about this, but maybe (c << 8) - c is faster than c*255
# So maybe you can arrange this to do that
# Check for performance improvement
data[i] = chr(((ord(data[i]) << 8) - ord(data[i]))/a)
data[i+1] = chr(255*ord(data[i+1])/a)
data[i+2] = chr(255*ord(data[i+2])/a)
return data
I've just make some dummy benchmark on << vs *, and it seems not to be markable differences, but I guess you can do better evaluation on your project.
Anyway, a c module may be a good thing, even if it does not seem to be "language related" the problem.

Categories