WAV FFT: Slice indices must be integers - python

I am trying to perform some analysis on a .wav file, I have taken the code from the following question (Python Scipy FFT wav files) and it seems to give exactly what I need however when running the code I run into the following error:
TypeError: slice indices must be integers or None or have an index method
This occurs on line 9 of my code. I don't undertand why this occurs, because I thought that the abs function would make it an integer.
import matplotlib.pyplot as plt
from scipy.fftpack import fft
from scipy.io import wavfile # get the api
fs, data = wavfile.read('New Recording 2.wav') # load the data
a = data.T[0] # this is a two channel soundtrack, I get the first track
b=[(ele/2**8.)*2-1 for ele in a] # this is 8-bit track, b is now normalized on [-1,1)
c = fft(b) # calculate fourier transform (complex numbers list)
d = len(c)/2 # you only need half of the fft list (real signal symmetry)
plt.plot(abs(c[:(d-1)]),'r')
plt.show()
plt.savefig("Test.png", bbox_inches = "tight")

abs() doesn't make your number into an integer. It just turns negative numbers into positive numbers. When len(c) is an odd number your variable d is a float that ends in x.5.
What you want is probably round(d) instead of abs(d)

Related

maximum allowed dimention exceeded

I am attempting to make a painting based on the mass of the universe with pi and the gravitational constant of earth at sea level converted to binary. i've done the math and i have the right dimentions and it should only be less than a megabyte of ram but im running into maximum allowed dimention exceeded value error.
Here is the code:
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
boshi = 123456789098765432135790864234579086542098765432135321 # universal mass
genesis = boshi ** 31467 # padding
artifice = np.binary_repr(genesis) # formatting
A = int(artifice)
D = np.array(A).reshape(A, (1348, 4117))
plt.imsave('hello_world.png', D, cmap=cm.gray) # save image
I keep running into the error at D = np.array..., and maybe my reshape is too big but its only a little bigger than 4k. seems like this should be no problem for gpu enhanced colab. Doesn't run on my home machine either with the same error. Would this be fixed with more ram?
Making it Work
The problem is that artifice = np.binary_repr(genesis) creates a string. The string consists of 1348 * 4117 = 5549716 digits, all of them zeros and ones. If you convert the string to a python integer, A = int(artifice), you will (A) wait a very long time, and (B) get a non-iterable object. The array you create with np.array(A) will have a single element.
The good news is that you can bypass the time-consuming step entirely using the fact that the string artifice is already an iterable:
D = np.array(list(artifice), dtype=np.uint8).reshape(1348, 4117)
The step list(artifice) will take a couple of seconds since it has to split up the string, but everything else should be quite fast.
Plotting is easy from there with plt.imsave('hello_world.png', D, cmap=cm.gray):
Colormaps
You can easily change the color map to coolwarm or whatever you want when you save the image. Keep in mind that your image is binary, so only two of the values will actually matter:
plt.imsave('hello_world2.png', D, cmap=cm.coolwarm)
Exploration
You have an opportunity here to add plenty of color to your image. Normally, a PNG is 8-bit. For example, instead of converting genesis to bits, you can take the bytes from it to construct an image. You can also take nibbles (half-bytes) to construct an indexed image with 16 colors. With a little padding, you can even make sure that you have a multiple of three data points, and create a full color RGB image in any number of ways. I will not go into the more complex options, but I would like to explore making a simple image from the bytes.
5549716 bits is 693715 = 5 * 11 * 12613 bytes (with four leading zero bits). This is a very nasty factorization leading to an image size of 55x12613, so let's remove that upper nibble: while 693716's factorization is just as bad as 693715's, 693714 factors very nicely into 597 * 1162.
You can convert your integer to an array of bytes using its own to_bytes method:
from math import ceil
byte_genesis = genesis.to_bytes(ceil(genesis.bit_length() / 8), 'big')
The reason that I use the built-in ceil rather than np.ceil is that it return an integer rather than a float.
Converting the huge integer is very fast because the bytes object has direct access to the data of the integer: even if it makes a copy, it does virtually no processing. It may even share the buffer since both bytes and int are nominally immutable. Similarly, you can create a numpy array from the bytes as just a view to the same memory location using np.frombuffer:
img = np.frombuffer(byte_genesis, dtype=np.uint8)[1:].reshape(597, 1162)
The [1:] is necessary to chop off the leading nibble, since bytes_genesis must be large enough to hold the entirety of genesis. You could also chop off on the bytes side:
img = np.frombuffer(byte_genesis[1:], dtype=np.uint8).reshape(597, 1162)
The results are identical. Here is what the picture looks like:
plt.imsave('hello_world3.png', img, cmap=cm.viridis)
The result is too large to upload (because it's not a binary image), but here is a randomly selected sample:
I am not sure if this is aesthetically what you are looking for, but hopefully this provides you with a place to start looking at how to convert very large numbers into data buffers.
More Options, Because this is Interesting
I wanted to look at using nibbles rather than bytes here, since that would allow you to have 16 colors per pixel, and twice as many pixels. You can get an 1162x1194 image starting from
temp = np.frombuffer(byte_genesis, dtype=np.uint8)[1:]
Here is one way to unpack the nibbles:
img = np.empty((1162, 1194), dtype=np.uint8)
img.ravel()[::2] = np.bitwise_and(temp >> 4, 0x0F)
img.ravel()[1::2] = np.bitwise_and(temp, 0x0F)
With a colormap like jet, you get:
plt.imsave('hello_world4.png', img, cmap=cm.jet)
Another option, going in the opposite direction in a manner of speaking) is not to use colormaps at all. Instead, you can divide your space by a factor of three and generate your own colors in RGB space. Luckily, one of the prime factors of 693714 is 3. You can therefore have a 398x581 image (693714 == 3 * 398 * 581). How you interpret the data is even more than usual up to you.
Side Note Before I Continue
With the black-and-white binary image, you could control the color, size and orientation of the image. With 8-bit data, you could control how the bits were sampled (8 or fewer, as in the 4-bit example), the endianness of your interpretation, the color map, and the image size. With full color, you can treat each triple as a separate color, treat the entire dataset as three consecutive color planes, or even do something like apply a Bayer filter to the array. All in addition to the other options like size, ordering, number of bits per sample, etc.
The following will show the color triples and three color planes options for now.
Full Color Images
To treat each set of 3 consecutive bytes as an RGB triple, you can do something like this:
img = temp.reshape(398, 581, 3)
plt.imsave('hello_world5.png', img)
Notice that there is no colormap in this case.
Interpreting the data as three color planes requires an extra step because plt.imsave expects the last dimension to have size 3. np.rollaxis is a good tool for this:
img = np.rollaxis(temp.reshape(3, 398, 581), 0, 3)
plt.imsave('hello_world6.png', img)
I could not reproduce your problem, because the line A = int(artifice) took like forever. I replaced it with a ,for loop to cast each digit on its own. The code worked then and produced the desired image.
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
boshi = 123456789098765432135790864234579086542098765432135321
genesis = boshi ** 31467
artifice = np.binary_repr(genesis)
D = np.zeros((1348, 4117), dtype=int)
for i, val in enumerate(D):
D[i] = int(artifice[i])
plt.imsave('hello_world.png', D, cmap=cm.gray)

Handling large long values in matplotlib

I currently implement a 2D plot which shall be used to relate those two values to a visual "landscape":
x-axis: huge binary discrete values with a length of up to
3000 digits (2^3000)
y-axis: calculated value (no problem)
It seems that matplotlib can not handle such huge values.
As it represents a landscape, the values itself are not important. What is important is a visual representation of the function itself.
I tried to log-scale the values, which did not solve the problem. This is the current code:
import numpy as np
import matplotlib.pyplot as plt
'''
convert binary list to gray code to maintain hamming distance
'''
def indtogray(self, ind):
return ind[:1] + [i ^ ishift for i, ishift in zip(ind[:-1], ind[1:])]
'''
Create int from gray value
'''
def graytoint(self, gray):
i = 0
for bit in gray:
i = i*2 + bit
return i
'''
Create example list of binary lists
'''
def create(self, n, size):
return [[np.random.randint(2) for _ in range(size)] for _ in range(n)]
def showPlot(self, toolbox, neval):
individuals = self.create(100, 2000)
fitnesses = map(np.sum, individuals)
fig,ax = plt.subplots()
values = map(self.graytoint, map(self.indtogray, individuals))
full = zip(*sorted(zip(values, fitnesses)))
line = ax.plot(full[0], full[1], 'r-')
plt.show()
if __name__ == '__main__':
show()
I get the following error:
OverflowError: long int too large to convert to float
Anyone an idea?
The error just means that your number is too big and it cannot convert it to float. Things you can do is take the logarithm of x.
Now if you have up to 3000 binary digits, this means that the largest decimal number is pow(2,3000). If you take log(pow(2,3000), you should get 2079.44154~ which you should then be able convert to a float. I would double check to see if the number you have is in binary format but in decimal representation. Meaning if x[0] = 10, make sure that it is ten and not 2 in binary. Otherwise, a 2^3000 number in binary format would be very large.

Matplotlib Magnitude_spectrum Units in Python for Comparing Guitar Strings

I'm using matplotlib's magnitude_spectrum to compare the tonal characteristics of guitar strings. Magnitude_spectrum shows the y axis as having units of "Magnitude (energy)". I use two different 'processes' to compare the FFT. Process 2 (for lack of a better description) is much easier to interpret- code & graphs below
My questions are:
In terms of units, what does "Magnitude (energy)" mean and how does it relate to dB?
Using #Process 2 (see code & graphs below), what type of units am I looking at, dB?
If #Process 2 is not dB, then what is the best way to scale it to dB?
My code below (simplified) shows an example of what I'm talking about/looking at.
import numpy as np
from scipy.io.wavfile import read
from pylab import plot
from pylab import plot, psd, magnitude_spectrum
import matplotlib.pyplot as plt
#Hello Signal!!!
(fs, x) = read('C:\Desktop\Spectral Work\EB_AB_1_2.wav')
#Remove silence out of beginning of signal with threshold of 1000
def indices(a, func):
#This allows to use the lambda function for equivalent of find() in matlab
return [i for (i, val) in enumerate(a) if func(val)]
#Make the signal smaller so it uses less resources
x_tiny = x[0:100000]
#threshold is 1000, 0 is calling the first index greater than 1000
thresh = indices(x_tiny, lambda y: y > 1000)[1]
# backs signal up 20 bins, so to not ignore the initial pluck sound...
thresh_start = thresh-20
#starts at threshstart ends at end of signal (-1 is just a referencing thing)
analysis_signal = x[thresh_start-1:]
#Split signal so it is 1 second long
one_sec = 1*fs
onesec = x[thresh_start-1:one_sec+thresh_start-1]
#process 1
(spectrum, freqs, _) = magnitude_spectrum(onesec, Fs=fs)
#process 2
spectrum1 = spectrum/len(spectrum)
I don't know how to bulk process on multiple .wav files so I run this code separately on a whole bunch of different .wav files and i put them into excel to compare. But for the sake of not looking at ugly graphs, I graphed it in Python. Here's what #process1 and #process2 look like when graphed:
Process 1
Process 2
Magnetude is just the absolute value of the frequency spectrum. As you have labelled in Process 1 "Energy" is a good way to think about it.
Both Process 1 and Process 2 are in the same units. The only difference is that the values in Process 2 has been divided by the total length of the array (a scalar, hence no change of units). Normally this happens as part of the FFT, but sometimes it does not (e.g. numpy.FFT doesn't include the divide by length).
The easiest way to scale it to dB is:
(spectrum, freqs, _) = magnitude_spectrum(onesec, Fs=fs, scale='dB')
If you wanted to do this yourself then you would need to do something like:
spectrum2 = 20*numpy.log10(spectrum)
**It is worth noting that I'm not sure if you should be applying the /len(spectrum) or not. I would suggest using the scale='dB' !!
To convert to dB, take the log of any non-zero spectrum magnitudes, and scale (scale to match a calibrated mic and sound source if available, or use an arbitrarily scale to make the levels look familiar otherwise), before plotting.
For zero magnitude values, perhaps just replace or clamp the log with whatever you want to be on the bottom of your log plot (certainly not negative-infinity).

How can MATLAB function wavread() be implemented in Python?

In Matlab:
[s, fs] = wavread(file);
Since the value of each element in s is between -1 and 1, so we import scikits.audiolab using its wavread function.
from scikits.audiolab import wavread
s, fs, enc = wavread(filename)
But when I red the same input wav file, the value of s in Matlab and Python were totally different. How could I get the same output of s as in MATLAB?
p.s. The wav file is simple 16-bit mono channel file sampled in 44100Hz.
The Matlab wavread() function returns normalised data as default, i.e. it scales all the data to the range -1 to +1. If your audio file is in 16-bit format then the raw data values will be in the range -32768 to +32767, which should match the range returned by scikits.audiolab.wavread().
To normalise this data to within the range -1 to +1 all you need to do is divide the data array by the value with the maximum magnitude (using the numpy module in this case):
from scikits.audiolab import wavread
import numpy
s, fs, enc = wavread(filename) # s in range -32768 to +32767
s = numpy.array(s)
s = s / max(abs(s)) # s in range -1 to +1
Note that you can also use the 'native' option with the Matlab function to return un-normalised data values, as suggested by horchler in his comment.
[s, fs] = wavread(file, 'native');

Moving average of an array in Python

I have an array where discreet sinewave values are recorded and stored. I want to find the max and min of the waveform. Since the sinewave data is recorded voltages using a DAQ, there will be some noise, so I want to do a weighted average. Assuming self.yArray contains my sinewave values, here is my code so far:
filterarray = []
filtersize = 2
length = len(self.yArray)
for x in range (0, length-(filtersize+1)):
for y in range (0,filtersize):
summation = sum(self.yArray[x+y])
ave = summation/filtersize
filterarray.append(ave)
My issue seems to be in the second for loop, where depending on my averaging window size (filtersize), I want to sum up the values in the window to take the average of them. I receive an error saying:
summation = sum(self.yArray[x+y])
TypeError: 'float' object is not iterable
I am an EE with very little experience in programming, so any help would be greatly appreciated!
The other answers correctly describe your error, but this type of problem really calls out for using numpy. Numpy will run faster, be more memory efficient, and is more expressive and convenient for this type of problem. Here's an example:
import numpy as np
import matplotlib.pyplot as plt
# make a sine wave with noise
times = np.arange(0, 10*np.pi, .01)
noise = .1*np.random.ranf(len(times))
wfm = np.sin(times) + noise
# smoothing it with a running average in one line using a convolution
# using a convolution, you could also easily smooth with other filters
# like a Gaussian, etc.
n_ave = 20
smoothed = np.convolve(wfm, np.ones(n_ave)/n_ave, mode='same')
plt.plot(times, wfm, times, -.5+smoothed)
plt.show()
If you don't want to use numpy, it should also be noted that there's a logical error in your program that results in the TypeError. The problem is that in the line
summation = sum(self.yArray[x+y])
you're using sum within the loop where your also calculating the sum. So either you need to use sum without the loop, or loop through the array and add up all the elements, but not both (and it's doing both, ie, applying sum to the indexed array element, that leads to the error in the first place). That is, here are two solutions:
filterarray = []
filtersize = 2
length = len(self.yArray)
for x in range (0, length-(filtersize+1)):
summation = sum(self.yArray[x:x+filtersize]) # sum over section of array
ave = summation/filtersize
filterarray.append(ave)
or
filterarray = []
filtersize = 2
length = len(self.yArray)
for x in range (0, length-(filtersize+1)):
summation = 0.
for y in range (0,filtersize):
summation = self.yArray[x+y]
ave = summation/filtersize
filterarray.append(ave)
self.yArray[x+y] is returning a single item out of the self.yArray list. If you are trying to get a subset of the yArray, you can use the slice operator instead:
summation = sum(self.yArray[x:y])
to return an iterable that the sum builtin can use.
A bit more information about python slices can be found here (scroll down to the "Sequences" section): http://docs.python.org/2/reference/datamodel.html#the-standard-type-hierarchy
You could use numpy, like:
import numpy
filtersize = 2
ysums = numpy.cumsum(numpy.array(self.yArray, dtype=float))
ylags = numpy.roll(ysums, filtersize)
ylags[0:filtersize] = 0.0
moving_avg = (ysums - ylags) / filtersize
Your original code attempts to call sum on the float value stored at yArray[x+y], where x+y is evaluating to some integer representing the index of that float value.
Try:
summation = sum(self.yArray[x:y])
Indeed numpy is the way to go. One of the nice features of python is list comprehensions, allowing you to do away with the typical nested for loop constructs. Here goes an example, for your particular problem...
import numpy as np
step=2
res=[np.sum(myarr[i:i+step],dtype=np.float)/step for i in range(len(myarr)-step+1)]

Categories