Handling large long values in matplotlib - python

I currently implement a 2D plot which shall be used to relate those two values to a visual "landscape":
x-axis: huge binary discrete values with a length of up to
3000 digits (2^3000)
y-axis: calculated value (no problem)
It seems that matplotlib can not handle such huge values.
As it represents a landscape, the values itself are not important. What is important is a visual representation of the function itself.
I tried to log-scale the values, which did not solve the problem. This is the current code:
import numpy as np
import matplotlib.pyplot as plt
'''
convert binary list to gray code to maintain hamming distance
'''
def indtogray(self, ind):
return ind[:1] + [i ^ ishift for i, ishift in zip(ind[:-1], ind[1:])]
'''
Create int from gray value
'''
def graytoint(self, gray):
i = 0
for bit in gray:
i = i*2 + bit
return i
'''
Create example list of binary lists
'''
def create(self, n, size):
return [[np.random.randint(2) for _ in range(size)] for _ in range(n)]
def showPlot(self, toolbox, neval):
individuals = self.create(100, 2000)
fitnesses = map(np.sum, individuals)
fig,ax = plt.subplots()
values = map(self.graytoint, map(self.indtogray, individuals))
full = zip(*sorted(zip(values, fitnesses)))
line = ax.plot(full[0], full[1], 'r-')
plt.show()
if __name__ == '__main__':
show()
I get the following error:
OverflowError: long int too large to convert to float
Anyone an idea?

The error just means that your number is too big and it cannot convert it to float. Things you can do is take the logarithm of x.
Now if you have up to 3000 binary digits, this means that the largest decimal number is pow(2,3000). If you take log(pow(2,3000), you should get 2079.44154~ which you should then be able convert to a float. I would double check to see if the number you have is in binary format but in decimal representation. Meaning if x[0] = 10, make sure that it is ten and not 2 in binary. Otherwise, a 2^3000 number in binary format would be very large.

Related

WAV FFT: Slice indices must be integers

I am trying to perform some analysis on a .wav file, I have taken the code from the following question (Python Scipy FFT wav files) and it seems to give exactly what I need however when running the code I run into the following error:
TypeError: slice indices must be integers or None or have an index method
This occurs on line 9 of my code. I don't undertand why this occurs, because I thought that the abs function would make it an integer.
import matplotlib.pyplot as plt
from scipy.fftpack import fft
from scipy.io import wavfile # get the api
fs, data = wavfile.read('New Recording 2.wav') # load the data
a = data.T[0] # this is a two channel soundtrack, I get the first track
b=[(ele/2**8.)*2-1 for ele in a] # this is 8-bit track, b is now normalized on [-1,1)
c = fft(b) # calculate fourier transform (complex numbers list)
d = len(c)/2 # you only need half of the fft list (real signal symmetry)
plt.plot(abs(c[:(d-1)]),'r')
plt.show()
plt.savefig("Test.png", bbox_inches = "tight")
abs() doesn't make your number into an integer. It just turns negative numbers into positive numbers. When len(c) is an odd number your variable d is a float that ends in x.5.
What you want is probably round(d) instead of abs(d)

maximum allowed dimention exceeded

I am attempting to make a painting based on the mass of the universe with pi and the gravitational constant of earth at sea level converted to binary. i've done the math and i have the right dimentions and it should only be less than a megabyte of ram but im running into maximum allowed dimention exceeded value error.
Here is the code:
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
boshi = 123456789098765432135790864234579086542098765432135321 # universal mass
genesis = boshi ** 31467 # padding
artifice = np.binary_repr(genesis) # formatting
A = int(artifice)
D = np.array(A).reshape(A, (1348, 4117))
plt.imsave('hello_world.png', D, cmap=cm.gray) # save image
I keep running into the error at D = np.array..., and maybe my reshape is too big but its only a little bigger than 4k. seems like this should be no problem for gpu enhanced colab. Doesn't run on my home machine either with the same error. Would this be fixed with more ram?
Making it Work
The problem is that artifice = np.binary_repr(genesis) creates a string. The string consists of 1348 * 4117 = 5549716 digits, all of them zeros and ones. If you convert the string to a python integer, A = int(artifice), you will (A) wait a very long time, and (B) get a non-iterable object. The array you create with np.array(A) will have a single element.
The good news is that you can bypass the time-consuming step entirely using the fact that the string artifice is already an iterable:
D = np.array(list(artifice), dtype=np.uint8).reshape(1348, 4117)
The step list(artifice) will take a couple of seconds since it has to split up the string, but everything else should be quite fast.
Plotting is easy from there with plt.imsave('hello_world.png', D, cmap=cm.gray):
Colormaps
You can easily change the color map to coolwarm or whatever you want when you save the image. Keep in mind that your image is binary, so only two of the values will actually matter:
plt.imsave('hello_world2.png', D, cmap=cm.coolwarm)
Exploration
You have an opportunity here to add plenty of color to your image. Normally, a PNG is 8-bit. For example, instead of converting genesis to bits, you can take the bytes from it to construct an image. You can also take nibbles (half-bytes) to construct an indexed image with 16 colors. With a little padding, you can even make sure that you have a multiple of three data points, and create a full color RGB image in any number of ways. I will not go into the more complex options, but I would like to explore making a simple image from the bytes.
5549716 bits is 693715 = 5 * 11 * 12613 bytes (with four leading zero bits). This is a very nasty factorization leading to an image size of 55x12613, so let's remove that upper nibble: while 693716's factorization is just as bad as 693715's, 693714 factors very nicely into 597 * 1162.
You can convert your integer to an array of bytes using its own to_bytes method:
from math import ceil
byte_genesis = genesis.to_bytes(ceil(genesis.bit_length() / 8), 'big')
The reason that I use the built-in ceil rather than np.ceil is that it return an integer rather than a float.
Converting the huge integer is very fast because the bytes object has direct access to the data of the integer: even if it makes a copy, it does virtually no processing. It may even share the buffer since both bytes and int are nominally immutable. Similarly, you can create a numpy array from the bytes as just a view to the same memory location using np.frombuffer:
img = np.frombuffer(byte_genesis, dtype=np.uint8)[1:].reshape(597, 1162)
The [1:] is necessary to chop off the leading nibble, since bytes_genesis must be large enough to hold the entirety of genesis. You could also chop off on the bytes side:
img = np.frombuffer(byte_genesis[1:], dtype=np.uint8).reshape(597, 1162)
The results are identical. Here is what the picture looks like:
plt.imsave('hello_world3.png', img, cmap=cm.viridis)
The result is too large to upload (because it's not a binary image), but here is a randomly selected sample:
I am not sure if this is aesthetically what you are looking for, but hopefully this provides you with a place to start looking at how to convert very large numbers into data buffers.
More Options, Because this is Interesting
I wanted to look at using nibbles rather than bytes here, since that would allow you to have 16 colors per pixel, and twice as many pixels. You can get an 1162x1194 image starting from
temp = np.frombuffer(byte_genesis, dtype=np.uint8)[1:]
Here is one way to unpack the nibbles:
img = np.empty((1162, 1194), dtype=np.uint8)
img.ravel()[::2] = np.bitwise_and(temp >> 4, 0x0F)
img.ravel()[1::2] = np.bitwise_and(temp, 0x0F)
With a colormap like jet, you get:
plt.imsave('hello_world4.png', img, cmap=cm.jet)
Another option, going in the opposite direction in a manner of speaking) is not to use colormaps at all. Instead, you can divide your space by a factor of three and generate your own colors in RGB space. Luckily, one of the prime factors of 693714 is 3. You can therefore have a 398x581 image (693714 == 3 * 398 * 581). How you interpret the data is even more than usual up to you.
Side Note Before I Continue
With the black-and-white binary image, you could control the color, size and orientation of the image. With 8-bit data, you could control how the bits were sampled (8 or fewer, as in the 4-bit example), the endianness of your interpretation, the color map, and the image size. With full color, you can treat each triple as a separate color, treat the entire dataset as three consecutive color planes, or even do something like apply a Bayer filter to the array. All in addition to the other options like size, ordering, number of bits per sample, etc.
The following will show the color triples and three color planes options for now.
Full Color Images
To treat each set of 3 consecutive bytes as an RGB triple, you can do something like this:
img = temp.reshape(398, 581, 3)
plt.imsave('hello_world5.png', img)
Notice that there is no colormap in this case.
Interpreting the data as three color planes requires an extra step because plt.imsave expects the last dimension to have size 3. np.rollaxis is a good tool for this:
img = np.rollaxis(temp.reshape(3, 398, 581), 0, 3)
plt.imsave('hello_world6.png', img)
I could not reproduce your problem, because the line A = int(artifice) took like forever. I replaced it with a ,for loop to cast each digit on its own. The code worked then and produced the desired image.
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
boshi = 123456789098765432135790864234579086542098765432135321
genesis = boshi ** 31467
artifice = np.binary_repr(genesis)
D = np.zeros((1348, 4117), dtype=int)
for i, val in enumerate(D):
D[i] = int(artifice[i])
plt.imsave('hello_world.png', D, cmap=cm.gray)

BeamDeflection Plot

I'm having trouble with my script not showing a plot.
The plot must show the deflection of the beam as a function of the x-coordinate of the entire beam. I don't know if I can make the statements: "x[i]>a[v]" if x is not given...
import numpy as np #Imports NumPy
import matplotlib.pyplot as plt
def beamPlot(beamLength, loadPositions, loadForces, beamSupport):
l=beamLength #Scalar
a=loadPositions #Vector
W=loadForces #Vector
x=np.array(range(0,l))
E=200*10**9 #Constant [N/m^2]
I=0.001 #Constant [m^4]
#Makes an empty vector with the same size as x
y=np.empty_like(x)
for i in range(np.size(x)): #Continues as long as the vector x
for v in range(np.size(a)):
if a[v]==[ ] and W[v]==[ ]:
return np.zeros(np.size(x))
elif beamSupport=="both" and x[i]<a[v]:
y[i]=np.sum(((W[v]*(l-a[v])*x[i])/(6*E*I*l))*(l**2-x[i]**2-(l-a[v])**2))
elif beamSupport=="both" and x[i]>=a[v]:
y[i]=np.sum(W[v]*a[v]*(l-x[i])/(6*E*I*l)*(l**2-(l-x[i])**2-a[v]**2))
elif beamSupport=="cantilever" and x[i]<a[v]:
y[i]=np.sum((W[v]*x[i]**2)/(6*E*I)*(3*a[v]-x[i]))
elif beamSupport=="cantilever" and x[i]>=a[v]:
y[i]=np.sum((W[v]*a[v]**2)/(6*E*I)*(3*x[i]-a[v]))
deflection=y
plt.ylim([0,10000])
plt.xlim([0,l])
plt.title("Beam deflection")
plt.plot(x, deflection)
plt.show()
Your array x is created with a list of integers from range(0,l), which means that the elements in the array are of type int. You create the y array using np.epty_like() which means that it also has elements of type int. Unless you are using huge values for the loads, the float values created by your calculations get rounded to 0 when converted to int, so the plot is a flat line at y=0.
You can fix this by specifying that y should contain float values when it is created by adding dtype=float to:
y=np.empty_like(x, dtype=float)
You should also remove the plt.ylim(0,10000) and instead let matplotlib autoscale your y-axis, since the displacements are probably not going to be this large for any reasonable values of loads (given your stiffness)

Adding + sign to exponent in matplotlib axes

I have a log-log plot where the range goes from 10^-3 to 10^+3. I would like values ≥10^0 to have a + sign in the exponent analogous to how values <10^0 have a - sign in the exponent. Is there an easy way to do this in matplotlib?
I looked into FuncFormatter but it seems overly complex to achieve this and also I couldn't get it to work.
You can do this with a FuncFormatter from the matplotlib.ticker module. You need a condition on whether the tick's value is greater than or less than 1. So, if log10(tick value) is >0, then add the + sign in the label string, if not, then it will get its minus sign automatically.
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
# sample data
x = y = np.logspace(-3,3)
# create a figure
fig,ax = plt.subplots(1)
# plot sample data
ax.loglog(x,y)
# this is the function the FuncFormatter will use
def mylogfmt(x,pos):
logx = np.log10(x) # to get the exponent
if logx < 0:
# negative sign is added automatically
return u"$10^{{{:.0f}}}$".format(logx)
else:
# we need to explicitly add the positive sign
return u"$10^{{+{:.0f}}}$".format(logx)
# Define the formatter
formatter = ticker.FuncFormatter(mylogfmt)
# Set the major_formatter on x and/or y axes here
ax.xaxis.set_major_formatter(formatter)
ax.yaxis.set_major_formatter(formatter)
plt.show()
Some explanation of the format string:
"$10^{{+{:.0f}}}$".format(logx)
the double braces {{ and }} are passed to LaTeX, to signify everything within them should be raised as an exponent. We need double braces, because the single braces are used by python to contain the format string, in this case {:.0f}. For more explanation of format specifications, see the docs here, but the TL;DR for your case is we are formatting a float with a precision of 0 decimal places (i.e. printing it essentially as an integer); the exponent is a float in this case because np.log10 returns a float. (one could alternatively convert the output of np.log10 to an int, and then format the string as an int - just a matter of your preference which you prefer).
I hope this is what you mean:
def fmt(y, pos):
a, b = '{:.2e}'.format(y).split('e')
b = int(b)
if b >= 0:
format_example = r'$10^{+{}}$'.format(b)
else:
format_example = r'$10^{{}}$'.format(b)
return
Then use FuncFormatter, e.g. for a colorbar: plt.colorbar(name_of_plot,ticks=list_with_tick_locations, format = ticker.FuncFormatter(fmt)). I think you have to import import matplotlib.ticker as ticker.
Regards

Moving average of an array in Python

I have an array where discreet sinewave values are recorded and stored. I want to find the max and min of the waveform. Since the sinewave data is recorded voltages using a DAQ, there will be some noise, so I want to do a weighted average. Assuming self.yArray contains my sinewave values, here is my code so far:
filterarray = []
filtersize = 2
length = len(self.yArray)
for x in range (0, length-(filtersize+1)):
for y in range (0,filtersize):
summation = sum(self.yArray[x+y])
ave = summation/filtersize
filterarray.append(ave)
My issue seems to be in the second for loop, where depending on my averaging window size (filtersize), I want to sum up the values in the window to take the average of them. I receive an error saying:
summation = sum(self.yArray[x+y])
TypeError: 'float' object is not iterable
I am an EE with very little experience in programming, so any help would be greatly appreciated!
The other answers correctly describe your error, but this type of problem really calls out for using numpy. Numpy will run faster, be more memory efficient, and is more expressive and convenient for this type of problem. Here's an example:
import numpy as np
import matplotlib.pyplot as plt
# make a sine wave with noise
times = np.arange(0, 10*np.pi, .01)
noise = .1*np.random.ranf(len(times))
wfm = np.sin(times) + noise
# smoothing it with a running average in one line using a convolution
# using a convolution, you could also easily smooth with other filters
# like a Gaussian, etc.
n_ave = 20
smoothed = np.convolve(wfm, np.ones(n_ave)/n_ave, mode='same')
plt.plot(times, wfm, times, -.5+smoothed)
plt.show()
If you don't want to use numpy, it should also be noted that there's a logical error in your program that results in the TypeError. The problem is that in the line
summation = sum(self.yArray[x+y])
you're using sum within the loop where your also calculating the sum. So either you need to use sum without the loop, or loop through the array and add up all the elements, but not both (and it's doing both, ie, applying sum to the indexed array element, that leads to the error in the first place). That is, here are two solutions:
filterarray = []
filtersize = 2
length = len(self.yArray)
for x in range (0, length-(filtersize+1)):
summation = sum(self.yArray[x:x+filtersize]) # sum over section of array
ave = summation/filtersize
filterarray.append(ave)
or
filterarray = []
filtersize = 2
length = len(self.yArray)
for x in range (0, length-(filtersize+1)):
summation = 0.
for y in range (0,filtersize):
summation = self.yArray[x+y]
ave = summation/filtersize
filterarray.append(ave)
self.yArray[x+y] is returning a single item out of the self.yArray list. If you are trying to get a subset of the yArray, you can use the slice operator instead:
summation = sum(self.yArray[x:y])
to return an iterable that the sum builtin can use.
A bit more information about python slices can be found here (scroll down to the "Sequences" section): http://docs.python.org/2/reference/datamodel.html#the-standard-type-hierarchy
You could use numpy, like:
import numpy
filtersize = 2
ysums = numpy.cumsum(numpy.array(self.yArray, dtype=float))
ylags = numpy.roll(ysums, filtersize)
ylags[0:filtersize] = 0.0
moving_avg = (ysums - ylags) / filtersize
Your original code attempts to call sum on the float value stored at yArray[x+y], where x+y is evaluating to some integer representing the index of that float value.
Try:
summation = sum(self.yArray[x:y])
Indeed numpy is the way to go. One of the nice features of python is list comprehensions, allowing you to do away with the typical nested for loop constructs. Here goes an example, for your particular problem...
import numpy as np
step=2
res=[np.sum(myarr[i:i+step],dtype=np.float)/step for i in range(len(myarr)-step+1)]

Categories