Can matplotlib plot decreasing arrays? - python

I am processing some data collected in a driving simulator, and I needed to plot the velocity against the location. I managed to convert the velocity and location values into 2 numpy arrays. Due to the settings of the simulator, the location array is continuously decreasing. The sample array is [5712.114 5711.662 5711.209 ... 3185.806 3185.525 3185.243]. Similarly, the velocity array is also decreasing because we were testing the brake behavior. Example array: [27.134 27.134 27.134 ... 16.87 16.872 16.874].
So, when I plot these 2 arrays, what I should see should be a negatively sloped line, and both x and y axis should have decreasing numbers. I used the code below to plot them:
plotting_x = np.array(df["SubjectX"].iloc[start_index-2999:end_index+3000])
plotting_y = np.array(df["Velocity"].iloc[start_index-2999:end_index+3000])
plt.plot(plotting_x, plotting_y, "r")
What I saw is a graph attached here. Anyone know what went wrong? Does Matplotlib not allow decreasing series? Thanks! Matplotlib plot

The problem is that by default matplotlib always defines the x axis increasing, so it will map the points following that rule. Try to reverse it by dong:
ax = plt.gca()
ax.invert_xaxis()
After the plot call.

From what I understand, since both the position and the velocity are decreasing, there is nothing wrong with the plot, simply the first point is in the top right corner and the last is in the bottom left.
At a first glance, I would also say that the position is always decreasing (the vehicle never jumps back) while the velocity has a more interesting behaviour.
You can check if this is the case plotting in two steps with two colours:
plotting_x = np.array(df["SubjectX"].iloc[start_index-2999:end_index])
plotting_y = np.array(df["Velocity"].iloc[start_index-2999:end_index])
plt.plot(plotting_x, plotting_y, "r", label="first")
and
plotting_x = np.array(df["SubjectX"].iloc[start_index:end_index+3000])
plotting_y = np.array(df["Velocity"].iloc[start_index:end_index+3000])
plt.plot(plotting_x, plotting_y, "b", label="second")
then:
plt.legend()
plt.show()
To get a more usual representation you can revert the axis or use:
plotting_x = some_number - np.array(df["SubjectX"].iloc[start_index-2999:end_index+3000])

Related

How can I produce multiple plots on one graph where each plot has a different color? Can I set a colormap to an array of scalar variables?

I have a series of simple mass-radius relationships (so a 2d plot) that I'd like to include in one plot according to how well of a fit it is to my data. I have the radii (x), masses (y), and a separate 1d array that quantifies how well the M-R relationship fits to my data. This 1d array can be likened to error, but it isn't calculated using a standard Python function (I calculate it myself).
Ideally, my end result is a series of ~2000 mass-radius relationships on one plot, where each mass-radius relationship is color coded according to its agreement with my data. So something like this, but instead of two colors, it's on a grayscale:
Here's a snippet of what I'm trying to do but obviously isn't working, as I didn't even define a colormap:
for i in range(10):
plt.plot(x,y,c=error[i])
plt.colorbar()
plt.show()
And again, I'd like to have each element in error correspond to a color in greyscale.
I know this is simple so I'm definitely outing myself as an amateur here, but I really appreciate any help!
EDIT: Here is the code snippet where I made the plot:
for i in range(2396):
if eps[i]==0.:
plt.plot(f[i,:,1],f[i,:,0],c='g',linewidth=0.1)
else:
plt.plot(f[i,:,1],f[i,:,0],c='r',linewidth=0.1)
plt.xlabel('Radius')
plt.ylabel('Mass')
plt.title('Neutron Star Mass-Radius Relationships')
You have one fit value for each series of points:
Here is a script to plot multiple series on a single plot, where each series (i.e. each line) is colored based on a third fit variable:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
fit = np.random.rand(25)
cmap = mpl.cm.get_cmap('binary')
color_gradients = cmap(fit) # this line changed! it was incorrect before
fig, (ax1,ax2) = plt.subplots(1,2, gridspec_kw={'width_ratios': [30, 1]})
for i,_ in enumerate(fit):
x = sorted(np.random.randint(100, size=25))
y = sorted(np.random.randint(100, size=25))
ax1.plot(x, y, c=color_gradients[i])
cb = mpl.colorbar.ColorbarBase(ax2, cmap=cmap,
orientation='vertical',
ticks=[0,1])
Now responding to your questions from the comments:
How does fit play into the rest of the plot?
fit is an array of random decimals between 0 and 1, corresponding to the "error" values for each series:
>>>fit
array([0.76458568, 0.15017328, 0.70686393, 0.98885091, 0.18449953,
0.62506401, 0.49513702, 0.69138913, 0.96844495, 0.48937011,
0.09878352, 0.68965829, 0.13524182, 0.95419698, 0.39844843,
0.63095159, 0.95933663, 0.00693236, 0.98212815, 0.16262205,
0.26274884, 0.56880703, 0.68233984, 0.18304883, 0.66759496])
fit is used to generate the divisions of the color gradient in these lines:
cmap = mpl.cm.get_cmap('binary')
color_gradients = cmap(fit)
I'm not sure where the specific documentation for this is, but basically, passing an array of numbers to the cmap will return an array of RGBA color values spaced accordingly to the array passed:
>>>color_gradients
array([[0.23529412, 0.23529412, 0.23529412, 1. ],
[0.85098039, 0.85098039, 0.85098039, 1. ],
[0.29411765, 0.29411765, 0.29411765, 1. ],
[0.00784314, 0.00784314, 0.00784314, 1. ],
.
.
.
So this array can be used to assign specific colors to each line, based on their fit. And it assumes the higher numbers are better fits, and that you want better fits to be colored darker.
Note that before I had color_gradient_divisions = [(1/len(fit))*i for i in range(len(fit))], which was incorrect as it evenly divides the color map into 25 pieces, not actually returning values corresponding to the fit.
The cmap is also passed to the colorbar when constructing it. Often you can just call plt.colorbar to simply create one, but here matplotlib doesn't automatically know what to create a color bar for as the lines are separate and manually colored. So instead, we create 2 axes, one for the plot and one for the colorbar (spacing them accordingly with the gridspec_kw argument), and then using mpl.colorbar.ColorbarBase to make the colorbar (I also removed a norm argument b/c I don't think it is needed).
why have you used an underscore in the for loop?
This is a pattern in Python, typically meaning "I'm not using this thing". enumerate returns an iterator of tuples with the structure (value index, value). So enumerate(fit) returns (0, 0.76458568), (1, 0.15017328), etc (based on the data shown above). I am only using the index (i) to get the corresponding position (and color) in color_gradients (ax1.plot(x, y, c=color_gradients[i])). Even though the values from fit are being returned by enumerate, I am not using them, so I instead point them to _. If I was using them within the loop, I would use a typical variable name instead.
enumerate is the encouraged way to loop over an iterable if you need to access both the count of the values and the values themselves. People tend to use for i in range(len(fit)) also to do this (which works fine!) but the further I've gone with Python the more I've seen people avoiding that.
This was a little bit of a confusing example; I set my loop to iterate over fit b/c I was conceptualizing "creating one graph for each value in fit". But I could have just looped over color_gradients (for c in color_gradients) which might be more clear.
But in your real data, something like enumerate may be helpful if you are looping over multiple aligned arrays. In my example, I just create new random data within each loop. But you will likely want to have an array of fit values, an array of color values, an array (of series) of radii, and an array (of series) of masses, such that the ith element of each array corresponds to the same star. You may be iterating over one array and want to access the same position in another (zip is used for this also).
I'll leave this second answer here, even though it wasn't what OP was getting at:
You have one fit value for each point:
Here, each pair of x,y coordinates has its own fit value:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.randint(100, size=25)
y = np.random.randint(100, size=25)
fit = np.random.rand(25)
plt.scatter(x, y, c=fit, cmap='binary')
plt.colorbar()
Note that with either approach, poorly fitting points or lines may be invisible

Trying to plot some data in matplotlib with numpy

I'm trying to simulate Conway's Game of Life in python(here is some of the code), and now I need to handle the ouput. Right now, I'm just plotting points in matplotlib but I want something like what this guy did(That script shows error in my PC but it generates the images anyway). I understand that the code I am looking for is:
plt.imshow(A, cmap='bone', interpolation='nearest')
plt.axis('off')
and that A is a numpy array alike a matrix with just True and False as entries.
By the way, I've already realized that instead of True and False I can put 1's and 0's.
I have the data of living cells as a set of points ([(x1,y1),(x2,y2),....,(xn,yn)]) of the plane(coordinates all integers). As you can see, my script is finite(it uses a for loop until 30), so I preset the plots' axis before the loop...for example, the minimum x coordinate of the plots is the minimum coordinate of the initial points minus 30, assuring then that all the points are visible in the last image.
To represent each configuration, I had the idea to do:
SuperArray=np.zeros(maxx+30,maxy+30)
for (i,j) in livecells:
SuperArray[i,j]=1
But that idea won't work, because the indices of SuperArray are all positives, and my coordinates maybe negative. To solve this I was thinking in translate ALL of the points in livecells so their coordinates be positive. I would do that by adding |minx|+30 to the x coordinate and |miny|+30 to the y coordinate
of each (x,y) in livecells...I haven't put it in practice yet, but it seems too complicated and memory consuming...Do you guys have any suggestion?

Got an extra line on python plot

i'm using pyplot to show the FFT of the signal 'a', here the code:
myFFT = numpy.fft.fft(a)
x = numpy.arange(len(a))
fig2 = plt.figure(2)
plt.plot(numpy.fft.fftfreq(x.shape[-1]), myFFT)
fig2.show()
and i get this figure
There is a line from the begin to the end of the signal in the frequency domain. How i can remove this line? AM I doing something wrong with pyplot?
Instead of sorted, you might want to use np.fft.fftshift to center you 0th frequency, this deals properly with odd- and even-size signals. Most importantly, you need to apply the transform on both x and y vectors you are plotting.
plt.plot(np.fft.fftshift(np.fft.fftfreq(x.shape[-1])), np.fft.fftshift(myFFT))
You might also want to display the amplitude or phase of the FFT (np.abs or np.angle) - as-is, you are just plotting the real-part.
Have a look at plt.plot(numpy.fft.fftfreq(x.shape[-1]): the first and last points are the same, hence the graph "makes a loop"
You can do plt.plot(sorted(numpy.fft.fftfreq(x.shape[-1])),myFFT) or plt.plot(myFFT)

Getting correct XY axes when plotting numpy array

Beginning python/numpy user here. I do an analysis of a 2D function in the XY plane. Using 2 loops through x and y I compute the function value and store it into an array for later plotting. I ran into a couple of problems.
Lets say my XY range is -10 to 10. How do I accommodate that when storing computed value into my data array? (only positive numbers are allowed as indices) For now I just add to x and Y to make it positive.
From my data I know that the extreme is a x=-3 and y=2. When I plot the computed array first of all the axes labels are wrong. I would like Y to go the mathematical way. (up)
I would like the axes labels to run from -10 to 10. I tried 'extend' but that did not come out right.
Again from my data I know that the extreme is at x=-3 and y=2. In the plot when I hover the mouse over the graphics, the max value is shown at x=12 and y=7. Seems x and y have been swapped. Though when I move the mouse the displayed x and y numbers run as follows. X grows larger when moving the mouse right etc. (OK) Y runs the wrong way, grows larger when moving DOWN.
As side note it would be nice to have the function value shown in the plot window as well next to x and y.
Here is my code:
size = 10
q = np.zeros((2*size,2*size))
for xs in range(-size,+size):
for ys in range(-size,+size):
q[xs+size,ys+size] = my_function_of_x_and_y(x,y)
im = plt.imshow(q, cmap='rainbow', interpolation='none')
plt.show()
One more thing. I would like not to mess with the q array too badly as I later want to find the extreme spot in it.
idxmin = np.argmin(q)
xmin,ymin = np.unravel_index(idxmin, q.shape)
xmin= xmin-size
ymin= ymin-size
So that I get this:
>>> xmin,ymin
(-3, 2)
>>>
Here is my plot:
(source: dyndns.ws)
Here is the desired plot (made in photoshop) (axis lineswould be nice):
(source: dyndns.ws)
Not too sure why setting extend did not work for you but this is how I have implemented it
q = np.random.randint(-10,10, size=(20, 20))
im = plt.imshow(q, cmap='rainbow', interpolation='none',extent=[-10,10,-10,10])
plt.vlines(0,10,-10)
plt.hlines(0,10,-10)
plt.show()
Use vlines and hlines methods to set the centering line

Boxplot on distance Data - set Box manually to values

I have a bunch of 2d points and angles. To visualise the amount of movement i wanted to use a boxplot and plot the difference to the mean of the points.
I sucessfully visualised the angle jitter using python and matplotlib in the following boxplot:
Now i want to do the same for my position Data. After computing the euclidean distance all the data is positive, so a naive boxplot will give wrong results. For an Example see the boxplot at the bottom, points that are exactly on the mean have a distance of zero and are now outliers.
So my Question is:
How can i set the bottom end of the box and the whiskers manually onto zero?
If i should take another approach like a bar chart please tell me (i would like to use the same style though)
Edit:
It looks similar to the following plot at the moment (This a plot of the distance the angle have from their mean).
As you can see the boxplot does't cover the zero. That is correct for the data, but not for the meaning behind it! Zero is perfect (since it represents a points that was exactly in the middle of the angles) but it is not included in the boxplot.
I found out it has already been asked before in this question on SO. While not as exact duplicate, the other question contains the answer!
In matplotlib 1.4 will probably be a faster way to do it, but for now the answer in the other thread seems to be the best way to go.
Edit:
Well it turned out that i couldn't use their approach since i have plt.boxplot(data, patch_artist=True) to get all the other fancy stuff.
So i had to resort to the following ugly final solution:
N = 12 #number of my plots
upperBoxPoints= []
for d in data:
upperBoxPoints.append(np.percentile(d, 75))
w = 0.5 # i had to tune the width by hand
ind = range(0,N) #compute the correct placement from number and width
ind = [x + 0.5+(w/2) for x in ind]
for i in range(N):
rect = ax.bar(ind[i], menMeans[i], w, color=color[i], edgecolor='gray', linewidth=2, zorder=10)
# ind[i] position
# menMeans[i] hight of box
# w width
# color=color[i] as you can see i have a complex color scheme, use '#AAAAAAA' for colors, html names won't work
# edgecolor='gray' just like the other one
# linewidth=2 dito
# zorder=2 IMPORTANT you have to use at least 2 to draw it over the other stuff (but not to high or it is over your horizontal orientation lines
And the final result:

Categories