Boxplot on distance Data - set Box manually to values - python

I have a bunch of 2d points and angles. To visualise the amount of movement i wanted to use a boxplot and plot the difference to the mean of the points.
I sucessfully visualised the angle jitter using python and matplotlib in the following boxplot:
Now i want to do the same for my position Data. After computing the euclidean distance all the data is positive, so a naive boxplot will give wrong results. For an Example see the boxplot at the bottom, points that are exactly on the mean have a distance of zero and are now outliers.
So my Question is:
How can i set the bottom end of the box and the whiskers manually onto zero?
If i should take another approach like a bar chart please tell me (i would like to use the same style though)
Edit:
It looks similar to the following plot at the moment (This a plot of the distance the angle have from their mean).
As you can see the boxplot does't cover the zero. That is correct for the data, but not for the meaning behind it! Zero is perfect (since it represents a points that was exactly in the middle of the angles) but it is not included in the boxplot.

I found out it has already been asked before in this question on SO. While not as exact duplicate, the other question contains the answer!
In matplotlib 1.4 will probably be a faster way to do it, but for now the answer in the other thread seems to be the best way to go.
Edit:
Well it turned out that i couldn't use their approach since i have plt.boxplot(data, patch_artist=True) to get all the other fancy stuff.
So i had to resort to the following ugly final solution:
N = 12 #number of my plots
upperBoxPoints= []
for d in data:
upperBoxPoints.append(np.percentile(d, 75))
w = 0.5 # i had to tune the width by hand
ind = range(0,N) #compute the correct placement from number and width
ind = [x + 0.5+(w/2) for x in ind]
for i in range(N):
rect = ax.bar(ind[i], menMeans[i], w, color=color[i], edgecolor='gray', linewidth=2, zorder=10)
# ind[i] position
# menMeans[i] hight of box
# w width
# color=color[i] as you can see i have a complex color scheme, use '#AAAAAAA' for colors, html names won't work
# edgecolor='gray' just like the other one
# linewidth=2 dito
# zorder=2 IMPORTANT you have to use at least 2 to draw it over the other stuff (but not to high or it is over your horizontal orientation lines
And the final result:

Related

Can matplotlib plot decreasing arrays?

I am processing some data collected in a driving simulator, and I needed to plot the velocity against the location. I managed to convert the velocity and location values into 2 numpy arrays. Due to the settings of the simulator, the location array is continuously decreasing. The sample array is [5712.114 5711.662 5711.209 ... 3185.806 3185.525 3185.243]. Similarly, the velocity array is also decreasing because we were testing the brake behavior. Example array: [27.134 27.134 27.134 ... 16.87 16.872 16.874].
So, when I plot these 2 arrays, what I should see should be a negatively sloped line, and both x and y axis should have decreasing numbers. I used the code below to plot them:
plotting_x = np.array(df["SubjectX"].iloc[start_index-2999:end_index+3000])
plotting_y = np.array(df["Velocity"].iloc[start_index-2999:end_index+3000])
plt.plot(plotting_x, plotting_y, "r")
What I saw is a graph attached here. Anyone know what went wrong? Does Matplotlib not allow decreasing series? Thanks! Matplotlib plot
The problem is that by default matplotlib always defines the x axis increasing, so it will map the points following that rule. Try to reverse it by dong:
ax = plt.gca()
ax.invert_xaxis()
After the plot call.
From what I understand, since both the position and the velocity are decreasing, there is nothing wrong with the plot, simply the first point is in the top right corner and the last is in the bottom left.
At a first glance, I would also say that the position is always decreasing (the vehicle never jumps back) while the velocity has a more interesting behaviour.
You can check if this is the case plotting in two steps with two colours:
plotting_x = np.array(df["SubjectX"].iloc[start_index-2999:end_index])
plotting_y = np.array(df["Velocity"].iloc[start_index-2999:end_index])
plt.plot(plotting_x, plotting_y, "r", label="first")
and
plotting_x = np.array(df["SubjectX"].iloc[start_index:end_index+3000])
plotting_y = np.array(df["Velocity"].iloc[start_index:end_index+3000])
plt.plot(plotting_x, plotting_y, "b", label="second")
then:
plt.legend()
plt.show()
To get a more usual representation you can revert the axis or use:
plotting_x = some_number - np.array(df["SubjectX"].iloc[start_index-2999:end_index+3000])

Is there a way I can align the histogram with the function plot in this graph?

Here's a plot I currently have: (using Python)
The darkorange curve is my function, generated from
plt.plot(x,Yt,color = 'darkorange')
while the histogram comes from
plt.bar(dic.keys(), dic.values(), width=np.abs((rang2-rang1)/N), color='lightcoral')
From this graph we can see they are not quite aligned at the bottom (where both of them should be 0), I'm wondering is there a way I can make them aligned? Thanks!!
you might need to play around with the offset number below
offset = 0.01
Yt = [y-offset for y in Yt]
plt.plot(x,Yt,color = 'darkorange')
note that if you want to only offset outside the peak range of the function (the spiky part in the middle) you would need a non-constant offset.

matplotlib: get axis ratio of plot

I need to produce scatter plots for several 2D data sets automatically.
By default the aspect ratio is set ax.set_aspect(aspect='equal'), which most of the times works because the x,y values are distributed more or less in a squared region.
Sometimes though, I encounter a data set that, when plotted with the equal ratio, looks like this:
i.e.: too narrow in a given axis. For the above image, the axis are approximately 1:8.
In such a case, an aspect ratio of ax.set_aspect(aspect='auto') would result in a much better plot:
Now, I don't want to set aspect='auto' as my default for all data sets because using aspect='equal' is actually the correct way of displaying such a scatter plot.
I need to fall back to using ax.set_aspect(aspect='auto') only for cases such as the one above.
The question: is there a way to know before hand if the aspect ratio of a plot will be too narrow if aspect='equal' is used? Like getting the actual aspect ratio of the plotted data set.
This way, based on such a number, I can adjust the aspect ratio to something more sane looking (i.e.: auto or some other aspect ratio) instead of 'equal'.
Something like this ought to do,
aspect = (max(x) - min(x)) / (max(y) - min(y))
The axes method get_data_ratio gives the aspect ratio of the bounds of your data as displayed.¹
ax.get_data_ratio()
for example:
M = 4.0
ax.set_aspect('equal' if 1/M < ax.get_data_ratio() < M else 'auto')
¹This is the reciprocal of #farenorth's answer when the axes are zoomed right around the data, i.e., when max(y) == max(ax.get_ylim()) since it is calculated using the ranges in ax.get_ybound and ax.get_xbound.

Getting correct XY axes when plotting numpy array

Beginning python/numpy user here. I do an analysis of a 2D function in the XY plane. Using 2 loops through x and y I compute the function value and store it into an array for later plotting. I ran into a couple of problems.
Lets say my XY range is -10 to 10. How do I accommodate that when storing computed value into my data array? (only positive numbers are allowed as indices) For now I just add to x and Y to make it positive.
From my data I know that the extreme is a x=-3 and y=2. When I plot the computed array first of all the axes labels are wrong. I would like Y to go the mathematical way. (up)
I would like the axes labels to run from -10 to 10. I tried 'extend' but that did not come out right.
Again from my data I know that the extreme is at x=-3 and y=2. In the plot when I hover the mouse over the graphics, the max value is shown at x=12 and y=7. Seems x and y have been swapped. Though when I move the mouse the displayed x and y numbers run as follows. X grows larger when moving the mouse right etc. (OK) Y runs the wrong way, grows larger when moving DOWN.
As side note it would be nice to have the function value shown in the plot window as well next to x and y.
Here is my code:
size = 10
q = np.zeros((2*size,2*size))
for xs in range(-size,+size):
for ys in range(-size,+size):
q[xs+size,ys+size] = my_function_of_x_and_y(x,y)
im = plt.imshow(q, cmap='rainbow', interpolation='none')
plt.show()
One more thing. I would like not to mess with the q array too badly as I later want to find the extreme spot in it.
idxmin = np.argmin(q)
xmin,ymin = np.unravel_index(idxmin, q.shape)
xmin= xmin-size
ymin= ymin-size
So that I get this:
>>> xmin,ymin
(-3, 2)
>>>
Here is my plot:
(source: dyndns.ws)
Here is the desired plot (made in photoshop) (axis lineswould be nice):
(source: dyndns.ws)
Not too sure why setting extend did not work for you but this is how I have implemented it
q = np.random.randint(-10,10, size=(20, 20))
im = plt.imshow(q, cmap='rainbow', interpolation='none',extent=[-10,10,-10,10])
plt.vlines(0,10,-10)
plt.hlines(0,10,-10)
plt.show()
Use vlines and hlines methods to set the centering line

matplotlib radar plot min values

I started with the matplotlib radar example but values below some min values disappear.
I have a gist here.
The result looks like
As you can see in the gist, the values for D and E in series A are both 3 but they don't show up at all.
There is some scaling going on.
In order to find out what the problem is I started with the original values and removed one by one.
When I removed one whole series then the scale would shrink.
Here an example (removing Factor 5) and scale in [0,0.2] range shrinks.
From
to
I don't care so much about the scaling but I would like my values at 3 score to show up.
Many thanks
Actually, the values for D and E in series A do show up, although they are plotted in the center of the plot. This is because the limits of your "y-axis" is autoscaled.
If you want to have a fixed "minimum radius", you can simply put ax.set_ylim(bottom=0) in your for-loop.
If you want the minimum radius to be a number relative to the lowest plotted value, you can include something like ax.set_ylim(np.asarray(data.values()).flatten().min() - margin) in the for-loop, where margin is the distance from the lowest plotted value to the center of the plot.
With fixed center at radius 0 (added markers to better show that the points are plotted):
By setting margin = 1, and using the relative y-limits, I get this output:

Categories