Non-overlapping scatter plot labels using matplotlib - python

I have a scatter plot with a number of points. Each point has a string associated with it (varying in length) that I'd like to supply a label, but I can't fit them all. So I'd like to iterating through my data points from most to least important, and in each case apply a label only if it would not overlap as existing label. The strings vary in length. One of the commenters mentions solving a knapsack problem to find an optimal solution. In my case the greedy algorithm (always label the most important remaining point that can be labeled without overlap) would be a good start and might suffice.
Here's a toy example. Could I get Python to label only as many points as it can without overlapping?
import matplotlib.pylab as plt, numpy as np
npoints = 100
xs = np.random.rand(npoints)
ys = np.random.rand(npoints)
plt.scatter(xs, ys)
labels = iter(dir(np))
for x, y, in zip(xs, ys):
# Ideally I'd condition the next line on whether or not the new label would overlap with an existing one
plt.annotate(labels.next(), xy = (x, y))
plt.show()

You can draw all the annotates first, and then use a mask array to check the overlap and use set_visible() to hide. Here is an example:
import numpy as np
import pylab as pl
import random
import string
import math
random.seed(0)
np.random.seed(0)
n = 100
labels = ["".join(random.sample(string.ascii_letters, random.randint(4, 10))) for _ in range(n)]
x, y = np.random.randn(2, n)
fig, ax = pl.subplots()
ax.scatter(x, y)
ann = []
for i in range(n):
ann.append(ax.annotate(labels[i], xy = (x[i], y[i])))
mask = np.zeros(fig.canvas.get_width_height(), bool)
fig.canvas.draw()
for a in ann:
bbox = a.get_window_extent()
x0 = int(bbox.x0)
x1 = int(math.ceil(bbox.x1))
y0 = int(bbox.y0)
y1 = int(math.ceil(bbox.y1))
s = np.s_[x0:x1+1, y0:y1+1]
if np.any(mask[s]):
a.set_visible(False)
else:
mask[s] = True
the output:

Just as an additional note: for my code to work, I had to add and additional renderer=fig.canvas.get_renderer() parameter to the get_window_extent() method rather than the default get_window_extent(renderer=None). I think the necessity of this additional parameter specification depends on the operating system. https://github.com/matplotlib/matplotlib/issues/10874

Related

Python Matplotlib -> give each x axis a numeric Label

I want to give each x Point a Label. From 0 to Inf.
The Label should be visible at the point, that is the highest.
Function:
def plot_pics(self, figure,title, x, a1, a2, a3, labelx, labely):
ax = figure.add_subplot(111)
ax.plot(x,a1,'-o')
ax.plot(x,a2,'-o')
ax.plot(x,a3,'-o')
ax.legend(['left region','center region','right region'])
ax.set_xlabel(labelx)
ax.set_ylabel(labely)
ax.set_title(title)
figure.canvas.draw_idle()
Here is a minimal working example of what I think you want to achieve.
import numpy as np
import matplotlib.pyplot as plt
# Random plotting data
x_arr = np.arange(10)
rand_arr = np.random.random((10, 3))
# Plot everything
plt.plot(x_arr, rand_arr, '-o')
# Find maximas for every x-value
rand_max_arr = np.max(rand_arr, axis=1)
x_offset = 0.5
y_offset = 0.04
for x, y in zip(x_arr, rand_max_arr):
plt.text(x - x_offset, y + y_offset, "point {:d}".format(x), bbox=dict(facecolor="white"))
plt.show()
It generates the following plot.
For testing purposes I create 3 arrays of 10 random numbers each. Afterwards you have to find the maximum for each x-point and attach a text to the point via plt.text(), whereas the coordinates are the x-point and the found maximum. The offsets are used to move the text so it does only minimally interfere with the plotted maximas themselves.

Plotting an intersection when graph touches an x-axis

So I'm making a Graphical Calculator, which shows an intersection between graphs and axes. I found the method from Intersection of two graphs in Python, find the x value to work most of the time, however trying to plot the x-axis intersection of x**2 as such
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(-5, 5, 0.01)
g = (x) ** 2
plt.plot(x, g, '-')
idx = np.argwhere(np.diff(np.sign(g))).flatten()
plt.plot(x[idx], g[idx], 'ro')
plt.show()
doesn't put the dot at (0,0) point. I assumed it has something to do with the fact that 0 is not in g, so the grpah it doesn't actually pass through the point exactly and instead gets really close to it. So I experimented with changing idx to
epsilon = 0.0001
# or another real small number
idx = g < epsilon
Unfortunately, that only seemed to make a lot of points near the actual x-intercept, instead of just one.
You are close, instead, I just search for where the absolute value of the derivative is at a minimum such that
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(-5, 5, 0.01)
g = x**2
plt.plot(np.abs(np.diff(g)))
plt.show()
which shows that the minimum should be at index 500:
Then all you need to do is return the index of the minimum value with argmin and plot that point
idx = np.argmin(np.abs(np.diff(g)))
plt.plot(x, g, '-')
plt.scatter(x[idx],g[idx])
plt.show()
You'll need to modify the idx variable to return multiple roots, but for the question you posted, this should be sufficient.

Adjusting Plotted Values of Contour Plots

I'm making contour plots which are basically analytical or numerical solutions to a fluid dynamic system. I don't think the technical stuff really matters too much, but here's my plots. The first plot is the numerical (Matrix system) solution, and the second plot is the nice closed form (single forumla) solution.
As can be seen, my second plot has the bubbles on the right hand side. Looking at the legend/scale, I have negative values. I'd like to not have negative values, or not plot them, although I'm not sure how to adjust this within my code. I've spent some time looking into how to adjust the z values to being positive only, but I can't seem to get it. I'll drop my plot code, and then my nice closed form function that is used in the plot.
import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
import scipy.special as sp1
from mpl_toolkits.mplot3d import Axes3D
def v(r,z,gamma):
a=r*(1-z/gamma)
sums = 0
for n in range(1,26):
sums += ((sp1.iv(1(n*np.pi*r)/gamma))/(n*sp1.iv(1(n*np.pi)/gamma)))*np.sin(n*np.pi*z/gamma)
return a-(2/np.pi)*sums
def plot_contour(a, filename=None, zlabel='v(r,z)',cmap=plt.cm.gnuplot):
fig = plt.figure(figsize=(5,4))
ax = fig.add_subplot(111)
x = np.arange(a.shape[0])
y = np.arange(a.shape[1])
X, Y = np.meshgrid(x, y)
Z = a[X, Y]
cset = ax.contourf(X, Y, Z, 20, cmap=cmap)
ax.set_xlabel('r')
ax.set_ylabel('z')
ax.set_title('\u0393=2.5')
ax.axis('off')
ax.set_aspect(1)
cb = fig.colorbar(cset, shrink=0.5, aspect=5)
cb.set_label(zlabel)
if filename:
fig.savefig(filename,dpi=1600)
plt.close(fig)
return filename
else:
return ax
...
plot_contour(v1, 'gamma25e+1')
This is all the necessary code. The rest of it is the matrix solution stuff, which is just a bunch of linear algebra. Any help on what I need to add or adjust to prevent negative values from showing up on the second plot. It should look exactly like the first.
I've spent some time looking into how to adjust the z values to being positive only
what you can do depends greatly on what you want to do with the results below zero, if your sole purpose is to make the points below zero show as zero, you can simply make them zero, however that would be showing a wrong result.
x = np.arange(a.shape[0])
y = np.arange(a.shape[1])
X, Y = np.meshgrid(x, y)
Z = a[X, Y]
Z[Z < 0] = 0
another solution is to subtract the minimum value of you data so that the minimum value of the result is 0.
x = np.arange(a.shape[0])
y = np.arange(a.shape[1])
X, Y = np.meshgrid(x, y)
Z = a[X, Y]
Z -= np.amin(Z)

Plot and function with three variables in python

An equation which is represent as below
sin(x)*sin(y)*sin(z)+cos(x)*sin(y)*cos(z)=0
I know the code to plot function for z=f(x,y) using matplotlib but to plot above function I don’t know the code, but I tried MATLAB MuPad code which is as follows
Plot(sin(x)*sin(y)*sin(z)+cos(x)*sin(y)*cos(z),#3d)
This will be much easier if you can isolate z. Your equation is the same as sin(z)/cos(z) = -cos(x)*sin(y)/(sin(x)*sin(y)) so z = atan(-cos(x)*sin(y)/(sin(x)*sin(y))).
Please don't mistake me, but I think your given equation to plot can be reduced to a simple 2D plot.
sin(x)*sin(y)*sin(z)+cos(x)*sin(y)*cos(z) = 0
sin(y)[sin(x)*sin(z)+cos(x)*cos(z)] = 0
sin(y)*cos(x-z) = 0
Hence sin(y) = 0 or cos(x-z)=0
Hence y = n*pi (1) or x-z=(2*n + 1)pi/2
Implies, x = z + (2*n + 1)pi/2 (2)
For (1), it will be a straight line (the plot of y vs n) and in second case, you will get parallel lines which cuts x-axis at (2*n + 1)pi/2 and distance between two parallel lines would be pi. (Assuming you keep n constant).
Assuming, your y can't be zero, you could simplify the plot to a 2D plot with just x and z.
And answering your original question, you need to use mplot3d to plot 3D plots. But as with any graphing tool, you need values or points of x, y, z. (You can compute the possible points by programming). Then you feed those points to the plot, like below.
from mpl_toolkits import mplot3d
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()
ax = plt.axes(projection="3d")
xs = [] # X values
ys = [] # Y values
zs = [] # Z values
ax.plot3D(xs, ys, zs)
plt.show()

Plot staggered histograms/lines as in FACS

My question is basically exaclt the same as this one but for matplotlib. I'm sure it has something to do with axes or subplots, but I don't think I fully understand those paradigms (a fuller explanation would be great).
As I loop through a set of comparisons, I'd like the base y value of each new plot to be set slightly below the previous one to get something like this:
One other (potential) wrinkle is that I'm generating these plots in a loop, so I don't necessarily know how many plots there will be at the outset. I think this is one of the things that I'm getting hung up on with subplots/axes, because it seems like you need to set them ahead of time.
Any ideas would be greatly appreciated.
EDIT: I made a little progress I think:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
x = np.random.random(100)
y = np.random.random(100)
fig = plt.figure()
ax = fig.add_axes([1,1,1,1])
ax2 = fig.add_axes([1.02,.9,1,1])
ax.plot(x, color='red')
ax.fill_between([i for i in range(len(x))], 0, x, color='red', alpha=0.5)
ax2.plot(y, color='green')
ax2.fill_between([i for i in range(len(y))], 0, y, color='green', alpha=0.5)
Gives me:
Which is close to what I want...
Is this the sort of thing you want?
What I did was define the y-distance between the baselines of each curve. For the ith curve, I calculated the minimum Y-value, then set that minimum to be i times the y-distance, adjusting the height of the entire curve accordingly. I used a decreasing z-order to ensure that the filled part of the curves were not obscured by the baselines.
Here's the code:
import numpy as np
import matplotlib.pyplot as plt
delta_Y = .5
zorder = 0
for i, Y in enumerate(data):
baseline = min(Y)
#change needed for minimum of Y to be delta_Y above previous curve
y_change = delta_Y * i - baseline
Y = Y + y_change
plt.fill_between(np.linspace(0, 1000, 1000), Y, np.ones(1000) * delta_Y * i, zorder = zorder)
zorder -= 1
Code that generates dummy data:
def gauss(X):
return np.exp(-X**2 / 2.0)
#create data
X = np.linspace(-10, 10, 100)
data = []
for i in xrange(10):
arr = np.zeros(1000)
arr[i * 100: i * 100 + 100] = gauss(X)
data.append(arr)
data.reverse()
You could also look into installing JoyPy through:
pip install joypy
Pretty dynamic tool created by Leonardo Taccari, if what you are looking into is "stacked" distribution plots like so:
Example 1 - Joy Plot using JoyPy:
Example 2 - Joy Plot on Iris dataset:
Leonardo also has a neat description of the package and how to use it here.
Alternatively Seaborn has a package but I found it less easy to use.
Hope that helps!
So I managed to get a little bit farther by adding an additional Axes instance in each loop.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#instantiate data sets
x = np.random.random(100)
y = np.random.random(100)
z = np.random.random(100)
plots = [x, y, z]
fig = plt.figure()
#Sets the default vertical position
pos = 1
def making_plot(ax, p):
ax.plot(p)
# Prevents the background from covering over the earlier plots
ax.set_axis_bgcolor('none')
for p in plots:
ax = fig.add_axes([1,pos,1,1])
pos -= 0.3
making_plot(ax, p)
plt.show()
Clearly, I could spend more time making this prettier, but this does the job.

Categories