I want to give each x Point a Label. From 0 to Inf.
The Label should be visible at the point, that is the highest.
Function:
def plot_pics(self, figure,title, x, a1, a2, a3, labelx, labely):
ax = figure.add_subplot(111)
ax.plot(x,a1,'-o')
ax.plot(x,a2,'-o')
ax.plot(x,a3,'-o')
ax.legend(['left region','center region','right region'])
ax.set_xlabel(labelx)
ax.set_ylabel(labely)
ax.set_title(title)
figure.canvas.draw_idle()
Here is a minimal working example of what I think you want to achieve.
import numpy as np
import matplotlib.pyplot as plt
# Random plotting data
x_arr = np.arange(10)
rand_arr = np.random.random((10, 3))
# Plot everything
plt.plot(x_arr, rand_arr, '-o')
# Find maximas for every x-value
rand_max_arr = np.max(rand_arr, axis=1)
x_offset = 0.5
y_offset = 0.04
for x, y in zip(x_arr, rand_max_arr):
plt.text(x - x_offset, y + y_offset, "point {:d}".format(x), bbox=dict(facecolor="white"))
plt.show()
It generates the following plot.
For testing purposes I create 3 arrays of 10 random numbers each. Afterwards you have to find the maximum for each x-point and attach a text to the point via plt.text(), whereas the coordinates are the x-point and the found maximum. The offsets are used to move the text so it does only minimally interfere with the plotted maximas themselves.
Related
I have a 2D output matrix (say, Z) which was calculated as a function of two variables x,y.
x varies in a non-uniform manner like [1e-5,5e-5,1e-4,5e-4,1e-3,5e-3,1e-2]
y varies in a uniform manner like [300,400,500,600,700,800]
[ say, Z = np.random.rand(7,6) ]
I was trying to plot a colormap of the matrix Z by first creating a meshgrid for x,y and then using the pcolormesh. Since, my x values are non-uniform, I do not kn ow how to proceed. Any inputs would be greatly appreciated.
No need for meshgrids; regarding the non-uniform axes: In your case a log-scale works fine:
import numpy as np
from matplotlib import pyplot as plt
x = [1e-5,5e-5,1e-4,5e-4,1e-3,5e-3,1e-2]
y = [300,400,500,600,700,800]
# either enlarge x and y by one number (right-most
# endpoint for those bins), or make Z smaller as I did
Z = np.random.rand(6,5)
fig = plt.figure()
ax = fig.gca()
ax.pcolormesh(x,y,Z.T)
ax.set_xscale("log")
fig.show()
I am creating a plot in python. Is there a way to re-scale the axis by a factor? The yscale and xscale commands only allow me to turn log scale off.
Edit:
For example. If I have a plot where the x scales goes from 1 nm to 50 nm, the x scale will range from 1x10^(-9) to 50x10^(-9) and I want it to change from 1 to 50. Thus, I want the plot function to divide the x values placed on the plot by 10^(-9)
As you have noticed, xscale and yscale does not support a simple linear re-scaling (unfortunately). As an alternative to Hooked's answer, instead of messing with the data, you can trick the labels like so:
ticks = ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x*scale))
ax.xaxis.set_major_formatter(ticks)
A complete example showing both x and y scaling:
import numpy as np
import pylab as plt
import matplotlib.ticker as ticker
# Generate data
x = np.linspace(0, 1e-9)
y = 1e3*np.sin(2*np.pi*x/1e-9) # one period, 1k amplitude
# setup figures
fig = plt.figure()
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
# plot two identical plots
ax1.plot(x, y)
ax2.plot(x, y)
# Change only ax2
scale_x = 1e-9
scale_y = 1e3
ticks_x = ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x/scale_x))
ax2.xaxis.set_major_formatter(ticks_x)
ticks_y = ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x/scale_y))
ax2.yaxis.set_major_formatter(ticks_y)
ax1.set_xlabel("meters")
ax1.set_ylabel('volt')
ax2.set_xlabel("nanometers")
ax2.set_ylabel('kilovolt')
plt.show()
And finally I have the credits for a picture:
Note that, if you have text.usetex: true as I have, you may want to enclose the labels in $, like so: '${0:g}$'.
Instead of changing the ticks, why not change the units instead? Make a separate array X of x-values whose units are in nm. This way, when you plot the data it is already in the correct format! Just make sure you add a xlabel to indicate the units (which should always be done anyways).
from pylab import *
# Generate random test data in your range
N = 200
epsilon = 10**(-9.0)
X = epsilon*(50*random(N) + 1)
Y = random(N)
# X2 now has the "units" of nanometers by scaling X
X2 = (1/epsilon) * X
subplot(121)
scatter(X,Y)
xlim(epsilon,50*epsilon)
xlabel("meters")
subplot(122)
scatter(X2,Y)
xlim(1, 50)
xlabel("nanometers")
show()
To set the range of the x-axis, you can use set_xlim(left, right), here are the docs
Update:
It looks like you want an identical plot, but only change the 'tick values', you can do that by getting the tick values and then just changing them to whatever you want. So for your need it would be like this:
ticks = your_plot.get_xticks()*10**9
your_plot.set_xticklabels(ticks)
I am working with a large number of 3D points, each with x,y,z values stored in numpy arrays.
For background, the points will always fall within a cylinder of fixed radius, and height = max z value of the points.
My objective is to split the bounding cylinder (or column if it is easier) into e.g. 1 m height strata, and then count the number of points within each cell
of a regular grid (e.g. 1 m x 1 m) overlaid on each strata.
Conceptually, the operation would be the same as overlaying a raster and counting the points intersecting each pixel.
The grid of cells can form a square or a disk, it doesn't matter.
After a lot of searching and reading, my current thinking is to use some combination of numpy.linspace and numpy.meshgrid to generate the vertices of each cell stored within an array and test each cell against each point to see if it is 'in'. This seems inefficient, especially when working with thousands of points.
The numpy / scipy suite seems well suited to the problem, but I have not found a solution yet. Any suggestions would be much appreciated.
I have included a few example points and some code to visualize the data.
# Setup
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# Load in X,Y,Z values from a sub-sample of 10 points for testing
# XY Values are scaled to a reasonable point of origin
z_vals = np.array([3.08,4.46,0.27,2.40,0.48,0.21,0.31,3.28,4.09,1.75])
x_vals = np.array([22.88,20.00,20.36,24.11,40.48,29.08,36.02,29.14,32.20,18.96])
y_vals = np.array([31.31,25.04,31.86,41.81,38.23,31.57,42.65,18.09,35.78,31.78])
# This plot is instructive to visualize the problem
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x_vals, y_vals, z_vals, c='b', marker='o')
plt.show()
I am not sure I understand perfectly what you are looking for, but since every "cell" seems to have a 1m side for all directions, couldn't you:
round all your values to integers (rasterize your data) probably with some floor function;
create a bijection from these integer coordinates to something more convenient with something like:
(64**2)*x + (64)*y + z # assuming all values are in [0,63]
You can put z rather at the beginning if you want to more easely focus on height later
compute the histogram of each "cell" (several functions from numpy/scipy or numpy can do it);
revert the bijection if needed (ie. know the "true" coordinates of each cell once the count is known)
Maybe I didn't understand well, but in case it helps...
Thanks #Baruchel. It turns out the n-dimensional histograms suggested by #DilithiumMatrix provides a fairly simple solution to the problem I posted. After some reading, here is my current solution for anyone else that faces a similar problem.
As this is my first Python/Numpy effort any improvements/suggestions, especially regarding performance, would be welcome. Thanks.
# Setup
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# Load in X,Y,Z values from a sub-sample of 10 points for testing
# XY Values are scaled to a reasonable point of origin
z_vals = np.array([3.08,4.46,0.27,2.40,0.48,0.21,0.31,3.28,4.09,1.75])
x_vals = np.array([22.88,20.00,20.36,24.11,40.48,29.08,36.02,29.14,32.20,18.96])
y_vals = np.array([31.31,25.04,31.86,41.81,38.23,31.57,42.65,18.09,35.78,31.78])
# Updated code below
# Variables needed for 2D,3D histograms
xmax, ymax, zmax = int(x_vals.max())+1, int(y_vals.max())+1, int(z_vals.max())+1
xmin, ymin, zmin = int(x_vals.min()), int(y_vals.min()), int(z_vals.min())
xrange, yrange, zrange = xmax-xmin, ymax-ymin, zmax-zmin
xedges = np.linspace(xmin, xmax, (xrange + 1), dtype=int)
yedges = np.linspace(ymin, ymax, (yrange + 1), dtype=int)
zedges = np.linspace(zmin, zmax, (zrange + 1), dtype=int)
# Make the 2D histogram
h2d, xedges, yedges = np.histogram2d(x_vals, y_vals, bins=(xedges, yedges))
assert np.count_nonzero(h2d) == len(x_vals), "Unclassified points in the array"
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
plt.imshow(h2d.transpose(), extent=extent, interpolation='none', origin='low')
# Transpose and origin must be used to make the array line up when using imshow, unsure why
# Plot settings, not sure yet on matplotlib update/override objects
plt.grid(b=True, which='both')
plt.xticks(xedges)
plt.yticks(yedges)
plt.xlabel('X-Axis')
plt.ylabel('Y-Axis')
plt.plot(x_vals, y_vals, 'ro')
plt.show()
# 3-dimensional histogram with 1 x 1 x 1 m bins. Produces point counts in each 1m3 cell.
xyzstack = np.stack([x_vals,y_vals,z_vals], axis=1)
h3d, Hedges = np.histogramdd(xyzstack, bins=(xedges, yedges, zedges))
assert np.count_nonzero(h3d) == len(x_vals), "Unclassified points in the array"
h3d.shape # Shape of the array should be same as the edge dimensions
testzbin = np.sum(np.logical_and(z_vals >= 1, z_vals < 2)) # Slice to test with
np.sum(h3d[:,:,1]) == testzbin # Test num points in second bins
np.sum(h3d, axis=2) # Sum of all vertical points above each x,y 'pixel'
# only in this example the h2d and np.sum(h3d,axis=2) arrays will match as no z bins have >1 points
# Remaining issue - how to get a r x c count of empty z bins.
# i.e. for each 'pixel' how many z bins contained no points?
# Possible solution is to reshape to use logical operators
count2d = h3d.reshape(xrange * yrange, zrange) # Maintain dimensions per num 3D cells defined
zerobins = (count2d == 0).sum(1)
zerobins.shape
# Get back to x,y grid with counts - ready for output as image with counts=pixel digital number
bincount_pixels = zerobins.reshape(xrange,yrange)
# Appears to work, perhaps there is a way without reshapeing?
PS if you are facing a similar problem scikit patch extraction looks like another possible solution.
I have a scatter plot with a number of points. Each point has a string associated with it (varying in length) that I'd like to supply a label, but I can't fit them all. So I'd like to iterating through my data points from most to least important, and in each case apply a label only if it would not overlap as existing label. The strings vary in length. One of the commenters mentions solving a knapsack problem to find an optimal solution. In my case the greedy algorithm (always label the most important remaining point that can be labeled without overlap) would be a good start and might suffice.
Here's a toy example. Could I get Python to label only as many points as it can without overlapping?
import matplotlib.pylab as plt, numpy as np
npoints = 100
xs = np.random.rand(npoints)
ys = np.random.rand(npoints)
plt.scatter(xs, ys)
labels = iter(dir(np))
for x, y, in zip(xs, ys):
# Ideally I'd condition the next line on whether or not the new label would overlap with an existing one
plt.annotate(labels.next(), xy = (x, y))
plt.show()
You can draw all the annotates first, and then use a mask array to check the overlap and use set_visible() to hide. Here is an example:
import numpy as np
import pylab as pl
import random
import string
import math
random.seed(0)
np.random.seed(0)
n = 100
labels = ["".join(random.sample(string.ascii_letters, random.randint(4, 10))) for _ in range(n)]
x, y = np.random.randn(2, n)
fig, ax = pl.subplots()
ax.scatter(x, y)
ann = []
for i in range(n):
ann.append(ax.annotate(labels[i], xy = (x[i], y[i])))
mask = np.zeros(fig.canvas.get_width_height(), bool)
fig.canvas.draw()
for a in ann:
bbox = a.get_window_extent()
x0 = int(bbox.x0)
x1 = int(math.ceil(bbox.x1))
y0 = int(bbox.y0)
y1 = int(math.ceil(bbox.y1))
s = np.s_[x0:x1+1, y0:y1+1]
if np.any(mask[s]):
a.set_visible(False)
else:
mask[s] = True
the output:
Just as an additional note: for my code to work, I had to add and additional renderer=fig.canvas.get_renderer() parameter to the get_window_extent() method rather than the default get_window_extent(renderer=None). I think the necessity of this additional parameter specification depends on the operating system. https://github.com/matplotlib/matplotlib/issues/10874
I am trying to plot a picture like this in python.
I have three parameters for ploting.
x:
[ 0.03570416 0.05201517 0.05418171 0.01868341 0.07116423 0.07547471]
y:
[-0.32079484 -0.53330218 -1.02866859 -0.94808545 -0.51682506 -0.26788337]
z:
[-0.32079484 -0.53330218 -1.02866859 -0.94808545 -0.51682506 -0.26788337]
so x is x-axis and y is y-axis. however z is the intensity of the pixel.
I come up with this code:
z = np.array(reals)
x = np.array(ra)
y = np.array(dec)
nrows, ncols = 10, 10
grid = z.reshape((nrows, ncols))
plt.imshow(grid, extent=(x.min(), x.max(), y.max(), y.min()), interpolation='nearest', cmap=cm.gist_rainbow)
plt.title('This is a phase function')
plt.xlabel('ra')
plt.ylabel('dec')
plt.show()
However I get this error:
grid = z.reshape((nrows, ncols))
ValueError: total size of new array must be unchanged
ra, dec and reals are normal arrays with the same size. I calculated them before and then I create the numpy arrays with them
The data you show is not consistent with making an image, but you could make a scatter plot with it.
The two basic types of plots for z values at (x,y) coordinate pairs are:
scatter plots, where for each (x,y) pair, a z-value is specified.
image (imshow, pcolor, pcolormesh, contour), where an x-axis with m regularly spaced values, and a y-axis with n regularly spaced values are specified, and then an array of z-values with size (m,n) is given.
Your data looks more like the former type, so I'm suggesting a scatter plot.
Here's what a scatter plot looks like (btw, your y and z values are the same, which if probably a mistake):
import numpy as np
import matplotlib.pyplot as plt
x = np.array([ 0.03570416, 0.05201517, 0.05418171, 0.01868341, 0.07116423, 0.07547471])
y = np.array([-0.32079484, -0.53330218, -1.02866859, -0.94808545, -0.51682506, -0.26788337])
z = np.array([-0.32079484, -0.53330218, -1.02866859, -0.94808545, -0.51682506, -0.26788337])
plt.scatter(x, y, c=z, s =250)
plt.show()