Changing size of scattered points in matplotlib - python

I am doing some plotting using cartopy and matplotlib, and I am producing a few images using the same set of points with a different domain size shown in each image. As my domain size gets bigger, the size of each plotted point remains fixed, so eventually as I zoom out, things get scrunched up, overlapped, and generally messy. I want to change the size of the points, and I know that I could do so by plotting them again, but I am wondering if there is a way to change their size without going through that process another time.
this is the line that I am using to plot the points:
plt.scatter(longs, lats, color = str(platformColors[platform]), zorder = 2, s = 8, marker = 'o')
and this is the line that I am using to change the domain size:
ax.set_extent([lon-offset, lon+offset, lat-offset, lat+offset])
Any advice would be greatly appreciated!

scatter has the option set_sizes, which you can use to set a new size. For example:
import matplotlib.pylab as pl
import numpy as np
x = np.random.random(10)
y = np.random.random(10)
s = np.random.random(10)*100
pl.figure()
l=pl.scatter(x,y,s=s)
s = np.random.random(10)*100
l.set_sizes(s)
It seems that set_sizes only accepts arrays, so for a constant marker size you could do something like:
l.set_sizes(np.ones(x.size)*100)
Or for a relative change, something like:
l.set_sizes(l.get_sizes()*2)

http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.scatter
These are the parameters that plt.scatter take and the s parameter is the size of the scattered point so change s to whatever you like, something like so
plt.scatter(longs, lats, color = str(platformColors[platform]), zorder = 2, s = 20, marker = 'o')

Related

How can I adjust Axes sizes in matplotlib polar plots? [duplicate]

I am starting to play around with creating polar plots in Matplotlib that do NOT encompass an entire circle - i.e. a "wedge" plot - by setting the thetamin and thetamax properties. This is something I was waiting for for a long time, and I am glad they have it done :)
However, I have noticed that the figure location inside the axes seem to change in a strange manner when using this feature; depending on the wedge angular aperture, it can be difficult to fine tune the figure so it looks nice.
Here's an example:
import numpy as np
import matplotlib.pyplot as plt
# get 4 polar axes in a row
fig, axes = plt.subplots(2, 2, subplot_kw={'projection': 'polar'},
figsize=(8, 8))
# set facecolor to better display the boundaries
# (as suggested by ImportanceOfBeingErnest)
fig.set_facecolor('paleturquoise')
for i, theta_max in enumerate([2*np.pi, np.pi, 2*np.pi/3, np.pi/3]):
# define theta vector with varying end point and some data to plot
theta = np.linspace(0, theta_max, 181)
data = (1/6)*np.abs(np.sin(3*theta)/np.sin(theta/2))
# set 'thetamin' and 'thetamax' according to data
axes[i//2, i%2].set_thetamin(0)
axes[i//2, i%2].set_thetamax(theta_max*180/np.pi)
# actually plot the data, fine tune radius limits and add labels
axes[i//2, i%2].plot(theta, data)
axes[i//2, i%2].set_ylim([0, 1])
axes[i//2, i%2].set_xlabel('Magnitude', fontsize=15)
axes[i//2, i%2].set_ylabel('Angles', fontsize=15)
fig.set_tight_layout(True)
#fig.savefig('fig.png', facecolor='skyblue')
The labels are in awkward locations and over the tick labels, but can be moved closer or further away from the axes by adding an extra labelpad parameter to set_xlabel, set_ylabel commands, so it's not a big issue.
Unfortunately, I have the impression that the plot is adjusted to fit inside the existing axes dimensions, which in turn lead to a very awkward white space above and below the half circle plot (which of course is the one I need to use).
It sounds like something that should be reasonably easy to get rid of - I mean, the wedge plots are doing it automatically - but I can't seem to figure it out how to do it for the half circle. Can anyone shed a light on this?
EDIT: Apologies, my question was not very clear; I want to create a half circle polar plot, but it seems that using set_thetamin() you end up with large amounts of white space around the image (especially above and below) which I would rather have removed, if possible.
It's the kind of stuff that normally tight_layout() takes care of, but it doesn't seem to be doing the trick here. I tried manually changing the figure window size after plotting, but the white space simply scales with the changes. Below is a minimum working example; I can get the xlabel closer to the image if I want to, but saved image file still contains tons of white space around it.
Does anyone knows how to remove this white space?
import numpy as np
import matplotlib.pyplot as plt
# get a half circle polar plot
fig1, ax1 = plt.subplots(1, 1, subplot_kw={'projection': 'polar'})
# set facecolor to better display the boundaries
# (as suggested by ImportanceOfBeingErnest)
fig1.set_facecolor('skyblue')
theta_min = 0
theta_max = np.pi
theta = np.linspace(theta_min, theta_max, 181)
data = (1/6)*np.abs(np.sin(3*theta)/np.sin(theta/2))
# set 'thetamin' and 'thetamax' according to data
ax1.set_thetamin(0)
ax1.set_thetamax(theta_max*180/np.pi)
# actually plot the data, fine tune radius limits and add labels
ax1.plot(theta, data)
ax1.set_ylim([0, 1])
ax1.set_xlabel('Magnitude', fontsize=15)
ax1.set_ylabel('Angles', fontsize=15)
fig1.set_tight_layout(True)
#fig1.savefig('fig1.png', facecolor='skyblue')
EDIT 2: Added background color to figures to better show the boundaries, as suggested in ImportanteOfBeingErnest's answer.
It seems the wedge of the "truncated" polar axes is placed such that it sits in the middle of the original axes. There seems so be some constructs called LockedBBox and _WedgeBbox in the game, which I have never seen before and do not fully understand. Those seem to be created at draw time, such that manipulating them from the outside seems somewhere between hard and impossible.
One hack can be to manipulate the original axes such that the resulting wedge turns up at the desired position. This is not really deterministic, but rather looking for some good values by trial and error.
The parameters to adjust in this case are the figure size (figsize), the padding of the labels (labelpad, as already pointed out in the question) and finally the axes' position (ax.set_position([left, bottom, width, height])).
The result could then look like
import numpy as np
import matplotlib.pyplot as plt
# get a half circle polar plot
fig1, ax1 = plt.subplots(1, 1, figsize=(6,3.4), subplot_kw={'projection': 'polar'})
theta_min = 1.e-9
theta_max = np.pi
theta = np.linspace(theta_min, theta_max, 181)
data = (1/6.)*np.abs(np.sin(3*theta)/np.sin(theta/2.))
# set 'thetamin' and 'thetamax' according to data
ax1.set_thetamin(0)
ax1.set_thetamax(theta_max*180./np.pi)
# actually plot the data, fine tune radius limits and add labels
ax1.plot(theta, data)
ax1.set_ylim([0, 1])
ax1.set_xlabel('Magnitude', fontsize=15, labelpad=-60)
ax1.set_ylabel('Angles', fontsize=15)
ax1.set_position( [0.1, -0.45, 0.8, 2])
plt.show()
Here I've set some color to the background of the figure to better see the boundary.

Setting both axes logarithmic in bar plot matploblib

I have already binned data to plot a histogram. For this reason I'm using the plt.bar() function. I'd like to set both axes in the plot to a logarithmic scale.
If I set plt.bar(x, y, width=10, color='b', log=True) which lets me set the y-axis to log but I can't set the x-axis logarithmic.
I've tried plt.xscale('log') unfortunately this doesn't work right. The x-axis ticks vanish and the sizes of the bars don't have equal width.
I would be grateful for any help.
By default, the bars of a barplot have a width of 0.8. Therefore they appear larger for smaller x values on a logarithmic scale. If instead of specifying a constant width, one uses the distance between the bin edges and supplies this to the width argument, the bars will have the correct width. One would also need to set the align to "edge" for this to work.
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(1)
x = np.logspace(0, 5, num=21)
y = (np.sin(1.e-2*(x[:-1]-20))+3)**10
fig, ax = plt.subplots()
ax.bar(x[:-1], y, width=np.diff(x), log=True,ec="k", align="edge")
ax.set_xscale("log")
plt.show()
I cannot reproduce missing ticklabels for a logarithmic scaling. This may be due to some settings in the code that are not shown in the question or due to the fact that an older matplotlib version is used. The example here works fine with matplotlib 2.0.
If the goal is to have equal width bars, assuming datapoints are not equidistant, then the most proper solution is to set width as
plt.bar(x, y, width=c*np.array(x), color='b', log=True) for a constant c appropriate for the plot. Alignment can be anything.
I know it is a very old question and you might have solved it but I've come to this post because I was with something like this but at the y axis and I manage to solve it just using ax.set_ylim(df['my data'].min()+100, df['my data'].max()+100). In y axis I have some sensible information which I thouhg the best way was to show in log scale but when I set log scale I couldn't see the numbers proper (as this post in x axis) so I just leave the idea of use log and use the min and max argment. It sets the scale of my graph much like as log. Still looking for another way for doesnt need use that -+100 at set_ylim.
While this does not actually use pyplot.bar, I think this method could be helpful in achieving what the OP is trying to do. I found this to be easier than trying to calibrate the width as a function of the log-scale, though it's more steps. Create a line collection whose width is independent of the chart scale.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.collections as coll
#Generate data and sort into bins
a = np.random.logseries(0.5, 1000)
hist, bin_edges = np.histogram(a, bins=20, density=False)
x = bin_edges[:-1] # remove the top-end from bin_edges to match dimensions of hist
lines = []
for i in range(len(x)):
pair=[(x[i],0), (x[i], hist[i])]
lines.append(pair)
linecoll = coll.LineCollection(lines, linewidths=10, linestyles='solid')
fig, ax = plt.subplots()
ax.add_collection(linecoll)
ax.set_xscale("log")
ax.set_yscale("log")
ax.set_xlim(min(x)/10,max(x)*10)
ax.set_ylim(0.1,1.1*max(hist)) #since this is an unweighted histogram, the logy doesn't make much sense.
Resulting plot - no frills
One drawback is that the "bars" will be centered, but this could be changed by offsetting the x-values by half of the linewidth value ... I think it would be
x_new = x + (linewidth/2)*10**round(np.log10(x),0).

Tricontourf plot with a hole in the middle.

I have some data defined on a regular Cartesian grids. I'd like to show only some of them with a condition based on the radius from the center. This will effectively create a ring-like structure with a hole in the center. As a result, I cannot use imshow. tricontourf or tripcolor are what I found to deal with it. My code looks something like this:
R = np.sqrt(x**2+y**2)
flag = (R<150)*(R>10)
plt.tricontourf(x[flag], y[flag], data[flag], 100)
where x and y are mesh grids where data defines. The problem here is that both tricontourf and tripcolor try to fill the middle of the ring, where I hope can be left blank.
To be more specific, the one in the left is similar to what I want but I can only get the one in the right with this piece of code shown above.
The following shows how to mask some parts of the plot based on a condition. Using imshow is perfectly possible, and that's what the script below is doing.
The idea is to set all the unwanted parts of the plots to nan. To make the nan values disappear, we can set their alpha to 0, basically making the plot transparent at those points.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-150, 150, 300)
y = np.linspace(-150, 150, 300)
X,Y = np.meshgrid(x,y)
data = np.exp(-(X/80.)**2-(Y/80.)**2)
R = np.sqrt(X**2+Y**2)
flag =np.logical_not( (R<110) * (R>10) )
data[flag] = np.nan
palette = plt.cm.jet
palette.set_bad(alpha = 0.0)
im = plt.imshow(data)
plt.colorbar(im)
plt.savefig(__file__+".png")
plt.show()
Just to add that also tricontourf can do what you're asking about. This example from the matplotlib gallery shows exactly what you're looking for, while this question on SO deals with a similar issue in a more comprehensive way.
Try creating fake data points in the inner hole and set them to np.nan or np.inf. Alternately, you could set them to a high value (in your case, say simply 1) and then pass limits to the colour scale so that these high regions are not plotted.

Contour labels in Python

I would like to plot a series of contour lines, but the spacing between where the label is and the line increases higher up the page. I've plotted an example attached. I want no decimal points, hence used fmt, but this seems to change the spacing at different points (Ideally I want around half a centimetre gap between the contour line break and the writing.
As an aside, I also tried to use the manual locations so it'd plot each label at a certain place, but as there are two seperate contour lines with the same value I'm not sure if this is possible. Thanks!
Here is my code;
from netCDF4 import Dataset
import numpy as np
from matplotlib import pyplot as plt
#############################
# #
# Parameter Setup #
# #
#############################
myfile = '/home/ubuntu/Documents/Control/puma/run/Control.nc' #Control U
myfile2 = '/home/ubuntu/Documents/Control_Trop_40K/puma/run/ControlTrop.nc' #Perturbed U
Title = 'U'
Units = '!U'
Variable = 'ua'
#############################
#############################
Import = Dataset(myfile, mode='r')
Import2 = Dataset(myfile2, more='r')
latt = Import.variables['lat'][:]
level = Import.variables['lev'][:]
UControl = Import.variables[Variable][:]
#UPerturbed = Import2.variables[Variable][:]
#UChange = UPerturbed - UControl
#UChange = np.mean(UChange[:,:,:,0], axis=0)
UControl = np.mean(UControl[:,:,:,0], axis=0)
Contourrange = [10]
CS=plt.contour(latt,level,UControl,Contourrange, colors='k')
plt.clabel(CS, fontsize=10, inline=1,fmt = '%1.0f',ticks=Contourrange)
plt.gca().invert_yaxis()
plt.yscale('log', nonposy='clip')
plt.xticks(np.round((np.arange(-90, 91, 30)), 2))
plt.xlim(xmin=-90)
plt.yticks([900,800,700,600,500,400,300,200,100,50])
plt.gca().set_yticklabels([900,800,700,600,500,400,300,200,100,50])
plt.xlabel('Latitude')
plt.ylabel('Pressure (hPa)')
plt.title(Title)
plt.show()
The pictures are:
You are manually defining the values for which it should be a tick:
plt.yticks([900,800,700,600,500,400,300,200,100,50])
Since you also have chosen a logarithmic scale, and since the increment you specified is constant, matplotlib needs to vary the space between ticks to comply with both your requirements.
If you absolutely do not want this behavior, either get rid of the log option, or let matplotlib automatically set ticks for you. Alternatively, you could provide the plt.yticks fuction with an array of exponentially increasing/decreasing numbers. Like this:
plt.yticks([10^3,10^2,10^1])
You will have to make sure you are using the correct base (I simply assumed a base 10), and you will have to find suitable numbers to span your range of values.

Speeding up matplotlib scatter plots

I'm trying to make an interactive program which primarily uses matplotlib to make scatter plots of rather a lot of points (10k-100k or so). Right now it works, but changes take too long to render. Small numbers of points are ok, but once the number rises things get frustrating in a hurry. So, I'm working on ways to speed up scatter, but I'm not having much luck
There's the obvious way to do thing (the way it's implemented now)
(I realize the plot redraws without updating. I didn't want to alter the fps result with large calls to random).
import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import time
X = np.random.randn(10000) #x pos
Y = np.random.randn(10000) #y pos
C = np.random.random(10000) #will be color
S = (1+np.random.randn(10000)**2)*3 #size
#build the colors from a color map
colors = mpl.cm.jet(C)
#there are easier ways to do static alpha, but this allows
#per point alpha later on.
colors[:,3] = 0.1
fig, ax = plt.subplots()
fig.show()
background = fig.canvas.copy_from_bbox(ax.bbox)
#this makes the base collection
coll = ax.scatter(X,Y,facecolor=colors, s=S, edgecolor='None',marker='D')
fig.canvas.draw()
sTime = time.time()
for i in range(10):
print i
#don't change anything, but redraw the plot
ax.cla()
coll = ax.scatter(X,Y,facecolor=colors, s=S, edgecolor='None',marker='D')
fig.canvas.draw()
print '%2.1f FPS'%( (time.time()-sTime)/10 )
Which gives a speedy 0.7 fps
Alternatively, I can edit the collection returned by scatter. For that, I can change color and position, but don't know how to change the size of each point. That would I think look something like this
import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import time
X = np.random.randn(10000) #x pos
Y = np.random.randn(10000) #y pos
C = np.random.random(10000) #will be color
S = (1+np.random.randn(10000)**2)*3 #size
#build the colors from a color map
colors = mpl.cm.jet(C)
#there are easier ways to do static alpha, but this allows
#per point alpha later on.
colors[:,3] = 0.1
fig, ax = plt.subplots()
fig.show()
background = fig.canvas.copy_from_bbox(ax.bbox)
#this makes the base collection
coll = ax.scatter(X,Y,facecolor=colors, s=S, edgecolor='None', marker='D')
fig.canvas.draw()
sTime = time.time()
for i in range(10):
print i
#don't change anything, but redraw the plot
coll.set_facecolors(colors)
coll.set_offsets( np.array([X,Y]).T )
#for starters lets not change anything!
fig.canvas.restore_region(background)
ax.draw_artist(coll)
fig.canvas.blit(ax.bbox)
print '%2.1f FPS'%( (time.time()-sTime)/10 )
This results in a slower 0.7 fps. I wanted to try using CircleCollection or RegularPolygonCollection, as this would allow me to change the sizes easily, and I don't care about changing the marker. But, I can't get either to draw so I have no idea if they'd be faster. So, at this point I'm looking for ideas.
I've been through this a few times trying to speed up scatter plots with large numbers of points, variously trying:
Different marker types
Limiting colours
Cutting down the dataset
Using a heatmap / grid instead of a scatter plot
And none of these things worked. Matplotlib is just not very performant when it comes to scatter plots. My only recommendation is to use a different plotting library, though I haven't personally found one that was suitable. I know this doesn't help much, but it may save you some hours of fruitless tinkering.
We are actively working on performance for large matplotlib scatter plots.
I'd encourage you to get involved in the conversation (http://matplotlib.1069221.n5.nabble.com/mpl-1-2-1-Speedup-code-by-removing-startswith-calls-and-some-for-loops-td41767.html) and, even better, test out the pull request that has been submitted to make life much better for a similar case (https://github.com/matplotlib/matplotlib/pull/2156).
HTH

Categories