Plot : Too many ticks on X axe - python

The first loaded plot have too many ticks on X axe (see image01).
If I use the zoom action on X axe, the plot is now well loaded.
Can you give me some advise where I can search because The Plot constructor parameters seems good.
date_range = (735599.0, 735745.0)
x = (735610.5, 735647.0, 735647.5, 735648.5, 735669.0, 735699.0, 735701.5, 735702.5, 735709.5, 735725.5, 735728.5, 735735.5, 735736.0)
y = (227891.25361545716, 205090.4880046467, 208352.59317388065, 175462.99296699322, 98209.836461969651, 275063.37219361769, 219456.93600708069, 230731.12613806152, 209043.19805037521, 218297.51486296533, 208036.88967207001, 206311.71988471842, 216036.56824433553)
y0 = 218206.79192
x_after = (735610.5, 735647.0, 735647.5, 735701.5, 735702.5, 735709.5, 735725.5, 735728.5, 735735.5, 735736.0)
y_after = (227891.25361545716, 205090.4880046467, 208352.59317388065, 219456.93600708069, 230731.12613806152, 209043.19805037521, 218297.51486296533, 208036.88967207001, 206311.71988471842, 216036.56824433553)
linex = -39.1175584541
liney = 28993493.5251
ax.plot_date(x, numpy.array(y) / y0, color='r', xdate=True, marker='x')
ax.plot_date(x_after, numpy.array(y_after) / y0, color='r', xdate=True)
ax.set_xlim(date_range)
steps = list(ax.get_xlim())
steps.append(steps[-1] + 2)
steps = [steps[0] - 2] + steps
ax.plot(steps, numpy.array([linex * a + liney for a in steps]) / y0, color='b')
Thank you for your help.
Manuel

If you have too many xtick labels, so many that they are all munged together on the plot, you can reduce them using pyplot.xticks. the arguments are the points the labels apply to, the labels themselves and an optional rotation.
import numpy as np
import matplotlib.pyplot as plt
y = np.arange(10000)
ticks = y - 5000
plt.plot(y)
k = 1000
ys = y[::k]
ys = np.append(ys, y[-1])
labels = ticks[::k]
labels = np.append(labels, ticks[-1])
plt.xticks(ys,labels, rotation='vertical')
plt.show()
plt.close()

I'm not sure I understand exactly what you wanna do but is a rotation of your xticklabels sufficient for you?
# Add this code at the end of your script
# It will rotate the labels contained in your date range
plt.xticks(rotation=70)
If I test your code, I have 7 labels but the rotation argument is changed to 0 (horizontal)

Related

Plotting bars hist and PDF line (via kdeplot)

I'm trying to plot bar hist of interest rates and attach to it a PDF line. I have looked for solutions and found a way with kdeplot.
The result is pretty strange the kdeplot line is much higher than the bars hist and I don't know how to fix it.
After applying kdeplot:
Before applying kdeplot:
Here is the code that I'm using:
df=pd.read_excel('interestrate.xlsx')
k=0.0005
bin_steps = np.arange(start = df['Interest rate Real'].min(), stop = df['Interest rate Real'].max(), step = k)
ax = df['Interest rate Real'].hist(bins = bin_steps, figsize=[10,5])
ax1 = df['Interest rate Real']
vals = ax.get_xticks()
ax.set_xticklabels(['{:,.2%}'.format(x) for x in vals])
ax.set_yticklabels(['{:,.2%}'.format(x) for x in vals])
ax.set_title("PDF for Real Interest Rate")
#sns.kdeplot(ax1)
The following code snippet should set you in the right direction (just insert your data):
import scipy.stats as st
y = np.random.randn(1000) # your data goes here
plt.hist(y,50, density=True)
mn, mx = plt.xlim()
plt.xlim(mn, mx)
x = np.linspace(mn, mx, 301)
kde = st.gaussian_kde(y)
plt.plot(x, kde.pdf(x));
Alternatively with seaborn:
import seaborn as sns
plt.hist(y,50, density=True)
sns.kdeplot(y);
or as simple as:
sns.distplot(y)

matplotlib: change axis ticks of ndim histogram plotted with seaborn.heatmap

Motivation:
I'm trying to visualize a dataset of many n-dimensional vectors (let's say i have 10k vectors with n=300 dimensions). What i'd like to do is calculate a histogram for each of the n dimensions and plot it as a single line in a bins*n heatmap.
So far i've got this:
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
%matplotlib inline
import seaborn as sns
# sample data:
vectors = np.random.randn(10000, 300) + np.random.randn(300)
def ndhist(vectors, bins=500):
limits = (vectors.min(), vectors.max())
hists = []
dims = vectors.shape[1]
for dim in range(dims):
h, bins = np.histogram(vectors[:, dim], bins=bins, range=limits)
hists.append(h)
hists = np.array(hists)
fig = plt.figure(figsize=(16, 9))
sns.heatmap(hists)
axes = fig.gca()
axes.set(ylabel='dimensions', xlabel='values')
print(dims)
print(limits)
ndhist(vectors)
This generates the following output:
300
(-6.538069472429366, 6.52159540162285)
Problem / Question:
How can i change the axes ticks?
for the y-axis i'd like to simply change this back to matplotlib's default, so it picks nice ticks like 0, 50, 100, ..., 250 (bonus points for 299 or 300)
for the x-axis i'd like to convert the shown bin indices into the bin (left) boundaries, then, as above, i'd like to change this back to matplotlib's default selection of some "nice" ticks like -5, -2.5, 0, 2.5, 5 (bonus points for also including the actual limits -6.538, 6.522)
Own solution attempts:
I've tried many things like the following already:
def ndhist_axlabels(vectors, bins=500):
limits = (vectors.min(), vectors.max())
hists = []
dims = vectors.shape[1]
for dim in range(dims):
h, bins = np.histogram(vectors[:, dim], bins=bins, range=limits)
hists.append(h)
hists = np.array(hists)
fig = plt.figure(figsize=(16, 9))
sns.heatmap(hists, yticklabels=False, xticklabels=False)
axes = fig.gca()
axes.set(ylabel='dimensions', xlabel='values')
#plt.xticks(np.linspace(*limits, len(bins)), bins)
plt.xticks(range(len(bins)), bins)
axes.xaxis.set_major_locator(matplotlib.ticker.AutoLocator())
plt.yticks(range(dims+1), range(dims+1))
axes.yaxis.set_major_locator(matplotlib.ticker.AutoLocator())
print(dims)
print(limits)
ndhist_axlabels(vectors)
As you can see however, the axes labels are pretty wrong. My guess is that the extent or limits are somewhere stored in the original axis, but lost when switching back to the AutoLocator. Would greatly appreciate a nudge in the right direction.
Maybe you're overthinking this. To plot image data, one can use imshow and get the ticking and formatting for free.
import numpy as np
from matplotlib import pyplot as plt
# sample data:
vectors = np.random.randn(10000, 300) + np.random.randn(300)
def ndhist(vectors, bins=500):
limits = (vectors.min(), vectors.max())
hists = []
dims = vectors.shape[1]
for dim in range(dims):
h, _ = np.histogram(vectors[:, dim], bins=bins, range=limits)
hists.append(h)
hists = np.array(hists)
fig, ax = plt.subplots(figsize=(16, 9))
extent = [limits[0], limits[-1], hists.shape[0]-0.5, -0.5]
im = ax.imshow(hists, extent=extent, aspect="auto")
fig.colorbar(im)
ax.set(ylabel='dimensions', xlabel='values')
ndhist(vectors)
plt.show()
If you read the docs, you will notice that the xticklabels/yticklabels arguments are overloaded, such that if you provide an integer instead of a string, it will interpret the argument as xtickevery/ytickevery and place ticks only at the corresponding locations. So in your case, seaborn.heatmap(hists, yticklabels=50) fixes your y-axis problem.
Regarding your xtick labels, I would simply provide them explictly:
xtickevery = 50
xticklabels = ['{:.1f}'.format(b) if ii%xtickevery == 0 else '' for ii, b in enumerate(bins)]
sns.heatmap(hists, yticklabels=50, xticklabels=xticklabels)
Finally came up with a version that works for me for now and uses AutoLocator based on some simple linear mapping...
def ndhist(vectors, bins=1000, title=None):
t = time.time()
limits = (vectors.min(), vectors.max())
hists = []
dims = vectors.shape[1]
for dim in range(dims):
h, bs = np.histogram(vectors[:, dim], bins=bins, range=limits)
hists.append(h)
hists = np.array(hists)
fig = plt.figure(figsize=(16, 12))
sns.heatmap(
hists,
yticklabels=50,
xticklabels=False
)
axes = fig.gca()
axes.set(
ylabel=f'dimensions ({dims} total)',
xlabel=f'values (min: {limits[0]:.4g}, max: {limits[1]:.4g}, {bins} bins)',
title=title,
)
def val_to_idx(val):
# calc (linearly interpolated) index loc for given val
return bins*(val - limits[0])/(limits[1] - limits[0])
xlabels = [round(l, 3) for l in limits] + [
v for v in matplotlib.ticker.AutoLocator().tick_values(*limits)[1:-1]
]
# drop auto-gen labels that might be too close to limits
d = (xlabels[4] - xlabels[3])/3
if (xlabels[1] - xlabels[-1]) < d:
del xlabels[-1]
if (xlabels[2] - xlabels[0]) < d:
del xlabels[2]
xticks = [val_to_idx(val) for val in xlabels]
axes.set_xticks(xticks)
axes.set_xticklabels([f'{l:.4g}' for l in xlabels])
plt.show()
print(f'histogram generated in {time.time() - t:.2f}s')
ndhist(np.random.randn(100000, 300), bins=1000, title='randn')
Thanks to Paul for his answer giving me the idea.
If there's an easier or more elegant solution, i'd still be interested though.

Colormap with colored quiver

I am plotting a map with arrows on top of it. These arrows represent winddirections, average windspeed (per direction) and the occurence (per direction).
The direction is indicated by the direction of the arrow. The length of the arrow indicated the average windspeed in that direction. The color of the arrow indicates the occurence of winds in such a direction.
This all works fine with the script below:
windData = pd.read_csv(src+'.txt'), sep='\t', names=['lat', 'lon', 'wind_dir_start', 'wind_dir_end', 'total_num_data_points','num_data_points', 'avg_windspeed']).dropna()
# plot map
m = Basemap(llcrnrlon=minLon, llcrnrlat=minLat, urcrnrlon=maxLon, urcrnrlat=maxLat, resolution='i')
Left, Bottom = m(minLon, minLat)
Right, Top = m(maxLon, maxLat)
# get x y
x, y = m(windData['lon'], windData['lat'])
# angles
angleStart = -windData['wind_start']+90
angleStart[angleStart<0] = np.radians(angleStart[angleStart<0]+360.)
angleEnd = -windData['wind_end']+90
angleEnd[angleEnd<0] = np.radians(angleEnd[angleEnd<0]+360.)
angle = angleStart + math.radians(binSize/2.)
xux = np.cos(angle) * windData['avg_windspeed']
yuy = np.sin(angle) * windData['avg_windspeed']
# occurence
occurence = (windData['num_data_points']/windData['total_num_data_points'])
xi = np.linspace(minLon, maxLon, 300)
yi = np.linspace(minLat, maxLat, 300)
# plotting
## xux and yuy are used negatively because they are measured as "coming from" and displayed as "going to"
# To make things more readable I left a threshold for the occurence out
# I usually plot x, y, xux, yuy and the colors as var[occurence>threshold]
Q = m.quiver(x, y, -xux, -yuy, scale=75, zorder=6, color=cm.jet, width=0.0003*Width, cmap=cm.jet)
qk = plt.quiverkey(Q, 0.5, 0.92, 3, r'$3 \frac{m}{s}$', labelpos='S', fontproperties={'weight': 'bold'})
m.scatter(x, y, c='k', s=20*np.ones(len(x)), zorder=10, vmin=4.5, vmax=39.)
This plot shows the arrows well, but now I want to add a colormap that indicates the percentage of occurence next to the plot. How would I do this?
OK
Usual imports, plus import matplotlib
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
Fake the data to be plotted (tx for the MCVE)
NP = 10
np.random.seed(1)
x = np.random.random(NP)
y = np.random.random(NP)
angle = 1.07+np.random.random(NP) # NE to NW
velocity = 1.50+np.random.random(NP)
o = np.random.random(NP)
occurrence = o/np.sum(o)
dx = np.cos(angle)*velocity
dy = np.sin(angle)*velocity
Create a mappable so that Matplotib has no reason to complain "RuntimeError: No mappable was found to use for colorbar creation."
norm = matplotlib.colors.Normalize()
norm.autoscale(occurrence)
cm = matplotlib.cm.copper
sm = matplotlib.cm.ScalarMappable(cmap=cm, norm=norm)
sm.set_array([])
and plot the data
plt.quiver(x, y, dx, dy, color=cm(norm(o)))
plt.colorbar(sm)
plt.show()
References:
A logarithmic colorbar in matplotlib scatter plot
,
Drawing a colorbar aside a line plot, using Matplotlib
and
Different colours for arrows in quiver plot.
P.S. In recent (for sure in 3.+) Matplotlib releases the cm.set_array incantation is no more necessary
Do you want the colorbar to show the different wind speeds? If so, it might be sufficient to place plt.colorbar() between the lines Q = m.quiver(...) and qk = ....

Linear Regression: Extending line past data and adding a legend

I have a code:
import math
import numpy as np
import pylab as plt1
from matplotlib import pyplot as plt
uH2 = 1.90866638
uHe = 3.60187307
eH2 = 213.38
eHe = 31.96
R = float(uH2*eH2)/(uHe*eHe)
C_Values = []
Delta = []
kHeST = []
J_f21 = []
data = np.genfromtxt("Lamda_HeHCL.txt", unpack=True);
J_i1=data[1];
J_f1=data[2];
kHe=data[7]
data = np.genfromtxt("Basecol_Basic_New_1.txt", unpack=True);
J_i2=data[0];
J_f2=data[1];
kH2=data[5]
print kHe
print kH2
kHe = map(float, kHe)
kH2 = map(float, kH2)
kHe = np.array(kHe)
kH2= np.array(kH2)
g = len(kH2)
for n in range(0,g):
if J_f2[n] == 1:
Jf21 = J_f2[n]
J_f21.append(Jf21)
ratio = kHe[n]/kH2[n]
C = (((math.log(float(kH2[n]),10)))-(math.log(float(kHe[n]),10)))/math.log(R,10)
C_Values.append(C)
St = abs(J_f1[n] - J_i1[n])
Delta.append(St)
print C_Values
print Delta
print J_f21
fig, ax = plt.subplots()
ax.scatter(Delta,C_Values)
for i, txt in enumerate(J_f21):
ax.annotate(txt, (Delta[i],C_Values[i]))
plt.plot(np.unique(Delta), np.poly1d(np.polyfit(Delta, C_Values, 1))(np.unique(Delta)))
plt.plot(Delta, C_Values)
fit = np.polyfit(Delta,C_Values,1)
fit_fn = np.poly1d(fit)
# fit_fn is now a function which takes in x and returns an estimate for y
plt.scatter(Delta,C_Values, Delta, fit_fn(Delta))
plt.xlim(0, 12)
plt.ylim(-3, 3)
In this code, I am trying to plot a linear regression that extends past the data and touches the x-axis. I am also trying to add a legend to the plot that shows the slope of the plot. Using the code, I was able to plot this graph.
Here is some trash data I have been using to try and extend the line and add a legend to my code.
x =[5,7,9,15,20]
y =[10,9,8,7,6]
I would also like it to be a scatter except for the linear regression line.
Given that you don't provide the data you're loading from files I was unable to test this, but off the top of my head:
To extend the line past the plot, you could turn this line
plt.plot(np.unique(Delta), np.poly1d(np.polyfit(Delta, C_Values, 1))(np.unique(Delta)))
Into something like
x = np.linspace(0, 12, 50) # both 0 and 12 are from visually inspecting the plot
plt.plot(x, np.poly1d(np.polyfit(Delta, C_Values, 1))(x))
But if you want the line extended to the x-axis,
polynomial = np.polyfit(Delta, C_Values, 1)
x = np.linspace(0, *np.roots(polynomial))
plt.plot(x, np.poly1d(polynomial)(x))
As for the scatter plot thing, it seems to me you could just remove this line:
plt.plot(Delta, C_Values)
Oh right, as for the legend, add a label to the plots you make, like this:
plt.plot(x, np.poly1d(polynomial)(x), label='Linear regression')
and add a call to plt.legend() just before plt.show().

Trying to set y axis labels and ticks aligned to 2D faces

This is my plot:
If I were to draw your attention to the axis labelled 'B' you'll see that everything is not as it should be.
The plots was produced using this:
def newPoly3D(self):
from matplotlib.cm import autumn
# This passes a pandas dataframe of shape (data on rows x 4 columns)
df = self.loadData()
fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
vels = [1.42,1.11,0.81,0.50]
which_joints = df.columns
L = len(which_joints)
dmin,dmax = df.min().min(),df.max().max()
dix = df.index.values
offset=-5
for i,j in enumerate(which_joints):
ax.add_collection3d(plt.fill_between(dix,df[j],
dmin,
lw=1.5,
alpha=0.3/float(i+1.),
facecolor=autumn(i/float(L))),
zs=vels[i],
zdir='y')
ax.grid(False)
ax.set_xlabel('A')
ax.set_xlim([0,df.index[-1]])
ax.set_xticks([])
ax.xaxis.set_ticklabels([])
ax.set_axis_off
ax.set_ylabel('B')
ax.set_ylim([0.4, max(vels)+0.075])
ax.set_yticks(vels)
ax.tick_params(direction='out', pad=10)
ax.set_zlabel('C')
ax.set_zlim([dmin,dmax])
ax.xaxis.labelpad = -10
ax.yaxis.labelpad = 15
ax.zaxis.labelpad = 15
# Note the inversion of the axis
plt.gca().invert_yaxis()
First I want to align the ticks on the yaxis (labelled B) with each coloured face. As you can see they are now offset slightly down.
Second I want to align the yaxis tick labels with the above, as you cans see they are currently very much offset downwards. I do not know why.
EDIT:
Here is some example data; each column represents one coloured face on the above plot.
-13.216256 -7.851065 -9.965357 -25.502654
-13.216253 -7.851063 -9.965355 -25.502653
-13.216247 -7.851060 -9.965350 -25.502651
-13.216236 -7.851052 -9.965342 -25.502647
-13.216214 -7.851038 -9.965324 -25.502639
-13.216169 -7.851008 -9.965289 -25.502623
-13.216079 -7.850949 -9.965219 -25.502592
-13.215900 -7.850830 -9.965078 -25.502529
Here we are again, with a simpler plot, reproduced with this data:
k = 10
df = pd.DataFrame(np.array([range(k),
[x + 1 for x in range(k)],
[x + 4 for x in range(k)],
[x + 9 for x in range(k)]]).T,columns=list('abcd'))
If you want to try this with the above function, comment out the df line in the function and change its argument as so def newPoly3D(df): so that you can pass the the test df above.

Categories