Matplotlib not showing histogram correctly when saving figure - python

Using matplotlib, I am plotting 2 histograms in one figure. The goal is to add them to a Latex document later. I am interested in the difference between the two, so I use a low transparency and plot them on top of each other. In Spyder, when I plot inline, the image looks fine. See wanted plot
When I export the image as a PNG using plt.savefig(), the image looks like this. However, this does not work well in Latex documents as the scaling gets ruined. When I try to export it as a PDF file, the bars of the histogram seem to overlap, making it seem like it has edges, like in ugly plot.
I think the cause of the problem is due to the vector format, when zooming in and out of the PDF, the overlap changes. When zoomed in completely, it looks identical to the PNG, when zoomed out the overlap becomes much larger. I would be very grateful if anyone knew the solution to this.
Things I have tried already:
changing linewidth/edgecolor
changing the matplotlibrc file
changing the distance of the bins using rwidth
Code I am using:
binwidth = (np.max(prediction) - np.min(prediction)) / (2*I**(1/3))
kwargs = dict(alpha=0.5, bins=np.arange(min(prediction), max(prediction) + binwidth, binwidth))
fig_path = '***.pdf'
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.hist(prediction.flatten(), **kwargs, label = 'NN')
ax.hist(hedging_error, **kwargs, label = 'BS')
ax.set_xlim((-3,3))
ax.set_xlabel('Hedging error')
ax.set_ylabel('Count')
ax.legend()
fig.savefig(fig_path)

To remove the bin edges use plt.hist(..., histtype='stepfilled')
https://matplotlib.org/stable/gallery/statistics/histogram_histtypes.html

Related

How to save a pyplot figure in its maximized state

I am trying to save a pyplot figure in its maximized form as displayed when I call plt.show() because maximizing the graph correctly displays the data, while a smaller 'windowed' version of the plot that is currently getting saved has the data incorrectly shifted/formatted.
Current code:
mng = plt.get_current_fig_manager()
mng.window.showMaximized()
plt.savefig(path + '.png', dpi=fig.dpi)
plt.show()
I use the current_fig_manager to force plot.show() to show in its maximized state which it does correctly, but plt.savefig() still saves the fig in the smaller format.
I am looking into ways to grab the dimensions of mng.window.showMaximized() in inches and then plugging that into savefig() but was wondering if there is a better approach?
Try to config the figure size before starting the code with plt.figure(figsize=(width, height)). See bellow a example:
plt.figure(figsize=(10,6)) #10,6 is the figure dimensions
plt.plot(...)
...
plt.savefig(...)
Doing that the savefig function will use the defined dimensions.
Solution is create a figure and set the size in inches there, then save the figure instead of the plot.
fig = matplotlib.pyplot.gcf()
fig.set_size_inches(18.5, 10.5)
fig.savefig('test2png.png', dpi=100)

How to read saved image and locate it in coordinates without any distortins?

I can't overcome maybe very simple obstacle. First, I am doing some spatial operations with shape files, plot the results and save the image:
# read different shape-files, overlaying them, sjoining them`
...
# plotting results:
fig, ax = plt.subplots(figsize=[10, 10])
ax.set_xlim(left=9686238.14, right=9727068.02)
ax.set_ylim(bottom=7070076.66, top=7152463.12)
# various potting like objects.plot(ax=ax, column = 'NAME', cmap='Pastel2', k=6, legend=False) and many others
plt.axis('equal')
plt.savefig('back11.png', dpi=300)
plt.close()
Thus I got such a nice picture back11.png:
Second, I am reading that picture and (in the same cordinates) want to see absolutlely identical one map11.png:
fig, ax = plt.subplots(figsize=[10, 10])
ax.set_xlim(left=9686238.14, right=9727068.02)
ax.set_ylim(bottom=7070076.66, top=7152463.12)
back = plt.imread('back11.png')
ax.imshow(back, extent=[9686238.14, 9727068.02, 7070076.66, 7152463.12])
plt.axis('equal')
plt.savefig('map11.png', dpi=300)
plt.close()
But really I got something else (map11.png):
What is the origin of such a strange mismatch?
When matplotlib is showing an image using plt.imshow, it automatically adds axis and white space around it (regardless of the image content). While your image is accidentally another plot, which contains axis and white space itself. To solve that problem, use
plt.subplots_adjust(0, 0, 1, 1)
plt.axis('off')
which should output nothing but the image.
But on the other hand, you have to specify plt.figure(figsize=xxx, dpi=xxx) correctly in order to get THE stored image (correct size, no interpolation or re-sampling). If you simply want to see the image using python (and you are in jupyter notebook), you can use Pillow. If you convert the image to a PIL.Image object, it is by itself displayable by jupyter REPL.
If you are not inside jupyter, you might also directly open the image using os image viewer. It is at least more convenient than matplotlib to display the "exact" image.
BTW, when displaying the image, the same parameters do not apply any more (since its an image and all parameters are hidden inside the content of it). Therefore, there's no need (and it's wrong) to write all those magic numbers. Also if you want to save the image without white border and axis, use the code above before calling plt.savefig

Is It Possible To Set Transparency When Using LineCollection in Matplotlib?

There are a number of helpful posts for using LineCollections in Matplotlib.
I have working code, but am having trouble figuring out how to set the transparency of the lines. For example, in Pandas it's as easy as doing:
df.plot(kind='line',alpha=.25)
However, I chose the LineCollection method because I want to plot a dataframe with >15k lines and the above example does not work.
I've tried adding ax.set_alpha(.25) in my code:
fig, ax = plt.subplots()
ax.set_xlim(np.min(may_days), np.max(may_days))
ax.set_ylim(np.min(may_segments.min()), np.max(may_segments.max()))
line_segments = LineCollection(may_segments,cmap='jet')
line_segments.set_array(may_days)
ax.add_collection(line_segments)
ax.set_alpha(.05)
ax.set_title('Daily May Data')
plt.show()
but there is no change.
Unfortunately I cannot provide a sample of the data with which I'm working; however, I've found the second example this Matplotlib gallery doc to be easy to copy.
You do it the same way you'd do it in pandas.
line_segments = LineCollection(may_segments, cmap='jet', alpha=0.05)

holoviews/bokeh gridline issue

I am attempting to make a heat map with holoviews (currently using the bokeh backend). I have a data frame ('dep_df') with 3 columns: X, Y, type. X and Y are the dimension labels, and type is a categorical variables b/n 0 and n (where n is an integer). Here's my code:
dep_hm = hv.HeatMap(dep_df[["X", "Y", "type"]], label="DEP population")
TOOLS = ['hover']
colors = palettes.d3['Category20b'][5]
%%opts HeatMap [width=300, height=300, xaxis=None, yaxis=None, show_grid=True]
grid_style = {'grid_line_color': 'white', 'grid_line_width': 1.5}
dep_hm.options(cmap=ListedColormap(colors), gridstyle=grid_style, tools=TOOLS, invert_axes=True)
The plot looks correct in Jupiter notebook except that the ygrid lines don't show (only xgrid), and its showing all tools instead of just 'hover' as I specified. Even with the grid lines that do show, there's always a missing gridline exactly in the middle (have had that issue even in straight bokeh implementations of this heatmap.
Another issue is that I've tried saving the file to HTML using both Bokeh.io and renderer.save() and in both cases, all formatting options are not executed (like not showing the axes, inverting the axes, and not showing full toolbar options). it seems to just save the plot with default options.
Thanks for your help.
renderer.save() doesn't read the notebook magic i.e. %%opts HeatMap [width=300, height=300, xaxis=None, yaxis=None, show_grid=True]
You have to use your_variable.options(width=300, height=300, xaxis=None, yaxis=None, show_grid=True) to make it stick. See http://holoviews.org/user_guide/Customizing_Plots.html Simplified format
Not sure about your other issue though.

Matplotlib/Latex issues when using \odot as marker

I'm trying to use the latex symbol \odot as a marker in a scatter plot but I also need latex style ticks, but for some reason these two are not playing well together. I can successfully use marker=$\\odot$ with usetex=False, like this, but when I set it equal to true (to get the tick font right), I get ! LaTeX Error: File 'type1cm.sty' not found. I've already gone through to make sure I have the sty file installed and in the correct directory and that I have all the dependencies installed (as suggested here). Plus, I can still have usetex=True and use any of the normal pyplot markers, just not anything involving math font, but can I can have \odot in the label for the legend. Ive also already tried appending the rc params with amsmath but still keep getting the type1cm error. I've also tried using the raw string literal to no avail.
So basically when usetex=True, I can use math symbols in the label for the legend, just not as the actual marker. Has anyone experienced this issue before?
My current work around involves just plotting a large unfilled circle and overplotting a small filled circle (basically simulating the odot). Then I run into an issue with the legend so I basically have to create a transparent legend showing the large unfilled circles and then plot the smaller filled circles behind it by hand like this which ends up wonky, but this has the axes tick font I need. This becomes very frustrating if I have to change axes limits though, because I have to repeat the process of figuring out where to plot the small filled circles all over again.
Does anyone know if there is a better work around than this? Would it be possible to use the overplotting scheme like I have been, but then create a custom proxy artist to display the \odot symbol (in the different colors/sizes) in the legend?
Mac OSX, matplotlib 1.4.2, python 2.7, matplotlib is using pdfTeX thru TeX Live 2017/Mac Ports 2017
Edit: Here is my code
plt.rc('text', usetex=True)
plt.rc('font', family='serif')
f, ax1 = plt.subplots(1,1)
x = np.arange(20)
y = x
ax1.scatter(x, y, marker='$\\odot$', edgecolors='b', s=200, label = 'Test') #used with usetex=False
#ax1.scatter(x, y, marker='o', edgecolors='b', s=200, label = 'Test') #used with usetex=True
ax1.tick_params(labelsize=24)
leg = ax1.legend(scatterpoints=1, loc='lower right', borderaxespad=0., handletextpad=0.)#, fontsize=18) # borderpad=0.,)
I'm not sure how much I can help without seeing your code, but this worked for me:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
x1 = [1,2]; x2 = [1,2]
y1 = [1,1]; y2 = [2,2]
mpl.rc('text', usetex = True)
fig, ax = plt.subplots(1,1)
ax.scatter(x1,y1, label='A1', marker=r'$\odot$',s=150, c='b')
ax.scatter(x2,y2, label='A2', marker=r'$\odot$',s=50, c='b')
ax.set_xlim(0,3)
ax.set_ylim(0,3)
ax.legend()
fig.show()
If this doesn't help let me know!

Categories