How to save figures to pdf as raster images in matplotlib - python

I have some complex graphs made using matplotlib. Saving them to a pdf using the savefig command uses a vector format, and the pdf takes ages to open. Is there any way to save the figure to pdf as a raster image to get around this problem?

You can force individual figure elements to be rasterized like this:
text(1,1,'foobar',rasterized=True)

Not that I know, but you can use the 'convert' program (ImageMagick') to convert a jpg to a pdf: `convert file.jpg file.pdf'.

Related

Export pyvis graph as vector or .png image. Is there a way?

I'm looking for a way to export a huge Graph generated with pyvis in to a vector graphic .svg or at least .png format. Is there a way to do it? So far I've only found the option to save / export as .html file.
Thanks in advance.
For me this worked:
save the graph to HTML file
open the HTML in a new tab
right-click -> save image as ...

How to save grayscale image in Python?

I am trying to save a grayscale image using matplotlib savefig(). I find that the png file which is saved after the use of matplotlib savefig() is a bit different from the output image which is showed when the code runs. The output image which is generated when the code is running contains more details than the saved figure.
How can I save the output plot in such a manner that all details are stored in the output image?
My my code is given below:
import cv2
import matplotlib.pyplot as plt
plt.figure(1)
img_DR = cv2.imread(‘image.tif',0)
edges_DR = cv2.Canny(img_DR,20,40)
plt.imshow(edges_DR,cmap = 'gray')
plt.savefig('DR.png')
plt.show()
The input file (‘image.tif’) can be found from here.
Following is the output image which is generated when the code is running:
Below is the saved image:
Although the two aforementioned images denote the same picture, one can notice that they are slightly different. A keen look at the circular periphery of the two images shows that they are different.
Save the actual image to file, not the figure. The DPI between the figure and the actual created image from your processing will be different. Since you're using OpenCV, use cv2.imwrite. In your case:
cv2.imwrite('DR.png', edges_DR)
Use the PNG format as JPEG is lossy and would thus give you a reduction in quality to promote small file sizes. If accuracy is the key here, use a lossless compression standard and PNG is one example.
If you are somehow opposed to using OpenCV, Matplotlib has an equivalent image writing method called imsave which has the same syntax as cv2.imwrite:
plt.imsave('DR.png', edges_DR, cmap='gray')
Note that I am enforcing the colour map to be grayscale for imsave as it is not automatically inferred like how OpenCV writes images to file.
Since you are using cv2 to load the image, why not using it also to save it.
I think the command you are looking for is :
cv2.imwrite('gray.jpg', gray_image)
Using a DPI that matches the image size seems to make a difference.
The image is of size width=2240 and height=1488 (img_DR.shape). Using fig.get_size_inches() I see that the image size in inches is array([7.24, 5.34]). So an appropriate dpi is about 310 since 2240/7.24=309.4 and 1488/5.34=278.65.
Now I do plt.savefig('DR.png', dpi=310) and get
One experiment to do would be to choose a high enough DPI, calculate height and width of figure in inches, for example width_inch = width_pixel/DPI and set figure size using plt.figure(figsize=(width_inch, height_inch)), and see if the displayed image itself would increase/decrease in quality.
Hope this helps.

Creating an infographic with multiple PDF's in Python

I've created multiple charts using Matplotlib and saved them as PDF's. I need to combine up to 5 PDF's into one PDF, as this will be done many times the task needs to automated with Python. The reason I'm combining PDF's instead of .jpg or .png is that the PDF scales the best and doesn't result in a fuzzy image. I've tried using code from here Is there a matplotlib flowable for ReportLab? but I don't understand how to control the image placement. Reportlab has a function:
.drawImage(file, x-coord, y-coord) which allows for specific placement of the image on the page, unfortunately this function only takes .jpg or .png which are too low quality. If anyone has any suggestions on how to combine PDF's it would be greatly appreciated!
If anyone stumbles upon this I've found that the best way to actually do this is in Latex. There is a python plugin called PyLatex but there is no documentation so instead I will create a Latex template then using subprocess.call in my Python script create the Infographic.

Is it possible to generate vector based pdf using wordcloud

I am using wordcloud in python to generate word clouds.
I was able to reproduce this example on my machine, and then tried to change the last line plt.show() to plt.savefig('image.pdf') to have a pdf output.
I had a pdf with the same result, however, the pdf seems like pixel-based instead of vector-based. When I focus a particular point in the pdf it becomes a very low-quality picture.
Is there any way to produce vector-based pdf using wordcloud? If not, is there any other library that I can produce vector-based (pdf) wordclouds in Python?
If wordcloud can generate any sort of vector output such as ps or svg, inkscape can usually convert it to a PDF without rasterizing it. You can even do this headless, e.g. inkscape my.svg -A my.pdf.
Hmm, looking at wordcloud, it looks like it uses PIL. I don't think that PIL can produce vector images. But if you could use the logic in wordcloud and separate it from PIL, you can get vector fonts onto PDFs by drawing onto a reportlab canvas.
You can save the images in a vector format so that they will be scalable without quality loss. Such formats are PDF and EPS. Just change the extension to .pdf or .eps and matplotlib will write the correct image format.
plt.savefig('destination_path.eps', format='eps')
plt.savefig('destination_path.pdf', format='pdf')
I have found that eps/pdf files work best.

matplotlib: saved imshow pdf looks different from the plot window

The following figure was plotted using imshow in matplotlib with option interpolation='none':
However, after I saved it as a pdf file, the saved pdf file looks quite different:
The problem is: the blue patterns become very blurry.
My question is: How can I save a pdf figure that looks exactly like the plot window?
I solved this problem by specifying the dpi in the savefig for filetype pdf. Even though i read online that dpi is not supposed to make a difference in the vector based pdf format in theory, it did solve the problem for me in practice.
plt.imshow(np.random.random((10,10)))
plt.savefig("test.pdf", dpi=300)
PDF format is a vector image format. This means it is upto the program you open it in to interpret how it should be drawn. This can have some benefits when you want to be able to arbitrarily zoom in and out of an image while keeping high quality. However some programs can modify the image through anti-aliasing.
Your best bet for consistency is to use a pixel based image format. I would suggest try saving it as a .png.

Categories