How to save webkit page image resources from memory? - python

I open page with python, gtk, and webkit. Now - how to save image from that page without downloading it again from the internet?

Here is a python program that will save a rendered web page to an image: http://pastie.org/4572412
This should be the section of primary interest to you:
size = self.browser.mainFrame().contentsSize()
if width > 0:
size.setWidth(width)
self.browser.setViewportSize(size)
# Render the virtual browsers viewport to an image.
image = QImage(self.browser.viewportSize(), QImage.Format_ARGB32)
paint = QPainter(image) #Have the painters target be our image object
self.browser.mainFrame().render(paint) #Render browser window to painter
paint.end()
image = image.scaledToWidth(width) #ensure the image is your desired width
extension = os.path.splitext(filename)[1][1:].upper() #save the file as your desired image extension
if extension not in self.image_extensions:
raise ValueError("'filename' must be a valid extension: {0}".format(self.image_extensions))
image.save(filename, extension)
Hope this helps!

Related

how to load an image png or jpg python3 tkinter in linux debian

I am new to the forum, I am training on the Linux system.
I want to make an application that works well on Windows.
the problem is on "load image", i can browse, but I can't load the image.
thanks a lot
here is the code :
def parcourir():
global imageName
imn = askopenfilename(initialdir="/",title="choisir une image",
filetypes = (("png files","*.png"),("jpeg files","*.jpg")))
if imn:
imageNmae=imn
if imageName:
texte = imageName.split("/")
photoEntre.confiure(text=".../"+texte[-1])

python pdf extract image is extracting 2 images instead of one image(black background and white background

Using python fitz module I am trying to extract images from pdf file, its extracting images correctly for most of the pdf files but for some pdf files its not working properly.
In pdf page only one image is there but while extracting its extracting multiple images and text is distributed in black background image and white background image( one image data is divided into 3 images), below is the code and pdf ad gray images
import fitz
from PIL import Image
# open the file
pdf_file = fitz.open("image path")
# iterate over PDF pages
for page_index in range(len(pdf_file)):
# get the page itself
page = pdf_file[page_index]
image_list = page.get_images()
# printing number of images found in this page
if image_list:
print(f"[+] Found a total of {len(image_list)} images in page {page_index}")
else:
print("[!] No images found on page", page_index)
for image_index, image in enumerate(page.get_images(), start=0):
xref = image[0]
base_image = pdf_file.extract_image(xref)
image_bytes = base_image["image"]
image_ext = base_image["ext"]
image = Image.open(io.BytesIO(image_bytes))
image.save(path to save)

python GTK3 Image from PIL jpg format

I'm trying to open image (jpg) from the buffer (get as blob from Oracle database) as Gtk.Image() and add it to a Gtk window, but I get error "Expected Gtk.Widget, but got PIL.JpegImagePlugin.JpegImageFile". I can show the image with show(), I can window.add jpeg file from path on the disc, and then show the window, but when I try to add jpg from buffer I get the error. Here is what I produced:
my_source=Here_I_get_BLOB_img_from_database()
newimage=Gtk.Image()
newimage=my_source.read()
image=io.BytesIO(newimage)
dt=Image.open(image)
newwindow = Gtk.Window(modal=True)
In this point actually I have jpg in buffer and I can do:
dt.show() # to display the image in system imageviewer
Or save dt as jpg, or add to newwindow a result of image.set_from_file("path with jpg extension") but don't know how to do this:
newwindow.add(dt)
Or anything with similar effect. How to do this in simplest way?
What worked for me was -
Gtk.Image to load img from buffer has Pixbuf object, which for example can be loaded from stream. So:
from gi.repository import (...) GdkPixbuf, GLib, Gio
(...)
my_source=Here_I_get_BLOB_img_from_database()
newimage=my_source.read()
glib=GLib.Bytes.new(newimage)
stream = Gio.MemoryInputStream.new_from_bytes(glib)
pixbuf = GdkPixbuf.Pixbuf.new_from_stream(stream, None)
image=Gtk.Image().new_from_pixbuf(pixbuf)
my_window = Gtk.Window(modal=True, title="Image")
my_window.add(image)
image.show()
my_window.show()

Convert PDF page to image with pyPDF2 and BytesIO

I have a function that gets a page from a PDF file via pyPdf2 and should convert the first page to a png (or jpg) with Pillow (PIL Fork)
from PyPDF2 import PdfFileWriter, PdfFileReader
import os
from PIL import Image
import io
# Open PDF Source #
app_path = os.path.dirname(__file__)
src_pdf= PdfFileReader(open(os.path.join(app_path, "../../../uploads/%s" % filename), "rb"))
# Get the first page of the PDF #
dst_pdf = PdfFileWriter()
dst_pdf.addPage(src_pdf.getPage(0))
# Create BytesIO #
pdf_bytes = io.BytesIO()
dst_pdf.write(pdf_bytes)
pdf_bytes.seek(0)
file_name = "../../../uploads/%s_p%s.png" % (name, pagenum)
img = Image.open(pdf_bytes)
img.save(file_name, 'PNG')
pdf_bytes.flush()
That results in an error:
OSError: cannot identify image file <_io.BytesIO object at 0x0000023440F3A8E0>
I found some threads with a similar issue, (PIL open() method not working with BytesIO) but I cannot see where I am wrong here, as I have pdf_bytes.seek(0) already added.
Any hints appreciated
Per document:
write(stream) Writes the collection of pages added to this object out
as a PDF file.
Parameters: stream – An object to write the file to. The object must
support the write method and the tell method, similar to a file
object.
So the object pdf_bytes contains a PDF file, not an image file.
The reason why there are codes like above work is: sometimes, the pdf file just contains a jpeg file as its content. If your pdf is just a normal pdf file, you can't just read the bytes and parse it as an image.
And refer to as a more robust implementation: https://stackoverflow.com/a/34116472/334999
[![enter image description here][1]][1]
import glob, sys, fitz
# To get better resolution
zoom_x = 2.0 # horizontal zoom
zoom_y = 2.0 # vertical zoom
mat = fitz.Matrix(zoom_x, zoom_y) # zoom factor 2 in each dimension
filename = "/xyz/abcd/1234.pdf" # name of pdf file you want to render
doc = fitz.open(filename)
for page in doc:
pix = page.get_pixmap(matrix=mat) # render page to an image
pix.save("/xyz/abcd/1234.png") # store image as a PNG
Credit
[Convert PDF to Image in Python Using PyMuPDF][2]
https://towardsdatascience.com/convert-pdf-to-image-in-python-using-pymupdf-9cc8f602525b

Upload Image To Imgur After Resizeing In PIL

I am writing a script which will get an image from a link. Then the image will be resized using the PIL module and the uploaded to Imgur using pyimgur. I dont want to save the image on disk, instead manipulate the image in memory and then upload it from memory to Imgur.
The Script:
from pyimgur import Imgur
import cStringIO
import requests
from PIL import Image
LINK = "http://pngimg.com/upload/cat_PNG106.png"
CLIENT_ID = '29619ae5d125ae6'
im = Imgur(CLIENT_ID)
def _upload_image(img, title):
uploaded_image = im.upload_image(img, title=title)
return uploaded_image.link
def _resize_image(width, height, link):
#Retrieve our source image from a URL
fp = requests.get(link)
#Load the URL data into an image
img = cStringIO.StringIO(fp.content)
im = Image.open(img)
#Resize the image
im2 = im.resize((width, height), Image.NEAREST)
#saving the image into a cStringIO object to avoid writing to disk
out_im2 = cStringIO.StringIO()
im2.save(out_im2, 'png')
return out_im2.getvalue()
When I run this script I get this error: TypeError: file() argument 1 must be encoded string without NULL bytes, not str
Anyone has a solution in mind?
It looks like the same problem as this, and the solution is to use StringIO.
A common tip for searching such issues is to search using the generic part of the error message/string.

Categories