I am not an expert in any sense, I am trying to extract a pdf page as an image to do some processing later. I used the following code for that, that I built from other recommendations in this page.
import fitz
from PIL import Image
dir = r'C:\Users\...'
files = os.listdir(dir)
doc = fitz.open(dir+files[21])
page = doc.loadPage(2)
zoom = 2
mat = fitz.Matrix(zoom, zoom)
pix = page.getPixmap(matrix = mat)
img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
Usually this would give me the pixel information of the image, but in this case it returns a list of white pixels. I have no clue as for what is the reason of this... The image (img) is displayed if asked, but not its data.
I will appreciate any help?
If you want to convert pdf to image, and process, you might use something along these lines. This particular simple example reads in 5 pages of the PDF, and for the last page, looks at what percentage of the image is a particular color; the slow way and fast way.
import pdf2image
import numpy as np
# details:
# https://pypi.org/project/pdf2image/
images = pdf2image.convert_from_path('test.pdf')
# Get first five pages, just for testing
i = 1
for image in images:
print(i," shape: ", image.size)
image.save('output' + str(i) + '.jpg', 'JPEG')
i = i + 1
# Look at last image
for i in range(image.width):
for j in range(image.height):
if(x[0]==color_test[0] and x[1]==color_test[1] and x[2]==color_test[2]):
print("frac of specific color = ", specific_color/(specific_color+other))
# faster!
print("(faster) frac of color = ", len(a[0])/((image.width)*(image.height)))
The code works if I take a shorter path and replace doc.loadPage with doc.getPagePixmap
import fitz
from PIL import Image
dir = r'C:\Users\...'
files = os.listdir(dir)
doc = fitz.open(dir+files[21])
pix= doc.getPagePixmap(2)
img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
I still don't know why the long code fails, and the working method doesn't allows me to get a better resolution version of the exctracted page.
I have a functional update view that I am trying to compress uploaded images before saving them. However, when I try to compress the image, nothing happens and instead just saves the image with the exact same size.
I think I might be saving it wrong, but I am unsure of how to save it correctly. Please let me know. Thank you!
import io
from PIL import Image
def get_compressed_image(file):
image = Image.open(file)
with io.BytesIO() as output:
image.save(output, format=image.format, quality=20, optimize=True)
contents = output.getvalue()
return contents
def updated_form_view(request)
if initial_form.is_valid():
updated_form = initial_form.save(commit=False)
updated_form.username = request.user.username
# compressing image here
updated_form.form_image.file.image = get_compressed_image(updated_form.form_image)
we must reduce the resolution of picture, like this:
from PIL import Image
img = Image.open("logo.jpg")
amount = 1.5 # higher amount: more reduction. lower: less reduction
width, height = img.size
new_size = int(width // amount), int(height // amount)
compressed = img.resize(new_size,Image.ANTIALIAS)
less resolution + no exif info = less size
and take note ur code worked but ur code dont change the resolution of picture but it pixelize image like this:
just click it
I'm relatively new to python, and I'm trying to replace an image at a given location.
The idea is to check if the extracted image in the PDF matches the image I want to replace. If it does, I extract the location and put the new image in its place. I'm done with the extracting and checking part. Could someone please help me with the later part?
Step 1: convert mypdf.pdf to full_page_image.jpg
from pdf2image import convert_from_path
pages = convert_from_path('mypdf.pdf', 500)
pages[x].save('full_page_image.jpg', 'JPEG') #where x is your page number minus one
Step 2: overlay image_to_be_added onto full_page_image
import cv2
import numpy as np
full_page_image = cv2.imread('full_page_image.jpg')
image_to_be_added = cv2.imread('image_to_be_added.jpg')
final_image = full_page_image.copy()
final_image[100:400,100:400,:] = image_to_be_added[100:400,100:400,:] #adjust the numbers according to the dimensions of the image_to_be_added
cv2.imwrite(final_image.jpg, final_image)
Step3: convert final_image.jpg to final_pdf.pdf
from PIL import Image
final_image2 = Image.open(r'final_image.jpg')
final_image3 = final_image2.convert('RGB')
I'm trying to extract images from a pdf using PyPDF2, but when my code gets it, the image is very different from what it should actually look like, look at the example below:
But this is how it should really look like:
Here's the pdf I'm using:
Here's my code:
pdf_filename = "SAMPLE.pdf"
pdf_file = open(pdf_filename, 'rb')
cond_scan_reader = PyPDF2.PdfFileReader(pdf_file)
page = cond_scan_reader.getPage(0)
xObject = page['/Resources']['/XObject'].getObject()
i = 0
for obj in xObject:
# print(xObject[obj])
if xObject[obj]['/Subtype'] == '/Image':
if xObject[obj]['/Filter'] == '/DCTDecode':
data = xObject[obj]._data
img = open("{}".format(i) + ".jpg", "wb")
i += 1
And since I need to keep the image in it's colour mode, I can't just convert it to RBG if it was CMYK because I need that information.
Also, I'm trying to get dpi from images I get from a pdf, is that information always stored in the image?
Thanks in advance
I used pdfreader to extract the image from your example.
The image uses ICCBased colorspace with the value of N=4 and Intent value of RelativeColorimetric. This means that the "closest" PDF colorspace is DeviceCMYK.
All you need is to convert the image to RGB and invert the colors.
Here is the code:
from pdfreader import SimplePDFViewer
import PIL.ImageOps
fd = open("SAMPLE PDF.pdf", "rb")
viewer = SimplePDFViewer(fd)
img = viewer.canvas.images['Im0']
# this displays ICCBased 4 RelativeColorimetric
print(img.ColorSpace[0], img.ColorSpace[1].N, img.Intent)
pil_image = img.to_Pillow()
pil_image = pil_image.convert("RGB")
inverted = PIL.ImageOps.invert(pil_image)
Read more on PDF objects: Image (sec. 8.9.5), InlineImage (sec. 8.9.7)
Hope this works: you probably need to use another library such as Pillow:
Here is an example:
from PIL import Image
image = Image.open("path_to_image")
if image.mode == 'CMYK':
image = image.convert('RGB')
Reference: Convert from CMYK to RGB
I'm trying to locate an image, then store another image relative to the first one within an array. Afterwards, I want those images to drop into a word document using the docx library. Currently, I'm getting the following error, despite a few different solutions I've tried below. Here's the code:
import sys
import PIL
import pyautogui
import docx
import numpy
def grab_paperclip_images():
This'll look at the documents that're on
the current screen, and create images of
each document with a paperclip. I'll be
testing on an unsorted screen first.
image_array = []
clip_array = find_all_objects("WHITE_PAPERCLIP.png")
for item in clip_array:
coordinates = item[0]+45, item[1], 222, item[3]
return image_array
doc = docx.Document()
images = grab_paperclip_images()
for image in images:
#print image
#yields: [<PIL.Image.Image image mode=RGB size=222x12 at 0x7CC7770>,etc]
#Tried this - no dice
#img = PIL.Image.open(image)
Please let me know what I'm misunderstanding, and if you see any suggestions to make the code more pythonic, better scoped, etc.
As always, thanks for the help, sincerely!
Figured out a way around this. I had to save the images to disk. I could still reference the array, but I couldn't reference the image without saving it. Here's my workaround:
def grab_paperclip_images():
This'll look at the documents that're on
the current screen, and create images of
each document with a paperclip. I'll be
testing it on an unsorted screen first.
bottom_record = pyautogui.screenshot(
image_array = []
clip_array = find_all_objects("WHITE_PAPERCLIP.png")
count = 0
for item in clip_array:
coordinates = item[0]+45, item[1], 222, item[3]
filename = "image"+str(count)+".png"
image = pyautogui.screenshot(filename, region=coordinates)
count += 1
return image_array
doc = docx.Document()
images = grab_paperclip_images()
for image in images:
I currently try to make a movie out of images, but i could not find anything helpful .
Here is my code so far:
import time
from PIL import ImageGrab
x =0
while True:
x+= 1
movie = #Idontknow
for _ in range(x):
You could consider using an external tool like ffmpeg to merge the images into a movie (see answer here) or you could try to use OpenCv to combine the images into a movie like the example here.
I'm attaching below a code snipped I used to combine all png files from a folder called "images" into a video.
import cv2
import os
image_folder = 'images'
video_name = 'video.avi'
images = [img for img in os.listdir(image_folder) if img.endswith(".png")]
frame = cv2.imread(os.path.join(image_folder, images[0]))
height, width, layers = frame.shape
video = cv2.VideoWriter(video_name, 0, 1, (width,height))
for image in images:
video.write(cv2.imread(os.path.join(image_folder, image)))
It seems that the most commented section of this answer is the use of VideoWriter. You can look up it's documentation in the link of this answer (static) or you can do a bit of digging of your own. The first parameter is the filename, followed by an integer (fourcc in the documentation, the codec used), the FPS count and a tuple of the dimensions of the frame. If you really like digging in that can of worms, here's the fourcc video codecs list.
Thanks , but i found an alternative solution using ffmpeg:
def save():
os.system("ffmpeg -r 1 -i img%01d.png -vcodec mpeg4 -y movie.mp4")
But thank you for your help :)
Here is a minimal example using moviepy. For me this was the easiest solution.
import os
import moviepy.video.io.ImageSequenceClip
image_files = [os.path.join(image_folder,img)
for img in os.listdir(image_folder)
if img.endswith(".png")]
clip = moviepy.video.io.ImageSequenceClip.ImageSequenceClip(image_files, fps=fps)
I use the ffmpeg-python binding. You can find more information here.
import ffmpeg
.input('/path/to/jpegs/*.jpg', pattern_type='glob', framerate=25)
When using moviepy's ImageSequenceClip it is important that the images are in an ordered sequence.
While the documentation states that the frames can be ordered alphanumerically under the hood, I found this not to be the case.
So, if you are having problems, make sure to manually order the frames first.
#Wei Shan Lee (and others): Sure, my whole code looks like this
import os
import moviepy.video.io.ImageSequenceClip
from PIL import Image, ImageFile
image_files = []
for img_number in range(1,20):
image_files.append(path_to_images + 'image_folder/image_' + str(img_number) + '.png')
fps = 30
clip = moviepy.video.io.ImageSequenceClip.ImageSequenceClip(image_files, fps=fps)
clip.write_videofile(path_to_videos + 'my_new_video.mp4')
I've created a function to do this. Similar to the first answer (using opencv) but wanted to add that for me, ".mp4" format did not work. That's why I use the raise within the function.
import cv2
import typing
def write_video(video_path_out:str,
if ".mp4" in video_path_out: raise ValueError("[ERROR] This method does not support .mp4; try .avi instead")
height, width, _ = frames_sequence[0].shape
# 0 means no preprocesing
# 1 means each image will be played with 1 sec delay (1fps)
out = cv2.VideoWriter(video_path_out,0, 1,(width,height))
for frame in frames_sequence:
# you can use as much images as you need, I just use 3 for this example
# put your img1_path,img2_path, img3_path
img1 = cv2.imread(img1_path)
img2 = cv2.imread(img2_path)
img3 = cv2.imread(img3_path)
# img1 can be cv2.imread out; which is a np.ndarray; you can also se PIL
# if you'd like to.
frames_sequence = [img1,img2,img3]
write_video(video_path_out = "mypath_outvideo.avi",
frames_sequence = frames_sequence
Hope it's useful!
Little hacky but avoids creating the file and just lets you watch it in real time.
import glob
from PIL import Image
import cv2
import numpy as np
import time
####### PARAMS
imgs_path = "/Users/user/Desktop/lidar_rig/ouster_data_wide_angle_cam_v9/imgs/*"
cur_img_index = 0
ds_st_index = 0
ds_en_index = -1
fps = 35 # tweak this
###### PARAMS
def cnvt_pil_to_cv2(pil_img):
open_cv_image = np.array(pil_img)
# Convert RGB to BGR
open_cv_image = open_cv_image[:, :, ::-1].copy()
return open_cv_image
img_files = sorted(glob.glob(imgs_path), key = lambda x: int(x.split('/')[-1].split('.')[0]))[ds_st_index:ds_en_index][cur_img_index:]
cnt = 0
for img_pth in img_files:
if not cnt %50: ## DDD -- avoid mem overflow
img = Image.open(img_pth).resize((750,750))
cv2.imshow(img_pth.split("/")[-1], cnvt_pil_to_cv2(img))