PDF is blue when it generated from image - python

I am converting jpg to pdf using PIL library. Below is my code.
im = PIL.Image.open(filename)
PIL.Image.Image.save(im, newfilename, "PDF", resoultion = 200.0,quality = 100)
But output of pdf file is blur and also color of image is change.
Is there any class of PIL which use to avoid such things?
Thanks in advance.

I have success with this code, to convert a JPG to PDF:
from PIL import Image
inputfilename = "aaron.jpg"
outputfilename = "aaron.pdf"
im = Image.open(inputfilename)
dpi = None
if hasattr(im.info, "dpi"):
dpi = im.info.dpi[0] # Assume horizontal DPI is same as vertical DPI.
if not dpi:
dpi = 72 # Assume it's 72 if it's not specified in the JPG.
im.save(outputfilename, resolution=dpi, quality=100)
im.save() also works without the resolution and quality parameters, but I included them here because your example showed them.

Related

I want to compress an image in view before saving, but the file size stays the same

I have a functional update view that I am trying to compress uploaded images before saving them. However, when I try to compress the image, nothing happens and instead just saves the image with the exact same size.
I think I might be saving it wrong, but I am unsure of how to save it correctly. Please let me know. Thank you!
import io
from PIL import Image
def get_compressed_image(file):
image = Image.open(file)
with io.BytesIO() as output:
image.save(output, format=image.format, quality=20, optimize=True)
contents = output.getvalue()
return contents
def updated_form_view(request)
...
if initial_form.is_valid():
initial_form.clean()
updated_form = initial_form.save(commit=False)
updated_form.username = request.user.username
# compressing image here
updated_form.form_image.file.image = get_compressed_image(updated_form.form_image)
updated_form.save()```
we must reduce the resolution of picture, like this:
from PIL import Image
img = Image.open("logo.jpg")
amount = 1.5 # higher amount: more reduction. lower: less reduction
width, height = img.size
new_size = int(width // amount), int(height // amount)
compressed = img.resize(new_size,Image.ANTIALIAS)
compressed.save("compressed.jpg")
less resolution + no exif info = less size
and take note ur code worked but ur code dont change the resolution of picture but it pixelize image like this:
just click it

Is it possible to pytesseract a bytes image?

I am trying to crop an image with cv2 (converting it to a bytes file and therefore not needing to save it)and afterwards perform pytesseract.
This way i won't need to save the image twice during the process.
First when i create the image
When cropping the image
Process...
## CROPPING THE IMAGE REGION
ys, xs = np.nonzero(mask2)
ymin, ymax = ys.min(), ys.max()
xmin, xmax = xs.min(), xs.max()
croped = image[ymin:ymax, xmin:xmax]
pts = np.int32([[xmin, ymin],[xmin,ymax],[xmax,ymax],[xmax,ymin]])
cv2.drawContours(image, [pts], -1, (0,255,0), 1, cv2.LINE_AA)
#OPENCV IMAGE TO BYTES WITHOUT SAVING TO DISK
is_success, im_buf_arr = cv2.imencode(".jpg", croped)
byte_im = im_buf_arr.tobytes()
#PYTESSERACT IMAGE USING A BYTES FILE
Results = pytesseract.image_to_string(byte_im, lang="eng")
print(Results)
Unfortunately i get the error : Unsupported image object
Am i missing something? Is there a way to do this process without needing to save the file when cropping? Any help is highly appreciated.
you have croped which is a numpy array.
according to pytesseract examples, you simply do this:
# tesseract needs the right channel order
cropped_rgb = cv2.cvtColor(croped, cv2.COLOR_BGR2RGB)
# give the numpy array directly to pytesseract, no PIL or other acrobatics necessary
Results = pytesseract.image_to_string(cropped_rgb, lang="eng")
from PIL import Image
img_tesseract = Image.fromarray(croped)
Results = pytesseract.image_to_string(img_tesseract, lang="eng")
from PIL import Image
import io
def bytes_to_image(image_bytes):
io_bytes = io.BytesIO(image_bytes)
return Image.open(io_bytes)
pytesseract.image_to_data(byte_array_image,lang='eng')

Python add noise to image breaks PNG

I'm trying to create a image system in Python 3 to be used in a web app. The idea is to load an image from disk and add some random noise to it. When I try this, I get what looks like a totally random image, not resembling the original:
import cv2
import numpy as np
from skimage.util import random_noise
from random import randint
from pathlib import Path
from PIL import Image
import io
image_files = [
{
'name': 'test1',
'file': 'test1.png'
},
{
'name': 'test2',
'file': 'test2.png'
}
]
def gen_image():
rand_image = randint(0, len(image_files)-1)
image_file = image_files[rand_image]['file']
image_name = image_files[rand_image]['name']
image_path = str(Path().absolute())+'/img/'+image_file
img = cv2.imread(image_path)
noise_img = random_noise(img, mode='s&p', amount=0.1)
img = Image.fromarray(noise_img, 'RGB')
fp = io.BytesIO()
img.save(fp, format="PNG")
content = fp.getvalue()
return content
gen_image()
I have also tried using pypng:
import png
# Added the following to gen_image()
content = png.from_array(noise_img, mode='L;1')
content.save('image.png')
How can I load a png (With alpha transparency) from disk, add some noise to it, and return it so that it can be displayed by web server code (flask, aiohttp, etc).
As indicated in the answer by makayla, this makes it better: noise_img = (noise_img*255).astype(np.uint8) but the colors are still wrong and there's no transparency.
Here's the updated function for that:
def gen_image():
rand_image = randint(0, len(image_files)-1)
image_file = image_files[rand_image]['file']
image_name = image_files[rand_image]['name']
image_path = str(Path().absolute())+'/img/'+image_file
img = cv2.imread(image_path)
cv2.imshow('dst_rt', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Problem exists somewhere below this line.
img = random_noise(img, mode='s&p', amount=0.1)
img = (img*255).astype(np.uint8)
img = Image.fromarray(img, 'RGB')
fp = io.BytesIO()
img.save(fp, format="png")
content = fp.getvalue()
return content
This will popup a pre-noise image and return the noised image. RGB (And alpha) problem exists in returned image.
I think the problem is it needs to be RGBA but when I change to that, I get ValueError: buffer is not large enough
Given all the new information I am updating my answer with a few more tips for debugging the issue.
I found a site here which creates sample transparent images. I created a 64x64 cyan (R=0, G=255, B=255) image with a transparency layer of 0.5. I used this to test your code.
I read in the image two ways to compare: im1 = cv2.imread(fileName) and im2 = cv2.imread(fileName,cv2.IMREAD_UNCHANGED). np.shape(im1) returned (64,64,3) and np.shape(im2) returned (64,64,4). This is why that flag is required--the default imread settings in opencv will read in a transparent image as a normal RGB image.
However opencv reads in as BGR instead of RGB, and since you cannot save out with opencv, you'll need to convert it to the correct order otherwise the image will have reversed color. For example, my cyan image, when viewed with the reversed color appears like this:
You can change this using openCV's color conversion function like this im = cv2.cvtColor(im, cv2.COLOR_BGRA2RGBA) (Here is a list of all the color conversion codes). Again, double check the size of your image if you need to, it should still have four channels since you converted it to RGBA.
You can now add your noise to your image. Just so you know, this is also going to add noise to your alpha channel as well, randomly making some pixels more transparent and others less transparent. The random_noise function from skimage converts your image to float and returns it as float. This means the image values, normally integers ranging from 0 to 255, are converted to decimal values from 0 to 1. Your line img = Image.fromarray(noise_img, 'RGB') does not know what to do with the floating point noise_img. That's why the image is all messed up when you save it, as well as when I tried to show it.
So I took my cyan image, added noise, and then converted the floats back to 8 bits.
noise_img = random_noise(im, mode='s&p', amount=0.1)
noise_img = (noise_img*255).astype(np.uint8)
img = Image.fromarray(noise_img, 'RGBA')
It now looks like this (screenshot) using img.show():
I used the PIL library to save out my image instead of openCV so it's as close to your code as possible.
fp = 'saved_im.png'
img.save(fp, format="png")
I loaded the image into powerpoint to double-check that it preserved the transparency when I saved it using this method. Here is a screenshot of the saved image overlaid on a red circle in powerpoint:

Extract an image from a PDF in python

I'm trying to extract images from a pdf using PyPDF2, but when my code gets it, the image is very different from what it should actually look like, look at the example below:
But this is how it should really look like:
Here's the pdf I'm using:
https://www.hbp.com/resources/SAMPLE%20PDF.pdf
Here's my code:
pdf_filename = "SAMPLE.pdf"
pdf_file = open(pdf_filename, 'rb')
cond_scan_reader = PyPDF2.PdfFileReader(pdf_file)
page = cond_scan_reader.getPage(0)
xObject = page['/Resources']['/XObject'].getObject()
i = 0
for obj in xObject:
# print(xObject[obj])
if xObject[obj]['/Subtype'] == '/Image':
if xObject[obj]['/Filter'] == '/DCTDecode':
data = xObject[obj]._data
img = open("{}".format(i) + ".jpg", "wb")
img.write(data)
img.close()
i += 1
And since I need to keep the image in it's colour mode, I can't just convert it to RBG if it was CMYK because I need that information.
Also, I'm trying to get dpi from images I get from a pdf, is that information always stored in the image?
Thanks in advance
I used pdfreader to extract the image from your example.
The image uses ICCBased colorspace with the value of N=4 and Intent value of RelativeColorimetric. This means that the "closest" PDF colorspace is DeviceCMYK.
All you need is to convert the image to RGB and invert the colors.
Here is the code:
from pdfreader import SimplePDFViewer
import PIL.ImageOps
fd = open("SAMPLE PDF.pdf", "rb")
viewer = SimplePDFViewer(fd)
viewer.render()
img = viewer.canvas.images['Im0']
# this displays ICCBased 4 RelativeColorimetric
print(img.ColorSpace[0], img.ColorSpace[1].N, img.Intent)
pil_image = img.to_Pillow()
pil_image = pil_image.convert("RGB")
inverted = PIL.ImageOps.invert(pil_image)
inverted.save("sample.png")
Read more on PDF objects: Image (sec. 8.9.5), InlineImage (sec. 8.9.7)
Hope this works: you probably need to use another library such as Pillow:
Here is an example:
from PIL import Image
image = Image.open("path_to_image")
if image.mode == 'CMYK':
image = image.convert('RGB')
image.write("path_to_image.jpg")
Reference: Convert from CMYK to RGB

Python Image Library Image Resolution when Resizing

I am trying to shrink some jpeg images from 24X36 inches to 11X16.5 inches using the python image library. Since PIL deals in pixels this should mean scaling from 7200X 4800 pixels to 3300 X2200 pixels, with my resolution set at 200 pixels/inch, however when I run my script PIL changes the resolution to 72 pixels/inch and I end up with a larger image than i had before.
import Image
im = Image.open("image.jpg")
if im.size == (7200, 4800):
out = im.resize((3300,2200), Image.ANTIALIAS)
elif im.size == (4800,7200):
out = im.resize((2200,3300), Image.ANTIALIAS)
out.show()
Is there a way to mantain my image resolution when I'm resizing my images?
thanks for any help!
To preserve the DPI, you need to specify it when saving; the info attribute is not always preserved across image manipulations:
dpi = im.info['dpi'] # Warning, throws KeyError if no DPI was set to begin with
# resize, etc.
out.save("out.jpg", dpi=dpi)

Categories