Convert Jpg image to heice in python [duplicate] - python
The High Efficiency Image File (HEIF) format is the default when airdropping an image from an iPhone to a OSX device. I want to edit and modify these .HEIC files with Python.
I could modify phone settings to save as JPG by default but that doesn't really solve the problem of being able to work with the filetype from others. I still want to be able to process HEIC files for doing file conversion, extracting metadata, etc. (Example Use Case -- Geocoding)
Pillow
Here is the result of working with Python 3.7 and Pillow when trying to read a file of this type.
$ ipython
Python 3.7.0 (default, Oct 2 2018, 09:20:07)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from PIL import Image
In [2]: img = Image.open('IMG_2292.HEIC')
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-2-fe47106ce80b> in <module>
----> 1 img = Image.open('IMG_2292.HEIC')
~/.env/py3/lib/python3.7/site-packages/PIL/Image.py in open(fp, mode)
2685 warnings.warn(message)
2686 raise IOError("cannot identify image file %r"
-> 2687 % (filename if filename else fp))
2688
2689 #
OSError: cannot identify image file 'IMG_2292.HEIC'
It looks like support in python-pillow was requested (#2806) but there are licensing / patent issues preventing it there.
ImageMagick + Wand
It appears that ImageMagick may be an option. After doing a brew install imagemagick and pip install wand however I was unsuccessful.
$ ipython
Python 3.7.0 (default, Oct 2 2018, 09:20:07)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from wand.image import Image
In [2]: with Image(filename='img.jpg') as img:
...: print(img.size)
...:
(4032, 3024)
In [3]: with Image(filename='img.HEIC') as img:
...: print(img.size)
...:
---------------------------------------------------------------------------
MissingDelegateError Traceback (most recent call last)
<ipython-input-3-9d6f58c40f95> in <module>
----> 1 with Image(filename='ces2.HEIC') as img:
2 print(img.size)
3
~/.env/py3/lib/python3.7/site-packages/wand/image.py in __init__(self, image, blob, file, filename, format, width, height, depth, background, resolution, pseudo)
4603 self.read(blob=blob, resolution=resolution)
4604 elif filename is not None:
-> 4605 self.read(filename=filename, resolution=resolution)
4606 # clear the wand format, otherwise any subsequent call to
4607 # MagickGetImageBlob will silently change the image to this
~/.env/py3/lib/python3.7/site-packages/wand/image.py in read(self, file, filename, blob, resolution)
4894 r = library.MagickReadImage(self.wand, filename)
4895 if not r:
-> 4896 self.raise_exception()
4897
4898 def save(self, file=None, filename=None):
~/.env/py3/lib/python3.7/site-packages/wand/resource.py in raise_exception(self, stacklevel)
220 warnings.warn(e, stacklevel=stacklevel + 1)
221 elif isinstance(e, Exception):
--> 222 raise e
223
224 def __enter__(self):
MissingDelegateError: no decode delegate for this image format `HEIC' # error/constitute.c/ReadImage/556
Any other alternatives available to do a conversion programmatically?
Consider using PIL in conjunction with pillow-heif:
pip3 install pillow-heif
from PIL import Image
from pillow_heif import register_heif_opener
register_heif_opener()
image = Image.open('image.heic')
That said, I'm not aware of any licensing/patent issues that would prevent HEIF support in Pillow (see this or this). libheif is widely adopted and free to use, provided you do not bundle the HEIF decoder with a device and fulfill the requirements of the LGPLv3 license.
You guys should check out this library, it's a Python 3 wrapper to the libheif library, it should serve your purpose of file conversion, extracting metadata:
https://github.com/david-poirier-csn/pyheif
https://pypi.org/project/pyheif/
Example usage:
import io
import whatimage
import pyheif
from PIL import Image
def decodeImage(bytesIo):
fmt = whatimage.identify_image(bytesIo)
if fmt in ['heic', 'avif']:
i = pyheif.read_heif(bytesIo)
# Extract metadata etc
for metadata in i.metadata or []:
if metadata['type']=='Exif':
# do whatever
# Convert to other file format like jpeg
s = io.BytesIO()
pi = Image.frombytes(
mode=i.mode, size=i.size, data=i.data)
pi.save(s, format="jpeg")
...
I was quite successful with Wand package :
Install Wand:
https://docs.wand-py.org/en/0.6.4/
Code for conversion:
from wand.image import Image
import os
SourceFolder="K:/HeicFolder"
TargetFolder="K:/JpgFolder"
for file in os.listdir(SourceFolder):
SourceFile=SourceFolder + "/" + file
TargetFile=TargetFolder + "/" + file.replace(".HEIC",".JPG")
img=Image(filename=SourceFile)
img.format='jpg'
img.save(filename=TargetFile)
img.close()
Here is another solution to convert heic to jpg while keeping the metadata intact. It is based on mara004s solution above, however I was not able to extract the images timestamp in that way, so had to add some code. Put the heic files in dir_of_interest before applying the function:
import os
from PIL import Image, ExifTags
from pillow_heif import register_heif_opener
from datetime import datetime
import piexif
import re
register_heif_opener()
def convert_heic_to_jpeg(dir_of_interest):
filenames = os.listdir(dir_of_interest)
filenames_matched = [re.search("\.HEIC$|\.heic$", filename) for filename in filenames]
# Extract files of interest
HEIC_files = []
for index, filename in enumerate(filenames_matched):
if filename:
HEIC_files.append(filenames[index])
# Convert files to jpg while keeping the timestamp
for filename in HEIC_files:
image = Image.open(dir_of_interest + "/" + filename)
image_exif = image.getexif()
if image_exif:
# Make a map with tag names and grab the datetime
exif = { ExifTags.TAGS[k]: v for k, v in image_exif.items() if k in ExifTags.TAGS and type(v) is not bytes }
date = datetime.strptime(exif['DateTime'], '%Y:%m:%d %H:%M:%S')
# Load exif data via piexif
exif_dict = piexif.load(image.info["exif"])
# Update exif data with orientation and datetime
exif_dict["0th"][piexif.ImageIFD.DateTime] = date.strftime("%Y:%m:%d %H:%M:%S")
exif_dict["0th"][piexif.ImageIFD.Orientation] = 1
exif_bytes = piexif.dump(exif_dict)
# Save image as jpeg
image.save(dir_of_interest + "/" + os.path.splitext(filename)[0] + ".jpg", "jpeg", exif= exif_bytes)
else:
print(f"Unable to get exif data for {filename}")
Adding to the answer by danial, i just had to modify the byte array slighly to get a valid datastream for further work. The first 6 bytes are 'Exif\x00\x00' .. dropping these will give you a raw format that you can pipe into any image processing tool.
import pyheif
import PIL
import exifread
def read_heic(path: str):
with open(path, 'rb') as file:
image = pyheif.read_heif(file)
for metadata in image.metadata or []:
if metadata['type'] == 'Exif':
fstream = io.BytesIO(metadata['data'][6:])
# now just convert to jpeg
pi = PIL.Image.open(fstream)
pi.save("file.jpg", "JPEG")
# or do EXIF processing with exifread
tags = exifread.process_file(fstream)
At least this worked for me.
You can use the pillow_heif library to read HEIF images in a way compatible with PIL.
The example below will import a HEIF picture and save it in png format.
from PIL import Image
import pillow_heif
heif_file = pillow_heif.read_heif("HEIC_file.HEIC")
image = Image.frombytes(
heif_file.mode,
heif_file.size,
heif_file.data,
"raw",
)
image.save("./picture_name.png", format="png")
This will do go get the exif data from the heic file
import pyheif
import exifread
import io
heif_file = pyheif.read_heif("file.heic")
for metadata in heif_file.metadata:
if metadata['type'] == 'Exif':
fstream = io.BytesIO(metadata['data'][6:])
exifdata = exifread.process_file(fstream,details=False)
# example to get device model from heic file
model = str(exifdata.get("Image Model"))
print(model)
Example for working with HDR(10/12) bit HEIF files using OpenCV and pillow-heif:
import numpy as np
import cv2
import pillow_heif
heif_file = pillow_heif.open_heif("images/rgb12.heif", convert_hdr_to_8bit=False)
heif_file.convert_to("BGRA;16" if heif_file.has_alpha else "BGR;16")
np_array = np.asarray(heif_file)
cv2.imwrite("rgb16.png", np_array)
Input file for this example can be 10 or 12 bit file.
I am facing the exact same problem as you, wanting a CLI solution. Doing some further research, it seems ImageMagick requires the libheif delegate library. The libheif library itself seems to have some dependencies as well.
I have not had success in getting any of those to work as well, but will continue trying. I suggest you check if those dependencies are available to your configuration.
Simple solution after going over multiple responses from people.
Please install whatimage, pyheif and PIL libraries before running this code.
[NOTE] : I used this command for install.
python3 -m pip install Pillow
Also using linux was lot easier to install all these libraries. I recommend WSL for windows.
code
import whatimage
import pyheif
from PIL import Image
import os
def decodeImage(bytesIo, index):
with open(bytesIo, 'rb') as f:
data = f.read()
fmt = whatimage.identify_image(data)
if fmt in ['heic', 'avif']:
i = pyheif.read_heif(data)
pi = Image.frombytes(mode=i.mode, size=i.size, data=i.data)
pi.save("new" + str(index) + ".jpg", format="jpeg")
# For my use I had my python file inside the same folder as the heic files
source = "./"
for index,file in enumerate(os.listdir(source)):
decodeImage(file, index)
It looked like that there is a solution called heic-to-jpg, but I might be not very sure about how this would work in colab.
the first answer works, but since its just calling save with a BytesIO object as the argument, it doesnt actually save the new jpeg file, but if you create a new File object with open and pass that, it saves to that file ie:
import whatimage
import pyheif
from PIL import Image
def decodeImage(bytesIo):
fmt = whatimage.identify_image(bytesIo)
if fmt in ['heic', 'avif']:
i = pyheif.read_heif(bytesIo)
# Convert to other file format like jpeg
s = open('my-new-image.jpg', mode='w')
pi = Image.frombytes(
mode=i.mode, size=i.size, data=i.data)
pi.save(s, format="jpeg")
I use the pillow_heif library. For example, I use this script when I have a folder of heif files I want to convert to png.
from PIL import Image
import pillow_heif
import os
from tqdm import tqdm
import argparse
def get_images(heic_folder):
# Get all the heic images in the folder
imgs = [os.path.join(heic_folder, f) for f in os.listdir(heic_folder) if f.endswith('.HEIC')]
# Name of the folder where the png files will be stored
png_folder = heic_folder + "_png"
# If it doesn't exist, create the folder
if not os.path.exists(png_folder):
os.mkdir(png_folder)
for img in tqdm(imgs):
heif_file = pillow_heif.read_heif(img)
image = Image.frombytes(
heif_file.mode,
heif_file.size,
heif_file.data,
"raw",
)
image.save(os.path.join(png_folder,os.path.basename(img).split('.')[0])+'.png', format("png"))
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Convert heic images to png')
parser.add_argument('heic_folder', type=str, help='Folder with heic images')
args = parser.parse_args()
get_images(args.heic_folder)
Related
Open webp file with Pillow
On an Anaconda set up, I want to run following simple code: from PIL import Image img = Image.open('test.webp') This works on my Linux machine, but on Windows: UserWarning: image file could not be identified because WEBP support not installed warnings.warn(message) --------------------------------------------------------------------------- UnidentifiedImageError Traceback (most recent call last) <ipython-input-3-79ee787a81b3> in <module> ----> 1 img = Image.open('test.webp') ~\miniconda3\lib\site-packages\PIL\Image.py in open(fp, mode, formats) 3028 for message in accept_warnings: 3029 warnings.warn(message) -> 3030 raise UnidentifiedImageError( 3031 "cannot identify image file %r" % (filename if filename else fp) 3032 ) UnidentifiedImageError: cannot identify image file 'test.webp' I do have the libwebp package installed, and libwebp.dll is present in the Library\bin directory of my Anaconda set up. Any idea?
Looks like no solution is in sight. Assuming dwebp from the libwepb library is available in the system, this works for me to generate a Pillow Image object from a WEBP file: import os import subprocess from PIL import Image, UnidentifiedImageError from io import BytesIO class dwebpException(Exception): pass def dwebp(file: str): webp = subprocess.run( f"dwebp {file} -quiet -o -", shell=True, capture_output=True ) if webp.returncode != 0: raise dwebpException(webp.stderr.decode()) else: return Image.open(BytesIO(webp.stdout)) filename = 'test.webp' try: img = Image.open(filename) except UnidentifiedImageError: if os.path.splitext(filename)[1].lower() == ".webp": img = dwebp(filename) else: raise
The question is whether Anaconda actually builds Pillow with webp support. Looking on the anaconda website I couldn't determine this. However, conda-forge does build Pillow with webp support, see here. So you might want to consider using that.
How can i pass image itself (np.array) not path of it to zxing library for decode pdf417
Code: import zxing from PIL import Image reader = zxing.BarCodeReader() path = 'C:/Users/UI UX/Desktop/Uasa.png' im = Image.open(path) barcode = reader.decode(path) print(barcode) when i use code above work fine and return result: BarCode(raw='P<E.... i need to use this code: import zxing import cv2 reader = zxing.BarCodeReader() path = 'C:/Users/UI UX/Desktop/Uasa.png' img = cv2.imread (path) cv2.imshow('img', img) cv2.waitKey(0) barcode = reader.decode(img) print(barcode) but this code return an error: TypeError: expected str, bytes or os.PathLike object, not numpy.ndarray In another program i have image at base64 could help me somewhere here? any body could help me with this?
ZXing does not support passing an image directly as it is using an external application to process the barcode image. If you're not locked into using the ZXing library for decoding PDF417 barcodes you can take a look at the PyPI package pdf417decoder. If you're starting with a Numpy array like in your example then you have to convert it to a PIL image first. import cv2 import pdf417decoder from PIL import Image npimg = cv2.imread (path) cv2.imshow('img', npimg) cv2.waitKey(0) img = Image.fromarray(npimg) decoder = PDF417Decoder(img) if (decoder.decode() > 0): print(decoder.barcode_data_index_to_string(0)) else: print("Failed to decode barcode.")
You cannot. if you look at the source code you will see that what it does is call a java app with the provided path (Specifically com.google.zxing.client.j2se.CommandLineRunner). If you need to pre-process your image then you will have to save it somewhere and pass the path to it to your library
I fix this by: path = os.getcwd() # print(path) writeStatus = cv2.imwrite(os.path.join(path, 'test.jpg'), pdf_image) if writeStatus is True: print("image written") else: print("problem") # or raise exception, handle problem, etc. sss = (os.path.join(path, 'test.jpg')) # print(sss) pp = sss.replace('\\', '/') # print(pp) reader = zxing.BarCodeReader() barcode = reader.decode(pp)
The zxing package is not recommended. It is just a command line tool to invoke Java ZXing libraries. You should use zxing-cpp, which is a Python module built with ZXing C++ code. Here is the sample code: import cv2 import zxingcpp img = cv2.imread('myimage.png') results = zxingcpp.read_barcodes(img) for result in results: print("Found barcode:\n Text: '{}'\n Format: {}\n Position: {}" .format(result.text, result.format, result.position)) if len(results) == 0: print("Could not find any barcode.")
Extracting text from scanned PDF without saving the scan as a new file image
I would like to extract text from scanned PDFs. My "test" code is as follows: from pdf2image import convert_from_path from pytesseract import image_to_string from PIL import Image converted_scan = convert_from_path('test.pdf', 500) for i in converted_scan: i.save('scan_image.png', 'png') text = image_to_string(Image.open('scan_image.png')) with open('scan_text_output.txt', 'w') as outfile: outfile.write(text.replace('\n\n', '\n')) I would like to know if there is a way to extract the content of the image directly from the object converted_scan, without saving the scan as a new "physical" image file on the disk? Basically, I would like to skip this part: for i in converted_scan: i.save('scan_image.png', 'png') I have a few thousands scans to extract text from. Although all the generated new image files are not particularly heavy, it's not negligible and I find it a bit overkill. EDIT Here's a slightly different, more compact approach than Colonder's answer, based on this post. For .pdf files with many pages, it might be worth adding a progress bar to each loop using e.g. the tqdm module. from wand.image import Image as w_img from PIL import Image as p_img import pyocr.builders import regex, pyocr, io infile = 'my_file.pdf' tool = pyocr.get_available_tools()[0] tool = tools[0] req_image = [] txt = '' # to convert pdf to img and extract text with w_img(filename = infile, resolution = 200) as scan: image_png = scan.convert('png') for i in image_png.sequence: img_page = w_img(image = i) req_image.append(img_page.make_blob('png')) for i in req_image: content = tool.image_to_string( p_img.open(io.BytesIO(i)), lang = tool.get_available_languages()[0], builder = pyocr.builders.TextBuilder() ) txt += content # to save the output as a .txt file with open(infile[:-4] + '.txt', 'w') as outfile: full_txt = regex.sub(r'\n+', '\n', txt) outfile.write(full_txt)
UPDATE MAY 2021 I realized that although pdf2image is simply calling a subprocess, one doesn't have to save images to subsequently OCR them. What you can do is just simply (you can use pytesseract as OCR library as well) from pdf2image import convert_from_path for img in convert_from_path("some_pdf.pdf", 300): txt = tool.image_to_string(img, lang=lang, builder=pyocr.builders.TextBuilder()) EDIT: you can also try and use pdftotext library pdf2image is a simple wrapper around pdftoppm and pdftocairo. It internally does nothing more but calls subprocess. This script should do what you want, but you need a wand library as well as pyocr (I think this is a matter of preference, so feel free to use any library for text extraction you want). from PIL import Image as Pimage, ImageDraw from wand.image import Image as Wimage import sys import numpy as np from io import BytesIO import pyocr import pyocr.builders def _convert_pdf2jpg(in_file_path: str, resolution: int=300) -> Pimage: """ Convert PDF file to JPG :param in_file_path: path of pdf file to convert :param resolution: resolution with which to read the PDF file :return: PIL Image """ with Wimage(filename=in_file_path, resolution=resolution).convert("jpg") as all_pages: for page in all_pages.sequence: with Wimage(page) as single_page_image: # transform wand image to bytes in order to transform it into PIL image yield Pimage.open(BytesIO(bytearray(single_page_image.make_blob(format="jpeg")))) tools = pyocr.get_available_tools() if len(tools) == 0: print("No OCR tool found") sys.exit(1) # The tools are returned in the recommended order of usage tool = tools[0] print("Will use tool '%s'" % (tool.get_name())) # Ex: Will use tool 'libtesseract' langs = tool.get_available_languages() print("Available languages: %s" % ", ".join(langs)) lang = langs[0] print("Will use lang '%s'" % (lang)) # Ex: Will use lang 'fra' # Note that languages are NOT sorted in any way. Please refer # to the system locale settings for the default language # to use. for img in _convert_pdf2jpg("some_pdf.pdf"): txt = tool.image_to_string(img, lang=lang, builder=pyocr.builders.TextBuilder())
How to work with HEIC image file types in Python
The High Efficiency Image File (HEIF) format is the default when airdropping an image from an iPhone to a OSX device. I want to edit and modify these .HEIC files with Python. I could modify phone settings to save as JPG by default but that doesn't really solve the problem of being able to work with the filetype from others. I still want to be able to process HEIC files for doing file conversion, extracting metadata, etc. (Example Use Case -- Geocoding) Pillow Here is the result of working with Python 3.7 and Pillow when trying to read a file of this type. $ ipython Python 3.7.0 (default, Oct 2 2018, 09:20:07) Type 'copyright', 'credits' or 'license' for more information IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: from PIL import Image In [2]: img = Image.open('IMG_2292.HEIC') --------------------------------------------------------------------------- OSError Traceback (most recent call last) <ipython-input-2-fe47106ce80b> in <module> ----> 1 img = Image.open('IMG_2292.HEIC') ~/.env/py3/lib/python3.7/site-packages/PIL/Image.py in open(fp, mode) 2685 warnings.warn(message) 2686 raise IOError("cannot identify image file %r" -> 2687 % (filename if filename else fp)) 2688 2689 # OSError: cannot identify image file 'IMG_2292.HEIC' It looks like support in python-pillow was requested (#2806) but there are licensing / patent issues preventing it there. ImageMagick + Wand It appears that ImageMagick may be an option. After doing a brew install imagemagick and pip install wand however I was unsuccessful. $ ipython Python 3.7.0 (default, Oct 2 2018, 09:20:07) Type 'copyright', 'credits' or 'license' for more information IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: from wand.image import Image In [2]: with Image(filename='img.jpg') as img: ...: print(img.size) ...: (4032, 3024) In [3]: with Image(filename='img.HEIC') as img: ...: print(img.size) ...: --------------------------------------------------------------------------- MissingDelegateError Traceback (most recent call last) <ipython-input-3-9d6f58c40f95> in <module> ----> 1 with Image(filename='ces2.HEIC') as img: 2 print(img.size) 3 ~/.env/py3/lib/python3.7/site-packages/wand/image.py in __init__(self, image, blob, file, filename, format, width, height, depth, background, resolution, pseudo) 4603 self.read(blob=blob, resolution=resolution) 4604 elif filename is not None: -> 4605 self.read(filename=filename, resolution=resolution) 4606 # clear the wand format, otherwise any subsequent call to 4607 # MagickGetImageBlob will silently change the image to this ~/.env/py3/lib/python3.7/site-packages/wand/image.py in read(self, file, filename, blob, resolution) 4894 r = library.MagickReadImage(self.wand, filename) 4895 if not r: -> 4896 self.raise_exception() 4897 4898 def save(self, file=None, filename=None): ~/.env/py3/lib/python3.7/site-packages/wand/resource.py in raise_exception(self, stacklevel) 220 warnings.warn(e, stacklevel=stacklevel + 1) 221 elif isinstance(e, Exception): --> 222 raise e 223 224 def __enter__(self): MissingDelegateError: no decode delegate for this image format `HEIC' # error/constitute.c/ReadImage/556 Any other alternatives available to do a conversion programmatically?
Consider using PIL in conjunction with pillow-heif: pip3 install pillow-heif from PIL import Image from pillow_heif import register_heif_opener register_heif_opener() image = Image.open('image.heic') That said, I'm not aware of any licensing/patent issues that would prevent HEIF support in Pillow (see this or this). libheif is widely adopted and free to use, provided you do not bundle the HEIF decoder with a device and fulfill the requirements of the LGPLv3 license.
You guys should check out this library, it's a Python 3 wrapper to the libheif library, it should serve your purpose of file conversion, extracting metadata: https://github.com/david-poirier-csn/pyheif https://pypi.org/project/pyheif/ Example usage: import io import whatimage import pyheif from PIL import Image def decodeImage(bytesIo): fmt = whatimage.identify_image(bytesIo) if fmt in ['heic', 'avif']: i = pyheif.read_heif(bytesIo) # Extract metadata etc for metadata in i.metadata or []: if metadata['type']=='Exif': # do whatever # Convert to other file format like jpeg s = io.BytesIO() pi = Image.frombytes( mode=i.mode, size=i.size, data=i.data) pi.save(s, format="jpeg") ...
I was quite successful with Wand package : Install Wand: https://docs.wand-py.org/en/0.6.4/ Code for conversion: from wand.image import Image import os SourceFolder="K:/HeicFolder" TargetFolder="K:/JpgFolder" for file in os.listdir(SourceFolder): SourceFile=SourceFolder + "/" + file TargetFile=TargetFolder + "/" + file.replace(".HEIC",".JPG") img=Image(filename=SourceFile) img.format='jpg' img.save(filename=TargetFile) img.close()
Here is another solution to convert heic to jpg while keeping the metadata intact. It is based on mara004s solution above, however I was not able to extract the images timestamp in that way, so had to add some code. Put the heic files in dir_of_interest before applying the function: import os from PIL import Image, ExifTags from pillow_heif import register_heif_opener from datetime import datetime import piexif import re register_heif_opener() def convert_heic_to_jpeg(dir_of_interest): filenames = os.listdir(dir_of_interest) filenames_matched = [re.search("\.HEIC$|\.heic$", filename) for filename in filenames] # Extract files of interest HEIC_files = [] for index, filename in enumerate(filenames_matched): if filename: HEIC_files.append(filenames[index]) # Convert files to jpg while keeping the timestamp for filename in HEIC_files: image = Image.open(dir_of_interest + "/" + filename) image_exif = image.getexif() if image_exif: # Make a map with tag names and grab the datetime exif = { ExifTags.TAGS[k]: v for k, v in image_exif.items() if k in ExifTags.TAGS and type(v) is not bytes } date = datetime.strptime(exif['DateTime'], '%Y:%m:%d %H:%M:%S') # Load exif data via piexif exif_dict = piexif.load(image.info["exif"]) # Update exif data with orientation and datetime exif_dict["0th"][piexif.ImageIFD.DateTime] = date.strftime("%Y:%m:%d %H:%M:%S") exif_dict["0th"][piexif.ImageIFD.Orientation] = 1 exif_bytes = piexif.dump(exif_dict) # Save image as jpeg image.save(dir_of_interest + "/" + os.path.splitext(filename)[0] + ".jpg", "jpeg", exif= exif_bytes) else: print(f"Unable to get exif data for {filename}")
Adding to the answer by danial, i just had to modify the byte array slighly to get a valid datastream for further work. The first 6 bytes are 'Exif\x00\x00' .. dropping these will give you a raw format that you can pipe into any image processing tool. import pyheif import PIL import exifread def read_heic(path: str): with open(path, 'rb') as file: image = pyheif.read_heif(file) for metadata in image.metadata or []: if metadata['type'] == 'Exif': fstream = io.BytesIO(metadata['data'][6:]) # now just convert to jpeg pi = PIL.Image.open(fstream) pi.save("file.jpg", "JPEG") # or do EXIF processing with exifread tags = exifread.process_file(fstream) At least this worked for me.
You can use the pillow_heif library to read HEIF images in a way compatible with PIL. The example below will import a HEIF picture and save it in png format. from PIL import Image import pillow_heif heif_file = pillow_heif.read_heif("HEIC_file.HEIC") image = Image.frombytes( heif_file.mode, heif_file.size, heif_file.data, "raw", ) image.save("./picture_name.png", format="png")
This will do go get the exif data from the heic file import pyheif import exifread import io heif_file = pyheif.read_heif("file.heic") for metadata in heif_file.metadata: if metadata['type'] == 'Exif': fstream = io.BytesIO(metadata['data'][6:]) exifdata = exifread.process_file(fstream,details=False) # example to get device model from heic file model = str(exifdata.get("Image Model")) print(model)
Example for working with HDR(10/12) bit HEIF files using OpenCV and pillow-heif: import numpy as np import cv2 import pillow_heif heif_file = pillow_heif.open_heif("images/rgb12.heif", convert_hdr_to_8bit=False) heif_file.convert_to("BGRA;16" if heif_file.has_alpha else "BGR;16") np_array = np.asarray(heif_file) cv2.imwrite("rgb16.png", np_array) Input file for this example can be 10 or 12 bit file.
I am facing the exact same problem as you, wanting a CLI solution. Doing some further research, it seems ImageMagick requires the libheif delegate library. The libheif library itself seems to have some dependencies as well. I have not had success in getting any of those to work as well, but will continue trying. I suggest you check if those dependencies are available to your configuration.
Simple solution after going over multiple responses from people. Please install whatimage, pyheif and PIL libraries before running this code. [NOTE] : I used this command for install. python3 -m pip install Pillow Also using linux was lot easier to install all these libraries. I recommend WSL for windows. code import whatimage import pyheif from PIL import Image import os def decodeImage(bytesIo, index): with open(bytesIo, 'rb') as f: data = f.read() fmt = whatimage.identify_image(data) if fmt in ['heic', 'avif']: i = pyheif.read_heif(data) pi = Image.frombytes(mode=i.mode, size=i.size, data=i.data) pi.save("new" + str(index) + ".jpg", format="jpeg") # For my use I had my python file inside the same folder as the heic files source = "./" for index,file in enumerate(os.listdir(source)): decodeImage(file, index)
It looked like that there is a solution called heic-to-jpg, but I might be not very sure about how this would work in colab.
the first answer works, but since its just calling save with a BytesIO object as the argument, it doesnt actually save the new jpeg file, but if you create a new File object with open and pass that, it saves to that file ie: import whatimage import pyheif from PIL import Image def decodeImage(bytesIo): fmt = whatimage.identify_image(bytesIo) if fmt in ['heic', 'avif']: i = pyheif.read_heif(bytesIo) # Convert to other file format like jpeg s = open('my-new-image.jpg', mode='w') pi = Image.frombytes( mode=i.mode, size=i.size, data=i.data) pi.save(s, format="jpeg")
I use the pillow_heif library. For example, I use this script when I have a folder of heif files I want to convert to png. from PIL import Image import pillow_heif import os from tqdm import tqdm import argparse def get_images(heic_folder): # Get all the heic images in the folder imgs = [os.path.join(heic_folder, f) for f in os.listdir(heic_folder) if f.endswith('.HEIC')] # Name of the folder where the png files will be stored png_folder = heic_folder + "_png" # If it doesn't exist, create the folder if not os.path.exists(png_folder): os.mkdir(png_folder) for img in tqdm(imgs): heif_file = pillow_heif.read_heif(img) image = Image.frombytes( heif_file.mode, heif_file.size, heif_file.data, "raw", ) image.save(os.path.join(png_folder,os.path.basename(img).split('.')[0])+'.png', format("png")) if __name__ == "__main__": parser = argparse.ArgumentParser(description='Convert heic images to png') parser.add_argument('heic_folder', type=str, help='Folder with heic images') args = parser.parse_args() get_images(args.heic_folder)
How to get letters from an image using python
i want capture the letters(characters & Numbers) from an image using python please help me how can i do it explain me with any sample code.
I hope this will help you out if your image is clear (positively less Noise). Use "PyTesser" Project of Google in this Case. PyTesser is an Optical Character Recognition module for Python. It takes as input an image or image file and outputs a string. You can get PyTesser from this link. Here's an example: >>> from pytesser import * >>> image = Image.open('fnord.tif') # Open image object using PIL >>> print image_to_string(image) # Run tesseract.exe on image fnord >>> print image_file_to_string('fnord.tif') fnord
I use tesseract for this. There is also a Python library for it: https://code.google.com/p/python-tesseract/ Example from the main page: import tesseract api = tesseract.TessBaseAPI() api.Init(".","eng",tesseract.OEM_DEFAULT) api.SetVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwxyz") api.SetPageSegMode(tesseract.PSM_AUTO) mImgFile = "eurotext.jpg" mBuffer=open(mImgFile,"rb").read() result = tesseract.ProcessPagesBuffer(mBuffer,len(mBuffer),api) print "result(ProcessPagesBuffer)=",result Here is my code for Python3 not using the tesseract library but the .exe file: import os import tempfile def tesser_exe(): path = os.path.join(os.environ['Programfiles'], 'Tesseract-OCR', 'tesseract.exe') if not os.path.exists(path): raise NotImplementedError('You must first install tesseract from https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-setup-3.02.02.exe&can=2&q=') return path def text_from_image_file(image_name): assert image_name.lower().endswith('.bmp') output_name = tempfile.mktemp() exe_file = tesser_exe() # path to the tesseract.exe file from return_code = subprocess.call([exe_file, image_name, output_name, '-psm', '7']) if return_code != 0: raise NotImplementedError('error handling not implemented') return open(output_name + '.txt', encoding = 'utf8').read()