Pillow's basic Image.resize function doesn't appear to have any options for SRGB-aware filtering. Is there a way to do SRGB-aware resizing in Pillow?
I could do it manually by converting the image to float and applying the SRGB transforms myself...but I'm hoping there's a built-in way.
I ended up implementing sRGB-aware resize myself using the following routine. It takes an 8-bit RGB image and a target size and resampling filter.
from PIL import Image
import numpy as np
def SRGBResize(im, size, filter):
# Convert to numpy array of float
arr = np.array(im, dtype=np.float32) / 255.0
# Convert sRGB -> linear
arr = np.where(arr <= 0.04045, arr/12.92, ((arr+0.055)/1.055)**2.4)
# Resize using PIL
arrOut = np.zeros((size[1], size[0], arr.shape[2]))
for i in range(arr.shape[2]):
chan = Image.fromarray(arr[:,:,i])
chan = chan.resize(size, filter)
arrOut[:,:,i] = np.array(chan).clip(0.0, 1.0)
# Convert linear -> sRGB
arrOut = np.where(arrOut <= 0.0031308, 12.92*arrOut, 1.055*arrOut**(1.0/2.4) - 0.055)
# Convert to 8-bit
arrOut = np.uint8(np.rint(arrOut * 255.0))
# Convert back to PIL
return Image.fromarray(arrOut)
After a lot of reading and trial and error I have stumbled upon a good solution. It assumes an sRGB image, converts it to linear colour space to do the resizing, then converts back to sRGB.
There is a slight downside in that a colour depth of 8 bits per pixel is used even when the image is in it's linear form. This results in a loss of variance in darker regions. Reading from this issue post it seems there is no way to convert to a higher depth using Pillow unfortunately.
from PIL import Image
from PIL.ImageCms import profileToProfile
SRGB_PROFILE = 'sRGB.icc'
LINEARIZED_PROFILE = 'linearized-sRGB.icc'
im = Image.open(IN_PATH)
im = profileToProfile(im, SRGB_PROFILE, LINEARIZED_PROFILE)
im = im.resize((WIDTH, HEIGHT), Image.ANTIALIAS)
im = profileToProfile(im, LINEARIZED_PROFILE, SRGB_PROFILE)
im.save(OUT_PATH)
You'll need a linearised ICC colour profile as Pillow/lcms can't do it without. You can get one from this issue post and the author mentions in the file "no copyright, use freely". You'll also need an sRGB profile which should be easily obtainable from your OS or online.
Much of the processing time is taken up computing the transformations from sRGB and back again. If you are going to be doing a lot of these operations you can store these transformations to re-use them like so:
from PIL.ImageCms import buildTransform, applyTransform
SRGB_TO_LINEARIZED = buildTransform(SRGB_PROFILE, LINEARIZED_PROFILE, 'RGB', 'RGB')
LINEARIZED_TO_SRGB = buildTransform(LINEARIZED_PROFILE, SRGB_PROFILE, 'RGB', 'RGB')
im = applyTransform(im, SRGB_TO_LINEARIZED)
im = im.resize((WIDTH, HEIGHT), Image.ANTIALIAS)
im = applyTransform(im, LINEARIZED_TO_SRGB)
I hope this helps and I'd be interested to hear if anyone has any ideas on resolving the 8 bit colour space issue.
99% of image resize implementations will not get sRGB right (which, unfortunately, is 99.9% of image material), and those who do usually will do it right by default and give you the option to opt out of gamma de/encoding.
[opinionated mode on, read with care]
IOW, if there is no option you likely have to add the code yourself - or just use pamscale. If a library doesn't get sRGB right it will have other flaws anyway.
[opinionated mode off]
You could de/encode yourself as discussed in
http://www.imagemagick.org/discourse-server/viewtopic.php?t=15955
but a from quick glance it seems pillow is not capable of doing that trick.
Related
I am doing a denoising work and I'm not very familiar with Python. I applied BM3D to get the denoised picture and I also have the original one.
Now I want to get the noise by doing this:
tmp = img - img_denoised
But it turns out to be a very strange black and white figure like this:
So how can I get a proper noise picture? What I wish to get is image like this:
Edit:
Got an image from the Internet and done the same processing.
after processing:
Edit again:
Providing a simple example:
import cv2
img = cv2.imread("path of the original image")
img_denoised = cv2.imread("path of the denoised image")
tmp = img - img_denoised
cv2.imwrite("test_noise.jpg",tmp)
In your example code, both img and img_denoised are uint8 NumPy arrays. When operating on these arrays, the output is of the same type. These operations are modulo 256. When the result of an operation exceeds 255, it wraps around back to 0, and when the result is negative, it wraps around back to 255. For example:
np.array([5], np.uint8) - np.array([10], np.uint8)
return array([251], dtype=uint8). Instead of -5, which cannot be represented in a uint8 value, we get 256 - 5 = 251.
The subtraction img - img_denoised results in some values just above zero, which look black, and some values just below zero, which will be stored as values near 255 and look white.
We can solve this in different ways. One is to force the operation to happen with floating-point values:
tmp = img.astype(float) - img_denoised.astype(float)
We now have an array of floats, about half of them negative. But a JPEG file can only store uint8 values, and casting our float values to uint8 will get us back where we started. So we need to shift the origin (the zero value) to a middle-gray (typically 128):
tmp = img.astype(float) - img_denoised.astype(float)
tmp += 128
cv2.imwrite("test_noise.jpg", tmp.astype(np.uint8))
This is very fiddly, but it works. I prefer using a library that takes care of data types for me, so I don't have to think about them when I don't want to. DIPlib is such a library (disclaimer: I'm an author):
import diplib as dip
img = dip.ImageRead("7yJS3.png")
img_denoised = dip.ImageRead("xjQIy.png")
tmp = img - img_denoised
tmp += 128
dip.ImageWrite(tmp, "test_noise.jpg")
In DIPlib, arithmetic operations automatically promote the images to a floating-point type, unless we explicitly prevent it. Saving as JPEG silently casts the image to uint8 (this is where errors will happen if the pixel values are outside the range of the uint8 type).
With limited information it is hard to pin-point the problem. Please provide input images and more code.
Looks like the result image is a binary bitmap, only white or black, no gray. Your tmp image's pixel format is probably incorrect, which might be due to your img and img_denoised are not having the same pixel format, or both are wrong. Try display your input images to see if they look normal.
Your img and img_denoised should be the same pixel format, maybe 8-bit gray scale, or 24-bit RGB, and after img-img_denoised, the result should still have the same pixel format.
It could also due to it's unsigned, try to make it signed, or + 128 to all pixels and see what happened.
What is your image data type/range? 0-1 or 0-255?
if your image is 0-1 float32, the noise image will have a data range of [-1, 1], around half of the pixels is below 0, and when displayed by cv2.imshow() as "black".
Try
noise = origin - clean
noise = (noise + 1) * 0.5
I want to resize png picture 476x402 to 439x371, and I used resize method of PIL(image) or opencv, however, it will loss some sharp. After resize, The picture becomes blurred.
How to resize(shrink) image without losing sharpness with use python?
from skimage import transform, data, io
from PIL import Image
import os
import cv2
infile = 'D:/files/script/org/test.png'
outfile = 'D:/files/script/out/test.png'
''' PIL'''
def fixed_size1(width, height):
im = Image.open(infile)
out = im.resize((width, height),Image.ANTIALIAS)
out.save(outfile)
''' open cv'''
def fixed_size2(width, height):
img_array = cv2.imread(infile)
new_array = cv2.resize(img_array, (width, height), interpolation=cv2.INTER_CUBIC)
cv2.imwrite(outfile, new_array)
def fixed_size3(width, height):
img = io.imread(infile)
dst = transform.resize(img, (439, 371))
io.imsave(outfile, dst)
fixed_size2(371, 439)
src:476x402
resized:439x371
How can you pack 2000 pixels into a box that only holds 1800? You can't.
Putting the same amount of information (stored as pixels in your source image) into a smaller pixelarea only works by
throwing away pixels (i.e. discarding single values or by cropping an image which is not what you want to do)
blending neighbouring pixels into some kind of weighted average and replace say 476 pixels with slightly altered 439 pixels
That is exactly what happens when resizing images. Some kind of algorithm (interpolation=cv2.INTER_CUBIC, others here) tweaks the pixel values to merge/average them so you do not loose too much of information.
You can try to change the algorithm or you can apply further postprocessing ("sharpening") to enrich the contrasts again.
Upon storing the image, certain formats do "lossy" storage to minimize file size (JPG) others are lossless (PNG, TIFF, JPG2000, ...) which further might blur your image if you choose a lossy image format.
See
Shrink/resize an image without interpolation
How can I sharpen an image in OpenCV?
i would like to ask you one question : wanted to implement a code which clarifies a picture done by hand ( by pen), let us consider such image
it is done by blue pen, which should be converted to the gray scale image using following code
from PIL import Image
user_test = filename
col = Image.open(user_test)
gray = col.convert('L')
bw = gray.point(lambda x: 0 if x<100 else 255, '1')
bw.save("bw_image.jpg")
bw
img_array = cv2.imread("bw_image.jpg", cv2.IMREAD_GRAYSCALE)
img_array = cv2.bitwise_not(img_array)
print(img_array.size)
plt.imshow(img_array, cmap = plt.cm.binary)
plt.show()
img_size = 28
new_array = cv2.resize(img_array, (img_size,img_size))
plt.imshow(new_array, cmap = plt.cm.binary)
plt.show()
idea is that i am taking image from camera directly, but it is losing structure of digit and comes only empty and black picture, like this
therefore computer can't understand which digit it is and neural networks fails to predict its label correctly, could you please tell me which transformation should i apply in order to detect this image much more precisely ?
edit :
i have apply following code
from PIL import Image
user_test = filename
col = Image.open(user_test)
gray = col.convert('L')
plt.hist(img_array)
plt.show()
and got
You have several issues here, and you can methodically address them.
First of all you're having an issue with thresholding properly.
As I suggested in earlier comments, you can easily see why your original thresholding was unsuccessful.
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
from matplotlib import cm
im = Image.open('whatever_path_you_choose.jpg').convert("L")
im = np.asarray(im)
plt.hist(im.flatten(), bins=np.arange(255));
Looking at the image you gave:
Clearly the threshold should be somewhere between 100-200, not as in your original code. Also note that this distribution isn't very bimodal - so I'm not sure otsu's method would work well here.
If we eyeball it (this can be tuned), we can see that thresholding at 145-ish gives decent results in terms of segmentation.
im_thresh = (im >= 145)
plt.imshow(im_thresh, cmap=cm.gray)
Now you might have an additional issue that you have horizontal lines, you can address this by writing on blank paper as suggested. This wasn't exactly your question but I will try to address it anyways (in a naive fashion). You can try a naive solution of using a sobel filter (think of it as the derivative of the image to get the lines), followed by a median filter to get the approximately most common pixel intensity - the size of the filter might have to vary for different digits though. This should clear up some of the lines. For a more rigorous approach try reading up on hough line transform for detecting horizontal lines and try to whiten them out.
This is my very naive approach:
from skimage.filters import sobel
from scipy.ndimage import median_filter
#Sobel filter reverses intensities so subtracting the result from 1.0 turns it back to the original
plt.imshow(1.0 - median_filter(sobel(im_thresh), [10, 3]), cmap=cm.gray)
You can try cropping automatically afterwards. Honestly I think most neural networks that could recognize MNIST-like digits could recognize the result I posted at the end as well.
Try using skimage package like this. This has inbuilt functions for image processing:
from skimage import io
from skimage.restoration import denoise_tv_chambolle
from skimage.filters import threshold_otsu
image = io.imread('path/to/your/image', as_gray=True)
# Denoising
denoised_image = denoise_tv_chambolle(image, weight=0.1, multichannel=True)
# Thresholding
threshold = threshold_otsu(denoised_image)
thresholded_image = denoised_image > threshold
I am trying to resize a .jpg image with skimage.transform.resize function. Function returns me weird result (see image below). I am not sure if it is a bug or just wrong use of the function.
import numpy as np
from skimage import io, color
from skimage.transform import resize
rgb = io.imread("../../small_dataset/" + file)
# show original image
img = Image.fromarray(rgb, 'RGB')
img.show()
rgb = resize(rgb, (256, 256))
# show resized image
img = Image.fromarray(rgb, 'RGB')
img.show()
Original image:
Resized image:
I allready checked skimage resize giving weird output, but I think that my bug has different propeties.
Update: Also rgb2lab function has similar bug.
The problem is that skimage is converting the pixel data type of your array after resizing the image. The original image has a 8 bits per pixel, of type numpy.uint8, and the resized pixels are numpy.float64 variables.
The resize operation is correct, but the result is not being correctly displayed. For solving this issue, I propose 2 different approaches:
To change the data structure of the resulting image. Prior to changing to uint8 values, the pixels have to be converted to a 0-255 scale, as they are on a 0-1 normalized scale:
# ...
# Do the OP operations ...
resized_image = resize(rgb, (256, 256))
# Convert the image to a 0-255 scale.
rescaled_image = 255 * resized_image
# Convert to integer data type pixels.
final_image = rescaled_image.astype(np.uint8)
# show resized image
img = Image.fromarray(final_image, 'RGB')
img.show()
Update: This method is deprecated, as per scipy.misc.imshow
To use another library for displaying the image. Taking a look at the Image library documentation, there isn't any mode supporting 3xfloat64 pixel images. However, the scipy.misc library has the appropriate tools for converting the array format in order to display it correctly:
from scipy import misc
# ...
# Do OP operations
misc.imshow(resized_image)
I would like to convert a PNG32 image (with transparency) to PNG8 with Python Image Library.
So far I have succeeded converting to PNG8 with a solid background.
Below is what I am doing:
from PIL import Image
im = Image.open("logo_256.png")
im = im.convert('RGB').convert('P', palette=Image.ADAPTIVE, colors=255)
im.save("logo_py.png", colors=255)
After much searching on the net, here is the code to accomplish what I asked for:
from PIL import Image
im = Image.open("logo_256.png")
# PIL complains if you don't load explicitly
im.load()
# Get the alpha band
alpha = im.split()[-1]
im = im.convert('RGB').convert('P', palette=Image.ADAPTIVE, colors=255)
# Set all pixel values below 128 to 255,
# and the rest to 0
mask = Image.eval(alpha, lambda a: 255 if a <=128 else 0)
# Paste the color of index 255 and use alpha as a mask
im.paste(255, mask)
# The transparency index is 255
im.save("logo_py.png", transparency=255)
Source: http://nadiana.com/pil-tips-converting-png-gif
Although the code there does not call im.load(), and thus crashes on my version of os/python/pil. (It looks like that is the bug in PIL).
As mentioned by Mark Ransom, your paletized image will only have one transparency level.
When saving your paletized image, you'll have to specify which color index you want to be the transparent color like this :
im.save("logo_py.png", transparency=0)
to save the image as a paletized colors and using the first color as a transparent color.
This is an old question so perhaps older answers are tuned to older version of PIL?
But for anyone coming to this with Pillow>=6.0.0 then the following answer is many magnitudes faster and simpler.
im = Image.open('png32_or_png64_with_alpha.png')
im = im.quantize()
im.save('png8_with_alpha_channel_preserved.png')
Don't use PIL to generate the palette, as it can't handle RGBA properly and has quite limited quantization algorithm.
Use pngquant instead.