I'd like some advice on performing a simple image analysis in python. I need to calculate a value for the "brightness" of an image. I know PIL is the goto library for doing something like this. There is a built-in histogram function.
What I need is a "perceived brightness" values I can decide if further adjustments to the image are necessary. So what are something of the basic techniques that will work in this situation? Should I just work with the RGB values, or will histogram give me something close enough?
One possible solution might be to combine the two, and generate average R,G,and B values using the histogram, then apply the "perceived brightness" formula.
Using the techniques mentioned in the question, I came up with a few different versions.
Each method returns a value close, but not exactly the same as the others. Also, all methods run about the same speed except for the last one, which is much slower depending on the image size.
Convert image to greyscale, return average pixel brightness.
def brightness( im_file ):
im = Image.open(im_file).convert('L')
stat = ImageStat.Stat(im)
return stat.mean[0]
Convert image to greyscale, return RMS pixel brightness.
def brightness( im_file ):
im = Image.open(im_file).convert('L')
stat = ImageStat.Stat(im)
return stat.rms[0]
Average pixels, then transform to "perceived brightness".
def brightness( im_file ):
im = Image.open(im_file)
stat = ImageStat.Stat(im)
r,g,b = stat.mean
return math.sqrt(0.241*(r**2) + 0.691*(g**2) + 0.068*(b**2))
RMS of pixels, then transform to "perceived brightness".
def brightness( im_file ):
im = Image.open(im_file)
stat = ImageStat.Stat(im)
r,g,b = stat.rms
return math.sqrt(0.241*(r**2) + 0.691*(g**2) + 0.068*(b**2))
Calculate "perceived brightness" of pixels, then return average.
def brightness( im_file ):
im = Image.open(im_file)
stat = ImageStat.Stat(im)
gs = (math.sqrt(0.241*(r**2) + 0.691*(g**2) + 0.068*(b**2))
for r,g,b in im.getdata())
return sum(gs)/stat.count[0]
Update Test Results
I ran a simulation against 200 images. I found that methods #2, #4 gave almost identical results. Also methods #3, #5 were also nearly identical. Method #1 closely followed #3, #5 (with a few exceptions).
Given that you're just looking for an average across the whole image, and not per-pixel brightness values, averaging PIL's histogram and applying the brightness function to the output seems like the best approach for that library.
If using ImageMagick (with the PythonMagick bindings), I would suggest using the identify command with the "verbose" option set. This will provide you with a mean value for each channel, saving you the need to sum and average a histogram — you can just multiply each channel directly.
I think your best result would come from converting the RGB to grayscale using your favorite formula, then taking the histogram of that result. I'm not sure if the mean or the median of the histogram would be more appropriate, but on most images they are probably similar.
I'm not sure how to do the conversion to grayscale in PIL using an arbitrary formula, but I'm guessing it's possible.
the code below will give you the brightness level of an image from 0-10
1- calculate the average brightness of the image after converting the image to HSV format using opencv.
2- find where this value lies in the list of brightness range.
import numpy as np
import cv2
import sys
from collections import namedtuple
#brange brightness range
#bval brightness value
BLevel = namedtuple("BLevel", ['brange', 'bval'])
#all possible levels
_blevels = [
BLevel(brange=range(0, 24), bval=0),
BLevel(brange=range(23, 47), bval=1),
BLevel(brange=range(46, 70), bval=2),
BLevel(brange=range(69, 93), bval=3),
BLevel(brange=range(92, 116), bval=4),
BLevel(brange=range(115, 140), bval=5),
BLevel(brange=range(139, 163), bval=6),
BLevel(brange=range(162, 186), bval=7),
BLevel(brange=range(185, 209), bval=8),
BLevel(brange=range(208, 232), bval=9),
BLevel(brange=range(231, 256), bval=10),
]
def detect_level(h_val):
h_val = int(h_val)
for blevel in _blevels:
if h_val in blevel.brange:
return blevel.bval
raise ValueError("Brightness Level Out of Range")
def get_img_avg_brightness():
if len(sys.argv) < 2:
print("USAGE: python3.7 brightness.py <image_path>")
sys.exit(1)
img = cv2.imread(sys.argv[1])
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
_, _, v = cv2.split(hsv)
return int(np.average(v.flatten()))
if __name__ == '__main__':
print("the image brightness level is:
{0}".format(detect_level(get_img_avg_brightness())))
This can be done by converting the BGR image from cv2 to grayscale and then finding the intensity - x and y are pixel coordinates. It's been explained well in this https://docs.opencv.org/3.4/d5/d98/tutorial_mat_operations.html document.
Scalar intensity = img.at<uchar>(y, x);
def calculate_brightness(image):
greyscale_image = image.convert('L')
histogram = greyscale_image.histogram()
pixels = sum(histogram)
brightness = scale = len(histogram)
for index in range(0, scale):
ratio = histogram[index] / pixels
brightness += ratio * (-scale + index)
return 1 if brightness == 255 else brightness / scale
Related
I'm working on a project to measure and visualize image similarity. The images in my dataset come from photographs of images in books, some of which have very high or low exposure rates. For example, the images below come from two different books; the one on the top is an over-exposed reprint of the one on the bottom, wherein the exposure looks good:
I'd like to normalize each image's exposure in Python. I thought I could do so with the following naive approach, which attempts to center each pixel value between 0 and 255:
from scipy.ndimage import imread
import sys
def normalize(img):
'''
Normalize the exposure of an image.
#args:
{numpy.ndarray} img: an array of image pixels with shape:
(height, width)
#returns:
{numpy.ndarray} an image with shape of `img` wherein
all values are normalized such that the min=0 and max=255
'''
_min = img.min()
_max = img.max()
return img - _min * 255 / (_max - _min)
img = imread(sys.argv[1])
normalized = normalize(img)
Only after running this did I realize that this normalization will only help images whose lightest value is less than 255 or whose darkest value is greater than 0.
Is there a straightforward way to normalize the exposure of an image such as the top image above? I'd be grateful for any thoughts others can offer on this question.
Histogram equalisation works surprisingly well for this kind of thing. It's usually better for photographic images, but it's helpful even on line art, as long as there are some non-black/white pixels.
It works well for colour images too: split the bands up, equalize each one separately, and recombine.
I tried on your sample image:
Using libvips:
$ vips hist_equal sample.jpg x.jpg
Or from Python with pyvips:
x = pyvips.Image.new_from_file("sample.jpg")
x = x.hist_equal()
x.write_to_file("x.jpg")
It's very hard to say if it will work for you without seeing a larger sample of your images, but you may find an "auto-gamma" useful. There is one built into ImageMagick and the description - so that you can calculate it yourself - is:
Automagically adjust gamma level of image.
This calculates the mean values of an image, then applies a calculated
-gamma adjustment so that the mean color in the image will get a value of 50%.
This means that any solid 'gray' image becomes 50% gray.
This works well for real-life images with little or no extreme dark
and light areas, but tend to fail for images with large amounts of
bright sky or dark shadows. It also does not work well for diagrams or
cartoon like images.
You can try it out yourself on the command line very simply before you go and spend a lot of time coding something that may not work:
convert Tribunal.jpg -auto-gamma result.png
You can do -auto-level as per your own code beforehand, and a thousand other things too:
convert Tribunal.jpg -auto-level -auto-gamma result.png
I ended up using a numpy implementation of the histogram normalization method #user894763 pointed out. Just save the below as normalize.py then you can call:
python normalize.py cats.jpg
Script:
import numpy as np
from scipy.misc import imsave
from scipy.ndimage import imread
import sys
def get_histogram(img):
'''
calculate the normalized histogram of an image
'''
height, width = img.shape
hist = [0.0] * 256
for i in range(height):
for j in range(width):
hist[img[i, j]]+=1
return np.array(hist)/(height*width)
def get_cumulative_sums(hist):
'''
find the cumulative sum of a numpy array
'''
return [sum(hist[:i+1]) for i in range(len(hist))]
def normalize_histogram(img):
# calculate the image histogram
hist = get_histogram(img)
# get the cumulative distribution function
cdf = np.array(get_cumulative_sums(hist))
# determine the normalization values for each unit of the cdf
sk = np.uint8(255 * cdf)
# normalize the normalization values
height, width = img.shape
Y = np.zeros_like(img)
for i in range(0, height):
for j in range(0, width):
Y[i, j] = sk[img[i, j]]
# optionally, get the new histogram for comparison
new_hist = get_histogram(Y)
# return the transformed image
return Y
img = imread(sys.argv[1])
normalized = normalize_histogram(img)
imsave(sys.argv[1] + '-normalized.jpg', normalized)
Output:
I have 10 greyscale brain MRI scans from BrainWeb. They are stored as a 4d numpy array, brains, with shape (10, 181, 217, 181). Each of the 10 brains is made up of 181 slices along the z-plane (going through the top of the head to the neck) where each slice is 181 pixels by 217 pixels in the x (ear to ear) and y (eyes to back of head) planes respectively.
All of the brains are type dtype('float64'). The maximum pixel intensity across all brains is ~1328 and the minimum is ~0. For example, for the first brain, I calculate this by brains[0].max() giving 1328.338086605072 and brains[0].min() giving 0.0003886114541273855. Below is a plot of a slice of a brain[0]:
I want to binarize all these brain images by rescaling the pixel intensities from [0, 1328] to {0, 1}. Is my method correct?
I do this by first normalising the pixel intensities to [0, 1]:
normalized_brains = brains/1328
And then by using the binomial distribution to binarize each pixel:
binarized_brains = np.random.binomial(1, (normalized_brains))
The plotted result looks correct:
A 0 pixel intensity represents black (background) and 1 pixel intensity represents white (brain).
I experimented by implementing another method to normalise an image from this post but it gave me just a black image. This is because np.finfo(np.float64) is 1.7976931348623157e+308, so the normalization step
normalized_brains = brains/1.7976931348623157e+308
just returned an array of zeros which in the binarizition step also led to an array of zeros.
Am I binarising my images using a correct method?
Your method of converting the image to a binary image basically amounts to random dithering, which is a poor method of creating the illusion of grey values on a binary medium. Old-fashioned print is a binary medium, they have fine-tuned the methods to represent grey-value photographs in print over centuries. This process is called halftoning, and is shaped in part by properties of ink on paper, that we do not have to deal with in binary images.
So what methods have people come up with outside of print? Ordered dithering (mostly Bayer matrix), and error diffusion dithering. Read more about dithering on Wikipedia. I wrote a blog post showing how to implement all of these methods in MATLAB some years ago.
I would recommend you use error diffusion dithering for your particular application. Here is some code in MATLAB (taken from my blog post liked above) for the Floyd-Steinberg algorithm, I hope that you can translate this to Python:
img = imread('https://i.stack.imgur.com/d5E9i.png');
img = img(:,:,1);
out = double(img);
sz = size(out);
for ii=1:sz(1)
for jj=1:sz(2)
old = out(ii,jj);
%new = 255*(old >= 128); % Original Floyd-Steinberg
new = 255*(old >= 128+(rand-0.5)*100); % Simple improvement
out(ii,jj) = new;
err = new-old;
if jj<sz(2)
% right
out(ii ,jj+1) = out(ii ,jj+1)-err*(7/16);
end
if ii<sz(1)
if jj<sz(2)
% right-down
out(ii+1,jj+1) = out(ii+1,jj+1)-err*(1/16);
end
% down
out(ii+1,jj ) = out(ii+1,jj )-err*(5/16);
if jj>1
% left-down
out(ii+1,jj-1) = out(ii+1,jj-1)-err*(3/16);
end
end
end
end
imshow(out)
Resampling the image before applying the dithering greatly improves the results:
img = imresize(img,4);
% (repeat code above)
imshow(out)
NOTE that the above process expects the input to be in the range [0,255]. It is easy to adapt to a different range, say [0,1328] or [0,1], but it is also easy to scale your images to the [0,255] range.
Have you tried a threshold on the image?
This is a common way to binarize images, rather than trying to apply a random binomial distribution. You could try something like:
binarized_brains = (brains > threshold_value).astype(int)
which returns an array of 0s and 1s according to whether the image value was less than or greater than your chosen threshold value.
You will have to experiment with the threshold value to find the best one for your images, but it does not need to be normalized first.
If this doesn't work well, you can also experiment with the thresholding options available in the skimage filters package.
IT is easy in OpenCV. as mentioned a very common way is defining a threshold, But your result looks like you are allocating random values to your intensities instead of thresholding it.
import cv2
im = cv2.imread('brain.png', cv2.CV_LOAD_IMAGE_GRAYSCALE)
(th, brain_bw) = cv2.threshold(imy, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
th = (DEFINE HERE)
im_bin = cv2.threshold(im, th, 255, cv
cv2.imwrite('binBrain.png', brain_bw)
brain
binBrain
I'm trying to stretch an image's histogram using a logarithmic transformation. Basically, I am applying a log operation to each pixel's intensity. When I'm trying to change image's value in each pixel, the new values are not saved but the histogram looks OK. Also, the maximum value is not correct. This is my code:
import cv2
import numpy as np
import math
from matplotlib import pyplot as plt
img = cv2.imread('messi.jpg',0)
img2 = img
for i in range(0,img2.shape[0]-1):
for j in range(0,img2.shape[1]-1):
if (math.log(1+img2[i,j],2)) < 0:
img2[i,j]=0
else:
img2[i,j] = np.int(math.log(1+img2[i,j],2))
print (np.int(math.log(1+img2[i,j],2)))
print (img2.ravel().max())
cv2.imshow('LSP',img2)
cv2.waitKey(0)
fig = plt.gcf()
fig.canvas.set_window_title('LSP histogram')
plt.hist(img2.ravel(),256,[0,256]); plt.show()
img3 = img2
B = np.int(img3.max())
A = np.int(img3.min())
print ("Maximum intensity = ", B)
print ("minimum intensity = ", A)
This is also the histogram I get:
However, the maximum intensity shows 186! This isn't applying the proper logarithmic operation at all.
Any ideas?
The code you wrote performs a logarithmic transformation applied to the image intensities. The reason why you are getting such a high spurious intensity as the maximum is because your for loops are wrong. Specifically, your range is incorrect. range is exclusive of the ending interval, which means that you must go up to img.shape[0] and img.shape[1] respectively, and not img.shape[0]-1 or img.shape[1]-1. Therefore, you are missing the last row and last column of the image, and these don't get touched by logarithmic operation. The maximum that is reported is from one of these pixels in the last row or column that you didn't touch.
Once you correct this, you don't get those bad intensities anymore:
for i in range(0,img2.shape[0]): # Change
for j in range(0,img2.shape[1]): # Change
if (math.log(1+img2[i,j],2)) < 0:
img2[i,j]=0
else:
img2[i,j] = np.int(math.log(1+img2[i,j],2))
Doing that now gives us:
('Maximum intensity = ', 7)
('minimum intensity = ', 0)
However, what you're going to get now is a very dark image. The histogram that you have shown us illustrates that all of the image pixels are in the dark range... roughly between [0-7]. Because of that, the majority of your image is going to be dark if you use uint8 as the data type for visualization. Take note that I searched for the Lionel Messi image that's part of the OpenCV tutorials, and this is the image I found:
Source: https://opencv-python-tutroals.readthedocs.org/en/latest/_images/roi.jpg
Your code is converting this to grayscale, and that's fine for the purpose of your question. Now, using the above image, if you actually show what the histogram count looks like as well as what the intensities are per bin in the histogram, this is what we get for img2:
In [41]: np.unique(img2)
Out[41]: array([0, 1, 2, 3, 4, 5, 6, 7], dtype=uint8)
In [42]: np.bincount(img2.ravel())
Out[42]: array([ 86, 88, 394, 3159, 14841, 29765, 58012, 19655])
As you can see, the bulk of the image pixels are hovering between the [0-7] range, which is why everything looks black. If you want to see this better, perhaps scale the image by roughly 255 / 7 = 36 or so we can see the image better:
img2 = 36*img2
cv2.imshow('LSP',img2)
cv2.waitKey(0)
We get this image:
I also get this histogram:
That personally looks very ugly... at least to me. As such, I would recommend that you choose a more meaningful image transformation if you want to stretch the histogram. In fact, the log operation compresses the dynamic range of the histogram. If you want to stretch the histogram, go the opposite way and try a power-law operation. Specifically, given an input intensity and the output is defined as:
out = c*in^(p)
in is the input intensity, p is a power and c is a constant to ensure that you scale the image so that the maximum intensity gets mapped to the same maximum intensity of the input when you're finished and not anything larger. That can be done by calculating c so that:
c = (img2.max()) / (img2.max()**p)
... where p is the power you want. In addition, the transformation via power-law can be explained with this nice diagram:
Source: http://www.nptel.ac.in/courses/117104069/chapter_8/8_14.html
Basically, powers that are less than 1 perform an intensity expansion where darker intensities get pushed towards the lighter side. Similarly, powers that are greater than 1 perform an intensity compression where lighter intensities get pushed to the darker side. In your case, you want to expand the histogram, and so you want the first option. Specifically, try making the intensities that are smaller go towards the larger range. This can be done by choosing a power that's smaller than 1... try 0.5 for example.
You'd modify your code so that it is like this:
img2 = img2.astype(np.float) # Cast to float
c = (img2.max()) / (img2.max()**(0.5))
for i in range(0,img2.shape[0]-1):
for j in range(0,img2.shape[1]-1):
img2[i,j] = np.int(c*img2[i,j]**(0.5))
# Cast back to uint8 for display
img2 = img2.astype(np.uint8)
Doing that, I get this image:
I also get this histogram:
Minor Note
If I can suggest something in terms of efficiency, I wouldn't recommend that you loop through the entire image and set each pixel individually... that's how numpy arrays were not supposed to be used. You can achieve what you want vectorized in a single line of code.
With your old code, use np.log2, not math.log with the base 2 with numpy arrays:
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Your code
img = cv2.imread('messi.jpg',0)
# New code
img2 = np.log2(1 + img.astype(np.float)).astype(np.uint8)
# Back to your code
img2 = 36*img2 # Edit from before
cv2.imshow('LSP',img2)
cv2.waitKey(0)
fig = plt.gcf()
fig.canvas.set_window_title('LSP histogram')
plt.hist(img2.ravel(),256,[0,256]); plt.show()
img3 = img2
B = np.int(img3.max())
A = np.int(img3.min())
print ("Maximum intensity = ", B)
print ("minimum intensity = ", A)
cv2.destroyAllWindows() # Don't forget this
Similarly, if you want to apply a power-law transformation, it's very simply:
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Your code
img = cv2.imread('messi.jpg',0)
# New code
c = (img2.max()) / (img2.max()**(0.5))
img2 = (c*img.astype(np.float)**(0.5)).astype(np.uint8)
#... rest of code as before
I am downloading satellite pictures like this
(source: u0553130 at home.chpc.utah.edu)
Since some images are mostly black, like this one, I don't want to save it.
How can I use python to check if the image is more than 50% black?
You're dealing with gifs which are mostly grayscale by the look of your example image, so you might expect most of the RGB components to be equal.
Using PIL:
from PIL import Image
im = Image.open('im.gif')
pixels = im.getdata() # get the pixels as a flattened sequence
black_thresh = 50
nblack = 0
for pixel in pixels:
if pixel < black_thresh:
nblack += 1
n = len(pixels)
if (nblack / float(n)) > 0.5:
print("mostly black")
Adjust your threshold for "black" between 0 (pitch black) and 255 (bright white) as appropriate).
The thorough way is to count the pixels using something like PIL, as given in the other answers.
However, if they're all compressed images, you may be able to check the file size, as images with lots of plain-colour areas should compress a lot more than ones with variation like the cloud cover.
With some tests, you could at least find a heuristic of which images with lots of cloud you know you can instantly discard without expensive looping over their pixels. Others closer to 50% can be checked pixel by pixel.
Additionally, when iterating over the pixels, you don't need to count all the black pixels, and then check if at least 50% are black. Instead, stop counting and discard as soon as you know at least 50% are black.
A second optimisation: if you know the images are generally mostly cloudy rather than mostly black, go the other way. Count the number of non-black pixels, and stop and keep the images as soon as that crosses 50%.
Load image
Read each pixel and increment result if pixel = (0,0,0)
If result =< (image.width * image.height)/2
Save image
Or check if it's almost black by returning true if your pixel R (or G or B) component is less that 15 for example.
Utilizing your test image, the most common color has an RGB value of (1, 1, 1). This is very black, but not exactly black. My answer utilizes the PIL library, webcolors and a generous helping of code from this answer.
from PIL import Image
import webcolors
def closest_color(requested_color):
min_colors = {}
for key, name in webcolors.css3_hex_to_names.items():
r_c, g_c, b_c = webcolors.hex_to_rgb(key)
rd = (r_c - requested_color[0]) ** 2
gd = (g_c - requested_color[1]) ** 2
bd = (b_c - requested_color[2]) ** 2
min_colors[(rd + gd + bd)] = name
return min_colors[min(min_colors.keys())]
def get_color_name(requested_color):
try:
closest_name = actual_name = webcolors.rgb_to_name(requested_color)
except ValueError:
closest_name = closest_color(requested_color)
actual_name = None
return actual_name, closest_name
if __name__ == '__main__':
lt = Image.open('test.gif').convert('RGB').getcolors()
lt.sort(key=lambda tup:tup[0], reverse=True)
actual_name, closest_name = get_color_name(lt[0][4])
print lt[0], actual_name, closest_name
Output:
(531162, (1, 1, 1)) None black
In this case, you'd be interested in the closest_name variable. The first (lt[0]) is showing you the most common RGB value. This doesn't have a defined web color name, hence the None for actual_name
Explanation:
This is opening the file you've provided, converting it to RGB and then running PIL's getcolors method on the image. The result of this is a list of tuples in the format (count, RGB_color_value). I then sort the list (in reverse order). Utilizing the functions from the other answer, I pass the most common RGB color value (now the first tuple in the list and the RBG is the second element in the tuple).
I'm looking for a way to find the most dominant color/tone in an image using python. Either the average shade or the most common out of RGB will do. I've looked at the Python Imaging library, and could not find anything relating to what I was looking for in their manual, and also briefly at VTK.
I did however find a PHP script which does what I need, here (login required to download). The script seems to resize the image to 150*150, to bring out the dominant colors. However, after that, I am fairly lost. I did consider writing something that would resize the image to a small size then check every other pixel or so for it's image, though I imagine this would be very inefficient (though implementing this idea as a C python module might be an idea).
However, after all of that, I am still stumped. So I turn to you, SO. Is there an easy, efficient way to find the dominant color in an image.
Here's code making use of Pillow and Scipy's cluster package.
For simplicity I've hardcoded the filename as "image.jpg". Resizing the image is for speed: if you don't mind the wait, comment out the resize call. When run on this sample image,
it usually says the dominant colour is #d8c865, which corresponds roughly to the bright yellowish area to the lower left of the two peppers. I say "usually" because the clustering algorithm used has a degree of randomness to it. There are various ways you could change this, but for your purposes it may suit well. (Check out the options on the kmeans2() variant if you need deterministic results.)
from __future__ import print_function
import binascii
import struct
from PIL import Image
import numpy as np
import scipy
import scipy.misc
import scipy.cluster
NUM_CLUSTERS = 5
print('reading image')
im = Image.open('image.jpg')
im = im.resize((150, 150)) # optional, to reduce time
ar = np.asarray(im)
shape = ar.shape
ar = ar.reshape(scipy.product(shape[:2]), shape[2]).astype(float)
print('finding clusters')
codes, dist = scipy.cluster.vq.kmeans(ar, NUM_CLUSTERS)
print('cluster centres:\n', codes)
vecs, dist = scipy.cluster.vq.vq(ar, codes) # assign codes
counts, bins = scipy.histogram(vecs, len(codes)) # count occurrences
index_max = scipy.argmax(counts) # find most frequent
peak = codes[index_max]
colour = binascii.hexlify(bytearray(int(c) for c in peak)).decode('ascii')
print('most frequent is %s (#%s)' % (peak, colour))
Note: when I expand the number of clusters to find from 5 to 10 or 15, it frequently gave results that were greenish or bluish. Given the input image, those are reasonable results too... I can't tell which colour is really dominant in that image either, so I don't fault the algorithm!
Also a small bonus: save the reduced-size image with only the N most-frequent colours:
# bonus: save image using only the N most common colours
import imageio
c = ar.copy()
for i, code in enumerate(codes):
c[scipy.r_[scipy.where(vecs==i)],:] = code
imageio.imwrite('clusters.png', c.reshape(*shape).astype(np.uint8))
print('saved clustered image')
Try Color-thief. It is based on Pillow and works awesome.
Installation
pip install colorthief
Usage
from colorthief import ColorThief
color_thief = ColorThief('/path/to/imagefile')
# get the dominant color
dominant_color = color_thief.get_color(quality=1)
It can also find color pallete
palette = color_thief.get_palette(color_count=6)
Python Imaging Library has method getcolors on Image objects:
im.getcolors() => a list of (count,
color) tuples or None
I guess you can still try resizing the image before that and see if it performs any better.
You can do this in many different ways. And you don't really need scipy and k-means since internally Pillow already does that for you when you either resize the image or reduce the image to a certain pallete.
Solution 1: resize image down to 1 pixel.
def get_dominant_color(pil_img):
img = pil_img.copy()
img = img.convert("RGBA")
img = img.resize((1, 1), resample=0)
dominant_color = img.getpixel((0, 0))
return dominant_color
Solution 2: reduce image colors to a pallete
def get_dominant_color(pil_img, palette_size=16):
# Resize image to speed up processing
img = pil_img.copy()
img.thumbnail((100, 100))
# Reduce colors (uses k-means internally)
paletted = img.convert('P', palette=Image.ADAPTIVE, colors=palette_size)
# Find the color that occurs most often
palette = paletted.getpalette()
color_counts = sorted(paletted.getcolors(), reverse=True)
palette_index = color_counts[0][1]
dominant_color = palette[palette_index*3:palette_index*3+3]
return dominant_color
Both solutions give similar results. The latter solution gives you probably more accuracy since we keep the aspect ratio when resizing the image. Also you get more control since you can tweak the pallete_size.
It's not necessary to use k-means to find the dominant color as Peter suggests. This overcomplicates a simple problem. You're also restricting yourself by the amount of clusters you select, so basically you need an idea of what you're looking at.
As you mentioned and as suggested by zvone, a quick solution to find the most common/dominant color is by using the Pillow library. We just need to sort the pixels by their count number.
from PIL import Image
def find_dominant_color(filename):
#Resizing parameters
width, height = 150, 150
image = Image.open(filename)
image = image.resize((width, height), resample = 0)
#Get colors from image object
pixels = image.getcolors(width * height)
#Sort them by count number(first element of tuple)
sorted_pixels = sorted(pixels, key=lambda t: t[0])
#Get the most frequent color
dominant_color = sorted_pixels[-1][1]
return dominant_color
The only problem is that the method getcolors() returns None when the amount of colors is more than 256. You can deal with it by resizing the original image.
In all, it might not be the most precise solution, but it gets the job done.
If you're still looking for an answer, here's what worked for me, albeit not terribly efficient:
from PIL import Image
def compute_average_image_color(img):
width, height = img.size
r_total = 0
g_total = 0
b_total = 0
count = 0
for x in range(0, width):
for y in range(0, height):
r, g, b = img.getpixel((x,y))
r_total += r
g_total += g
b_total += b
count += 1
return (r_total/count, g_total/count, b_total/count)
img = Image.open('image.png')
#img = img.resize((50,50)) # Small optimization
average_color = compute_average_image_color(img)
print(average_color)
My solution
Here's my adaptation based on Peter Hansen's solution.
import scipy.cluster
import sklearn.cluster
import numpy
from PIL import Image
def dominant_colors(image): # PIL image input
image = image.resize((150, 150)) # optional, to reduce time
ar = numpy.asarray(image)
shape = ar.shape
ar = ar.reshape(numpy.product(shape[:2]), shape[2]).astype(float)
kmeans = sklearn.cluster.MiniBatchKMeans(
n_clusters=10,
init="k-means++",
max_iter=20,
random_state=1000
).fit(ar)
codes = kmeans.cluster_centers_
vecs, _dist = scipy.cluster.vq.vq(ar, codes) # assign codes
counts, _bins = numpy.histogram(vecs, len(codes)) # count occurrences
colors = []
for index in numpy.argsort(counts)[::-1]:
colors.append(tuple([int(code) for code in codes[index]]))
return colors # returns colors in order of dominance
What are the differences/improvements?
It's (subjectively) more accurate
It's using the kmeans++ to pick initial cluster centers which gives better results. (kmeans++ may not be the fastest way to pick cluster centers though)
It's faster
Using sklearn.cluster.MiniBatchKMeans is significantly faster and gives very similar colors to the default KMeans algorithm. You can always try the slower sklearn.cluster.KMeans and compare the results and decide whether the tradeoff is worth it.
It's deterministic
I am using a random_state to get consistent ouput (I believe the original scipy.cluster.vq.kmeans also has a seed parameter). Before adding a random state I found that certain inputs could have significantly different outputs.
Benchmarks
I decided to very crudely benchmark a few solutions.
Method
Time (100 iterations)
Peter Hansen (kmeans)
58.85
Artem Bernatskyi (Color Thief)
61.29
Artem Bernatskyi (Color Thief palette)
15.69
Pithikos (PIL resize)
0.11
Pithikos (palette)
1.68
Mine (mini batch kmeans)
6.31
You could use PIL to repeatedly resize the image down by a factor of 2 in each dimension until it reaches 1x1. I don't know what algorithm PIL uses for downscaling by large factors, so going directly to 1x1 in a single resize might lose information. It might not be the most efficient, but it will give you the "average" color of the image.
To add to Peter's answer, if PIL is giving you an image with mode "P" or pretty much any mode that isn't "RGBA", then you need to apply an alpha mask to convert it to RGBA. You can do that pretty easily with:
if im.mode == 'P':
im.putalpha(0)
Below is a c++ Qt based example to guess the predominant image color. You can use PyQt and translate the same to Python equivalent.
#include <Qt/QtGui>
#include <Qt/QtCore>
#include <QtGui/QApplication>
int main(int argc, char** argv)
{
QApplication app(argc, argv);
QPixmap pixmap("logo.png");
QImage image = pixmap.toImage();
QRgb col;
QMap<QRgb,int> rgbcount;
QRgb greatest = 0;
int width = pixmap.width();
int height = pixmap.height();
int count = 0;
for (int i = 0; i < width; ++i)
{
for (int j = 0; j < height; ++j)
{
col = image.pixel(i, j);
if (rgbcount.contains(col)) {
rgbcount[col] = rgbcount[col] + 1;
}
else {
rgbcount[col] = 1;
}
if (rgbcount[col] > count) {
greatest = col;
count = rgbcount[col];
}
}
}
qDebug() << count << greatest;
return app.exec();
}
This is a complete script with a function compute_average_image_color().
Just copy and past it, and change the path of your image.
My image is img_path='./dir/image001.png'
#AVERANGE COLOR, MIN, MAX, STANDARD DEVIATION
#SELECT ONLY NOT TRANSPARENT COLOR
from PIL import Image
import sys
import os
import os.path
from os import path
import numpy as np
import math
def compute_average_image_color(img_path):
if not os.path.isfile(img_path):
print(path_inp_image, 'DONT EXISTS, EXIT')
sys.exit()
#load image
img = Image.open(img_path).convert('RGBA')
img = img.resize((50,50)) # Small optimization
#DEFINE SOME VARIABLES
width, height = img.size
r_total = 0
g_total = 0
b_total = 0
count = 0
red_list=[]
green_list=[]
blue_list=[]
#READ AND CHECK PIXEL BY PIXEL
for x in range(0, width):
for y in range(0, height):
r, g, b, alpha = img.getpixel((x,y))
if alpha !=0:
red_list.append(r)
green_list.append(g)
blue_list.append(b)
r_total += r
g_total += g
b_total += b
count += 1
#CALCULATE THE AVRANGE COLOR, MIN, MAX, ETC
average_color=(round(r_total/count), round(g_total/count), round(b_total/count))
print(average_color)
red_list.sort()
green_list.sort()
blue_list.sort()
red_min_max=[]
green_min_max=[]
blue_min_max=[]
red_min_max.append(min(red_list))
red_min_max.append(max(red_list))
green_min_max.append(min(green_list))
green_min_max.append(max(red_list))
blue_min_max.append(min(blue_list))
blue_min_max.append(max(blue_list))
print('red_min_max: ', red_min_max)
print('green_min_max: ', green_min_max)
print('blue_min_max: ', blue_min_max)
#variance and standard devietion
red_stddev=round(math.sqrt(np.var(red_list)))
green_stddev=round(math.sqrt(np.var(green_list)))
blue_stddev=round(math.sqrt(np.var(blue_list)))
print('red_stddev: ', red_stddev)
print('green_stddev: ', green_stddev)
print('blue_stddev: ', blue_stddev)
img_path='./dir/image001.png'
compute_average_image_color(img_path)