Disclaimer: This was part of a homework assignment, however, it has been passed in already. I'm simply looking for the correct solution for future know-how.
The goal with this program was to use the Python OpenCV library to implement image -> image steganography (Embedding/Extracting images inside other images). This is done with two images of equal size using the least significant bit(LSB) method.
The program allows the user to choose the number of bits used for embedding, so with 1 bit used the embedded image is nearly undetectable to the human eye, and with 7 you can clearly make out the hidden image.
I've correctly implemented the embedding just fine, by taking the most significant bits(MSB) of each RGB byte from the secret image, and setting them in the LSB places of the cover image.
My problem is extracting the secret image after it has been embedded. After the code runs, the image that I'm left with seems to be only the blue representation of it. I'm not sure where I went wrong, but I have a feeling it has something to do with my bit manipulation techniques, or use of the OpenCV library. Any help is greatly appreciated, thanks in advance!
Code for extracting:
import cv2
import numpy
def extract(img1, bitsUsed):
print "Extracting..."
# Import image & get dimensions
img = cv2.imread(img1)
h = img.shape[0]
w = img.shape[1]
# Create new image to extract secret image
# Same dimensions, and rgb channel
secretImg = numpy.zeros((h,w,3), numpy.uint8)
x, y = 0, 0
# Loop thru each pixel
while x < w:
while y < h:
# Grab the LSB (based on bitsUsed from embedding)
lsb_B = img.item(y,x,0) & bitsUsed
lsb_G = img.item(y,x,1) & bitsUsed
lsb_R = img.item(y,x,2) & bitsUsed
# Place those bits into MSB positions on new img
secretImg.itemset((y,x,0), lsb_B << (8 - bitsUsed))
secretImg.itemset((y,x,0), lsb_G << (8 - bitsUsed))
secretImg.itemset((y,x,0), lsb_R << (8 - bitsUsed))
y += 1
y = 0
x += 1
cv2.imwrite("extractedImg.png", secretImg)
njuffa is correct. In the extraction, when you have embedded only 1 bit, you want to AND with 0b00000001 (1), for 2 bits with 0b00000011 (3), for 3 bits with 0b00000111 (7), etc. Generally, for k embedded bits, you want the mask 2**k - 1.
Moreover, cv2.imread() will generate a numpy array of the pixels. Instead of looping through each pixel, you can vectorise your computations. All in all, this is what your code could look like.
import cv2
def embed(cover_file, secret_file, k):
cover = cv2.imread(cover_file)
secret = cv2.imread(secret_file)
mask = 256 - 2**k
stego = (cover & mask) | (secret >> (8 - k))
cv2.imwrite('stego.png', stego)
def extract(stego_file, k):
stego = cv2.imread(stego_file)
mask = 2**k - 1
output = (stego & mask) << (8 - k)
cv2.imwrite('extracted.png', output)
Related
Background: I have images I need to compare for differences. The images are large (on the order of 1400x9000 px), machine-generated and highly constrained (screenshots of a particular piece of linear UI), and are expected to be nearly identical, with differences being one of the following three possibilities:
Image 1 has a section image 2 is missing
Image 1 is missing a section image 2 has
Both images have the given section, but its contents differ
I'm trying to build a tool that highlights the differences for a human reviewer, essentially an image version of line-oriented diff. To that end, I'm trying to scan the images line by line and compare them to decide if the lines are identical. My ultimate goal is an actual diff-like output, where it can detect that sections are missing/added/different, and sync the images up as soon as possible for the remaining parts of identical content, but for the first cut, I'm going with a simpler approach where the two images are overlaid (alpha blended), and the lines which were different highlighted with a particular colour (ie. alpha-blended with a third line of solid colour). At first I tried using Python Imaging Library, but that was far several orders of magnitude too slow, so I decided to try using vips, which should be way faster. However, I have absolutely no idea how to express what I'm after using vips operations. The pseudocode for the simpler version would be essentially:
out = []
# image1 and image2 are expected, but not guaranteed to have the same height
# they are likely to have different heights if different
# most lines are entirely white pixels
for line1, line2 in zip(image1, image2):
if line1 == line2:
out.append(line1)
else:
# ALL_RED is a line composed of solid red pixels
out.append(line1.blend(line2, 0.5).blend(ALL_RED, 0.5))
I'm using pyvips in my project, but I'm also interested in code using plain vips or any other bindings, since the operations are shared and easily translated across dialects.
Edit: adding sample images as requested
Edit 2: full size images with missing/added/changed sections:
reference
comparison
How about just using diff? It's pretty quick. All you need to do is turn your PNGs into text a scanline a time, then parse the diff output.
For example:
#!/usr/bin/env python3
import sys
import os
import re
import pyvips
# calculate a checksum for each scanline and write to name_out
def scanline_checksum(name_in, name_out):
a = pyvips.Image.new_from_file(name_in, access="sequential")
# unfold colour channels to make a wider 1-band image
a = a.bandunfold()
# xyz makes an index image, where the value of each pixel is its coordinate
b = pyvips.Image.xyz(a.width, a.height)
# make a pow gradient image ... each pixel is some power of the x coordinate
b = b[0] ** 0.5
# now multiply and sum to make a checksum for each scanline
# "project" returns sum of columns, sum of rows
sum_of_columns, sum_of_rows = (a * b).project()
sum_of_rows.write_to_file(name_out)
to_csv(sys.argv[1], "1.csv")
to_csv(sys.argv[2], "2.csv")
os.system("diff 1.csv 2.csv > diff.csv")
for line in open("diff.csv", "r"):
match = re.match("(\\d+),(\\d+)c(\\d+),(\\d+)", line)
if not match:
continue
print(line)
For your two test images I see:
$ time ./diff.py 1.png 2.png
264,272c264,272
351,359c351,359
real 0m0.346s
user 0m0.445s
sys 0m0.033s
On this elderly laptop. All you need to do is use those "change" commands to mark up your images.
If OpenCV and NumPy are options to you, then there would be a quite simple solution at least for finding and coloring different rows.
In my approach, I just calculate pixel-wise differences using np.abs, and find non-zero row indices with np.nonzero. With these found row indices, I set up an additional black image and draw red lines for each row. The final blending is just some linear mixing:
0.5 * image1 + 0.5 * image2
for all equal rows, or
0.333 * image1 + 0.333 * image2 + 0.333 * red
for all different rows.
Here's the final code:
import cv2
import numpy as np
# Load images
first = cv2.imread('9gOlq.png', cv2.IMREAD_COLOR)
second = cv2.imread('1Hdx4.png', cv2.IMREAD_COLOR)
# Calcluate absolute differences between images
diff = np.abs(np.float32(first) - np.float32(second))
# Find all non-zero rows
nz_rows = np.unique(np.nonzero(diff)[0])
# Set up image with red lines
red = np.zeros(first.shape, np.uint8)
red[nz_rows, :, :] = [0, 0, 255]
# Set up output image
output = np.uint8(0.5 * first + 0.5 * second)
output[nz_rows, :, :] = 0.333 * first[nz_rows, :, :] + 0.333 * second[nz_rows, :, :] + 0.333 * red[nz_rows, :, :]
# Show results
cv2.imshow("diff", np.array(diff, dtype=np.uint8))
cv2.imshow("output", output)
cv2.waitKey()
cv2.destroyAllWindows()
The difference image diff looks like this:
The final output looke like this:
It would be interesting to see two input images with omitted sections as you described in your question. Also, testing this approach using original sized images would be necessary, since you mentioned time is crucial.
Anyway - hope that helps!
The best code for mosaic I've found you can see at this page:
https://github.com/codebox/mosaic
However, the code doesn't work well on my Windows computer, and also I think the code is too advanced for what it should do. Here are my requirements I've posted on reddit:
1) The main photo already has reduced number of colors (8)
2) I have already every image associated with colour needed to be replaced (e.g. number 1 is supposed to replace black pixels, number 2 replaces green pixels...)
3) I need to enlarge the photo by the small photo's size (9 x 9 small photos will produce 81 times bigger image), which should push the pixels "2n" points away from each other, but instead of producing a n x n same-coloured area around every single one of them (this is how I believe enlarging works in general, correct me if I'm wrong), it will just colour the white spaces with unrecognized colour, which is not associated with any small photo (let's call that colour C)
4) Now all it needs is to run through all non-C coloured pixels and put an image centered on that pixel, which would create the mosaic.
Since I'm pretty new to Python (esp. graphics) and need it just for one use, could someone help me with creating that code? I think that code I got inspired with is too complicated. Two things I don't need:
1) "approximation" - if the enlargement is lesser than needed for 100% quality (e.g. the pictures are 9x9, but every side of the original photo can be only 3 times larger, then the program needs to merge some pixels of different colours together, leading to quality loss)
2) association colour - picture: my palette of pictures is small and of colours as well, I can do it manually
For the ones who didn't get what I mean, here is my idea: https://ibb.co/9GNhqBx
I had a quick go using pyvips:
#!/usr/bin/python3
import sys
import os
import pyvips
if len(sys.argv) != 4:
print("usage: tile-directory input-image output-image")
sys.exit(1)
# the size of each tile ... 16x16 for us
tile_size = 16
# load all the tile images, forcing them to the tile size
print(f"loading tiles from {sys.argv[1]} ...")
for root, dirs, files in os.walk(sys.argv[1]):
tiles = [pyvips.Image.thumbnail(os.path.join(root, name), tile_size,
height=tile_size, size="force")
for name in files]
# drop any alpha
tiles = [image.flatten() if image.hasalpha() else image
for image in tiles]
# copy the tiles to memory, since we'll be using them many times
tiles = [image.copy_memory() for image in tiles]
# calculate the average rgb for an image, eg. image -> [12, 13, 128]
def avg_rgb(image):
m = image.stats()
return [m(4,i)[0] for i in range(1,4)]
# find the avg rgb for each tile
tile_colours = [avg_rgb(image) for image in tiles]
# load the main image ... we can do this in streaming mode, since we only
# make a single pass over the image
main = pyvips.Image.new_from_file(sys.argv[2], access="sequential")
# find the abs of an image, treating each pixel as a vector
def pyth(image):
return sum([band ** 2 for band in image.bandsplit()]) ** 0.5
# calculate a distance map from the main image to each tile colour
distance = [pyth(main - colour) for colour in tile_colours]
# make a distance index -- hide the tile index in the bottom 16 bits of the
# distance measure
index = [(distance[i] << 16) + i for i in range(len(distance))]
# find the minimum distance for each pixel and mask out the bottom 16 bits to
# get the tile index for each pixel
index = index[0].bandrank(index[1:], index=0) & 0xffff
# replicate each tile image to make a set of layers, and zoom the index to
# make an index matching the output size
layers = [tile.replicate(main.width, main.height) for tile in tiles]
index = index.zoom(tile_size, tile_size)
# now for each layer, select pixels matching the index
final = pyvips.Image.black(main.width * tile_size, main.height * tile_size)
for i in range(len(layers)):
final = (index == i).ifthenelse(layers[i], final)
print(f"writing {sys.argv[3]} ...")
final.write_to_file(sys.argv[3])
I hope it's easy to read. I can run it like this:
$ ./mosaic3.py smallpic/ mainpic/Use\ this.jpg x.png
loading tiles from smallpic/ ...
writing x.png ...
$
It takes about 5s on this 2015 laptop and makes this image:
I had to shrink it for upload, but here's a detail (bottom left of the first H):
Here's a google drive link to the mosaic, perhaps it'll work: https://drive.google.com/file/d/1J3ofrLUhkuvALKN1xamWqfW4sUksIKQl/view?usp=sharing
And here's this code on github: https://github.com/jcupitt/mosaic
I have been working on a piece of code to create a disparity map.
I don't want to use OpenCV for more than loading / saving the images converting them to grayscale.
So far, I've managed to implement the algorithm explained in this website. I'm using the version of the algorithm that uses the Sum of Absolute Differences (SAD). To test my implementation, I'm using the stereo images from this dataset.
Here's my code:
import cv2
import numpy as np
# Load the stereo images
img = cv2.imread('bow-view1.png')
img2 = cv2.imread('bow-view5.png')
# convert stereo images to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray2 = cv2.cvtColor(img2,cv2.COLOR_BGR2GRAY)
# get the size of the images
# l -> lines
# c -> columns
# v -> channel (RGB)
l,c,v = img.shape
# initialize arrays
minSAD = np.ones((l,c)) * 1000
sad = np.ones((l,c))
winsad = np.ones((l,c))
disp = np.zeros((l,c))
max_shift = 30
# set size of the SAD window
w_l = 2
w_c = 2
for shift in range(max_shift):
print("New Shift: %d"%(shift))
for u in range(0,l):
for v in range(0,c):
# calculate SAD
if(u+shift < l):
sad[u,v] = np.abs((int(gray[u,v]) - int(gray2[u+shift,v])))
sum_sad = 0
for d in range(w_l):
for e in range(w_c):
if(u+d < l and v+e < c):
sum_sad += sad[u+d,v+e]
winsad[u,v] = sum_sad
# Save disparity
if(sad[u,v] < minSAD[u,v]):
minSAD[u,v] = winsad[u,v]
disp[u,v] = shift
print("Process Complete")
# write disparity map to image
cv2.imwrite('outputHT/disparity/sad.png',disp)
print("Disparity Map Generated")
This is the output generated by that code:
I should get an output similar (or very close to) this:
I've tried several window sizes (in the SAD step), but I keep getting results like this one or images that are all black.
Any answer that helps me figure out the problem or that at least points me in the right direction will be very appreciated!
One thing you are missing here is that all the values in the disp array will be between 0 and 30 which correspond to black pixel, so in order to map these values between 0 and 255 you have to multiply the shift by 8.
I have a multiband satellite image stored in the band interleaved pixel (BIP) format along with a separate header file. The header file provides the details such as the number of rows and columns in the image, and the number of bands (can be more than the standard 3).
The image itself is stored like this (assume a 5 band image):
[B1][B2][B3][B4][B5][B1][B2][B3][B4][B5] ... and so on (basically 5 bytes - one for each band - for each pixel starting from the top left corner of the image).
I need to separate out each of these bands as PIL images in Python 3.2 (on Windows 7 64 bit), and currently I think I'm approaching the problem incorrectly. My current code is as follows:
def OpenBIPImage(file, width, height, numberOfBands):
"""
Opens a raw image file in the BIP format and returns a list
comprising each band as a separate PIL image.
"""
bandArrays = []
with open(file, 'rb') as imageFile:
data = imageFile.read()
currentPosition = 0
for i in range(height * width):
for j in range(numberOfBands):
if i == 0:
bandArrays.append(bytearray(data[currentPosition : currentPosition + 1]))
else:
bandArrays[j].extend(data[currentPosition : currentPosition + 1])
currentPosition += 1
bands = [Image.frombytes('L', (width, height), bytes(bandArray)) for bandArray in bandArrays]
return bands
This code takes way too long to open a BIP file, surely there must be a better way to do this. I do have the numpy and scipy libraries as well, but I'm not sure how I can use them, or if they'll even help in any way.
Since the number of bands in the image are also variable, I'm finding it hard to figure out a way to read the file quickly and separate the image into its component bands.
And just for the record, I have tried messing with the list methods in the loops (using slices, not using slices, using only append, using only extend etc), it doesn't particularly make a difference as the major time is lost because of the number of iterations involved - (width * height * numberOfBands).
Any suggestions or advice would be really helpful. Thanks.
If you can find a fast function to load the binary data in a big python list (or numpy array), you can de-interleave the data using the slicing notation:
band0 = biglist[::nbands]
band1 = biglist[1::nbands]
....
Does that help?
Standard PIL
To load an image from a file, use the open function in the Image module.
>>> import Image
>>> im = Image.open("lena.ppm")
If successful, this function returns an Image object. You can now use instance attributes to examine the file contents.
>>> print im.format, im.size, im.mode
PPM (512, 512) RGB
The format attribute identifies the source of an image. If the image was not read from a file, it is set to None. The size attribute is a 2-tuple containing width and height (in pixels). The mode attribute defines the number and names of the bands in the image, and also the pixel type and depth. Common modes are "L" (luminance) for greyscale images, "RGB" for true colour images, and "CMYK" for pre-press images.
The Python Imaging Library also allows you to work with the individual bands of an multi-band image, such as an RGB image. The split method creates a set of new images, each containing one band from the original multi-band image. The merge function takes a mode and a tuple of images, and combines them into a new image. The following sample swaps the three bands of an RGB image:
Splitting and merging bands
r, g, b = im.split()
im = Image.merge("RGB", (b, g, r))
So I think you should simply derive the mode and then split accordingly.
PIL with Spectral Python (SPy python module)
However, as you pointed out in your comments below, you are not dealing with a normal RGB image with 3 bands. So to deal with that, SpectralPython (a pure python module which requires PIL) might just be what you are looking for.
Specifically - http://spectralpython.sourceforge.net/class_func_ref.html#spectral.io.bipfile.BipFile
spectral.io.bipfile.BipFile deals with Image files with Band Interleaved Pixel (BIP) format.
Hope this helps.
I suspect that the repetition of extend is not good better allocate all first
def OpenBIPImage(file, width, height, numberOfBands):
"""
Opens a raw image file in the BIP format and returns a list
comprising each band as a separate PIL image.
"""
bandArrays = []
with open(file, 'rb') as imageFile:
data = imageFile.read()
currentPosition = 0
for j in range(numberOfBands):
bandArrays[j]= bytearray(b"\0"*(height * width)):
for i in xrange(height * width):
for j in xrange(numberOfBands):
bandArrays[j][i]=data[currentPosition])
currentPosition += 1
bands = [Image.frombytes('L', (width, height), bytes(bandArray)) for bandArray in bandArrays]
return bands
my measurements doesn't show nsuch a slow down
def x():
height,width,numberOfBands=1401,801,6
before = time.time()
for i in range(height * width):
for j in range(numberOfBands):
pass
print (time.time()-before)
>>> x()
0.937999963760376
EDITED
I'm looking for a way to find the most dominant color/tone in an image using python. Either the average shade or the most common out of RGB will do. I've looked at the Python Imaging library, and could not find anything relating to what I was looking for in their manual, and also briefly at VTK.
I did however find a PHP script which does what I need, here (login required to download). The script seems to resize the image to 150*150, to bring out the dominant colors. However, after that, I am fairly lost. I did consider writing something that would resize the image to a small size then check every other pixel or so for it's image, though I imagine this would be very inefficient (though implementing this idea as a C python module might be an idea).
However, after all of that, I am still stumped. So I turn to you, SO. Is there an easy, efficient way to find the dominant color in an image.
Here's code making use of Pillow and Scipy's cluster package.
For simplicity I've hardcoded the filename as "image.jpg". Resizing the image is for speed: if you don't mind the wait, comment out the resize call. When run on this sample image,
it usually says the dominant colour is #d8c865, which corresponds roughly to the bright yellowish area to the lower left of the two peppers. I say "usually" because the clustering algorithm used has a degree of randomness to it. There are various ways you could change this, but for your purposes it may suit well. (Check out the options on the kmeans2() variant if you need deterministic results.)
from __future__ import print_function
import binascii
import struct
from PIL import Image
import numpy as np
import scipy
import scipy.misc
import scipy.cluster
NUM_CLUSTERS = 5
print('reading image')
im = Image.open('image.jpg')
im = im.resize((150, 150)) # optional, to reduce time
ar = np.asarray(im)
shape = ar.shape
ar = ar.reshape(scipy.product(shape[:2]), shape[2]).astype(float)
print('finding clusters')
codes, dist = scipy.cluster.vq.kmeans(ar, NUM_CLUSTERS)
print('cluster centres:\n', codes)
vecs, dist = scipy.cluster.vq.vq(ar, codes) # assign codes
counts, bins = scipy.histogram(vecs, len(codes)) # count occurrences
index_max = scipy.argmax(counts) # find most frequent
peak = codes[index_max]
colour = binascii.hexlify(bytearray(int(c) for c in peak)).decode('ascii')
print('most frequent is %s (#%s)' % (peak, colour))
Note: when I expand the number of clusters to find from 5 to 10 or 15, it frequently gave results that were greenish or bluish. Given the input image, those are reasonable results too... I can't tell which colour is really dominant in that image either, so I don't fault the algorithm!
Also a small bonus: save the reduced-size image with only the N most-frequent colours:
# bonus: save image using only the N most common colours
import imageio
c = ar.copy()
for i, code in enumerate(codes):
c[scipy.r_[scipy.where(vecs==i)],:] = code
imageio.imwrite('clusters.png', c.reshape(*shape).astype(np.uint8))
print('saved clustered image')
Try Color-thief. It is based on Pillow and works awesome.
Installation
pip install colorthief
Usage
from colorthief import ColorThief
color_thief = ColorThief('/path/to/imagefile')
# get the dominant color
dominant_color = color_thief.get_color(quality=1)
It can also find color pallete
palette = color_thief.get_palette(color_count=6)
Python Imaging Library has method getcolors on Image objects:
im.getcolors() => a list of (count,
color) tuples or None
I guess you can still try resizing the image before that and see if it performs any better.
You can do this in many different ways. And you don't really need scipy and k-means since internally Pillow already does that for you when you either resize the image or reduce the image to a certain pallete.
Solution 1: resize image down to 1 pixel.
def get_dominant_color(pil_img):
img = pil_img.copy()
img = img.convert("RGBA")
img = img.resize((1, 1), resample=0)
dominant_color = img.getpixel((0, 0))
return dominant_color
Solution 2: reduce image colors to a pallete
def get_dominant_color(pil_img, palette_size=16):
# Resize image to speed up processing
img = pil_img.copy()
img.thumbnail((100, 100))
# Reduce colors (uses k-means internally)
paletted = img.convert('P', palette=Image.ADAPTIVE, colors=palette_size)
# Find the color that occurs most often
palette = paletted.getpalette()
color_counts = sorted(paletted.getcolors(), reverse=True)
palette_index = color_counts[0][1]
dominant_color = palette[palette_index*3:palette_index*3+3]
return dominant_color
Both solutions give similar results. The latter solution gives you probably more accuracy since we keep the aspect ratio when resizing the image. Also you get more control since you can tweak the pallete_size.
It's not necessary to use k-means to find the dominant color as Peter suggests. This overcomplicates a simple problem. You're also restricting yourself by the amount of clusters you select, so basically you need an idea of what you're looking at.
As you mentioned and as suggested by zvone, a quick solution to find the most common/dominant color is by using the Pillow library. We just need to sort the pixels by their count number.
from PIL import Image
def find_dominant_color(filename):
#Resizing parameters
width, height = 150, 150
image = Image.open(filename)
image = image.resize((width, height), resample = 0)
#Get colors from image object
pixels = image.getcolors(width * height)
#Sort them by count number(first element of tuple)
sorted_pixels = sorted(pixels, key=lambda t: t[0])
#Get the most frequent color
dominant_color = sorted_pixels[-1][1]
return dominant_color
The only problem is that the method getcolors() returns None when the amount of colors is more than 256. You can deal with it by resizing the original image.
In all, it might not be the most precise solution, but it gets the job done.
If you're still looking for an answer, here's what worked for me, albeit not terribly efficient:
from PIL import Image
def compute_average_image_color(img):
width, height = img.size
r_total = 0
g_total = 0
b_total = 0
count = 0
for x in range(0, width):
for y in range(0, height):
r, g, b = img.getpixel((x,y))
r_total += r
g_total += g
b_total += b
count += 1
return (r_total/count, g_total/count, b_total/count)
img = Image.open('image.png')
#img = img.resize((50,50)) # Small optimization
average_color = compute_average_image_color(img)
print(average_color)
My solution
Here's my adaptation based on Peter Hansen's solution.
import scipy.cluster
import sklearn.cluster
import numpy
from PIL import Image
def dominant_colors(image): # PIL image input
image = image.resize((150, 150)) # optional, to reduce time
ar = numpy.asarray(image)
shape = ar.shape
ar = ar.reshape(numpy.product(shape[:2]), shape[2]).astype(float)
kmeans = sklearn.cluster.MiniBatchKMeans(
n_clusters=10,
init="k-means++",
max_iter=20,
random_state=1000
).fit(ar)
codes = kmeans.cluster_centers_
vecs, _dist = scipy.cluster.vq.vq(ar, codes) # assign codes
counts, _bins = numpy.histogram(vecs, len(codes)) # count occurrences
colors = []
for index in numpy.argsort(counts)[::-1]:
colors.append(tuple([int(code) for code in codes[index]]))
return colors # returns colors in order of dominance
What are the differences/improvements?
It's (subjectively) more accurate
It's using the kmeans++ to pick initial cluster centers which gives better results. (kmeans++ may not be the fastest way to pick cluster centers though)
It's faster
Using sklearn.cluster.MiniBatchKMeans is significantly faster and gives very similar colors to the default KMeans algorithm. You can always try the slower sklearn.cluster.KMeans and compare the results and decide whether the tradeoff is worth it.
It's deterministic
I am using a random_state to get consistent ouput (I believe the original scipy.cluster.vq.kmeans also has a seed parameter). Before adding a random state I found that certain inputs could have significantly different outputs.
Benchmarks
I decided to very crudely benchmark a few solutions.
Method
Time (100 iterations)
Peter Hansen (kmeans)
58.85
Artem Bernatskyi (Color Thief)
61.29
Artem Bernatskyi (Color Thief palette)
15.69
Pithikos (PIL resize)
0.11
Pithikos (palette)
1.68
Mine (mini batch kmeans)
6.31
You could use PIL to repeatedly resize the image down by a factor of 2 in each dimension until it reaches 1x1. I don't know what algorithm PIL uses for downscaling by large factors, so going directly to 1x1 in a single resize might lose information. It might not be the most efficient, but it will give you the "average" color of the image.
To add to Peter's answer, if PIL is giving you an image with mode "P" or pretty much any mode that isn't "RGBA", then you need to apply an alpha mask to convert it to RGBA. You can do that pretty easily with:
if im.mode == 'P':
im.putalpha(0)
Below is a c++ Qt based example to guess the predominant image color. You can use PyQt and translate the same to Python equivalent.
#include <Qt/QtGui>
#include <Qt/QtCore>
#include <QtGui/QApplication>
int main(int argc, char** argv)
{
QApplication app(argc, argv);
QPixmap pixmap("logo.png");
QImage image = pixmap.toImage();
QRgb col;
QMap<QRgb,int> rgbcount;
QRgb greatest = 0;
int width = pixmap.width();
int height = pixmap.height();
int count = 0;
for (int i = 0; i < width; ++i)
{
for (int j = 0; j < height; ++j)
{
col = image.pixel(i, j);
if (rgbcount.contains(col)) {
rgbcount[col] = rgbcount[col] + 1;
}
else {
rgbcount[col] = 1;
}
if (rgbcount[col] > count) {
greatest = col;
count = rgbcount[col];
}
}
}
qDebug() << count << greatest;
return app.exec();
}
This is a complete script with a function compute_average_image_color().
Just copy and past it, and change the path of your image.
My image is img_path='./dir/image001.png'
#AVERANGE COLOR, MIN, MAX, STANDARD DEVIATION
#SELECT ONLY NOT TRANSPARENT COLOR
from PIL import Image
import sys
import os
import os.path
from os import path
import numpy as np
import math
def compute_average_image_color(img_path):
if not os.path.isfile(img_path):
print(path_inp_image, 'DONT EXISTS, EXIT')
sys.exit()
#load image
img = Image.open(img_path).convert('RGBA')
img = img.resize((50,50)) # Small optimization
#DEFINE SOME VARIABLES
width, height = img.size
r_total = 0
g_total = 0
b_total = 0
count = 0
red_list=[]
green_list=[]
blue_list=[]
#READ AND CHECK PIXEL BY PIXEL
for x in range(0, width):
for y in range(0, height):
r, g, b, alpha = img.getpixel((x,y))
if alpha !=0:
red_list.append(r)
green_list.append(g)
blue_list.append(b)
r_total += r
g_total += g
b_total += b
count += 1
#CALCULATE THE AVRANGE COLOR, MIN, MAX, ETC
average_color=(round(r_total/count), round(g_total/count), round(b_total/count))
print(average_color)
red_list.sort()
green_list.sort()
blue_list.sort()
red_min_max=[]
green_min_max=[]
blue_min_max=[]
red_min_max.append(min(red_list))
red_min_max.append(max(red_list))
green_min_max.append(min(green_list))
green_min_max.append(max(red_list))
blue_min_max.append(min(blue_list))
blue_min_max.append(max(blue_list))
print('red_min_max: ', red_min_max)
print('green_min_max: ', green_min_max)
print('blue_min_max: ', blue_min_max)
#variance and standard devietion
red_stddev=round(math.sqrt(np.var(red_list)))
green_stddev=round(math.sqrt(np.var(green_list)))
blue_stddev=round(math.sqrt(np.var(blue_list)))
print('red_stddev: ', red_stddev)
print('green_stddev: ', green_stddev)
print('blue_stddev: ', blue_stddev)
img_path='./dir/image001.png'
compute_average_image_color(img_path)