I'm playing around with a camera for a microscope using Micro-Manager 1.4. Using the Python interface, I've managed to access the camera, change exposure time etc, and I can capture individual images.
However, each image is returned as NumPy arrays where each pixel is represented as a single integer, e.g. "7765869". As far as I can find online, this is known as a "BufferedImage" in Java, and it means that the RGB values are encoded as:
BufferedImage = R * 2^16 + G * 2^8 + B
My question is: How can I, using e.g. Numpy or OpenCV, convert this kind of array into a more handy array where each pixel is a RGB triplet of uint8 values? Needless to say, the conversion should be as efficient as possible.
The easiest is to let numpy do the conversion for you. Your numpy array will probably be of type np.uint32. If you view it as an array of np.uint8, you will have an RGB0 format image, i.e. the values of R, G, and B for each pixel, plus an empty np.uint8 following. It's easy to reshape and discard that zero value:
>>> img = np.array([7765869, 16777215], dtype=np.uint32)
>>> img.view(np.uint8)
array([109, 127, 118, 0, 255, 255, 255, 0], dtype=uint8)
>>> img.view(np.uint8).reshape(img.shape+(4,))[..., :3]
array([[109, 127, 118],
[255, 255, 255]], dtype=uint8)
Best thing is there is no calculation or copying of data, just a reinterpretation of the contents of your original image: I don't think you can get much more efficient than that!
I recall that for some operations OpenCV requires a contiguous array, so you may have to add a .copy() to the end of that expression to really get rid of the column of zeros, not simply ignore it, although this would of course trigger the copy of data that the code above had avoided.
One way is
Red = BufferedImage / 2**16
Green = (BufferedImage % 2**16) / 2**8
Blue = (BufferedImage % 2**8)
However, I doubt it's the most elegant (Pythonic?) or the fastest way.
rgbs = [((x&0xff0000)>>16,(x&0xff00)>>8,x&0xff) for x in values]
at least I think ...
afaik the formula above can also be written as
BufferedRGB = RED<<16 + GREEN << 8 + BLUE
red,green,blue = 0xFF,0x99,0xAA
red<<16 + green << 8 + blue #= 0xFF99AA (buffered into one value)
#apply a bitmask to get colors back
red = (0xFF99AA & 0xFF0000) >> 16 # = 0xFF
green = (0xFF99AA & 0xFF00) >> 8 # = 0x99
blue = 0xFF99AA & 0xFF # = 0xAA
which is somewhat more readable to me and clear what is going on
The fastest approach would probably be to keep this in numpy:
from numpy import *
x = array([211*2**16 + 11*2**8 + 7]) # test data
b, g, r = bitwise_and(x, 255), bitwise_and(right_shift(x, 8), 255), bitwise_and(right_shift, 16), 255)
print r, g, b
(array([211]), array([11]), array([7]))
Related
I have a numpy array where each element has 3 values (RGB) from 0 to 255, and it spans from [0, 0, 0] to [255, 255, 255] with 256 elements evenly spaced. I want to plot it as a 16 by 16 grid but have no idea how to map the colors (as the numpy array) to the data to create the grid.
import numpy as np
# create an evenly spaced RGB representation as integers
all_colors_int = np.linspace(0, (255 << 16) + (255 << 8) + 255, dtype=int)
# convert the evenly spaced integers to RGB representation
rgb_colors = np.array(tuple(((((255<<16)&k)>>16), ((255<<8)&k)>>8, (255)&k) for k in all_colors_int))
# data to fit the rgb_colors as colors into a plot as a 16 by 16 numpy array
data = np.array(tuple((k,p) for k in range(16) for p in range(16)))
So, how to map the rgb_colors as colors to the data data into a grid plot?
There's quite a bit going on here, and I think it's valuable to talk about it.
linspace
I suggest you read the linspace documentation.
https://numpy.org/doc/stable/reference/generated/numpy.linspace.html
If you want a 16x16 grid, then you should start by generating 16x16=256 values, however if you inspect the shape of the all_colors_int array, you'll notice that it's only generated 50 values, which is the default value of the linspace num argument.
all_colors_int = np.linspace(0, (255 << 16) + (255 << 8) + 255, dtype=int)
print(all_colors_int.shape) # (50,)
Make sure you specify this third 'num' argument to generate the correct quantity of RGB pixels.
As a further side note, (255 << 16) + (255 << 8) + 255 is equivalent to (2^24)-1. The 2^N-1 formula is usually what's used to fill the first N bits of an integer with 1's.
numpy is faster
On your next line, your for loop manually iterates over all of the elements in python.
rgb_colors = np.array(tuple(((((255<<16)&k)>>16), ((255<<8)&k)>>8, (255)&k) for k in all_colors_int))
While this might work, this isn't considered the correct way to use numpy arrays.
You can directly perform bitwise operations to the entire numpy array without the python for loop. For example, to extract bits [16, 24) (which is usually the red channel in an RGB integer):
# Shift over so the 16th bit is now bit 0, then select only the first 8 bits.
RedChannel = (all_colors_int >> 16) & 255
Building the grid
There are many ways to do this in numpy, however I would suggest this approach.
Images are usually represented with a 3-dimensional numpy array, usually of the form
(HEIGHT, WIDTH, CHANNELS)
First, reshape your numpy int array into the 16x16 grid that you want.
reshaped = all_colors_int.reshape((16, 16))
Again, the numpy documentation is really great, give it a read:
https://numpy.org/doc/stable/reference/generated/numpy.reshape.html
Now, extract the red, green and blue channels, as described above, from this reshaped array. If you operate directly on the numpy array, you won't need a nested for-loop to iterate over the 16x16 grid, numpy will handle this for you.
RedChannel = (reshaped >> 16) & 255
GreenChannel = ... # TODO
BlueChannel = ... # TODO
And then finally, we can convert our 3, 16x16 grids, into a 16x16x3 grid, using the numpy stack function
https://numpy.org/doc/stable/reference/generated/numpy.stack.html
grid_rgb = np.stack((
RedChannel,
GreenChannel,
BlueChannel
), axis=2).astype(np.uint8)
Notice two things here
When we 'stack' arrays, we create a new dimension. The axis=2 argument tells numpy to add this new dimension at index 2 (e.g. the third axis). Without this, the shape of our grid would be (3, 16, 16) instead of (16, 16, 3)
The .astype(np.uint8) casts all of the values in this numpy array into a uint8 data type. This is so the grid is compatible with other image manipulation libraries, such as openCV, and PIL.
Show the image
We can use PIL for this.
If you want to use OpenCV, then remember that OpenCV interprets images as BGR not RGB and so your channels will be inverted.
# Show Image
from PIL import Image
Image.fromarray(grid_rgb).show()
If you've done everything right, you'll see an image... And it's all gray.
Why is it gray?
There are over 16 million possible colours. Selecting only 256 of them just so happens to select only pixels with the same R, G and B values which results in an image without any color.
If you want to see some colours, you'll need to either show a bigger image (e.g. 256x256), or alternatively, you can use a dimension that's not a power of two. For example, try a prime number, as this will add a small amount of pseudo-randomness to the RGB selection, e.g. try 17.
Best of luck.
Based solely on the title 'How to plot a normalized RGB map' rather than the approach you've provided, it appears that you'd like to plot a colour spectrum in RGB.
The following approach can be taken to manually construct this.
import cv2
import matplotlib.pyplot as plt
import numpy as np
h = np.repeat(np.arange(0, 180), 180).reshape(180, 180)
s = np.ones((180, 180))*255
v = np.ones((180, 180))*255
hsv = np.stack((h, s, v), axis=2).astype('uint8')
rgb = cv2.cvtColor(hsv, cv2.COLOR_HSV2RGB)
plt.imshow(rgb)
Explanation:
It's generally easier to construct (and decompose) a colour palette using the HSV (hue, saturation, value) colour scale; where hue is the colour itself, saturation can be thought of as the intensity and value as the distance from black. Therefore, there's really only one value to worry about, hue. Saturation and value can be set to 255, for 'full intensity'.
cv2 is used here to simply convert the constructed HSV colourscale to RGB and matplotlib is used to plot the image. (I didn't use cv2 for plotting as it doesn't play nicely with Jupyter.)
The actual spectrum values are constructed in numpy.
Breakdown:
Create the colour spectrum of hue and plug 255 in for the saturation and value. Why is 180 used?
h = np.repeat(np.arange(0, 180), 180).reshape(180, 180)
s = np.ones((180, 180))*255
v = np.ones((180, 180))*255
Stack the three channels H+S+V into a 3-dimensional array, convert the array values to unsigned 8-bit integers, and have cv2 convert from HSV to RGB for us, to be lazy and save us working out the math.
hsv = np.stack((h, s, v), axis=2).astype('uint8')
rgb = cv2.cvtColor(hsv, cv2.COLOR_HSV2RGB)
Plot the RGB image.
plt.imshow(rgb)
i'll explain what i mean by this example
i converted an image to a 3D-np.array of this form [row * col * [r, g, b, alpha, x,y]]
Notes:
The 3rd dimension is a numpy array containing rgba values of the pixel as well as its coordinates
To be presize rgb values are integers beetween 0 and 255
i want to modify each pixel(img[row][col]) using a function _func and am not sure if i can use np.vectorize for this
an example of _func would be to convert the pixel array to change the pixel rgb values st they are all equal (by calculating their mean) so that the pixel becomes gray (_func dosent modify the alpha or x, y of the pixel)
_func(np.array([100, 50, 0, 100, 255, 0, 0])) -> np.array([50, 50, 50, 100, 255, 0, 0])
can someone show me somthing equivelant to this code that actually works
#imports
img = f(img.bmp) #convert img to numpyarray, f defined earlier
def _func(arr):
"_func implementation here"
func = np.vectorize(_func, depth=2) #depth=2 because there is rows and columns
new_img = func(img)
i that this depth parameter dosent exist but can i implement this somehow.
and thank you for any help
(btw i started learning numpy this evening to do a little fun-project so i am very beginner at numpy)
I have a large set of also large images (5000,10000,3 channels, RGB) from a semantic segmentation process. I am trying to create a new image with the most "common" value for each pixel, the mode of each pixel for the complete set. Those images have some particularities. First of all, they have the same size, but sometimes contains black pixels that represent no information and must be excluded from the mode calculation. Merging together all image set, I will be able to define which pixel colour tuple (r,g,b) is the most common and store this information as a new image without black pixels.
I have tried using scipy stats.mode to analyse a list of np.array from the images, but this method does not count the (0,0,0) tuple as a nan_policy='omit', so after the calculation, it returns a black image. (0,0,0) is the most frequent pixel colour after all.
I tried also replacing the (0,0,0) tuple by a 'nan' value but the ram usage goes up really fast and is not efficient.
Could anyone give me a hint of some vectorised method to implement this stat calculation?
Thanks!
some sample images: img1img2img3img4
It sounds like you stored mixed tuples and nan values in a numpy array. This is not very effficient, because that would be an object array that needs to handle memory allocation separately for each pixel.
It is better to convert the each RGB tuple to a (integer) floating-point value. A single-precision float can store integers up to 2**24-1 without loss of precision; that is just enough for storing 24-bit RGB values.
Here is how to do it with 5 images of 50x100 pixels.
from scipy.stats import mode as stats_mode
ny, nx = 50, 100
imgs = np.random.randint(255, size=(5, ny, nx, 3), dtype=np.uint8)
imgs[:3, ny//2, nx//2, :] = 0 # ignore thsee
imgs[3:, ny//2, nx//2, :] = [255, 255, 254] # find this
my = 10 # slice size - must divide ny
mode_img = np.zeros((ny, nx, 3), dtype=np.uint8)
flt_imgs = np.zeros((5, my, nx), dtype=np.float32)
for iy in range(0, ny, my):
yslice = slice(iy, iy+my)
flt_imgs[:] = imgs[:, yslice, :, 0]*(256*256)
flt_imgs += imgs[:, yslice, :, 1]*256
flt_imgs += imgs[:, yslice, :, 2]
flt_imgs[flt_imgs == 0] = np.nan
mode_result = stats_mode(flt_imgs, axis=0, nan_policy='omit')
imode = mode_result.mode[0].astype(np.int32)
mode_img[yslice, :, 0] = (imode >> 16) & 0xff
mode_img[yslice, :, 1] = (imode >> 8) & 0xff
mode_img[yslice, :, 2] = imode & 0xff
print(f'Found mode: {mode_img[ny//2, nx//2]}')
Output:
Found mode: [255 255 254]
I wrote this code to switch the red and blue values in the RGB array from a given image:
from PIL import Image
import numpy as np
image = Image.open("image.jpg")
RGBarr = np.asarray(image)
newRGB = np.full_like(RGBarr, 1)
red = RGBarr[..., 0]
green = RGBarr[...,1]
blue = RGBarr[..., 2]
newRGB[..., 0] = blue
newRGB[..., 1] = green
newRGB[..., 2] = red
inv_image = Image.fromarray(newRGB, 'RGB')
inv_image.save('inv_image.png')
inv_image.show()
I tried it with multiple images, and it works almost every time. However, in some cases I get the following error:
raise ValueError("not enough image data")
ValueError: not enough image data
That can be fixed if I do not specify the mode in Image.fromarray(obj, mode), but even doing that I am not sure if the result I obtain is the "correct" one.
Is there a way to determine what mode should be used for a certain image?
I hope this is not a dumb question, but I am sort of new in this image processing business.
The error occurs, when you try to read images which are not RGB like grayscale images or RGBA images. To keep the rest of your code valid, the easiest way would be to enforce RGB input by using:
image = Image.open("image.jpg").convert('RGB')
Then, possible grayscale or RGBA images are converted to RGB, and can be processed as regular RGB images.
As you found out yourself,
inv_image = Image.fromarray(newRGB)
also works, but the processing from the rest of your code then isn't correct anymore (no proper slicing of the desired dimensions/axes). That would require further work on your code to also respect grayscale or RGBA images.
Hope that helps!
EDIT: To incorporate furas' idea to get rid of NumPy, here's a PIL only way of swapping the channels. Notice: You still need the enforced RGB input.
from PIL import Image
image = Image.open('image.jpg').convert('RGB')
r, g, b = image.split()
inv_image = Image.merge('RGB', (b, g, r))
inv_image.save('inv_image.png')
inv_image.show()
If you want to re-order RGB channels to BGR with Numpy, it is much simpler to do this:
BGR = RGB[...,::-1]
which just addresses the last index (i.e. the channels) in reverse. It has the benefit of being O(1) which means it takes the same amount of time regardless of the size of the array. On my Mac, it takes 180ns to do BGR->RGB with 10x10 image and just the same with a 10,000x10,000 image.
In general, you may want some other ordering rather than straight reversal, so if you want BGR->BRG, you can do:
BRG = BGR[...,(0,2,1)]
Or, if you want to make a 3-channel greyscale image by repeating the Green channel three times (because the green is usually the least noisy - see Wikipedia Bayer array article), you can simply do this:
RGBgrey = BGR[...,(1,1,1)]
If you want to get rid of Numpy, you can do it straight in PIL/Pillow using a matrix multiplication:
# Open image
im = Image.open('image.jpg')
# Define matrix to re-order RGB->BGR
Matrix = ( 0, 0, 1, 0,
0, 1, 0, 0,
1, 0, 0, 0)
# BGR -> RGB
BGR = im.convert("RGB", Matrix)
You can understand the matrix like this:
newR = 0*oldR + 0*oldG + 1*oldB + 0 offset
newG = 0*oldR + 1*oldG + 0*oldB + 0 offset
newB = 1*oldR + 0*oldG + 0*oldB + 0 offset
Input
Result
I have a project where I want to locate a bunch of arrows in images that look like so: ibb.co/dSCAYQ
with the following template: ibb.co/jpRUtQ
I'm using cv2's template matching feature in Python. My algorithm is to rotate the template 360 degrees and match for each rotation. I get the following result: ibb.co/kDFB7k
As you can see, it works well except for the 2 arrows that are really close, such that another arrow is in the black region of the template.
I am trying to use a mask, but it seems that cv2 is not applying my masks at all, i.e. no matter what values that mask array has, the matching is the same. Have been trying this for two days but cv2's limited documentation is not helping.
Here is my code:
import numpy as np
import cv2
import os
from scipy import misc, ndimage
STRIPPED_DIR = #Image dir
TMPL_DIR = #Template dir
MATCH_THRESH = 0.9
MATCH_RES = 1 #specifies degree-interval at which to match
def make_templates():
base = misc.imread(os.path.join(TMPL_DIR,'base.jpg')) # The templ that I rotate to make 360 templates
for deg in range(360):
print('making template: ' + str(deg))
tmpl = ndimage.rotate(base, deg)
misc.imsave(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.jpg'), tmpl)
def make_masks():
for deg in range(360):
tmpl = cv2.imread(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.jpg'), 0)
ret2, mask = cv2.threshold(tmpl, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
cv2.imwrite(os.path.join(TMPL_DIR, 'mask' + str(deg) + '.jpg'), mask)
def match(img_name):
img_rgb = cv2.imread(os.path.join(STRIPPED_DIR, img_name))
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
for deg in range(0, 360, MATCH_RES):
tmpl = cv2.imread(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.jpg'), 0)
mask = cv2.imread(os.path.join(TMPL_DIR, 'mask' + str(deg) + '.jpg'), 0)
w, h = tmpl.shape[::-1]
res = cv2.matchTemplate(img_gray, tmpl, cv2.TM_CCORR_NORMED, mask=mask)
loc = np.where( res >= MATCH_THRESH)
for pt in zip(*loc[::-1]):
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
cv2.imwrite('res.png',img_rgb)
Some things that I think could be wrong but not sure how to fix:
The number of channels the mask/tmpl/img should have. I have tried an example with colored 4-channel pngs stackoverflow eg., but not sure how it translates to grayscale or 3-channel jpegs.
The values of the mask array. e.g. Should masked out pixels be 1 or 255?
Any help is greatly appreciated.
UPDATE
I fixed a trivial error in my code; mask=mask must be used in the argument for matchTemplate(). This combined with using mask values of 255 made the difference. However, now I get a ton of false positives like so:
http://ibb.co/esfTnk Note that the false positives are more strongly correlated than the true positives.
Any pointers on how to fix my masks to resolve this? Right now I am simply using a black-and-white conversion of my templates.
You've already figured out the first questions, but I'll expand a bit on them:
For a binary mask, it should be of type uint8 where the values are simply zero or non-zero. The locations with zero are ignored, and are included in the mask if they are non-zero. You can pass a float32 instead as a mask, in which case, it lets you weight the pixels; so a value of 0 is ignore, 1 is include, and .5 is include but only give it half as much weight as another pixel. Note that a mask is only supported for TM_SQDIFF and TM_CCORR_NORMED, but that's fine since you're using the latter. Masks for matchTemplate are single channel only. And as you found out, mask is not a positional argument, so it must be called with the key in the argument, mask=your_mask. All of this is pretty explicit in this page on the OpenCV docs.
Now to the new issue:
It's related to the method you're using and the fact that you're using jpgs. Have a look at the formulas for the normed methods. Where the image is completely zero, you're going to get faulty results because you'll be dividing by zero. But that's not the exact problem---because that returns nan and np.nan > value always returns false, so you'll never be drawing a square from nan values.
Instead the problem is right at the edge cases where you get a hint of a non-zero value; and because you're using jpg images, not all black values are exactly 0; in fact, many aren't. Note from the formula you're diving by the mean values, and the mean values will be extremely small when you have values like 1, 2, 5, etc inside your image window, so it will blow up the correlation value. You should use TM_SQDIFF instead (because it's the only other method which allows a mask). Additionally because you're using jpg most of your masks are worthless, since any non-zero value (even 1) counts as an inclusion. You should use pngs for the masks. As long as the templates have a proper mask, shouldn't matter whether you use jpg or png for the templates.
With TM_SQDIFF, instead of looking for the maximum values, you're looking for the minimum---you want the smallest difference between the template and image patch. You know that the difference should be really small---exactly 0 for a pixel-perfect match, which you probably won't get. You can play around with thresholding a little bit. Note that you're always going to get pretty close values for every rotation, because the nature of your template---the little arrow bar hardly adds that many positive values, and it's not necessarily guaranteed that the one degree discretization its exactly right (unless you made the image that way). But even an arrow facing the totally wrong direction is going to still going to be extremely close since there's a lot of overlap; and the arrow facing close to the right direction will be really close to values with the exactly right direction.
Preview what the result of the square difference is while you're running the code:
res = cv2.matchTemplate(img_gray, tmpl, cv2.TM_SQDIFF, mask=mask)
cv2.imshow("result", res.astype(np.uint8))
if cv2.waitKey(0) & 0xFF == ord('q'):
break
You can see that basically every orientation of template matches closely.
Anyways, it seems a threshold of 8 nailed it:
The only thing I modified in your code was changing to pngs for all images, switching to TM_SQDIFF, making sure loc looks for values less than the threshold instead of greater than, and using a MATCH_THRESH of 8. At least I think that's all I changed. Have a look just in case:
import numpy as np
import cv2
import os
from scipy import misc, ndimage
STRIPPED_DIR = ...
TMPL_DIR = ...
MATCH_THRESH = 8
MATCH_RES = 1 #specifies degree-interval at which to match
def make_templates():
base = misc.imread(os.path.join(TMPL_DIR,'base.jpg')) # The templ that I rotate to make 360 templates
for deg in range(360):
print('making template: ' + str(deg))
tmpl = ndimage.rotate(base, deg)
misc.imsave(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.png'), tmpl)
def make_masks():
for deg in range(360):
tmpl = cv2.imread(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.png'), 0)
ret2, mask = cv2.threshold(tmpl, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
cv2.imwrite(os.path.join(TMPL_DIR, 'mask' + str(deg) + '.png'), mask)
def match(img_name):
img_rgb = cv2.imread(os.path.join(STRIPPED_DIR, img_name))
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
for deg in range(0, 360, MATCH_RES):
tmpl = cv2.imread(os.path.join(TMPL_DIR, 'tmp' + str(deg) + '.png'), 0)
mask = cv2.imread(os.path.join(TMPL_DIR, 'mask' + str(deg) + '.png'), 0)
w, h = tmpl.shape[::-1]
res = cv2.matchTemplate(img_gray, tmpl, cv2.TM_SQDIFF, mask=mask)
loc = np.where(res < MATCH_THRESH)
for pt in zip(*loc[::-1]):
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
cv2.imwrite('res.png',img_rgb)