Image.frombuffer with 16-bit image data - python

If my windows is in 32-bit color depth mode, then the following code gets a nice PIL Image from a window:
def image_grab_native(window):
hwnd = win32gui.GetDesktopWindow()
left, top, right, bot = get_rect(window)
w = right - left
h = bot - top
hwndDC = win32gui.GetWindowDC(hwnd)
mfcDC = win32ui.CreateDCFromHandle(hwndDC)
saveDC = mfcDC.CreateCompatibleDC()
saveBitMap = win32ui.CreateBitmap()
saveBitMap.CreateCompatibleBitmap(mfcDC, w, h)
saveDC.SelectObject(saveBitMap)
saveDC.BitBlt((0, 0), (w, h), mfcDC, (left, top), win32con.SRCCOPY)
bmpinfo = saveBitMap.GetInfo()
bmpstr = saveBitMap.GetBitmapBits(True)
im = Image.frombuffer(
'RGB',
(bmpinfo['bmWidth'], bmpinfo['bmHeight']),
bmpstr, 'raw', 'BGRX', 0, 1)
win32gui.DeleteObject(saveBitMap.GetHandle())
saveDC.DeleteDC()
mfcDC.DeleteDC()
win32gui.ReleaseDC(hwnd, hwndDC)
return im
However, when running in 16-bit mode, I get the error:
>>> image_grab_native(win)
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
image_grab_native(win)
File "C:\claudiu\bumhunter\finderbot\ezpoker\utils\win32.py", line 204, in image_grab_native
bmpstr, 'raw', 'BGRX', 0, 1)
File "c:\python25\lib\site-packages\PIL\Image.py", line 1808, in frombuffer
return apply(fromstring, (mode, size, data, decoder_name, args))
File "c:\python25\lib\site-packages\PIL\Image.py", line 1747, in fromstring
im.fromstring(data, decoder_name, args)
File "c:\python25\lib\site-packages\PIL\Image.py", line 575, in fromstring
raise ValueError("not enough image data")
ValueError: not enough image data
How should I form the frombuffer call to work in 16-bit mode? Also how can I make this function work in any bit depth mode, instead of say having to pass it as a parameter?
UPDATE: From this question I learned I must use "BGR;16" instead of "BGRX" for the 2nd mode parameter. It takes a correct picture, either specifying stride or not. The problem is that the pixel values are slightly off on some values:
x y native ImageGrab
280 0 (213, 210, 205) (214, 211, 206)
280 20 (156, 153, 156) (156, 154, 156)
280 40 (213, 210, 205) (214, 211, 206)
300 0 (213, 210, 205) (214, 211, 206)
just a sample of values taken from the same window. the screenshots look identical to the naked eye, but i have to do some pixel manipulation.. also the reason I want to use the native approach at all is that it's a bit faster and it behaves better when running inside virtual machines with dual monitors.. (yes pretty randomly complicated I know).

For the stride parameter, you need to give the row size in bytes. Your pixels are 16 bits each so you might naively assume stride = 2*bmpinfo['bmWidth']; unfortunately Windows adds padding to make the stride an even multiple of 32 bits. That means you'll have to round it to the next highest multiple of 4: stride = (stride + 3) / 4) * 4.
The documentation doesn't mention a 16-bit raw format so you'll have to check the Unpack.c module to see what's available.
The final thing you'll notice is that Windows likes to make its bitmaps upside down.
Edit: Your final little problem is easily explained - the conversion from 16 bit to 24 bit is not precisely defined, and an off-by-one difference between two different conversions is perfectly normal. It wouldn't be hard to adjust the data after you've converted it, as I'm sure the differences are constant based on the value.

Related

Python: OpenCV: How to get an undistorted image without the cropping?

This is similar to nkint's question from September 11, 2013. Link is here:
how to get all undistorted image with opencv
I'm a new user, so I didn't have enough reputation/clout to comment on the OP.
I have tried to emulate the code andrewmkeller posted, using Python instead of C++, with some minor changes based on Josh Bosch's response. The result is the following:
#!/usr/bin/env python
import cv2
import numpy as np
def loadUndistortedImage(fileName):
# load image
image = cv2.imread(fileName)
#print(image)
# set distortion coeff and intrinsic camera matrix (focal length, centerpoint offset, x-y skew)
cameraMatrix = np.array([[894.96803896,0,470.38713516],[0,901.32629374,922.41232898], [0,0,1]])
distCoeffs = np.array([[-0.340671222,0.110426603,-.000867987573,0.000189669273,-0.0160049526]])
# setup enlargement and offset for new image
y_shift = 60 #experiment with
x_shift = 70 #experiment with
imageShape = image.shape #image.size
print(imageShape)
imageSize = (int(imageShape[0])+2*y_shift, int(imageShape[1])+2*x_shift, 3)
print(imageSize)
# create a new camera matrix with the principal point offest according to the offset above
newCameraMatrix, validPixROI = cv2.getOptimalNewCameraMatrix(cameraMatrix, distCoeffs, imageSize,
1)
#newCameraMatrix = cv2.getDefaultNewCameraMatrix(cameraMatrix, imageSize, True) # imageSize, True
# create undistortion maps
R = np.array([[1,0,0],[0,1,0],[0,0,1]])
map1, map2 = cv2.initUndistortRectifyMap(cameraMatrix, distCoeffs, R, newCameraMatrix, imageSize,
cv2.CV_16SC2)
# remap
outputImage = cv2.remap(image, map1, map2, INTER_LINEAR)
#save output image as file with "FIX" appened to name - only works with .jpg files at the moment
index = filename.find('.jpg')
fixed_filename = filename[:index] +'_undistorted'+fileName[index:]
cv2.imwrite(fixed_filename, outputImage)
cv2.imshow('fix_img',outputImage)
cv2.waitKey(0)
return
#Undistort the images, then save the restored images
loadUndistortedImage('./calib/WIN_20200626_11_29_16_Pro.jpg')
This seemed good to me, but then problems came up when trying to use cv2.getOptimalNewCameraMatrix or cv2.getDefaultNewCameraMatrix and cv2.initUndistortRectifyMap. I kept getting told that 'the argument takes exactly 2 arguments (3 given)' even though I am putting the parameters as specified in their documentation here:
https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html
https://docs.opencv.org/2.4/modules/imgproc/doc/geometric_transformations.html
I can remove the error from "...getDefault..." if I remove the optional params, but I'd rather not do that.
Stacktrace:
Traceback (most recent call last):
File ".\main.py", line 46, in <module>
loadUndistortedImage('./<image file name>.jpg')
File ".\main.py", line 27, in loadUndistortedImage
newCameraMatrix, validPixROI = cv2.getOptimalNewCameraMatrix(cameraMatrix, distCoeffs, imageSize, 1)
TypeError: function takes exactly 2 arguments (3 given)
I don't have enough reputation to comment, but you could try:
newcameramatrix, _ = cv2.getOptimalCameraMatrix(
camera_matrix, dist_coeffs, (width, height), 1, (width, height)
)
According to this, that's how the function should be called.
Now, instead of getting the undistorted image with cv2.initUndistortRectifyMap, you could just do:
undistorted_image = cv2.undistort(
image, camera_matrix, dist_coeffs, None, newcameramatrix
)
cv2.imshow("undistorted", undistorted_image)
Following up to my comment on Sebastian Liendo's answer, and also thanks to a Finnish responder on Github (whose Issues are not for these sort of general questions, I learned), here is 1) the updated documentation for the python functions, and 2) the heart of my revised code which does a decent job of getting around the cropping. (Don't do what I did in the question and post the ENTIRE code, just the part essential to your question.)
https://docs.opencv.org/4.3.0/d9/d0c/group_calib3d.html#ga7a6c4e032c97f03ba747966e6ad862b1
#load image
image = cv2.imread(fileName)
#images = glob.glob(pathName + '*.jpg') #loop within a specified directory
#for fileName in images:
#image = cv2.imread(fileName)
#set camera parameters
height, width = image.shape[:2]
cameraMatrix = np.array([[894.96803896,0,470.38713516],[0,901.32629374,922.41232898], [0,0,1]])
distCoeffs = np.array([[-0.340671222,0.110426603,-.000867987573,0.000189669273,-0.0160049526]])
#create new camera matrix
newCameraMatrix, validPixROI = cv2.getOptimalNewCameraMatrix(cameraMatrix, distCoeffs,(width, height), 1, (width, height))
#undistort
outputImage = cv2.undistort(image, cameraMatrix, distCoeffs, None, newCameraMatrix)
#crop, modified
x, y, w, h = validPixROI #(211, 991, 547, 755)
outputImage = outputImage[y-200:y+h+200, x-40:x+w+80] #fudge factor to minimize cropping
THE ONE CAVEAT: this code still crops a bit of the outer-trim of the original capture, but not by much. Minimizing that cropping is the reason for the fudge factor I put in the ouputImage = outputImage[...] line.

open cv: add with numpy with maximum value [duplicate]

I am trying to increase brightness of a grayscale image. cv2.imread() returns a numpy array. I am adding integer value to every element of the array. Theoretically, this would increase each of them. After that I would be able to put upper threshold of 255 and get the image with the higher brightness.
Here is the code:
grey = cv2.imread(path+file,0)
print type(grey)
print grey[0]
new = grey + value
print new[0]
res = np.hstack((grey, new))
cv2.imshow('image', res)
cv2.waitKey(0)
cv2.destroyAllWindows()
However, numpy addition apparently does something like that:
new_array = old_array % 256
Every pixel intensity value higher than 255 becomes a remainder of dividing by 256.
As a result, I am getting dark instead of completely white.
Here is the output:
<type 'numpy.ndarray'>
[115 114 121 ..., 170 169 167]
[215 214 221 ..., 14 13 11]
And here is the image:
How can I switch off this remainder mechanism? Is there any better way to increase brightness in OpenCV?
One idea would be to check before adding value whether the addition would result in an overflow by checking the difference between 255 and the current pixel value and checking if it's within value. If it does, we won't add value, we would directly set those at 255, otherwise we would do the addition. Now, this decision making could be eased up with a mask creation and would be -
mask = (255 - grey) < value
Then, feed this mask/boolean array to np.where to let it choose between 255 and grey+value based on the mask.
Thus, finally we would have the implementation as -
grey_new = np.where((255 - grey) < value,255,grey+value)
Sample run
Let's use a small representative example to demonstrate the steps.
In [340]: grey
Out[340]:
array([[125, 212, 104, 180, 244],
[105, 26, 132, 145, 157],
[126, 230, 225, 204, 91],
[226, 181, 43, 122, 125]], dtype=uint8)
In [341]: value = 100
In [342]: grey + 100 # Bad results (e.g. look at (0,1))
Out[342]:
array([[225, 56, 204, 24, 88],
[205, 126, 232, 245, 1],
[226, 74, 69, 48, 191],
[ 70, 25, 143, 222, 225]], dtype=uint8)
In [343]: np.where((255 - grey) < 100,255,grey+value) # Expected results
Out[343]:
array([[225, 255, 204, 255, 255],
[205, 126, 232, 245, 255],
[226, 255, 255, 255, 191],
[255, 255, 143, 222, 225]], dtype=uint8)
Testing on sample image
Using the sample image posted in the question to give us arr and using value as 50, we would have -
Here is another alternative:
# convert data type
gray = gray.astype('float32')
# shift pixel intensity by a constant
intensity_shift = 50
gray += intensity_shift
# another option is to use a factor value > 1:
# gray *= factor_intensity
# clip pixel intensity to be in range [0, 255]
gray = np.clip(gray, 0, 255)
# change type back to 'uint8'
gray = gray.astype('uint8)
Briefly, you should add 50 to each value, find maxBrightness, then thisPixel = int(255 * thisPixel / maxBrightness)
You have to run a check for an overflow for each pixel. The method suggested by Divakar is straightforward and fast. You actually might want to increment (by 50 in your case) each value and then normalize it to 255. This would preserve details in bright areas of your image.
Use OpenCV's functions. They implement "saturating" math.
new = cv.add(grey, value)
Documentation for cv.add
When you only write new = grey + value, that isn't OpenCV doing the work, that is numpy doing the work. And numpy does nothing special. Wrap-around for integers is standard behavior.
An alternate approach that worked efficiently for me is to "blend in" a white image to the original image using the blend function in the PIL>Image library.
from PIL import Image
correctionVal = 0.05 # fraction of white to add to the main image
img_file = Image.open(location_filename)
img_file_white = Image.new("RGB", (width, height), "white")
img_blended = Image.blend(img_file, img_file_white, correctionVal)
img_blended = img_file * (1 - correctionVal) + img_file_white * correctionVal
Hence, if correctionVal = 0, we get the original image, and if correctionVal = 1, we get pure white.
This function self-corrects for RGB values exceeding 255.
Blending in black (RGB 0, 0, 0) reduces brightness.
I ran into a similar issue, but instead of addition, it was scaling image pixels in non-uniform manner.
The 1-D version of this:
a=np.array([100,200,250,252,255],dtype=np.uint8)
scaling=array([ 1.1, 1.2, 1.4, 1.2, 1.1])
result=np.uint8(a*scaling)
This gets you the overflow issue, of course; the result:
array([110, 240, 94, 46, 24], dtype=uint8)
The np.where works:
result_lim=np.where(a*scaling<=255,a*scaling,255)
yields result_lim as:
array([ 110., 240., 255., 255., 255.])
I was wondering about timing, I did this test on a 4000 x 6000 image (instead of 1D array), and found the np.where(), at least for my conditions, took about 2.5x times as long. Didn't know if there was a better/faster way of doing this. The option of converting to float, doing the operation, and then clipping as noted above was a bit slower than the np.where() method.
Don't know if there are better methods for this.

function takes exactly 1 argument (3 given)?

i am trying to change the value of a pixel in an image to the closest value i have in my list, and i cant figure out why i cant change the pixel value.
I've tried converting the image to RGB or RGBA and for some reason sometimes it takes 3 arguments sometime 4.
im = Image.open('rick.png') # Can be many different formats.
rgb_im = im.convert('RGBA')
pix = im.load()
height, width = im.size
image = ImageGrab.grab()
COLORS = (
(0, 0, 0),
(127, 127, 127),
(136, 0, 21),
(237, 28, 36),
(255, 127, 39),
)
def closest_color(r, g, b, COLORS):
min_diff = 9999
answer = None
for color in COLORS:
cr, cg, cb = color
color_diff = abs(r - cr) + abs(g - cg) + abs(b - cb)
if color_diff < min_diff:
answer = color
min_diff = color_diff
return answer
def read_color(height,width, COLORS, pix):
for x in range(height):
for y in range(width):
r,g,b,a = rgb_im.getpixel((x,y))
color = closest_color(r, g, b, COLORS) # color is returned as tuple
pix[x,y] = color # Changing color value? -Here i get the error-
read_color(height,width, COLORS, pix)
im.save('try.png')
I keep getting this error even tho closest_value returns one argument and i dont know why, thnk you for your help!
COLORS - is a list of colors, i've tested the closest_color() function and it works good
Error message:
'Exception has occurred: TypeError
function takes exactly 1 argument (3 given)
File "C:\Users\user\Desktop\תוכנות שעשיתי\program.py", line 133, in
read_color
pix[x,y] = color
File "C:\Users\user\Desktop\תוכנות שעשיתי\program.py", line 137, in
<module>
read_color(height,width, COLORS, pix)'
EDIT!
Apperantly the code is working for most of the images but not for all of them, for exmaple this image doesn't work and i get this error
You are being inconsistent by reading the pixels from the RGBA converted image but setting the pixels in the original maybe-not-RGBA image. Fixing that makes your code work with the sample image.
pix = rgb_im.load()

h5py flipping dimensions of images

I've been working on building a machine learning algorithm to recognize images, starting by creating my own h5 database. I've been following this tutorial, and it's been useful, but I keep running into one major error - when using OpenCV in the image processing section of the code, the program is unable to save the processed image because it keeps flipping the height and width of my images. When I try to compile, I get the following error:
Traceback (most recent call last):
File "array+and+label+data.py", line 79, in <module>
hdf5_file["train_img"][i, ...] = img[None]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/Users/USER/miniconda2/lib/python2.7/site-packages/h5py/_hl/dataset.py", line 631, in __setitem__
for fspace in selection.broadcast(mshape):
File "/Users/USER/miniconda2/lib/python2.7/site-packages/h5py/_hl/selections.py", line 299, in broadcast
raise TypeError("Can't broadcast %s -> %s" % (target_shape, count))
TypeError: Can't broadcast (1, 240, 320, 3) -> (1, 320, 240, 3)
My images are supposed to all be sized to 320 by 240, but you can see that this is being flipped somehow. Researching around has shown me that this is because OpenCV and NumPy use different conventions for height and width, but I'm not sure how to reconcile this issue within this code without patching my installation of OpenCV. Any ideas on how I can fix this? I'm a relative newbie to Python and all its libraries (though I know Java well)!
Thank you in advance!
Edit: adding more code for context, which is very similar to what's in the tutorial under the "Load images and save them" code example.
The size of my arrays:
train_shape = (len(train_addrs), 320, 240, 3)
val_shape = (len(val_addrs), 320, 240, 3)
test_shape = (len(test_addrs), 320, 240, 3)
The code that loops over the image addresses and resizes them:
# Loop over training image addresses
for i in range(len(train_addrs)):
# print how many images are saved every 1000 images
if i % 1000 == 0 and i > 1:
print ('Train data: {}/{}'.format(i, len(train_addrs)))
# read an image and resize to (320, 240)
# cv2 load images as BGR, convert it to RGB
addr = train_addrs[i]
img = cv2.imread(addr)
img = cv2.resize(img, (320, 240), interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# save the image and calculate the mean so far
hdf5_file["train_img"][i, ...] = img[None]
mean += img / float(len(train_labels))
Researching around has shown me that this is because OpenCV and NumPy use different conventions for height and width
Not exactly. The only thing that is tricky about images is 2D arrays/matrices are indexed with (row, col) which is opposite from normal Cartesian coordinates (x, y) that we might use for images. Because of this, sometimes when you specify points in OpenCV functions, it wants them in (x, y) coordinates---and similarly, it wants the dimensions of the image to be specified in (w, h) instead of (h, w) like an array would be made. And this is the case inside OpenCV's resize() function. You're passing it in (h, w) but it actually wants (w, h). From the docs for resize():
dsize – output image size; if it equals zero, it is computed as:
dsize = Size(round(fx*src.cols), round(fy*src.rows))
Either dsize or both fx and fy must be non-zero.
So you can see here that the number of columns is the first dimension (the width) and the number of rows is the second (the height).
The simple fix is just to swap your (h, w) to (w, h) inside the resize() function:
img = cv2.resize(img, (240, 320), interpolation=cv2.INTER_CUBIC)

Python PIL Editing Pixels versus ImageDraw.point

I am working on an image-generation program, and I have an issue trying to directly edit the pixels of an image.
My original method, which works, was simply:
image = Image.new('RGBA', (width, height), background)
drawing_image = ImageDraw.Draw(image)
# in some loop that determines what to draw and at what color
drawing_image.point((x, y), color)
This works fine, but I thought directly editing pixels might be slightly faster. I plan on using "very" high resolutions (maybe 10000px by 10000px), so even a slight decrease in time per pixel will be a large decrease overall.
I tried using this:
image = Image.new('RGBA', (width, height), background)
pixels = image.load()
# in some loop that determines what to draw and at what color
pixels[x][y] = color # note: color is a hex-formatted string, i.e "#00FF00"
This gives me an error:
Traceback (most recent call last):
File "my_path\my_file.py", line 100, in <module>
main()
File "my_path\my_file.py", line 83, in main
pixels[x][y] = color
TypeError: argument must be sequence of length 2
How does the actual pixels[x][y] work? I seem to be missing a fundamental concept here (I've never worked with directly editing pixels prior to this), or at least just not understanding what arguments are required. I even tried pixels[x][y] = (0, 0, 0), but that raised the same error.
In addition, is there a faster way to edit the pixels? I've heard that using the pixels[x][y] = some_color is faster than drawing to the image, but I'm open to any other faster method.
Thanks in advance!
You need to pass a tuple index as pixels[(x, y)] or simply pixels[x, y], for example:
#-*- coding: utf-8 -*-
#!python
from PIL import Image
width = 4
height = 4
background = (0, 0, 0, 255)
image = Image.new("RGBA", (width, height), background)
pixels = image.load()
pixels[0, 0] = (255, 0, 0, 255)
pixels[0, 3] = (0, 255, 0, 255)
pixels[3, 3] = (0, 0, 255, 255)
pixels[3, 0] = (255, 255, 255, 255)
image.save("image.png")

Categories