cv2 convert Range and copyTo functions of c++ to python - python

I'm writing a python video stabilizer and in some part of the code i need to copy 2 images into a canvas.
I tried to convert this c++ code to python but i wasn't able.
Mat cur2;
warpAffine(cur, cur2, T, cur.size());
cur2 = cur2(Range(vert_border, cur2.rows-vert_border),
Range(HORIZONTAL_BORDER_CROP, cur2.cols-HORIZONTAL_BORDER_CROP));
// Resize cur2 back to cur size, for better side by side comparison
resize(cur2, cur2, cur.size());
// Now draw the original and stablised side by side for coolness
Mat canvas = Mat::zeros(cur.rows, cur.cols*2+10, cur.type());
cur.copyTo(canvas(Range::all(), Range(0, cur2.cols)));
cur2.copyTo(canvas(Range::all(), Range(cur2.cols+10, cur2.cols*2+10)));
I wrote this code but i got error:
ret, frame = cap.read()
new_frame = transform(frame,data[counter]) #some kind of low pass filter
canvas = np.zeros ((frame_height, frame_width*2+10,3))
np.copyto (canvas[:frame_width], frame)
np.copyto (canvas[frame_width+10:frame_width*2+10], new_frame)
I got
"couldnt boradcast from shape into shape"
err. But i think i used canvas in wrong way. in cpp code there is canvas(Range::all(), Range(0, cur2.cols)) which i dont know how to use it in python
How can i use Range function and copyTo function in python?
And how should i copy an image to a specific part of canvas?
Any help?

cv::Mat are actually numpy arrays in python. And in this case, you should use numpy functions and not OpenCV ones.
For the copyTo as clone, use copy() as in:
a = np.zeros((10,10,3), dtype=np.uint8)
b = a.copy()
For ranges, in numpy is easier... just use:
a[y1:y2, x1:x2,:]
which means from row y1 to row y2 and from column x1 to column x2. In case you need all, just leave the : alone like all rows:
a[:, x1:x2,:]
The last colon is for channels, in this case all channels, but you can also limit it. And if you need only 1 column, or channel you can put the number directly instead of using a "range" like
a[4, x1:x2, 0]
You can also drop the last colon of the channels, and it will use all of them. Like:
a[1:3, 4:8]
Finally, to copy a value to a place in the image you can do something like:
bigImage[y1:y2, x1:x2] = image
You have to make sure that image fits in this place (channels included). That means, if image is of size 640x480 you can not do this:
bigImage[10:20, 20:30] = image
but you can do something like
bigImage[10:20, 20:30] = image[10:20, 10:20]
assuming both have the same number of channels

Related

Cropping ROI regions in an numpy array

I am writing some codes to cut two seprate ROI regions in an numpy array. The array is a mask array with boolean values and it consists of two major left and right parts.
I need to crops those left and right parts from my original numpy array. My codes are as follow which are a section of function (image and masked are passed to this function)
if mask.shape[-1] > 0:
# We're treating all instances as one, so collapse the mask into one layer
mask = (np.sum(mask, -1, keepdims=True) >= 1)
zeros=np.zeros(image.shape)
#splash = np.where(mask, image, gray).astype(np.uint8)
splash = np.where(mask, image, zeros).astype(np.uint8)
I am not sure how to achieve this as I am really new to bumpy. I can splash the imge but what I need is differt I ned to crop two left and right parts and for this I need to crop or reshape the mask array. I have attached a splashed output sample to this thread
This is a very typical problem in computer vision. One popular solution to this is the Connected Component Labeling (or CCL). OpenCV has already an implementation for this:
https://docs.opencv.org/3.4/d3/dc0/group__imgproc__shape.html#gaedef8c7340499ca391d459122e51bef5
Then you may use blob analysis to crop out the objects:
https://docs.opencv.org/3.4/d0/d7a/classcv_1_1SimpleBlobDetector.html

How to insert numpy asanyarray value to asanyarray?

I have some problems in numpy array code.
I am coding a program to take a frame while photographing with a camera.
I want to save the image values ​​that are fetched for each frame in a list or numpy array, and then I want to take all the pictures and save them all at once.
However, if I try to save it in an array, it will still fail.
Ah, Image's size is 640*480 and fps is 30.
This is Windows 10, pyrealsense, numpy, python 3.6 , cv2 and realsense camera D415.
Example code
color = []<br>
color_image = frame.get_color_frame()<br>
color_array = np.asanyarray(color_image.get_data())<br>
color.append(color_array)<br>
If I put this code in it, I will get a few frames, and after 15 frames, the camera frame keeps getting the same frame.
Example code
test = np.zeros((480,640,3))<br>
color_image = frame.get_color_frame()<br>
color_array = np.asanyarray(color_image.get_data())<br>
np.insert(test, frame_count, color_array)<br>
ValueError: could not broadcast input array from shape (480,640,3) into shape (480)... why..? :(
In case of 1, different frames must be stored after 15th frame.
In the case of 2, insert the value into the array well. There is no example that should be like this.

Concatenating Numpy arrays for OpenCV imshow

Using OpenCV and Python, I want to display the left hand half of one image concatenated with the right-hand half of another image, both of the same size - 512x512 pixels. I have identified several ways of doing this, but I am confused about the behaviour of one method. In the following code, assume that only one of the methods is used at any one time and the rest are commented out:
import cv2
import numpy as np
image1 = cv2.imread('img1.png',0)
image2 = cv2.imread('img2.png',0)
#Method 1 - works
image3 = np.concatenate([image1[:,0:256], image2[:,256:512]], axis=1)
#Method 2 - works
image3 = image1[:,:]
image3[:,256:512] = image2[:,256:512]
#Method 3 - works if I don't create image3 with np.zeros first.
#Otherwise displays black image - all zeros - but print displays correct values
image3 = np.zeros(shape=(512,512), dtype=int)
image3[:,0:256] = image1[:,0:256]
image3[:,256:512] = image2[:,256:512]
print(image3)
cv2.imshow("IMAGE", image3)
cv2.waitKey(0)
cv2.destroyAllWindows()
In method 3, I at first mistakenly thought that the new numpy array image 3 would need to be created first and so created an array filled with zeros and then seemingly overwrote that array with the correct values. When I print that array it displays the correct values, but when I show it as an image using cv2.imshow it is all black (i.e. all zeros). Why the difference? I understand that slicing creates a view, not a copy, but can someone please explain what is happening in method 3 and why cv2.imshow displays the underlying array but print doesn't.
Your problem is in:
np.zeros(shape=(512,512), dtype=int)
imshow will show images coded as float(32 bit) with a range of 0.-1. or 8bit(1-4 channels) with a range of 0-255. You are using int, which is 32 bit (in most cases) and it is not a floating point. What you should do to fix it, is to use np.uint8.
np.zeros(shape=(512,512), dtype=np.uint8)
I think also it can be displayed using matplotlib if you want to keep the int, but I am not 100% sure about it.

Confused during reshaping array of image

At the moment I'm trying to run a ConvNet. Each image, which later feeds the neural net, is stored as a list. But the list is at the moment created using three for-loops. Have a look:
im = Image.open(os.path.join(p_input_directory, item))
pix = im.load()
image_representation = []
# Get image into byte array
for color in range(0, 3):
for x in range(0, 32):
for y in range(0, 32):
image_representation.append(pix[x, y][color])
I'm pretty sure that this is not the nicest and most efficient way. Because I have to stick to the structure of the list created above, I thought about using numpy and providing an alternative way to get to the same structure.
from PIL import Image
import numpy as np
image = Image.open(os.path.join(p_input_directory, item))
image.load()
image = np.asarray(image, dtype="uint8")
image = np.reshape(image, 3072)
# Sth is missing here...
But I don't know how to reshape and concatenate the image for getting the same structure as above. Can someone help with that?
One approach would be to transpose the axes, which is essentially flattening in fortran mode i.e. reversed manner -
image = np.asarray(im, dtype="uint8")
image_representation = image.ravel('F').tolist()
For a closer look to the function have a look to the numpy.ravel documentation.

Optimizing performance when reading a satellite image file in python

I have a multiband satellite image stored in the band interleaved pixel (BIP) format along with a separate header file. The header file provides the details such as the number of rows and columns in the image, and the number of bands (can be more than the standard 3).
The image itself is stored like this (assume a 5 band image):
[B1][B2][B3][B4][B5][B1][B2][B3][B4][B5] ... and so on (basically 5 bytes - one for each band - for each pixel starting from the top left corner of the image).
I need to separate out each of these bands as PIL images in Python 3.2 (on Windows 7 64 bit), and currently I think I'm approaching the problem incorrectly. My current code is as follows:
def OpenBIPImage(file, width, height, numberOfBands):
"""
Opens a raw image file in the BIP format and returns a list
comprising each band as a separate PIL image.
"""
bandArrays = []
with open(file, 'rb') as imageFile:
data = imageFile.read()
currentPosition = 0
for i in range(height * width):
for j in range(numberOfBands):
if i == 0:
bandArrays.append(bytearray(data[currentPosition : currentPosition + 1]))
else:
bandArrays[j].extend(data[currentPosition : currentPosition + 1])
currentPosition += 1
bands = [Image.frombytes('L', (width, height), bytes(bandArray)) for bandArray in bandArrays]
return bands
This code takes way too long to open a BIP file, surely there must be a better way to do this. I do have the numpy and scipy libraries as well, but I'm not sure how I can use them, or if they'll even help in any way.
Since the number of bands in the image are also variable, I'm finding it hard to figure out a way to read the file quickly and separate the image into its component bands.
And just for the record, I have tried messing with the list methods in the loops (using slices, not using slices, using only append, using only extend etc), it doesn't particularly make a difference as the major time is lost because of the number of iterations involved - (width * height * numberOfBands).
Any suggestions or advice would be really helpful. Thanks.
If you can find a fast function to load the binary data in a big python list (or numpy array), you can de-interleave the data using the slicing notation:
band0 = biglist[::nbands]
band1 = biglist[1::nbands]
....
Does that help?
Standard PIL
To load an image from a file, use the open function in the Image module.
>>> import Image
>>> im = Image.open("lena.ppm")
If successful, this function returns an Image object. You can now use instance attributes to examine the file contents.
>>> print im.format, im.size, im.mode
PPM (512, 512) RGB
The format attribute identifies the source of an image. If the image was not read from a file, it is set to None. The size attribute is a 2-tuple containing width and height (in pixels). The mode attribute defines the number and names of the bands in the image, and also the pixel type and depth. Common modes are "L" (luminance) for greyscale images, "RGB" for true colour images, and "CMYK" for pre-press images.
The Python Imaging Library also allows you to work with the individual bands of an multi-band image, such as an RGB image. The split method creates a set of new images, each containing one band from the original multi-band image. The merge function takes a mode and a tuple of images, and combines them into a new image. The following sample swaps the three bands of an RGB image:
Splitting and merging bands
r, g, b = im.split()
im = Image.merge("RGB", (b, g, r))
So I think you should simply derive the mode and then split accordingly.
PIL with Spectral Python (SPy python module)
However, as you pointed out in your comments below, you are not dealing with a normal RGB image with 3 bands. So to deal with that, SpectralPython (a pure python module which requires PIL) might just be what you are looking for.
Specifically - http://spectralpython.sourceforge.net/class_func_ref.html#spectral.io.bipfile.BipFile
spectral.io.bipfile.BipFile deals with Image files with Band Interleaved Pixel (BIP) format.
Hope this helps.
I suspect that the repetition of extend is not good better allocate all first
def OpenBIPImage(file, width, height, numberOfBands):
"""
Opens a raw image file in the BIP format and returns a list
comprising each band as a separate PIL image.
"""
bandArrays = []
with open(file, 'rb') as imageFile:
data = imageFile.read()
currentPosition = 0
for j in range(numberOfBands):
bandArrays[j]= bytearray(b"\0"*(height * width)):
for i in xrange(height * width):
for j in xrange(numberOfBands):
bandArrays[j][i]=data[currentPosition])
currentPosition += 1
bands = [Image.frombytes('L', (width, height), bytes(bandArray)) for bandArray in bandArrays]
return bands
my measurements doesn't show nsuch a slow down
def x():
height,width,numberOfBands=1401,801,6
before = time.time()
for i in range(height * width):
for j in range(numberOfBands):
pass
print (time.time()-before)
>>> x()
0.937999963760376
EDITED

Categories