Append an array to an image in python - python

I'm trying to create a piece of data for a CNN in tensorflow. The image corresponds to a state in my environment. I'd like to take the state (the array) and append it to the image for model input.
Some conditions are, that the original structure of the image needs to be intact. So when I convert the array to an image, I can see the original image plus the appended array. The final array can have any number of rows, but the number of columns and channels cannot change.
So in other words, I'd like to reshape an array of max_length (say, 2,000,000) a long a matrix of n rows and 1920 columns and 3 channels. Padding with 0s if necessary, and then append that to an image of shape (1080, 1920, 3). I have a feeling my approach is more complicated than it needs to be.
My attempt at doing this was to convert the state and image to a list, for faster array transformations. Add them together sequentially and then reshape them into an numpy array. I use padding to constrain the array to 'n rows x 1920 x 3'.
The variable max_length is the max length of the padding.
input_space is the state or array I want to append to the image. It's length can be anywhere between 1 and max_length which is why I use padding. So the state size isn't changing per loop.
The code described under aspect ratio, is used for more padding to ensure we constrain the array to required dimensions.
def get_state(input_space, img):
img = list(np.concatenate(img).flat)
pad_length = (max_length - len(input_space))
state = img + input_space + [0] * pad_length
# get pad length for 1920 aspect ratio
aspect_ratio = (len(state) / 3) / 1920
remainder = aspect_ratio - np.fix(aspect_ratio)
aspect_ratio = round(((1 - remainder) * 1920) * 3)
# pad state to shape into 1920 x n rows and 3 channels
state = state + [0] * aspect_ratio
rows = round((len(state)/3)/1920)
return np.reshape(state, (rows, 1920, 3)).T
The function successfully produces an array constrained too n rows, 1920 columns and 3 channels, except when I go to preview the array as an image the original image structure is lost.
Is there a better way to approach this? I should add that performance is important because the function runs in a loop.

Ok so I put together something that works. Instead of appending together the image and state to start, and then reshaping them in the desired shape, I reshape the state and then just append it to the image. Seems to work out.
def get_state(input_space, img):
pad_length = (max_length - len(input_space))
state = input_space + [0] * pad_length
# get pad length for 1920 aspect ratio
aspect_ratio = (len(state) / 3) / 1920
remainder = aspect_ratio - np.fix(aspect_ratio)
aspect_ratio = round(((1 - remainder) * 1920) * 3)
# pad state to shape into 1920 x n rows and 3 channels
state = state + [0] * aspect_ratio
rows = round((len(state)/3)/1920)
x = list(np.reshape(state, (3, 1920, rows)).T)
state = list(img) + x
return np.array(state)

Related

Image Convolution with callback function in python

I want to loop over the pixels of a binary image in python and set the value of a pixel depending on a surrounding neighborhood of pixels. Similar to convolution but I want create a method that sets the value of the center pixel using a custom function rather than normal convolution that sets the center pixel to the arithmetic mean of the neighborhood.
In essence I would like to create a function that does the following:
def convolve(img, conv_function = lambda subImg: np.mean(subImg)):
newImage = emptyImage
for nxn_window in img:
newImage[center_pixel] = conv_function(nxn_window)
return newImage
At the moment I have a solution but it is very slow:
#B is the structuing array or convolution window/kernel
def convolve(func):
def wrapper(img, B):
#get dimensions of img
length, width = len(img), len(img[0])
#half width and length of dimensions
hw = (int)((len(B) - 1) / 2)
hh = (int)((len(B[0]) - 1) / 2)
#convert to npArray for fast operations
B = np.array(B)
#initialize empty return image
retVal = np.zeros([length, width])
#start loop over the values where the convolution window has a neighborhood
for row in range(hh, length - hh):
for pixel in range(hw, width - hw):
#window as subarray of pixels
window = [arr[pixel-hh:pixel+hh+1]
for arr in img[row-hw:row+hw+1]]
retVal[row][pixel] = func(window, B)
return retVal
return wrapper
with this function as a decorator I then do
# dilation
#convolve
def __add__(img, B):
return np.mean(np.logical_and(img, B)) > 0
# erosion
#convolve
def __sub__(img, B):
return np.mean(np.logical_and(img, B)) == 1
Is there a library that provides this type of function or is there a better way I can loop over the image?
Here's an idea: assign each pixel an array with its neighborhood and then simply apply your custom function to the extended image. It'll be fast BUT will consume more memory ( times more memory; if your B.shape is (3, 3) then you'll need 9 times more memory). Try this:
import numpy as np
def convolve2(func):
def conv(image, kernel):
""" Apply given filter on an image """
k = kernel.shape[0] # which is assumed equal to kernel.shape[1]
width = k//2 # note that width == 1 for k == 3 but also width == 1 for k == 2
a = framed(image, width) # create a frame around an image to compensate for kernel overlap when shifting
b = np.empty(image.shape + kernel.shape) # add two more dimensions for each pixel's neighbourhood
di, dj = image.shape[:2] # will be used as delta for slicing
# add the neighbourhood ('kernel size') to each pixel in preparation for the final step
# in other words: slide the image along the kernel rather than sliding the kernel along the image
for i in range(k):
for j in range(k):
b[..., i, j] = a[i:i+di, j:j+dj]
# apply the desired function
return func(b, kernel)
return conv
def framed(image, width):
a = np.zeros(np.array(image.shape) + [2 * width, 2 * width]) # only add the frame to the first two dimensions
a[width:-width, width:-width] = image # place the image centered inside the frame
return a
I've used a greyscale image 512x512 pixels and a filter 3x3 for testing:
embossing_kernel = np.array([
[-2, -1, 0],
[-1, 1, 1],
[0, 1, 2]
])
#convolve2
def filter2(img, B):
return np.sum(img * B, axis=(2,3))
#convolve2
def __add2__(img, B):
return np.mean(np.logical_and(img, B), axis=(2,3)) > 0
# image_gray is a 2D grayscale image (not color/RGB)
b = filter2(image_gray, embossing_kernel)
To compare with your convolve I've used:
#convolve
def filter(img, B):
return np.sum(img * B)
#convolve
def __add__(img, B):
return np.mean(np.logical_and(img, B)) > 0
b = filter2(image_gray, embossing_kernel)
The time for convolve was 4.3 s, for convolve2 0.05 s on my machine.
In my case the custom function needs to specify the axes over which to operate, i.e., the additional dimensions holding the neighborhood data. Perhaps the axes could be avoided too but I haven't tried.
Note: this works for 2D images (grayscale) (as you asked about binary images) but can be easily extended to 3D (color) images. In your case you could probably get rid of the frame (or fill it with zeros or ones e.g., in case of repeated application of the function).
In case memory was an issue you might want to adapt a fast implementation of convolve I've posted here: https://stackoverflow.com/a/74288118/20188124.

Create depth map image as 24-bit (Carla)

I have a depth map encoded in 24 bits (labeled "Original").
With the code below:
carla_img = cv.imread('carla_deep.png', flags=cv.IMREAD_COLOR)
carla_img = carla_img[:, :, :3]
carla_img = carla_img[:,:,::-1]
gray_depth = ((carla_img[:,:,0] + carla_img[:,:,1] * 256.0 + carla_img[:,:,2] * 256.0 * 256.0)/((256.0 * 256.0 * 256.0) - 1))
gray_depth = gray_depth * 1000
I am able to convert it as in the "Converted" image.
As shown here: https://carla.readthedocs.io/en/latest/ref_sensors/
How can I reverse this process (Without using any larger external libraries and using at most openCV)? In Python I create a depth map with the help of OpenCV. I wanted to save the obtained depth map in the form of Carla (24-bit).
This is how I create depth map:
imgL = cv.imread('leftImg.png',0)
imgR = cv.imread('rightImg.png',0)
stereo = cv.StereoBM_create(numDisparities=128, blockSize=17)
disparity = stereo.compute(imgL,imgR)
CameraFOV = 120
Focus_length = width /(2 * math.tan(CameraFOV * math.pi / 360))
camerasBaseline = 0.3
depthMap = (camerasBaseline * Focus_length) / disparity
How can I save the obtained depth map in the same form as in the picture marked "Original"?
Docs say:
normalized = (R + G * 256 + B * 256 * 256) / (256 * 256 * 256 - 1)
in_meters = 1000 * normalized
So if you have a depth map in_meters, you do the reverse, by rearranging the equations.
You need to make sure your depth map (from block matching) is in units of meters. Your calculations there look sensible, assuming your cameras have a baseline of 0.3 meters.
First variant
take the calculation apart, using division and modulo operations.
Various .astype are required to turn floats into integers, and wider integers into narrow integers (assumption for pictures).
normalized = in_meters / 1000
BGR = (normalized * (2**24-1)).astype(np.uint32)
BG,R = np.divmod(BGR, 2**8)
B,G = np.divmod(BG, 2**8)
carla_img = np.dstack([B,G,R]).astype(np.uint8) # BGR order
Second variant
One could also do this with a view, reinterpreting the uint32 data as four uint8 values. This assumes a little endian system, which is a fair assumption but one needs to be aware of it.
...
reinterpreted = BGR.view(np.uint8) # lowest byte first, i.e. order is RGBx
reinterpreted.shape = BGR.shape + (4,) # np.view refuses to add a dimension
carla_img = reinterpreted[:,:,(2,1,0)] # select BGR
# this may require a .copy() to get data without holes (OpenCV may want this)
Disclaimer
I could not test the code in this answer because you haven't provided usable data.

Shift numpy array data to make it only grow

I have numpy array named data_saw that contains thousands of float numbers.
When visualized, it looks like this
My task is to shift down each element before gap by 180 (because each gap range here is around 180), so I will get continuous growing only line without gaps
I've ended up with looping through array and checking for a gap at every index (the last element is for comparison only and is not needed in further calculations, so it is just deleted after alignment):
for i in range(1, len(data_saw)):
if data_saw[i - 1] > data_saw[i]:
data_saw[:i] -= 180
data_saw = np.delete(data_saw, -1)
Trying to find out if there are more correct ways to do this with numpy array. Are there any?
np.diff will tell you when the gap is bigger than some threshold
mask = np.diff(data_saw) < -90
To make mask the same size as data_saw, prepend a zero, because the result of diff is always smaller than the input by one element. For what I have in mind, you'll also want to convert to an integer type:
offset = np.concatenate(([0], mask)).cumsum()
To normalize the data, just add 180 * offset plus some arbitrary bias:
data_fixed = data_saw + 180 * offset
To keep the last segment at its original value:
data_fixed = data_saw + 180 * (offset - offset[1])
To keep the second segment as-is:
data_fixed = data_saw + 180 * (offset - offset[-1])
You can use a similar method to adjust data not only with arbitrary numbers of gaps, but even arbitrary gap sizes above some threshold.
First, compute the indices corresponding to the orignal mask using np.flatnonzero:
delta = np.diff(data_saw)
indices = np.flatnonzero(delta < -90)
Now you can simply fill in the bad elements of delta, for example with the average of the two surrounding elements:
delta[indices] = 0.5 * (delta[indices - 1] + delta[indices + 1])
The fixed data is the cumulative sum (with a zero prepended):
data_fixed = np.concatenate(([0], delta)).cumsum() + data_saw[0]
Not sure if it is more correct, but uses numpy's native methods.
import numpy as np
import matplotlib.pyplot as plt
array = np.arange(90)
array = np.concatenate([array[:30], array[30:] + 180, array[60:] + 2 * 180 + 30])
plt.plot(array)
adjusted = array - np.concatenate([[0], ((np.diff(array) >= 180).cumsum() * 180)])
plt.plot(adjusted)

Remove for loops for faster execution - vectorize

As a part of my academic project, I am working on a linear filter for an image. Below is the code, using only NumPy (no external libraries) and want to eliminate for loops by vectorizing or any other options. How can I achieve vectorization for faster execution? Thanks for the help.
Inputs -
Image.shape - (568, 768)
weightArray.shape - (3, 3)
def apply_filter(image: np.array, weight_array: np.array) -> np.array:
rows, cols = image.shape
height, width = weight_array.shape
output = np.zeros((rows - height + 1, cols - width + 1))
for rrow in range(rows - height + 1):
for ccolumn in range(cols - width + 1):
for hheight in range(height):
for wwidth in range(width):
imgval = image[rrow + hheight, ccolumn + wwidth]
filterval = weight_array[hheight, wwidth]
output[rrow, ccolumn] += imgval * filterval
return output
Vectorization is the process of converting each explicit for loop into a 1-dimensional array operation.
In Python, this will involve reimagining your data in terms of slices.
In the code below, I've provided a working vectorization of the kernel loop. This shows how to approach vectorization, but since it is only optimizing the 3x3 array, it doesn't give you the biggest available gains.
If you want to see a larger improvement, you'll vectorize the image array, which I've templated for you as well -- but left some as an exercise.
import numpy as np
from PIL import Image
## no vectorization
def applyFilterMethod1(image: np.array, weightArray: np.array) -> np.array:
rows, cols = image.shape ; height, width = weightArray.shape
output = np.zeros((rows - height + 1, cols - width + 1))
for rrow in range(rows - height + 1):
for ccolumn in range(cols - width + 1):
for hheight in range(height):
for wwidth in range(width):
imgval = image[rrow + hheight, ccolumn + wwidth]
filterval = weightArray[hheight, wwidth]
output[rrow, ccolumn] += imgval * filterval
return output
## vectorize the kernel loop (~3x improvement)
def applyFilterMethod2(image: np.array, weightArray: np.array) -> np.array:
rows, cols = image.shape ; height, width = weightArray.shape
output = np.zeros((rows - height + 1, cols - width + 1))
for rrow in range(rows - height + 1):
for ccolumn in range(cols - width + 1):
imgval = image[rrow:rrow + height, ccolumn:ccolumn + width]
filterval = weightArray[:, :]
output[rrow, ccolumn] = sum(sum(imgval * filterval))
return output
## vectorize the image loop (~50x improvement)
def applyFilterMethod3(image: np.array, weightArray: np.array) -> np.array:
rows, cols = image.shape ; height, width = weightArray.shape
output = np.zeros((rows - height + 1, cols - width + 1))
for hheight in range(height):
for wwidth in range(width):
imgval = 0 ## TODO -- construct a compatible slice
filterval = weightArray[hheight, wwidth]
output[:, :] += imgval * filterval
return output
src = Image.open("input.png")
sb = np.asarray(src)
cb = np.array([[1,2,1],[2,4,2],[1,2,1]])
cb = cb/sum(sum(cb)) ## normalize
db = applyFilterMethod2(sb, cb)
dst = Image.fromarray(db)
dst.convert("L").save("output.png")
#src.show() ; dst.show()
Note: You could probably remove all four for loops, with some additional complexity. However, because this would only eliminate the overhead of 9 iterations (in this example), I don't estimate that it would yield any additional performance gains over applyFilterMethod3. Furthermore, although I haven't attempted it, the way I imagine it would be done might add more overhead than it would remove.
FYI: This is a standard image convolution (supporting only grayscale as implemented). I always like to point out that, in order to be mathematically correct, this would need to compensate for the gamma compression that is implicit in nearly every default image encoding -- but this little detail is often ignored.
Discussion
This type of vectorization is is often necessary in Python, specifically, because the standard Python interpreter is extremely inefficient at processing large for loops. Explicitly iterating over each pixel of an image, therefore, wastes a lot time. Ultimately, though, the vectorized implementation does not change the amount of real work performed, so we're only talking about eliminating an overhead aspect of the algorithm.
However, vectorization has a side-benefit: Parallelization. Lumping a large amount of data processing onto a single operator gives the language/library more flexibility in how to optimize the execution. This might include executing your embarrassingly parallel operation on a GPU -- if you have the right tools, for example the Tensorflow image module.
Python's seamless support for array programming is one reason that it has become highly popular for use in machine learning, which can be extremely compute intensive.
Solution
Here's the solution to imgval assignment, which was left as an exercise above.
imgval = image[hheight:hheight+rows - height+1, wwidth:wwidth+cols - width +1]
You can construct an array of sliced views of the image, each shifted by the indices of the weights array, and then multiply it by the weights and take the sum.
def apply_filter(image: np.array, weights: np.array) -> np.array:
height, width = weights.shape
indices = np.indices(weights.shape).T.reshape(weights.size, 2)
views = np.array([image[r:-height+r,c:-width+c] for r, c in indices])
return np.inner(views.T, weights.T.flatten()).T # sum product
(I had to transpose and reshape at several points to get the data into the desired shapes and order. There may be simpler ways.)
There is still a sneaky for loop in the form of a list comprehension over the weights indices, but we minimize the operations inside the for loop to creating a set of slice views. The loop could potentially be avoided using sliding_window_view, but it's not clear if that would improve performance; or stride_tricks.as_strided (see answers to this question).

Stochastic Sampling for Image Processing

Can anyone help with the code for this? I want to reduce my image size by using stochastic sampling but cannot work out how to set the limits of my input patch.
# New smaller image
img_small = np.zeros((img.shape[0] // factor, img.shape[1] // factor),
dtype=np.int64)
# Loop over the rows of the smaller image
for i in range(img_small.shape[0]):
# Loop over the columns of the smaller image
for j in range(img_small.shape[1]):
# The input patch should consist of rows from factor * i to
# factor * (i + 1) - 1, and columns from factor * j to
# factor * (j + 1) - 1
# input_patch = img[ # Extract the input patch
# Can use np.random.choice(input_patch.flatten(), ...) to choose random
# pixels from input_patch
# img_small[i, j] = # Set the output pixel
img_small[i, j] =
The limits are given in the comments, simply apply them to the array. Using an example image
and using your code (added image load and converted it to greyscale - you will need to add colour handling if colour is required):
from PIL import Image
import numpy as np
from matplotlib.pyplot import imshow
# load the image and convert to greyscale
image = Image.open('imglrg0.jpg').convert('LA')
# convert image to numpy array
img_lrg = np.asarray(image)
#imshow(img_lrg)
factor = 8
# New smaller image
img_small = np.zeros((img_lrg.shape[0] // factor, img_lrg.shape[1] // factor),
dtype=np.int64)
# Loop over the rows of the smaller image
for i in range(img_small.shape[0]):
# Loop over the columns of the smaller image
for j in range(img_small.shape[1]):
# The input patch should consist of rows from factor * i to
# factor * (i + 1) - 1, and columns from factor * j to
# factor * (j + 1) - 1
# input_patch = img[ # Extract the input patch
input_patch = img_lrg[i * factor:(i+1) * factor - 1, j * factor:(j+1) * factor - 1]
# Can use np.random.choice(input_patch.flatten(), ...) to choose random
# pixels from input_patch
# img_small[i, j] = # Set the output pixel
img_small[i, j] = np.random.choice(input_patch.flatten())
imshow(np.asarray(img_small))
which results in (for factor=8. Not the best result, but recognizable. Maybe play with the sampling a bit to improve. I simply used matplotlib to quickly display the result so its off-color.):
Just as an addition on the sampling: choosing the average of three points like so img_small[i, j] = np.average(np.random.choice(input_patch.flatten(), 3)) results in a substantial improvement:

Categories