how to detect and slice boxes out of a numpy array

how to detect and slice boxes out of a numpy array - python

I've got a wide image with 2 images inside it, these 2 images could be seen as 'boxes' in the big image and the numpy array would look like this:
[
[200,200,200,157,200,200,200,238,256,167,234,266,154,200,200,200,157,200,200,200,238,256,167,234,266,154,200,200,200,157,200,200,200],
[200,200,200,200,200,200,200,238,256,167,234,266,154,200,200,200,200,200,200,200,238,256,167,234,266,154,200,200,200,200,200,200,200],
[200,144,200,200,132,200,200,238,256,167,234,266,154,200,144,200,200,132,200,200,238,256,167,234,266,154,200,144,200,200,132,200,200],
[200,200,200,200,200,200,200,238,256,167,234,266,154,200,200,200,200,200,200,200,238,256,167,234,266,154,200,200,200,200,200,200,200],
[200,200,166,200,200,200,200,238,256,167,234,266,154,200,200,166,200,200,200,200,238,256,167,234,266,154,200,200,166,200,200,200,200],
[182,200,200,200,200,200,200,238,256,167,234,266,154,182,200,200,200,200,200,200,238,256,167,234,266,154,182,200,200,200,200,200,200]
]
Because i applied a median filter, the surrounding pixels are all 200 with a little bit of noise here and there. My question is: How can i extract those 2 sub-images from this big image and put them as their own array so i have the pictures seperately. My guess would be to slice them out or maybe use edge detection but i haven't succeeded to do so yet. The array in the question is mockup but represents how it looks like because the real array it too big for the output in visual studio. Underneath is what a picture really looks like, there are different pictures with each different 'white spaces' and amount of sub pictures in it. The size is fixed and always 28 x 200.

I am not sure whether my assumptions of your image data is correct. The following algorithm only works, if the image itself does not contain
"flat regions" of 200.
import numpy as np
from matplotlib import pyplot as plt
data = np.array([
[200,200,200,157,200,200,200,238,256,167,234,266,154,200,200,200,157,200,200,200,238,256,167,234,266,154,200,200,200,157,200,200,200],
[200,200,200,200,200,200,200,238,256,167,234,266,154,200,200,200,200,200,200,200,238,256,167,234,266,154,200,200,200,200,200,200,200],
[200,144,200,200,132,200,200,238,256,167,234,266,154,200,144,200,200,132,200,200,238,256,167,234,266,154,200,144,200,200,132,200,200],
[200,200,200,200,200,200,200,238,256,167,234,266,154,200,200,200,200,200,200,200,238,256,167,234,266,154,200,200,200,200,200,200,200],
[200,200,166,200,200,200,200,238,256,167,234,266,154,200,200,166,200,200,200,200,238,256,167,234,266,154,200,200,166,200,200,200,200],
[182,200,200,200,200,200,200,238,256,167,234,266,154,182,200,200,200,200,200,200,238,256,167,234,266,154,182,200,200,200,200,200,200]
])
# filter all regions of (more or less) constant 200
median_data = np.median(data, axis=0)
diff_data = np.append([0], np.diff(median_data))
img_region = (diff_data != 0) & (median_data != 200)
# get image regions, identify longest image as "true" image length
idx_pairs = np.where(np.diff(np.hstack(([False],img_region,[False]))))[0]
region_lengths = np.diff(idx_pairs)[::2]
longest_region = np.max(np.diff(idx_pairs)[::2])
# fix broken images
for i_region, region_length in enumerate(region_lengths):
if region_length != longest_region:
try:
img_region[slice(idx_pairs[i_region*2], idx_pairs[i_region*2]+longest_region)] = True
except IndexError:
pass # removed mini-regions
idx_pairs = np.where(np.diff(np.hstack(([False], img_region, [False]))))[0]
# get number of images
n_img = np.sum(np.diff(img_region.astype(int)) == 1)
image_data = data[:, img_region]
# slice images
images = np.split(image_data, n_img, axis=1)
for img in images:
plt.figure()
plt.imshow(img)
plt.show()

Related

Save individual segments from image segmentation

I've been using skimage.segmentation modules to find contiguous segments within an image.
For example,
segments quite nicely to
I want to be able to view the distinct regions of the original image in isolation (such that the above image would result in 6 roughly rectangular sub-images). I have obtained some degree of success in doing this, but it's been difficult. Is there any pre-existing module I can use to accomplish this?
If not, high-level algorthim advice would be appreciated.
Approach thus far:
image_slic = seg.slic(image, n_segments=6)
borders = seg.find_boundaries(image_slic)
sub_images = []
new_seg = []
for every row of borders:
new_seg.append([])
for every pixel in every row:
if (pixel is not a border and is not already processed):
new_seg[-1].append(pixel)
Mark pixel as processed
elif (pixel is a border and is not already processed):
break
if (on the first pixel of a row OR the first unprocessed pixel):
sub_images.append(new_seg)
new_seg = []
With this approach, I can generate the four regions from the example image that border the left side without error. While it's not shown in the above pseudo-code, I'm also padding segments with transparent pixels to preserve their shape. This additional consideration makes finding right-side sub-images more difficult.

This can be readily accomplished through NumPy's boolean indexing:
import numpy as np
from skimage import io, segmentation
import matplotlib.pyplot as plt
n_segments = 6
fig_width = 2.5*n_segments
img = io.imread('https://i.imgur.com/G44JEG7.png')
segments = segmentation.slic(img, n_segments=n_segments)
fig, ax = plt.subplots(1, n_segments)
fig.set_figwidth(fig_width)
for index in np.unique(segments):
segment = img.copy()
segment[segments!=index] = 0
ax[index].imshow(segment)
ax[index].set(title=f'Segment {index}')
ax[index].set_axis_off()
plt.show(fig)
You could obtain the same result using NumPy's where function like this:
for index in np.unique(segments):
segment = np.where(np.expand_dims(segments, axis=-1)==index, img, [0, 0, 0])

Remove background of the image using opencv Python

I have two images, one with only background and the other with background + detectable object (in my case its a car). Below are the images
I am trying to remove the background such that I only have car in the resulting image. Following is the code that with which I am trying to get the desired results
import numpy as np
import cv2
original_image = cv2.imread('IMG1.jpg', cv2.IMREAD_COLOR)
gray_original = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
background_image = cv2.imread('IMG2.jpg', cv2.IMREAD_COLOR)
gray_background = cv2.cvtColor(background_image, cv2.COLOR_BGR2GRAY)
foreground = np.absolute(gray_original - gray_background)
foreground[foreground > 0] = 255
cv2.imshow('Original Image', foreground)
cv2.waitKey(0)
The resulting image by subtracting the two images is
Here is the problem. The expected resulting image should be a car only.
Also, If you take a deep look in the two images, you'll see that they are not exactly same that is, the camera moved a little so background had been disturbed a little. My question is that with these two images how can I subtract the background. I do not want to use grabCut or backgroundSubtractorMOG algorithm right now because I do not know right now whats going on inside those algorithms.
What I am trying to do is to get the following resulting image
Also if possible, please guide me with a general way of doing this not only in this specific case that is, I have a background in one image and background+object in the second image. What could be the best possible way of doing this. Sorry for such a long question.

I solved your problem using the OpenCV's watershed algorithm. You can find the theory and examples of watershed here.
First I selected several points (markers) to dictate where is the object I want to keep, and where is the background. This step is manual, and can vary a lot from image to image. Also, it requires some repetition until you get the desired result. I suggest using a tool to get the pixel coordinates.
Then I created an empty integer array of zeros, with the size of the car image. And then I assigned some values (1:background, [255,192,128,64]:car_parts) to pixels at marker positions.
NOTE: When I downloaded your image I had to crop it to get the one with the car. After cropping, the image has size of 400x601. This may not be what the size of the image you have, so the markers will be off.
Afterwards I used the watershed algorithm. The 1st input is your image and 2nd input is the marker image (zero everywhere except at marker positions). The result is shown in the image below.
I set all pixels with value greater than 1 to 255 (the car), and the rest (background) to zero. Then I dilated the obtained image with a 3x3 kernel to avoid losing information on the outline of the car. Finally, I used the dilated image as a mask for the original image, using the cv2.bitwise_and() function, and the result lies in the following image:
Here is my code:
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Load the image
img = cv2.imread("/path/to/image.png", 3)
# Create a blank image of zeros (same dimension as img)
# It should be grayscale (1 color channel)
marker = np.zeros_like(img[:,:,0]).astype(np.int32)
# This step is manual. The goal is to find the points
# which create the result we want. I suggest using a
# tool to get the pixel coordinates.
# Dictate the background and set the markers to 1
marker[204][95] = 1
marker[240][137] = 1
marker[245][444] = 1
marker[260][427] = 1
marker[257][378] = 1
marker[217][466] = 1
# Dictate the area of interest
# I used different values for each part of the car (for visibility)
marker[235][370] = 255 # car body
marker[135][294] = 64 # rooftop
marker[190][454] = 64 # rear light
marker[167][458] = 64 # rear wing
marker[205][103] = 128 # front bumper
# rear bumper
marker[225][456] = 128
marker[224][461] = 128
marker[216][461] = 128
# front wheel
marker[225][189] = 192
marker[240][147] = 192
# rear wheel
marker[258][409] = 192
marker[257][391] = 192
marker[254][421] = 192
# Now we have set the markers, we use the watershed
# algorithm to generate a marked image
marked = cv2.watershed(img, marker)
# Plot this one. If it does what we want, proceed;
# otherwise edit your markers and repeat
plt.imshow(marked, cmap='gray')
plt.show()
# Make the background black, and what we want to keep white
marked[marked == 1] = 0
marked[marked > 1] = 255
# Use a kernel to dilate the image, to not lose any detail on the outline
# I used a kernel of 3x3 pixels
kernel = np.ones((3,3),np.uint8)
dilation = cv2.dilate(marked.astype(np.float32), kernel, iterations = 1)
# Plot again to check whether the dilation is according to our needs
# If not, repeat by using a smaller/bigger kernel, or more/less iterations
plt.imshow(dilation, cmap='gray')
plt.show()
# Now apply the mask we created on the initial image
final_img = cv2.bitwise_and(img, img, mask=dilation.astype(np.uint8))
# cv2.imread reads the image as BGR, but matplotlib uses RGB
# BGR to RGB so we can plot the image with accurate colors
b, g, r = cv2.split(final_img)
final_img = cv2.merge([r, g, b])
# Plot the final result
plt.imshow(final_img)
plt.show()
If you have a lot of images you will probably need to create a tool to annotate the markers graphically, or even an algorithm to find markers automatically.

The problem is that you're subtracting arrays of unsigned 8 bit integers. This operation can overflow.
To demonstrate
>>> import numpy as np
>>> a = np.array([[10,10]],dtype=np.uint8)
>>> b = np.array([[11,11]],dtype=np.uint8)
>>> a - b
array([[255, 255]], dtype=uint8)
Since you're using OpenCV, the simplest way to achieve your goal is to use cv2.absdiff().
>>> cv2.absdiff(a,b)
array([[1, 1]], dtype=uint8)

I recommend using OpenCV's grabcut algorithm. You first draw a few lines on the foreground and background, and keep doing this until your foreground is sufficiently separated from the background. It is covered here: https://docs.opencv.org/trunk/d8/d83/tutorial_py_grabcut.html
as well as in this video: https://www.youtube.com/watch?v=kAwxLTDDAwU

OpenCV/python: How to change image pixels' values using a formula?

I'm trying to stretch an image's histogram using a logarithmic transformation. Basically, I am applying a log operation to each pixel's intensity. When I'm trying to change image's value in each pixel, the new values are not saved but the histogram looks OK. Also, the maximum value is not correct. This is my code:
import cv2
import numpy as np
import math
from matplotlib import pyplot as plt
img = cv2.imread('messi.jpg',0)
img2 = img
for i in range(0,img2.shape[0]-1):
for j in range(0,img2.shape[1]-1):
if (math.log(1+img2[i,j],2)) < 0:
img2[i,j]=0
else:
img2[i,j] = np.int(math.log(1+img2[i,j],2))
print (np.int(math.log(1+img2[i,j],2)))
print (img2.ravel().max())
cv2.imshow('LSP',img2)
cv2.waitKey(0)
fig = plt.gcf()
fig.canvas.set_window_title('LSP histogram')
plt.hist(img2.ravel(),256,[0,256]); plt.show()
img3 = img2
B = np.int(img3.max())
A = np.int(img3.min())
print ("Maximum intensity = ", B)
print ("minimum intensity = ", A)
This is also the histogram I get:
However, the maximum intensity shows 186! This isn't applying the proper logarithmic operation at all.
Any ideas?

The code you wrote performs a logarithmic transformation applied to the image intensities. The reason why you are getting such a high spurious intensity as the maximum is because your for loops are wrong. Specifically, your range is incorrect. range is exclusive of the ending interval, which means that you must go up to img.shape[0] and img.shape[1] respectively, and not img.shape[0]-1 or img.shape[1]-1. Therefore, you are missing the last row and last column of the image, and these don't get touched by logarithmic operation. The maximum that is reported is from one of these pixels in the last row or column that you didn't touch.
Once you correct this, you don't get those bad intensities anymore:
for i in range(0,img2.shape[0]): # Change
for j in range(0,img2.shape[1]): # Change
if (math.log(1+img2[i,j],2)) < 0:
img2[i,j]=0
else:
img2[i,j] = np.int(math.log(1+img2[i,j],2))
Doing that now gives us:
('Maximum intensity = ', 7)
('minimum intensity = ', 0)
However, what you're going to get now is a very dark image. The histogram that you have shown us illustrates that all of the image pixels are in the dark range... roughly between [0-7]. Because of that, the majority of your image is going to be dark if you use uint8 as the data type for visualization. Take note that I searched for the Lionel Messi image that's part of the OpenCV tutorials, and this is the image I found:
Source: https://opencv-python-tutroals.readthedocs.org/en/latest/_images/roi.jpg
Your code is converting this to grayscale, and that's fine for the purpose of your question. Now, using the above image, if you actually show what the histogram count looks like as well as what the intensities are per bin in the histogram, this is what we get for img2:
In [41]: np.unique(img2)
Out[41]: array([0, 1, 2, 3, 4, 5, 6, 7], dtype=uint8)
In [42]: np.bincount(img2.ravel())
Out[42]: array([ 86, 88, 394, 3159, 14841, 29765, 58012, 19655])
As you can see, the bulk of the image pixels are hovering between the [0-7] range, which is why everything looks black. If you want to see this better, perhaps scale the image by roughly 255 / 7 = 36 or so we can see the image better:
img2 = 36*img2
cv2.imshow('LSP',img2)
cv2.waitKey(0)
We get this image:
I also get this histogram:
That personally looks very ugly... at least to me. As such, I would recommend that you choose a more meaningful image transformation if you want to stretch the histogram. In fact, the log operation compresses the dynamic range of the histogram. If you want to stretch the histogram, go the opposite way and try a power-law operation. Specifically, given an input intensity and the output is defined as:
out = c*in^(p)
in is the input intensity, p is a power and c is a constant to ensure that you scale the image so that the maximum intensity gets mapped to the same maximum intensity of the input when you're finished and not anything larger. That can be done by calculating c so that:
c = (img2.max()) / (img2.max()**p)
... where p is the power you want. In addition, the transformation via power-law can be explained with this nice diagram:
Source: http://www.nptel.ac.in/courses/117104069/chapter_8/8_14.html
Basically, powers that are less than 1 perform an intensity expansion where darker intensities get pushed towards the lighter side. Similarly, powers that are greater than 1 perform an intensity compression where lighter intensities get pushed to the darker side. In your case, you want to expand the histogram, and so you want the first option. Specifically, try making the intensities that are smaller go towards the larger range. This can be done by choosing a power that's smaller than 1... try 0.5 for example.
You'd modify your code so that it is like this:
img2 = img2.astype(np.float) # Cast to float
c = (img2.max()) / (img2.max()**(0.5))
for i in range(0,img2.shape[0]-1):
for j in range(0,img2.shape[1]-1):
img2[i,j] = np.int(c*img2[i,j]**(0.5))
# Cast back to uint8 for display
img2 = img2.astype(np.uint8)
Doing that, I get this image:
I also get this histogram:
Minor Note
If I can suggest something in terms of efficiency, I wouldn't recommend that you loop through the entire image and set each pixel individually... that's how numpy arrays were not supposed to be used. You can achieve what you want vectorized in a single line of code.
With your old code, use np.log2, not math.log with the base 2 with numpy arrays:
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Your code
img = cv2.imread('messi.jpg',0)
# New code
img2 = np.log2(1 + img.astype(np.float)).astype(np.uint8)
# Back to your code
img2 = 36*img2 # Edit from before
cv2.imshow('LSP',img2)
cv2.waitKey(0)
fig = plt.gcf()
fig.canvas.set_window_title('LSP histogram')
plt.hist(img2.ravel(),256,[0,256]); plt.show()
img3 = img2
B = np.int(img3.max())
A = np.int(img3.min())
print ("Maximum intensity = ", B)
print ("minimum intensity = ", A)
cv2.destroyAllWindows() # Don't forget this
Similarly, if you want to apply a power-law transformation, it's very simply:
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Your code
img = cv2.imread('messi.jpg',0)
# New code
c = (img2.max()) / (img2.max()**(0.5))
img2 = (c*img.astype(np.float)**(0.5)).astype(np.uint8)
#... rest of code as before

Correlating two skeletized images : Python

import cv2
import numpy as np
from PIL import Image
from skimage import morphology
from scipy import signal
img = cv2.imread('thin.jpg',0)
img1 = cv2.imread('thin1.jpg',0)
cv2.imshow('image1',img)
cv2.imshow('image2',img1)
ret,img = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
ret,img1 = cv2.threshold(img1,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
size = np.size(img)
size1 = np.size(img1)
skel = np.zeros(img.shape,np.uint8)
skel1 = np.zeros(img1.shape,np.uint8)
element = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
img = 255 - img
img1 = 255 - img1
img = cv2.dilate(img,element,iterations=8)
img1 = cv2.dilate(img1,element,iterations=8)
done = False
while(not done):
eroded = cv2.erode(img,element)
eroded1 = cv2.erode(img1,element)
temp = cv2.dilate(eroded,element)
temp1 = cv2.dilate(eroded1,element)
temp = cv2.subtract(img,temp)
temp1 = cv2.subtract(img1,temp1)
skel = cv2.bitwise_or(skel,temp)
skel1 = cv2.bitwise_or(skel1,temp1)
img = eroded.copy()
img1 = eroded1.copy()
zeros = size - cv2.countNonZero(img)
if zeros==size:
done = True
cv2.imshow('IMAGE',skel)
cv2.imshow('TEMPLATE',skel1)
cv2.imwrite("image.jpg",skel)
if cv2.waitKey(0) & 0xFF == ord('q'):
cv2.destroyAllWindows()
This is the code that i tried to convert two grayscale image to two skeletized image using the method of binarization and thinning and the result is also obtained. Now with these two skeletized image , i want to do a comparison to see whether they match or not. How can i correlate each other? Do we need to convert this skeletized into 2d array? Can anyone suggest any solution. Thanks in advance.

There are a number of ways you can compare the images to see if they match. The simplest is to do a pixelwise subtraction to create a new image and then sum the pixels in the new image. If they sum to zero you have an exact match. The larger the sum the worse the match.
You will however have a problem using most comparison techniques on a skeletonized image. You take the image and reduce it to skinny little lines that are unlikely to overlap for images that only deviate from each other by a little bit.
With skeletonized images you often need to compare features. For example, identify the points of intersection of the skeleton, and use the location of those points for comparing images. In your sample image you might be able to extract the lines (I see three major ones) and then compare images based on the location of the lines.

Binary images are already represented as 2D numpy arrays.
This is a complex problem. You can do this by reshaping the images to two vectors (assuming they are exactly the same size), and then calculating the correlation coefficient:
np.corrcoef(img.reshape(-1), img1.reshape(-1))

One possible solution would be to correlate (or subtract) the blurred version of each skeletonized image with one another.
That way, the unavoidable little offsets between skeleton lines wouldn't have such a negative impact on the outcome as if you subtracted the skeletons directly (since the skeleton lines would most probably not overlay exactly over one another).
I'm assuming here that the original images weren't similar to each other in the first place, otherwise you wouldn't need to skeletonize them, right?

Finding coordinates of brightest pixel in an image and entering them into an array

I have been asked to write a program to find 'stars' in an image by converting the image file to a numpy array and generating an array of the coordinates of the brightest pixels in the image above a specified threshold (representing background interference).
Once I have located the brightest pixel in the image I must record its x,y coordinates, and set the value of that pixel and surrounding 10X10 pixel area to zero, effectively removing the star from the image.
I already have a helper code which converts the image to an array, and have attempted to tackle the problem as follows;
I have defined a variable
Max = array.max()
and used a while loop;
while Max >= threshold
coordinates = numpy.where(array == Max) # find the maximum value
however I want this to loop over the whole array for all of the coordinates,not just find the first maximum, and also remove each maximum when found and setting the surrounding 10X10 area to zero. I have thought about using a for loop to do this but am unsure how I should use it since I am new to Python.
I would appreciate any suggestions,
Thanks

There are a number of different ways to do it with just numpy, etc.
There's the "brute force" way:
import Image
import numpy as np
im = Image.open('test.bmp')
data = np.array(im)
threshold = 200
window = 5 # This is the "half" window...
ni, nj = data.shape
new_value = 0
for i, j in zip(*np.where(data > threshold)):
istart, istop = max(0, i-window), min(ni, i+window+1)
jstart, jstop = max(0, j-window), min(nj, j+window+1)
data[istart:istop, jstart:jstop] = new_value
Or the faster approach...
import Image
import numpy as np
import scipy.ndimage
im = Image.open('test.bmp')
data = np.array(im)
threshold = 200
window = 10 # This is the "full" window...
new_value = 0
mask = data > threshold
mask = scipy.ndimage.uniform_filter(mask.astype(np.float), size=window)
mask = mask > 0
data[mask] = new_value

Astronomy.net will do this for you:
If you have astronomical imaging of the sky with celestial coordinates
you do not know—or do not trust—then Astrometry.net is for you. Input
an image and we'll give you back astrometric calibration meta-data,
plus lists of known objects falling inside the field of view.
We have built this astrometric calibration service to create correct,
standards-compliant astrometric meta-data for every useful
astronomical image ever taken, past and future, in any state of
archival disarray. We hope this will help organize, annotate and make
searchable all the world's astronomical information.
You don't even have to upload the images to their website. You can download the source. It is licensed under the GPL and uses NumPy, so you can muck around with it if you need to.
Note that you will need to first convert your bitmap to one of the following: JPEG, GIF, PNG, or FITS image.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to detect and slice boxes out of a numpy array - python

Related

Save individual segments from image segmentation

Remove background of the image using opencv Python

OpenCV/python: How to change image pixels' values using a formula?

Correlating two skeletized images : Python

Finding coordinates of brightest pixel in an image and entering them into an array

Categories

Resources