How to iterate through non-zeros values of an image ? - Python - python

I have fond online a function to extract and display the dominant colors of an image. To save time, I want to iterate only on the non-zeros pixels instead of the whole image. However the way I changed the function raises an error :
if row != [0,0,0]:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Here is the modified code :
def dominantColor(image) :
from matplotlib.pyplot import imshow, show
from scipy.cluster.vq import whiten
from scipy.cluster.vq import kmeans
import pandas as pd
r = []
g = []
b = []
for row in image :
if row != [0,0,0]: #the part I added to the original code
print(row)
for temp_r, temp_g, temp_b in row:
r.append(temp_r)
g.append(temp_g)
b.append(temp_b)
image_df = pd.DataFrame({'red': r, 'green': g, 'blue': b})
image_df['scaled_color_red'] = whiten(image_df['red'])
image_df['scaled_color_blue'] = whiten(image_df['blue'])
image_df['scaled_color_green'] = whiten(image_df['green'])
cluster_centers, _ = kmeans(image_df[['scaled_color_red','scaled_color_blue','scaled_color_green']], 3)
dominant_colors = []
red_std, green_std, blue_std = image_df[['red','green','blue']].std()
for cluster_center in cluster_centers:
red_scaled, green_scaled, blue_scaled = cluster_center
dominant_colors.append((
red_scaled * red_std / 255,
green_scaled * green_std / 255,
blue_scaled * blue_std / 255
))
imshow([dominant_colors])
show()
return dominant_colors
How should I correct my iteration loop to remove the error and have only the non-zeros values of my image ? (NB : the image is actually mask * original_image)

You need to add .all() method after that comparison if you want co compare arrays element wise. So if (row == [0,0,0]).all().
import numpy as np
image = np.array([
[0, 0, 0],
[1, 0, 0],
[0, 0, 1],
])
for row in image:
if not (row == [0, 0, 0]).all():
print(row)
Result:
[1 0 0]
[0 0 1]

If I understand your code correctly, the answer is in the error log that you posted:
if row != [0,0,0]:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
So, the any function check if any of the elements in the row are evaluated as true:
for row in image :
if any(row): #enter the if block if any element is not 0
print(row)
for temp_r, temp_g, temp_b in row:
r.append(temp_r)
g.append(temp_g)
b.append(temp_b)

Related

Find index of list element with maximum value of specific property

This code snippet:
import numpy as np
from skimage.measure import label, regionprops
image = np.array([[1, 1, 0], [1, 0, 0], [0, 0, 2]])
labels = label(image, background=0, connectivity=2)
props = regionprops(labels, image, cache=True)
print(image)
print(np.argmax([p.area for p in props]))
will print:
[[1 1 0]
[1 0 0]
[0 0 2]]
0
0 is an index of props element with maximum value of area property. Is there a more direct way of computing it without the need for creating a temporary array in np.argmax([p.area for p in props])? It doesn't have to use NumPy.
What about using regionprops_table?
from skimage.measure import label, regionprops_table
labels = label(image, background=0, connectivity=2)
out = np.argmax(regionprops_table(labels, image, cache=True, properties=['area'])['area'])
Output: 0
Mozway has given you a great solution. I answer here on how to find index of list element with maximum value of specific property without creating a temporary array.
A straightforward method is use a for loop:
maxid = -1
maxarea = -1
for i, p in enumerate(props):
if p.area > maxarea:
maxid, maxarea = i, p.area
This can be written in one line using functools.reduce.
import functools
maxid, maxarea = functools.reduce(lambda res, i_p: (i_p[0], i_p[1].area) if i_p[1].area > res[1] else res, enumerate(props), (-1, -1))

How to get indices from tensor where the scores (values) satisfies a condition after using torch.topk

I have a Tensor of shape mxm which is basically the similarity or inner product of two tensors. I want to Get all the values which are above 0.5 only. How could I go this? (numpy operations would do too)
Example:
x = torch.randn((9052, 512))
similarities = x # x.T
scores, indices = torch.topk(similarities, x.shape[0]) # topk == all the values, sorted
I have tried
mask = torch.ones(scores.size()[0])
mask = 1 - mask.diag()
sim_vec = torch.nonzero((scores >= 0.5)*mask)
Gives me a tensor of shape [39672595, 2]
I've also tried
(scores > 0.5 ).nonzero(as_tuple=True)[0]
It gives me a tensor of shape [51152826]
Expected Result Pseudo Code:
result = []
for i, row in enumerate(scores):
temp = []
for j, value in enumerate(row):
if value > 0.5:
temp.append(indices[i][j].item())
result.append(temp)
Update:
The below code gives the Upper Triangle which shows that which element is closest to the other one. Main problem still persists of filtering and getting the index:
import pandas as pd
matrix = pd.DataFrame(scores.numpy().astype(np.float32))
upper_tri = matrix.where(np.triu(np.ones(matrix.shape),k=1).astype(np.bool))

How to raise an error that coordinates are out of raster bounds (rasterio.sample)?

I'm using rasterio sample module, list_of_coords are just 3 random points but only 2nd is within the raster bounds:
list_of_coords = [(754.2,4248548.6), (754222.6,4248548.6), (54.9,4248548.4)]
sample = np.array(list(rasterio.sample.sample_gen(raster, list_of_coords))).flatten()
output:
[ 0 896 0]
It works great but as you can see if coordinates are out of raster image it gives value 0. Is there any way to let user know that coords that they put in the list are out of raster bounds? 0 can be also a value for existing within raster bounds point so simple loop:
for idx, element in enumerate(sample):
if element == 0:
print(f"coords {a[idx]} out of raster")
is not a good solution. Here is what I came up with so far:
Knowing basic info about geographic coordinate system and raster bounds we could write down some "rules". With raster.bounds I got bbox for my raster and I wrote a bit better loop:
for idx, element in enumerate(a):
if element[0] > 0 and band2.bounds.right > 0 and element[0] > band2.bounds.right\
or element[0] > 0 and band2.bounds.left > 0 and element[0] < band2.bounds.left: #more conditions
print(f"coords {a[idx]} out of raster")
output (correct):
coords (754.6, 4248548.6) out of raster
coords (54.6, 4248548.6) out of raster
Problem is - to cover all posibilities I need to write in this loop way more conditions, is tere a better way to let user know that given point is out of raster?
rasterio.sample.sample_gen provide a masked argument. When True it yield Masked arrays according to bounding box of raster dataset.
>>> import rasterio
>>> ds = rasterio.open("raster.tif")
>>> ds.bounds
BoundingBox(left=-0.0001388888888888889, bottom=40.999861111111116, right=1.000138888888889, top=42.00013888888889)
>>> # Inside bbox
>>> next(rasterio.sample.sample_gen(ds, ((0.5, 41.5), ), masked=True))
masked_array(data=[130],
mask=False, # <= No mask (ndim=0)
fill_value=999999,
dtype=int16)
>>> # Outside bbox
>>> next(rasterio.sample.sample_gen(ds, ((0, 0), ), masked=True))
masked_array(data=[0],
mask=[False], # <= Mask ndim=1
fill_value=999999,
dtype=int16)
And them convert to python list with None when coordinates are outside raster bounds:
>>> [None if x.mask.ndim == 1 and not x.mask[0] else x[0]
... for x in rasterio.sample.sample_gen(ds, ((0.5, 41.5), (0, 0)), masked=True)]
[130, None]
The shortest piece of code I can think of. It's similar to how the sample function checks for masking https://github.com/mapbox/rasterio/blob/master/rasterio/sample.py#L46
def sample_or_error(raster_file, x, y):
with rasterio.open(raster_file) as src:
row, col = src.index(x, y)
if any([row < 0, col < 0, row >= src.height, col >= src.width]):
raise ValueError(f"({lon}, {lat}) is out of bounds from {src.bounds}")

How to detect if a 2D array is inside another 2D array?

So with the help of a stack-overflow member, I have the following code:
data = "needle's (which is a png image) base64 code goes here"
decoded = data.decode('base64')
f = cStringIO.StringIO(decoded)
image = Image.open(f)
needle = image.load()
while True:
screenshot = ImageGrab.grab()
haystack = screenshot.load()
if detectImage(haystack, needle):
break
else:
time.sleep(5)
I've written the following code to check if the needle is in the haystack:
def detectImage(haystack, needle):
counter = 0
for hayrow in haystack:
for haypix in hayrow:
for needlerow in needle:
for needlepix in needlerow:
if haypix == needlepix:
counter += 1
if counter == 980: #the needle has 980 pixels
return True
else:
return False
The issue is that I get this error for line 3: 'PixelAccess' object is not iterable
It was suggested to me that it would be easier to copy both needle and haystack into a numpy/scipy array. And then I can just use a function that checks to see if the 2D array needle is inside the 2D array haystack.
I need help with:
1) converting those arrays to numpy arrays.
2) a function that checks to see if the 2D array needle is inside the 2D array haystack. My function doesn't work.
These are the images:
Needle:
Haystack:
To convert the image into a numpy array, you should be able to simply do this:
import numpy as np
from PIL import Image
needle = Image.open('needle.png')
haystack = Image.open('haystack.jpg')
needle = np.asarray(needle)
haystack = np.asarray(haystack)
To get you started with finding the needle, note that this will give you a list of all the places where the corner matches:
haystack = np.array([[1,2,3],[3,2,1],[2,1,3]])
needle = np.array([[2,1],[1,3]])
np.where(haystack == needle[0,0])
#(array([0, 1, 2]), row-values
# array([1, 1, 0])) col-values
Then, you can look at all the corner matches, and see if the subhaystack there matches:
h,w = needle.shape
rows, cols = np.where(haystack == needle[0,0])
for row, col in zip(rows, cols):
if np.all(haystack[row:row+h, col:col+w] == needle):
print "found it at row = %i, col = %i"%(row,col)
break
else:
print "no needle in haystack"
Below is a more robust version that finds the best match, and if it matches better than some percentage, considers the needle found. Returns the corner coordinate if found, None if not.
def find_needle(needle, haystack, tolerance=.80):
""" input: PIL.Image objects
output: coordinat of found needle, else None """
# convert to grayscale ("L"uminosity) for simplicity.
needle = np.asarray(needle.convert('L'))
haystack = np.asarray(haystack.convert('L'))
h,w = needle.shape
H,W = haystack.shape
L = haystack.max()
best = (None, None, 1)
rows, cols = np.where((haystack - needle[0,0])/L < tolerance)
for row, col in zip(rows, cols):
if row+h > H or col+w > W: continue # out of range
diff = np.mean(haystack[row:row+h, col:col+w] - needle)/L
if diff < best[-1]:
best = (diff, row, col)
return best if best[-1] < tolerance else None
I finally managed to make a numpy-only implementation of a cross correlation search work... The cross-correlation is calculated using the cross-correlation theorem and FFTs.
from __future__ import division
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
def cross_corr(a, b):
a_rows, a_cols = a.shape[:2]
b_rows, b_cols = b.shape[:2]
rows, cols = max(a_rows, b_rows), max(a_cols, b_cols)
a_f = np.fft.fft2(a, s=(rows, cols), axes=(0, 1))
b_f = np.fft.fft2(b, s=(rows, cols), axes=(0, 1))
corr_ab = np.fft.fft2(a_f.conj()*b_f, axes=(0,1))
return np.rint(corr_ab / rows / cols)
def find_needle(haystack, needle, n=10):
# convert to float and subtract 128 for better matching
haystack = haystack.astype(np.float) - 128
needle = needle.astype(np.float) - 128
target = np.sum(np.sum(needle*needle, axis=0), axis=0)
corr_hn = cross_corr(haystack, needle)
delta = np.sum(np.abs(corr_hn - target), axis=-1)
return np.unravel_index(np.argsort(delta, axis=None)[:n],
dims=haystack.shape[:2])
haystack = np.array(Image.open('haystack.jpg'))
needle = np.array(Image.open('needle.png'))[..., :3]
plt.imshow(haystack, interpolation='nearest')
dy, dx = needle.shape[:2]
candidates = find_needle(haystack, needle, 1)
for y, x in zip(*candidates):
plt.plot([x, x+dx, x+dx, x, x], [y, y, y+dy,y+dy, y], 'g-', lw=2)
plt.show()
So the highest scoring point is the real needle:
>>> print candidates
(array([553], dtype=int64), array([821], dtype=int64))
You can use matchTemplate in opencv to detect the position:
import cv2
import numpy as np
import pylab as pl
needle = cv2.imread("needle.png")
haystack = cv2.imread("haystack.jpg")
diff = cv2.matchTemplate(haystack, needle, cv2.TM_CCORR_NORMED)
x, y = np.unravel_index(np.argmax(diff), diff.shape)
pl.figure(figsize=(12, 8))
im = pl.imshow(haystack[:,:, ::-1])
ax = pl.gca()
ax.add_artist(pl.Rectangle((y, x), needle.shape[1], needle.shape[0], transform=ax.transData, alpha=0.6))
here is the output:

Python - Remove a row from numpy array?

Hi all what I wan't should be really simple for somebody here..I want to remove a row from a numpy array in a loop like:
for i in range(len(self.Finalweight)):
if self.Finalweight[i] >= self.cutoffOutliers:
"remove line[i from self.wData"
I'm trying to remove outliers from a dataset. My full code os the method is like:
def calculate_Outliers(self):
def calcWeight(Value):
pFinal = abs(Value - self.pMed)/ self.pDev_abs_Med
gradFinal = abs(gradient(Value) - self.gradMed) / self.gradDev_abs_Med
return pFinal * gradFinal
self.pMed = median(self.wData[:,self.yColum-1])
self.pDev_abs_Med = median(abs(self.wData[:,self.yColum-1] - self.pMed))
self.gradMed = median(gradient(self.wData[:,self.yColum-1]))
self.gradDev_abs_Med = median(abs(gradient(self.wData[:,self.yColum-1]) - self.gradMed))
self.workingData= self.wData[calcWeight(self.wData)<self.cutoffOutliers]
self.xData = self.workingData[:,self.xColum-1]
self.yData = self.workingData[:,self.yColum-1]
I'm getting the following error:
ile "bin/dmtools", line 201, in plot_gride
self.calculate_Outliers()
File "bin/dmtools", line 188, in calculate_Outliers
self.workingData= self.wData[calcWeight(self.wData)>self.cutoffOutliers]
ValueError: too many indices for array
There is actually a tool in NumPy specifically made to mask out outliers and invalid data points: masked arrays. Example from the linked page:
x = numpy.array([1, 2, 3, -1, 5])
mx = numpy.ma.masked_array(x, mask=[0, 0, 0, 1, 0])
print mx.mean()
prints
2.75

Categories