Improving performance iterating in 2d numpy array

Improving performance iterating in 2d numpy array - python

I have two 2d numpy array (images).
First one defined by image, is storing the sum of a movement at the pixel (i,j)
Second one define by nbCameras, is storing the number of cameras who can see a movement at this pixel (i,j)
I want to create a third image imgFinal which only store the value of the pixel (i,j) and it's neighbours (3 x 3) mask, if the number of cameras who can see the pixel (i,j) is greater than 1.
For now I'm using two for loops which is not the best way. I'd like to increase the speed of the computation but I didn't find the best way to do it yet.
Also I'm a bit blocked as the fact I want to converse the neighbours of the pixel (i, j)
I also tried to use bumpy.vectorize but i can keep the neighbours of my pixel in this case.
What would be the best way to increase the speed of this function?
Thanks for your help!
maskWidth = 3
dstCenterMask = int( (maskWidth - 1) / 2)
imgFinal = np.zeros((image.shape),dtype = np.float32)
for j in range(dstCenterMask,image.shape[0] - dstCenterMask):
for i in range(dstCenterMask,image.shape[1] - dstCenterMask):
if nbCameras[j,i] > 1
imgFinal[j - dstCenterMask : j + dstCenterMask + 1, i - dstCenterMask : i + dstCenterMask + 1] =
image[j - dstCenterMask : j + dstCenterMask + 1, i - dstCenterMask : i + dstCenterMask + 1]

This got quite elegant using skimage.morphology's binary_dilation function. It will take a binary array, and kinda expand any pixels that are true into a 3x3 grid of true values (or any other size). This should also handle cases at the edges. Which i think your implementation did not.
Using this mask it's quite easy to calculate imgFinal
from skimage.morphology import binary_dilation, square
mask = binary_dilation(nbCameras > 1, square(maskWidth))
imgFinal = np.where(mask, image, 0)
square(3) is just shorthand for np.ones((3,3))
http://scikit-image.org/docs/dev/api/skimage.morphology.html?highlight=dilation#skimage.morphology.dilation
Example use of dilation for better explenation of what it does:
In [27]: a
Out[27]:
array([[ 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0.]])
In [28]: binary_dilation(a, square(3))
Out[28]:
array([[1, 1, 0, 0, 0],
[1, 1, 0, 0, 0],
[0, 0, 1, 1, 1],
[0, 0, 1, 1, 1],
[0, 0, 1, 1, 1]], dtype=uint8)

Option 1: Try to rewrite the code in a vectorized way. You could convolve with a 3x3 mask like this:
import numpy as np
from scipy.signal import convolve2d
image = np.random.random((100,100))
nbCameras = np.abs(np.random.normal(size=(100,100)).round())
maskWidth = 3
mask = np.ones((maskWidth, maskWidth))
visibilityMask = (nbCameras>1).astype(np.float)
visibilityMask = convolve2d(visibilityMask, mask, mode="same").astype(np.bool)
imgFinal = image.copy()
imgFinal[~visibilityMask] *= 0
import matplotlib.pyplot as plt
for i, (im, title) in enumerate([(image, "image"),
(nbCameras, "nbCameras"),
(visibilityMask, "visibilityMask"),
(imgFinal, "imgFinal")]):
plt.subplot(2,2,i+1)
plt.title(title)
plt.imshow(im, cmap=plt.cm.gray)
plt.show()
This will result in this plot:
Option 2: Use Numba. This uses an advanced just-in-time optimization technique and is specifically useful for speeding up loops.

This doesn't handle cameras on the edge of the array, but neither does your code:
import numpy as np
from numpy.lib.stride_tricks import as_strided
rows, cols, mask_width = 10, 10, 3
mask_radius = mask_width // 2
image = np.random.rand(rows, cols)
nb_cameras = np.random.randint(3 ,size=(rows, cols))
image_view = as_strided(image, shape=image.shape + (mask_width, mask_width),
strides=image.strides*2)
img_final = np.zeros_like(image)
img_final_view = as_strided(img_final,
shape=img_final.shape + (mask_width, mask_width),
strides=img_final.strides*2)
copy_mask = nb_cameras[mask_radius:-mask_radius,
mask_radius:-mask_radius] > 1
img_final_view[copy_mask] = image_view[copy_mask]
After running the above code:
>>> nb_cameras
array([[0, 2, 1, 0, 2, 0, 1, 2, 1, 0],
[0, 1, 1, 1, 1, 2, 1, 1, 2, 1],
[1, 2, 2, 2, 1, 2, 1, 0, 2, 0],
[0, 2, 2, 0, 1, 2, 1, 0, 1, 0],
[1, 2, 0, 1, 2, 0, 1, 0, 0, 2],
[2, 0, 1, 1, 1, 1, 1, 1, 0, 1],
[1, 0, 2, 2, 0, 1, 1, 1, 0, 0],
[0, 0, 1, 0, 1, 0, 1, 0, 2, 2],
[0, 1, 0, 1, 1, 2, 1, 1, 2, 2],
[2, 2, 0, 1, 0, 0, 1, 2, 1, 0]])
>>> np.round(img_final, 1)
array([[ 0. , 0. , 0. , 0. , 0.7, 0.5, 0.6, 0.5, 0.6, 0.9],
[ 0.1, 0.6, 1. , 0.2, 0.3, 0.6, 0. , 0.2, 0.9, 0.9],
[ 0.2, 0.3, 0.3, 0.5, 0.2, 0.3, 0.4, 0.1, 0.7, 0.5],
[ 0.9, 0.1, 0.7, 0.8, 0.2, 0.9, 0.9, 0.1, 0.3, 0.3],
[ 0.8, 0.8, 1. , 0.9, 0.2, 0.5, 1. , 0. , 0. , 0. ],
[ 0.2, 0.3, 0.5, 0.4, 0.6, 0.2, 0. , 0. , 0. , 0. ],
[ 0. , 0.2, 1. , 0.2, 0.8, 0. , 0. , 0.7, 0.9, 0.6],
[ 0. , 0.2, 0.9, 0.9, 0.3, 0.4, 0.6, 0.6, 0.3, 0.6],
[ 0. , 0. , 0. , 0. , 0.8, 0.8, 0.1, 0.7, 0.4, 0.4],
[ 0. , 0. , 0. , 0. , 0. , 0.5, 0.1, 0.4, 0.3, 0.9]])
Another option, to manage the edges, is to use a convolution function from scipy.ndimage:
import scipy.ndimage
mask = scipy.ndimage.convolve(nb_cameras > 1, np.ones((3,3)),
mode='constant') != 0
img_final[mask] = image[mask]
>>> np.round(img_final, 1)
array([[ 0.6, 0.8, 0.7, 0.9, 0.7, 0.5, 0.6, 0.5, 0.6, 0.9],
[ 0.1, 0.6, 1. , 0.2, 0.3, 0.6, 0. , 0.2, 0.9, 0.9],
[ 0.2, 0.3, 0.3, 0.5, 0.2, 0.3, 0.4, 0.1, 0.7, 0.5],
[ 0.9, 0.1, 0.7, 0.8, 0.2, 0.9, 0.9, 0.1, 0.3, 0.3],
[ 0.8, 0.8, 1. , 0.9, 0.2, 0.5, 1. , 0. , 0.3, 0.8],
[ 0.2, 0.3, 0.5, 0.4, 0.6, 0.2, 0. , 0. , 0.7, 0.6],
[ 0.2, 0.2, 1. , 0.2, 0.8, 0. , 0. , 0.7, 0.9, 0.6],
[ 0. , 0.2, 0.9, 0.9, 0.3, 0.4, 0.6, 0.6, 0.3, 0.6],
[ 0.4, 1. , 0.8, 0. , 0.8, 0.8, 0.1, 0.7, 0.4, 0.4],
[ 0.9, 0.5, 0.8, 0. , 0. , 0.5, 0.1, 0.4, 0.3, 0.9]])

Related

How to replace all the elements of a numpy array?

Given a numpy array with multiple arrays inside, how do I replace all the values of the array with values from another array?
For example:
import numpy
first_array = numpy.array([[1,2],[3,4],[5,6],[7,8],[9,10]])
second_array = numpy.array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6,
0.7, 0.8, 0.9, 1])
Given these arrays, How do I replace 1,2 with 0.1, 0.2 and etc?

Use np.reshape
# import numpy as np
>>> m
array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10]])
>>> n
array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
>>> n.reshape(m.shape)
array([[0.1, 0.2],
[0.3, 0.4],
[0.5, 0.6],
[0.7, 0.8],
[0.9, 1. ]])

first_array = np.array([[1,2],[3,4],[5,6],[7,8],[9,10]])
second_array = np.array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6,0.7, 0.8, 0.9, 1])
np.set_printoptions(formatter={'float':"{0:0.1f}".format})
first_array = first_array.astype(float)
for i in range(np.shape(first_array)[0]):
for j in range(np.shape(first_array)[1]):
first_array[i][j] = second_array[2*i+j]
print(first_array)
Output:
[[0.1 0.2]
[0.3 0.4]
[0.5 0.6]
[0.7 0.8]
[0.9 1.0]]

Index array with the result of .nonzero()

I am having difficulties selecting rows using two condition in Numpy. The following code does not return the intended output
tot_length=0.3
steps=0.1
start_val=0.0
list_no =np.arange(start_val, tot_length, steps)
x, y, z = np.meshgrid(*[list_no for _ in range(3)], sparse=True)
a = ((x>=y) & (y>=z)).nonzero() # this maybe the problem
output
(array([0, 0, 0, 1, 1, 1, 1, 2, 2, 2]), array([0, 1, 2, 1, 1, 2, 2, 2, 2, 2]), array([0, 0, 0, 0, 1, 0, 1, 0, 1, 2]))
whereas, the intended output
[[0. 0. 0. ]
[0.1 0. 0. ]
[0.1 0.1 0. ]
[0.1 0.1 0.1]
[0.2 0. 0. ]
[0.2 0.1 0. ]
[0.2 0.1 0.1]
[0.2 0.2 0. ]
[0.2 0.2 0.1]
[0.2 0.2 0.2]]

ndarray.nonzero as well as np.where return tuples of arrays of indices. This makes unpacking those indices into separate arrays, which can then be used to index along a given axis. Stacking them up into a 2D array is trivial though, simply build a new array and transpose as:
ix = np.array(((x>=y) & (y>=z)).nonzero()).T
Then you can easily use the array of indices to index list_no as:
list_no[ix]
array([[0. , 0. , 0. ],
[0. , 0.1, 0. ],
[0. , 0.2, 0. ],
[0.1, 0.1, 0. ],
[0.1, 0.1, 0.1],
[0.1, 0.2, 0. ],
[0.1, 0.2, 0.1],
[0.2, 0.2, 0. ],
[0.2, 0.2, 0.1],
[0.2, 0.2, 0.2]])

TensorFlow: An alternative to tf.scatter_update

I have two Tensors like this:
template = tf.convert_to_tensor([[1, 0, 0.5, 0.5, 0.3, 0.3],
[1, 0, 0.75, 0.5, 0.3, 0.3],
[1, 0, 0.5, 0.75, 0.3, 0.3],
[1, 0, 0.75, 0.75, 0.3, 0.3]])
patch = tf.convert_to_tensor([[0, 1, 0.43, 0.17, 0.4, 0.4],
[0, 1, 0.18, 0.22, 0.53, 0.6]])
Now I would like to update the second and the last rows of the template with the patch rows to get a value like this:
[[1. 0. 0.5 0.5 0.3 0.3 ]
[0. 1. 0.43 0.17 0.4 0.4 ]
[1. 0. 0.5 0.75 0.3 0.3 ]
[0. 1. 0.18 0.22 0.53 0.6 ]]
With tf.scatter_update it is easy:
var_template = tf.Variable(template)
var_template = tf.scatter_update(var_template, [1, 3], patch)
However, it requires creating a variable. Is there a way to obtain the value using only tensor operations?
I was thinking about tf.where, but then I probably have to broadcast every patch row into the template size and call tf.where for each row.

This one should work. A bit twisted, but no variable used.
import tensorflow as tf
template = tf.convert_to_tensor([[1, 1, 0.5, 0.5, 0.3, 0.3],
[2, 2, 0.75, 0.5, 0.3, 0.3],
[3, 3, 0.5, 0.75, 0.3, 0.3],
[4, 4, 0.75, 0.75, 0.3, 0.3]])
patch = tf.convert_to_tensor([[1, 1, 1, 0.17, 0.4, 0.4],
[3, 3, 3, 0.22, 0.53, 0.6]])
ind = tf.constant([1,3])
rn_t = tf.range(0, template.shape[0])
def index1d(t, val):
return tf.reduce_min(tf.where(tf.equal([t], val)))
def index1dd(t,val):
return tf.argmax(tf.cast(tf.equal(t,val), tf.int64), axis=0)
r = tf.map_fn(lambda x: tf.where(tf.equal(index1d(ind, x), 0), patch[index1dd(ind, x)] , template[x]), rn_t, dtype=tf.float32)
with tf.Session() as sess:
print(sess.run([r]))

I will add here also my solution. This utility function works pretty much the same as scatter_update, but without using Variables:
def scatter_update_tensor(x, indices, updates):
'''
Utility function similar to `tf.scatter_update`, but performing on Tensor
'''
x_shape = tf.shape(x)
patch = tf.scatter_nd(indices, updates, x_shape)
mask = tf.greater(tf.scatter_nd(indices, tf.ones_like(updates), x_shape), 0)
return tf.where(mask, patch, x)

Getting the location of value from numpy.where() as single value and append it to another array

I have an array in python created from numpy as:
a = [[1. 0.5 0.3 ... 0.71 0.72 0.73]
[0. 0.4 0.6 ... 0.74 0.75 0.76]
[0. 0.3 0. ... 0.72 0.73 0.74]
...
[0. 0.2 0.3 ... 0.56 0.57 0.58]
[0. 0.1 0.3 ... 0.67 0.68 0.69]]
and another array
b = [[1. 0.5 0.6 ... 0.74 0.75 0.76]]
which i got from np.max(a, axis=0). Now I need the index of the array where the value in array 'a' is equal to the corresponding value in 'b' for which i used:
locn = []
for i in range(0, len(b[0])):
for j in range(0, len(a)):
fav = np.where(a[j][i] == b[0][j])
locn.append(fav)
print(locn)
I get the output as
[(array([0]),), (array([0]),), (array([0]),), (array([0]),), (array([], dtype=int64),), (array([], dtype=int64),), (array([], dtype=int64),), (array([], dtype=int64),), (array([0]),), (array([0]),), (array([0]),), (array([0]),), (array([], dtype=int64),), ............
I could have used np.where(a == np.max(a)) to get the location on maximum, but that is not my problem. I need the exact location (like 1st element of 1st array.. or something like that) append the index of array in loc[]. For example: for the first round 1 is the highest, i just need to append the index value 0 to a new list locn[] as 0 is the index for first round where the element of inner array is equal to the maximum value.
How can I do this? Thanks in advance.

You can use the function argmax instead of just max. For example
a = np.random.randint(10, size=(4, 5))
[[8 9 6 4 7] [6 4 0 3 6] [7 5 9 1 6] [1 4 8 8 9]]
np.max(a, axis=0)
array([8, 9, 9, 8, 9])
np.argmax(a, axis=0)
array([0, 0, 2, 3, 3], dtype=int64)
If you want to print the info the way you are describing then you can do
b = np.argmax(a, axis=0)
print('locn'+str(b))
locn[0 0 2 3 3]

Even if the to find elements are not the maxima but for example randomly chosen, we can still use argmax on a==b.
Example:
# generate random data
>>> n = 10
>>> a = np.round(np.random.random((n, n)), 1)
>>> a
array([[0.3, 0.2, 0.2, 0.4, 0.1, 0.6, 0.8, 0.9, 0.8, 0.1],
[0.7, 1. , 0.1, 0.1, 0.4, 1. , 0.7, 0.8, 0.6, 0.5],
[0.1, 0.5, 1. , 0.4, 0.6, 0.8, 0.9, 0.3, 0.2, 0.4],
[0.2, 0.6, 0.2, 0. , 0.7, 0.8, 0.9, 0.6, 0. , 0.1],
[0.4, 0. , 0.8, 0.2, 0.1, 0.8, 0.2, 0.6, 0.1, 0. ],
[0.1, 0.2, 0.4, 0.4, 0. , 0.6, 0.6, 0.9, 0.6, 0.3],
[0.9, 1. , 0.8, 0.8, 0.3, 0.5, 0.5, 0.2, 0.4, 0.7],
[0.5, 0.5, 0.2, 0.8, 0.8, 0.1, 0.7, 0.5, 0.9, 0.5],
[0. , 0.4, 0.5, 0.5, 0.6, 0.2, 0.5, 0.9, 0.6, 0.9],
[0.8, 0.5, 0.1, 0.9, 0.7, 0.1, 0.8, 0. , 0.9, 0.8]])
# randomly pick an index each column
>>> choice = np.random.randint(0, n, (n,))
>>>
# retrieve values at chosen locations
>>> b = a[choice, range(n)]
>>> b
array([0.4, 0.2, 0.8, 0.4, 0.6, 0.6, 0.8, 0.9, 0.6, 0.5])
>>>
# now recover `choice`, or if the same as the chosen value occurs
# earlier in that column return the index of the first occurrence.
>>> recover = np.argmax(a==b, axis=0)
>>> recover
array([4, 0, 4, 0, 2, 0, 0, 0, 1, 1])
>>>
# check result:
>>> recover <= choice
array([ True, True, True, True, True, True, True, True, True,
True])
>>> a[recover, range(n)] == b
array([ True, True, True, True, True, True, True, True, True,
True])
As a nice little bonus this takes advantage of the fact that max/argmax short-ciruits on booleans (a==b is, however, still evaluated everywhere):
>>> timeit('np.argmax(x)', globals={'np': np, 'x': np.ones(1000000, bool)}, number=100000)
0.10291801800121902
>>> timeit('np.argmax(x)', globals={'np': np, 'x': np.zeros(1000000, bool)}, number=100000)
4.172021539001435

Histogram in matplotlib work incorrect

For example:
import matplotlib.pyplot as plt
data = [0.6, 0.8, 0.4, 0.2, 0.6, 0.8, 0.4, 0.2]
plt.hist(data, bins=20, range=[0.0, 1.0], normed=True)
plt.show()
And after this i taken histogram, where frequency for every item about 5, not 0.25%. How i can fix this?

You could check the histogram result by assigning plt.hist as follows:
out = plt.hist(data, bins=20)
print out
which prints:
(array([2, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 2]),
array([0.2 , 0.23, 0.26, 0.29, 0.32, 0.35, 0.38, 0.41, 0.44, 0.47,
0.5 , 0.53, 0.56, 0.59, 0.62, 0.65, 0.68, 0.71, 0.74, 0.77, 0.8 ]),
<a list of 20 Patch objects>)
which is correct.
also:
>>> plt.hist(data, bins=4)
(array([ 2., 2., 2., 2.]), array([ 0.2 , 0.35, 0.5 , 0.65, 0.8 ]),
<a list of 4 Patch objects>)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Improving performance iterating in 2d numpy array - python

Related

How to replace all the elements of a numpy array?

Index array with the result of .nonzero()

TensorFlow: An alternative to tf.scatter_update

Getting the location of value from numpy.where() as single value and append it to another array

Histogram in matplotlib work incorrect

Categories

Resources