Vectorizing loops in NumPy

Vectorizing loops in NumPy - python

I am trying to vectorize a loop iteration using NumPy but am struggling to achieve the desired results. I have an array of pixel values, so 3 dimensions, say (512,512,3) and need to iterate each x,y and calculate another value using a specific index in the third dimension. An example of this code in a standard loop is as follows:
for i in xrange(width):
for j in xrange(height):
temp = math.sqrt((scalar1-array[j,i,1])**2+(scalar2-array[j,i,2])**2)
What I am currently doing is this:
temp = np.sqrt((scalar1-array[:,:,1])**2+(scalar2-array[:,:,2])**2)
The temp array I get from this is the desired dimensions (x,y) but some of the values differ from the loop implementation. How can I eliminate the loop to compute this example efficiently in NumPy?
Thanks in advance!
Edit:
Here is code that is giving me differing results for temp and temp2, obviously temp2 is just the calculation for one cell
temp = np.sqrt((cb_key-fg_cbcr_array[:,:,1])**2+(cr_key-fg_cbcr_array[:,:,2])**2)
temp2 = np.sqrt((cb_key-fg_cbcr_array[500,500,1])**2+(cr_key-fg_cbcr_array[500,500,2])**2)
print temp[500, 500]
print temp2
The output for the above is
12.039
94.069123521
The scalars are definitely initialized and the array is generated from an image using
fg = PIL.Image.open('fg.jpg')
fg_cbcr = fg.convert("YCbCr")
fg_cbcr_array = np.array(fg_cbcr)
Edit2:
Ok so I have tracked it down to a problem with my array. Not sure why yet but it works when the array is generated with np.random.random but not when loading from a file using PIL as above.

Your vectorized solution is correct.
in your for loop temp is a scalar value that will take only the last value
use np.sqrt istead of math.sqrt for vectorized inputs
you should not use array as a variable since it can shadow the np.array method
I checked using the following code, which may give you some tip about where the error may be:
import numpy as np
width = 512
height = 512
scalar1 = 1
scalar2 = 2
a = np.random.random((height, width, 3))
tmp = np.zeros((height, width))
for i in xrange(width):
for j in xrange(height):
tmp[j,i] = np.sqrt((scalar1-a[j,i,1])**2+(scalar2-a[j,i,2])**2)
tmp2 = np.sqrt((scalar1-a[:,:,1])**2+(scalar2-a[:,:,2])**2)
np.allclose(tmp, tmp2)

Related

Python: create 3D array using values of another 3D array that meet a condition

I'm basically trying to take the weighted mean of a 3D dataset, but only on a filtered subset of the data, where the filter is based off of another (2D) array. The shape of the 2D data matches the first 2 dimensions of the 3D data, and is thus repeated for each slice in the 3rd dimension.
Something like:
import numpy as np
myarr = np.array([[[4,6,8],[9,3,2]],[[2,7,4],[3,8,6]],[[1,6,7],[7,8,3]]])
myarr2 = np.array([[7,3],[6,7],[2,6]])
weights = np.random.rand(3,2,3)
filtered = []
for k in range(len(myarr[0,0,:])):
temp1 = myarr[:,:,k]
temp2 = weights[:,:,k]
filtered.append(temp1[np.where(myarr2 > 5)]*temp2[np.where(myarr2 > 5)])
average = np.array(np.sum(filtered,1)/len(filtered[0]))
I am concerned about efficiency here. Is it possible to vectorize this so I don't need the loop, or are there other suggestions to make this more efficient?

The most glaring efficiency issue, even the loop aside, is that np.where(...) is being called multiple times inside the loop, on the same condition! You can just do this a single time beforehand. Moreover, there is no need for a loop. Your operation basically equates to:
mask = myarr2 > 5
average = (myarr[mask] * weights[mask]).mean(axis=0)
There is no need for an np.where either.
myarr2 is an array of shape (i, j) with same first two dims as myarr and weight, which have some shape (i, j, k).
So if there are n True elements in the boolean mask myarr2 > 5, you can apply it on your other arrays to obtain (n, k) elements (taking all elements along third axis, when there is a True at a certain [i, j] position).

numpy array not broadcastable

This is an example of my error. Say i created a numpy array
X = np.zeros((1000, 50))
Where 1000 is the features (rows) and 50 is the examples (columns)
Since i am adding examples one by one i will have to replace columns in the array 1 by 1 to get the final feature array. I tried this:
X[:,i] = example
where example is of size (1000, 1), and i is iterated for every example. This does not work because X[:,i] is of shape (1000,), a rank 1 array. How do i code it so that each example replaces a row of the X array without throwing the broadcast error. Thank you.

Reshape your vector before assigning it.
X[:,i] = example.reshape(-1,)
This will suppress the second dimension and turn example into shape (1000,)
Or, avoiding assigning one by one in the loop you can put all of your arrays in a list and then call np.array on your list and transpose it to have them as columns. This will probably work better if you can construct your list of arrays in a list comprehension.
Example:
arrs = [np.random.randint(10, size=5) for _ in range(5)]
X = np.array(arrs).T

Mapping RGB values in an image to a corresponding ID using a dictionary

I am working on a segmentation problem where given an image, each RGB value corresponds to a class label. The problem I have is to efficiently map RGB values from an image (numpy array) to a corresponding class label image.
Let's provide the following simplified example:
color2IdMap
{(100,0,100):0, (0,200,0):2}
labelOld
array([[[100,0,100],
[0,200,0]],
[[100,0,100],
[0,200,0]]], dtype=uint8)
(in a real example the colorIdMap will have about 20 entries and labelOld will be an array of shape: (1024,512,3))
Now I want the result to be the following mapped array. with shape: (1024,512)
labelNew
array([[ 0, 2],
[ 0, 2]])
I attempted to do this with loops and list comprehensions but both methods are quite slow (about ~10seconds per image, which is a big number for 250K images). And I am wondering if there is a faster way of doing it.
Attempted method 1:
labelNew = np.empty((1052,1914), dtype=np.uint8)
for i in range(1052):
for j in range(1914):
labelNew[i, j] = color2IdMap[tuple(labelOld[i, j])]
Attempted method 2:
labelNew = [[color2IdMap[tuple(x)] for x in y] for y in labelOld]
So, my question is if there is any faster and more efficient way of doing this?

Here's one approach based on dimensionality-reduction -
# Get keys and values
k = np.array(list(color2IdMap.keys()))
v = np.array(list(color2IdMap.values()))
# Setup scale array for dimensionality reduction
s = 256**np.arange(3)
# Reduce k to 1D
k1D = k.dot(s)
# Get sorted k1D and correspondingly re-arrange the values array
sidx = k1D.argsort()
k1Ds = k1D[sidx]
vs = v[sidx]
# Reduce image to 2D
labelOld2D = np.tensordot(labelOld, s, axes=((-1),(-1)))
# Get the positions of 1D sorted keys and get the correspinding values by
# indexing into re-arranged values array
out = vs[np.searchsorted(k1Ds, labelOld2D)]
Alternatively, we could use sidx as sorter input arg for np.searchsorted to get the final output -
out = v[sidx[np.searchsorted(k1D, labelOld2D, sorter=sidx)]]

Assume the RGB value map is like this(store in a Python dict):
color_dict = {(128,128,128):(255,255,255),
(128,256,128):(255,128,255),
}
The RGB value remap operation can be done using np.where() and np.all():
cvt_img = np.zeros_like(img)
for rgb_src, rgb_dst in color_dict.items():
rgb_src = np.array(rgb_src)
rgb_dst = np.array(rgb_dst)
idx = np.where(np.all(img == rgb_src, axis=-1))
cvt_img[idx] = rgb_dst

numpy array size vs. speed of concatenation

I am concatenating data to a numpy array like this:
xdata_test = np.concatenate((xdata_test,additional_X))
This is done a thousand times. The arrays have dtype float32, and their sizes are shown below:
xdata_test.shape : (x1,40,24,24) (x1 : [500~10500])
additional_X.shape : (x2,40,24,24) (x2 : [0 ~ 500])
The problem is that when x1 is larger than ~2000-3000, the concatenation takes a lot longer.
The graph below plots the concatenation time versus the size of the x2 dimension:
Is this a memory issue or a basic characteristic of numpy?

As far as I understand numpy, all the stack and concatenate functions are not extremely efficient. And for good reasons, because numpy tries to keep array memory contiguous for efficiency (see this link about contiguous arrays in numpy)
That means that every concatenate operation have to copy the whole data every time. When I need to concatenate a bunch of elements together I tend to do this :
l = []
for additional_X in ...:
l.append(addiional_X)
xdata_test = np.concatenate(l)
That way, the costly operation of moving the whole data is only done once.
NB : would be interested in the speed improvement that gives you.

If you have in advance the arrays you want to concatenate, I would suggest creating a new array with the total shape and fill it with the small arrays rather than concatenating, as every concatenation operation needs to copy the whole data to a new contiguous space of memory.
First, calculate the total size of the first axis:
max_x = 0
for arr in list_of_arrays:
max_x += arr.shape[0]
Second, create the end container:
final_data = np.empty((max_x,) + xdata_test.shape[1:], dtype=xdata_test.dtype)
which is equivalent to (max_x, 40, 24, 24) but dynamically typed.
Last, fill the numpy array:
curr_x = 0
for arr in list_of_arrays:
final_data[curr_x:curr_x+arr.shape[0]] = arr
curr_x += arr.shape[0]
The loop above, copies each of the arrays to a previously defined column/rows of the larger array.
By doing this, each of the N arrays will be copied to the exact final destination, rather than creating temporal arrays for each of the concatenation.

Applying a mask for speeding up various array calculations

I have a np.ndarray with numbers that indicate spots of interest, I am interested in the spots which have values 1 and 9.
Right now they are being extracted as such:
maskindex.append(np.where(extract.variables['mask'][0] == 1) or np.where(megadatalist[0].variables['mask'][0] == 9))
xval = maskindex[0][1]
yval = maskindex[0][0]
I need to apply these x and y values to the arrays that I am operating on, to speed things up.
I have 140 arrays that are each 734 x 1468, I need the mean, max, min, std calculated for each field. And I was hoping there was an easy way for applying the masked array to speed up the operations, right now I am simply doing it on the entire arrays as such:
Average_List = np.mean([megadatalist[i].variables['analysed_sst'][0] for i in range(0,Numbers_of_datasets)], axis=0)
Average_Error_List = np.mean([megadatalist[i].variables['analysis_error'][0] for i in range(0,Numbers_of_datasets)], axis=0)
Std_List = np.std([megadatalist[i].variables['analysed_sst'][0] for i in range(0,Numbers_of_datasets)], axis=0)
Maximum_List = np.maximum.reduce([megadatalist[i].variables['analysed_sst'][0] for i in range(0,Numbers_of_datasets)])
Minimum_List = np.minimum.reduce([megadatalist[i].variables['analysed_sst'][0] for i in range(0,Numbers_of_datasets)])
Any ideas on how to speed things up would be highly appreciated

I may have solved it partially, depending on what you're aiming for. The following code reduces an array arr to a 1d array with only the relevant indicies. You can then do the needed calculations without considering the unwanted locations
arr = np.array([[0,9,9,0,0,9,9,1],[9,0,1,9,0,0,0,1]])
target = [1,9] # wanted values
index = np.where(np.in1d(arr.ravel(), target).reshape(arr.shape))
no_zeros = arr[index]
At this stage "all you need" is to reinsert the values "no_zeros" on an array of zeroes with appropriate shape, on the indices given in "index". One way is to flatten the index array and recalculate the indices, so that they match a flattened arr array. Then use numpy.insert(np.zeroes(arr.shape),new_index,no_zeroes) and then reshaping to the appropriate shape afterwards. Reshaping is constant time in numpy. Admittedly, I have not figured out a fast numpy way to create the new_index array.
Hope it helps.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Vectorizing loops in NumPy - python

Related

Python: create 3D array using values of another 3D array that meet a condition

numpy array not broadcastable

Mapping RGB values in an image to a corresponding ID using a dictionary

numpy array size vs. speed of concatenation

Applying a mask for speeding up various array calculations

Categories

Resources