I've been trying to find a more efficient way to iterate through an image and split their properties on a threshold. In searching online and discussing with some programming friends they introduced me to the concept of vectorizing (particularly using numpy) a function. After much searching and trial and error, I can't seem to get the hang of it. Can some one give me a link, or suggestion how to make the following code more efficient?
Im = plt.imread(img)
Imarray = np.array(Im)
for line in Imarray:
for pixel in line:
if pixel <= 20000:
dim_sum += pixel
dim_counter += 1
if pixel > 20000:
bright_sum += pixel
bright_counter += 1
bright_mean = bright_sum/bright_counter
dim_mean = dim_sum/dim_counter
Basically, each pixel holds a brightness amount between 0 and 30000 and I'm trying to average all pixels below 20000 and above 20000 respectively. The best way I know how to do this is using for loops (which are slow in python) and search through each pixel with if statements.
NumPy supports and encourages vectorization through its arrays and ufuncs. In your case, you have as input image a NumPy array. So, those comparisons could be done in one-go/ vectorized manner to give us boolean arrays of the same shape as the input array. Those boolean arrays when used for indexing into the input arrays would select the valid elements from it. This is called boolean-indexing and forms a key feature in such a vectorized selection.
Finally, we use NumPy ufunc ndarray.mean that again operates in a vectorized fashion to give us the mean values of the selected elements.
Thus, to put all those into code, we would have -
bright_mean, dim_mean = Im[Im > 20000].mean(), Im[Im <= 20000].mean()
For this particular problem, from code-efficiency point of view, it would make more sense to perform the comparison once. The comparison would give us a boolean array, which could be used twice later on, once as it is and second time being inverted. Thus, alternatively we would have -
mask = Im > 20000
bright_mean, dim_mean = Im[mask].mean(), Im[~mask].mean()
Related
What I'm trying to do now is to get a value out of 1 X 1 size tensor, and I have nearly 6000 of them.
I've tried using eval(), session() so far. The best I could think of was to change the tensor to numpy to get the value out of it. But the problem is that it's extremely slow, especially when having to deal with huge amount of data. Is there any fast way to retrieve the data from tensor?
Just for additional information, this is the part of my code where I'm trying to implement.
cross_IF = []
count = 0
for i in range(len(test_IF)):
if (count % 100 == 0):
print(count)
count += 1
c = keras.losses.categorical_crossentropy(test_IF[i], prediction_IF[i])
element = keras.backend.eval(tf.reduce_sum(c))
cross_IF.append(element)
cross_IF is the list that I'll use to stack up values from tensor 'tf.reduce_sum(c)'.
test_IF and prediction_IF are test values and prediction values.
Providing the resolution in Answer section for the benefit of community.
The issue was that using categorical_crossentropy resulted in tensor, not numpy.
Converting categorical_crossentropy to numpy format and then appending that into numpy list took more time.
Instead, concatenating as a tensor form for all the cross entropies of the data and then converting that into numpy at the end made it faster.
I'm trying to use this 1000 dimension wikipedia word2vec model to analyze some documents.
Using introspection I found out that the vector representation of a word is a 1000 dimension numpy.ndarray, however whenever I try to create an ndarray to find the nearest words I get a value error:
ValueError: maximum supported dimension for an ndarray is 32, found 1000
and from what I can tell by looking around online 32 is indeed the maximum supported number of dimensions for an ndarray - so what gives? How is gensim able to output a 1000 dimension ndarray?
Here is some example code:
doc = [model[word] for word in text if word in model.vocab]
out = []
n = len(doc[0])
print(n)
print(len(model["hello"]))
print(type(doc[0]))
for i in range(n):
sum = 0
for d in doc:
sum += d[i]
out.append(sum/n)
out = np.ndarray(out)
which outputs:
1000
1000
<class 'numpy.ndarray'>
ValueError: maximum supported dimension for an ndarray is 32, found 1000
The goal here would be to compute the average vector of all words in the corpus in a format that can be used to find nearby words in the model so any alternative suggestions to that effect are welcome.
You're calling numpy's ndarray() constructor-function with a list that has 1000 numbers in it – your hand-calculated averages of each of the 1000 dimensions.
The ndarray() function expects its argument to be the shape of the matrix constructed, so it's trying to create a new matrix of shape (d[0], d[1], ..., d[999]) – and then every individual value inside that matrix would be addressed with a 1000-int set of coordinates. And, indeed numpy arrays can only have 32 independent dimensions.
But even if you reduced the list you're supplying to ndarray() to just 32 numbers, you'd still have a problem, because your 32 numbers are floating-point values, and ndarray() is expecting integral counts. (You'd get a TypeError.)
Along the approach you're trying to take – which isn't quite optimal as we'll get to below – you really want to create a single vector of 1000 floating-point dimensions. That is, 1000 cell-like values – not d[0] * d[1] * ... * d[999] separate cell-like values.
So a crude fix along the lines of your initial approach could be replacing your last line with either:
result = np.ndarray(len(d))
for i in range(len(d)):
result[i] = d[i]
But there are many ways to incrementally make this more efficient, compact, and idiomatic – a number of which I'll mention below, even though the best approach, at bottom, makes most of these interim steps unnecessary.
For one, instead of that assignment-loop in my code just above, you could use Python's bracket-indexing assignment option:
result = np.ndarray(len(d))
result[:] = d # same result as previous 3-lines w/ loop
But in fact, numpy's array() function can essentially create the necessary numpy-native ndarray from a given list, so instead of using ndarray() at all, you could just use array():
result = np.array(d) # same result as previous 2-lines
But further, numpy's many functions for natively working with arrays (and array-like lists) already include things to do averages-of-many-vectors in a single step (where even the looping is hidden inside very-efficient compiled code or CPU bulk-vector operations). For example, there's a mean() function that can average lists of numbers, or multi-dimensional arrays of numbers, or aligned sets of vectors, and so forth.
This allows faster, clearer, one-liner approaches that can replace your entire original code with something like:
# get a list of available word-vetors
doc = [model[word] for word in text if word in model.vocab]
# average all those vectors
out = np.mean(doc, axis=0)
(Without the axis argument, it'd average together all individual dimension-values , in all slots, into just one single final average number.)
I'm coming from IDL, so I'm most used to for loops with explicit indicing. I have read about how python does things differently and that you should just be able to say
for thing in things:
What I can't figure out is if a I have a 4 dimensional array and I want to perform an operation in one dimension of the array, how do I save out the result in a 4 dimensional array and do it in the 'python' way.
I have a 4 dimensional array in time, altitude, latitude, longitude. I want to smooth it using a running mean window of N=9.
Here is the code that I am working with:
KMCM_T = g.variables['temperature'][:,:,:,:] #K
N = 9
T_bar_run = []
for idx, lon in enumerate(KMCM_lon):
for idy, lat in enumerate(KMCM_lat):
for idz, lev in enumerate(KMCM_levels):
T_bar_run[:][idz][idy][idx] = np.convolve(KMCM_T[:,idz,idy,idx], np.ones((N,))/N, mode='same')
In this specific case you could probably use scipy.ndimage.convolve1d:
from scipy.ndimage import convolve1d
T_bar_run = convolve1d(KMCM_T, np.ones(N)/N, axis=0, mode='constant')
The "numpy way of doing things" is avoiding loops because in numerical applications often the overhead of an interpreted loop dwarfs the cost of its payload. This is done by relying on vectorized functions, i.e. functions that apply a certain operation to every cell of its array arguments.
Many such functions act naturally along one or a few dimensions which is why you will frequently encounter the axis keyword argument.
I have an image of the sun, I found center and radius and now I want to process pixels differently if they are inside or outside the disk. The ideal solution would be to imterpolate the parameters of the processing function, in order to smoothly transition from disk to background.
Here is what I'm doing now:
for index,value in np.ndenumerate(sun_img):
if distance.euclidean(index,center) > radius:
sun_img[index] = processing_function(index,value)
Like this it works but it takes forever to compute the image. I'm sure there is a more efficient way to do that. How would you solve this?
Image shape is around (1000, 1000)
Processing_function is basically not doing anything right now: value += 1
The function should be something like a non-linear "step function" with 0.0 value till radius and 1.0 5px after. something like: _______/''''''''''''''''''''' multiplied by the value of the pixel. The slope should be on the value of the radius. I wanna do this in order to enhance the protuberances
Here's a vectorized way leveraging NumPy broadcasting -
m,n = sun_img.shape
I,J = np.ogrid[:m,:n]
sq_dist = (I - center[0])**2 + (J - center[1])**2
valid_mask = sq_dist > radius**2
Now, for a processing_function that just adds 1 to the valid places, defined by the IF-conditional, do -
sun_img[valid_mask] += 1
If you need to implement a custom operation with processing_function that needs those row, column indices, use np.where to get those indices and then iterate through the valid elements, like so -
r,c = np.where(valid_mask)
for index in zip(r,c):
sun_img[index] = processing_function(index,sun_img[r,c])
If you have a lot of such valid places, then computing r,c might make things slow. In that case, directly use the mask, like so -
for index,value in np.ndenumerate(sun_img):
if valid_mask[index]:
sun_img[index] = processing_function(index,value)
Compared to the original code, the benefit is that we have the conditional values pre-computed before going into the loop. The best way again would be to vectorize processing_function itself so that it works on a bigger chunk of data, but that would depend on its implementation.
I'm still an amature when it comes to thinking about how to optimize. I have this section of code that takes in a list of found peaks and finds where these peaks,+/- some value, are located in a multidimensional array. It then adds +1 to their indices of a zeros array. The code works well, but it takes a long time to execute. For instance it is taking close to 45min to run if ind has 270 values and refVals has a shape of (3050,3130,80). I understand that its a lot of data to churn through, but is there a more efficient way of going about this?
maskData = np.zeros_like(refVals).astype(np.int16)
for peak in ind:
tmpArr = np.ma.masked_outside(refVals,x[peak]-2,x[peak]+2).astype(np.int16)
maskData[tmpArr.mask == False ] += 1
tmpArr = None
maskData = np.sum(maskData,axis=2)
Approach #1 : Memory permitting, here's a vectorized approach using broadcasting -
# Craate +,-2 limits usind ind
r = x[ind[:,None]] + [-2,2]
# Use limits to get inside matches and sum over the iterative and last dim
mask = (refVals >= r[:,None,None,None,0]) & (refVals <= r[:,None,None,None,1])
out = mask.sum(axis=(0,3))
Approach #2 : If running out of memory with the previous one, we could use a loop and use NumPy boolean arrays and that could be more efficient than masked arrays. Also, we would perform one more level of sum-reduction, so that we would be dragging less data with us when moving across iterations. Thus, the alternative implementation would look something like this -
out = np.zeros(refVals.shape[:2]).astype(np.int16)
x_ind = x[ind]
for i in x_ind:
out += ((refVals >= i-2) & (refVals <= i+2)).sum(-1)
Approach #3 : Alternatively, we could replace that limit based comparison with np.isclose in approach #2. Thus, the only step inside the loop would become -
out += np.isclose(refVals,i,atol=2).sum(-1)