I have an operation that I'm doing commonly which I'm calling a "jagged-slice" because I don't know the real name for it. It's best explained by example:
a = np.random.randn(50, 10)
entries_of_interest = np.random.randint(10, size = 50) # Vector of 50 indices between 0 and 9
# Now I want the values contained in each row of a at the corresponding index in "entries of interest"
jagged_slice_of_a = a[np.arange(a.shape[0]), entries_of_interest]
# jagged_slice_of_a is now a vector with 50 elements. Good.
Only problem is it's a bit cumbersome to do this a[np.arange(a.shape[0]), entries_of_interest] indexing (it seems silly to have to construct the "np.arange(a.shape[0])" just for the sake of this). I'd like something like the : operator for this, but the : does something else. Is there any more succinct way to do this operation?
Best answer:
No, there is no better way with native numpy. You can create a helper function for this if you want.
This is combersome only in the sense that it requires more typing for a task that seems so simple to you.
a[np.arange(a.shape[0]), entries_of_interest]
But as you note, the syntactically simpler a[:, entries_of_interest] has another interpretation in numpy. Choosing a subset of the columns of an array is a more common task that choosing one (random) item from each row.
Your case is just a specialized instance of
a[I, J]
where I and J are 2 arrays of the same shape. In the general case entries_of_interest could be smaller than a.shape[0] (not all the rows), or larger (several items from some rows), or even be 2d. It could even select certain elements repeatedly.
I have found in other SO questions that performing this kind of element selection is faster when applied to a.flat. But that requires some math to construct the I*n+J kind of flat index.
With your special knowledge of J, constructing I seems extra work, but numpy can't make that kind of assumption. If this selection was more common someone could write a function that wraps your expression
def peter_selection(a,I):
# check the a.shape[0]==I.shape[0]
return a[np.arange(a.shape[0]), I]
I think that your current method is probably the best way.
You can also use choose for this kind of selection. This is syntactically clearer, but is trickier to get right and potentially more limited. The equivalent with this method would be:
entries_of_interest.choose(a.T)
The elements in jagged_slice_of_a are the diagonal elements of a[:,entries_of_interest]
A slightly less cumbersome way of doing this would therefore be to use np.diagonal to extract them.
jagged_slice_of_a = a[:, entries_of_interest].diagonal()
Related
Is there a faster way to populate a 2d numpy array using the same algorithm (pnoise3 with the same input arguments, notably, i/scale j/scale) seen here? self.world is the np array and it is pretty large (2048,1024) to be traversing like this.
for i in range(self.height):
for j in range(self.width):
self.world[i][j] = noise.pnoise3(i/self.noise['scale'],
j/self.noise['scale'],
SEED,
octaves = self.noise['octaves'],
persistence = self.noise['persistence'],
lacunarity = self.noise['lacunarity'],
repeatx= self.width,
repeaty= self.height,
base= 0)
After learning about boolean indexing I was able to get rid of this nested for loop elsewhere in my program and was amazed at how much more efficient it was. Is there any room for improvement above?
I thought about doing something like self.world[self.world is not None] = noise.pnoise3(arg, arg, etc...) but that cannot accommodate for the incrementing i and j values. And would setting it to a function output mean every value is the same anyways? I also thought about make a separate array and then combining them but I still cannot figure out how to reproduce the incrementing i and j values in that scenario.
Also, as an aside, I used self.world[self.world is not None] as an example of a boolean index that would return true for everything but I imagine this is not the best way to do what I want. Is there an obvious alternative I am missing?
If pnoise is perlin noise then there are numpy vectorized implementations.
Here is one.
As it is I do not think you can do it faster. Numpy is fast when it can do the inner loop in C. That is the case for built in numpy functions like np.sin.
Here you have a vector operation where the operation is a python function.
However it could be possible to re-implement the noise function so that it internally uses numpy vectorized functions.
Let's suppose we have a matrix and a list of indexes:
adj_mat = np.array([[1,2,3],
[4,5,6],
[7,8,9]])
indexes = [0,2]
What I want is to sum the rows and columns corresponding to the sub matrix we get by the intersection of the rows and columns of the indexes list. In this case it would be:
sub_matrix = ([[1,3]
[7,9]])
result_rows = [4,16]
result_columns = [8,12]
However, I do this calculation rather a lot of times with the same original matrix and different indexes lists, so I am looking for an efficent solution without creating the sub matrix each iteration. My solution so far is (and for columns respectively):
def sum_rows(matrix, indexes):
sum_r = [0]*len(indexes)
for i in range(len(indexes)):
for j in indexes:
sum_r[i] += matrix.item(indexes[i], j)
return sum_r
I'm looking for a more efficient algorithm as I remember there is a method which looks like this that sums all rows (or columns?) in the indexes:
matrix.sum(:, indexes)
matrix.sum(indexes, indexes)
I assume what I need is the second line, if it exists. I tried to google it, with or without numpy, but couldn't find the right syntax.
Is there a solution as I described here but I'm just using the wrong syntax? Or any other suggestions for improvement?
IIUC:
import numpy as np
adj_mat = np.array([[1,2,3],
[4,5,6],
[7,8,9]])
indexes = np.array([1, 3]) - 1
sub_matrix = adj_mat[np.ix_(indexes, indexes)]
result_rows, result_columns = sub_matrix.sum(axis=1), sub_matrix.sum(axis=0)
Result:
array([ 4, 16]) # result_rows
array([ 8, 12]) # result_columns
So assuming you made a mistake and you meant indexes = [0,2] and sub_matrix = [[1,3], [7,9]], then this should do what you want
def sum_sub(matrix, indices):
"""
Returns the sum of each row and column (as a tuple)
for each index in indices (as an array)
"""
# note that this sub matrix does not copy any data from matrix,
# it is a "view" which simply holds a reference to matrix
sub_mat = matrix[np.ix_(indices, indices)]
return sub_mat.sum(axis=1), sub_mat.sum(axis=0)
sum_row, sum_col = sum_sub(np.arange(1,10).reshape((3,3)), [0,2])
The results of this are
sum_col # --> [ 8 12]
sum_row # --> [ 4 16]
Since the point of efficiency was brought up in the question, a little further analysis should probably be done.
First and foremost, the code looks like code to find a matrix inverse using the adjoint matrix. Unless that particular method is important to the project, the standard np.linalg.inv() is almost certainly going to be faster than anything we cook up here. Moreover, in many applications you can get away with solving a system of linear equations rather than finding an inverse and multiplying by it, cutting run times in half or more again.
Second, any discussion of efficient numpy code needs to address views as opposed to copies. Memory allocation, writing to memory, and memory deallocation are all extremely expensive operations when compared with standard floating point arithmetic. That's not to say that they're slow, but you can notice an order of magnitude or two of difference in the speed of code memory efficient code vs nearly anything else. That's the entire premise behind the fastest implementation of persistent homology calculations I know of, among other things.
All of the other answers (at the time of writing) create a copy of the data they're working with, explicitly storing that information in a new variable sub_matrix. It isn't possible to create every fancy-indexed matrix with a copy, but oftentimes equivalent operations can be performed.
For example, if this really is a set of computations on adjoint matrices so that your indexes variable consists of all but one of the available indices (in your example, all but the middle index), then instead of explicitly summing over all the intended indices, we can sum over all indices and subtract the one we don't care about. The effect is that all the intermediate matrices are views rather than copies, preventing the expensive memory allocations. On my machine, this is twice as fast for the tiny 3x3 example given and 10x as fast for 500x500 matrices.
bad_row = 1
bad_col = 1
result_rows = (np.sum(adj_mat, axis=1)-adj_mat[:,bad_col])[np.arange(adj_mat.shape[0])!=bad_row]
result_cols = (np.sum(adj_mat, axis=0)-adj_mat[bad_row,:])[np.arange(adj_mat.shape[1])!=bad_col]
Of course, it's even faster if you can use slices to represent whatever you're doing and you don't have to work around the problem with extra operations as I did, but the example you gave doesn't easily permit slices.
I need to do a lot of operations on multidimensional numpy arrays and therefor i am experimenting towards the best approach on this.
So let's say i have an array like this:
A = np.random.uniform(0, 1, size = 100).reshape(20, 5)
My goal is to get the maximum value numpy.amax() of each entry and it's index. So may A[0] be something like this:
A[0] = [ 0.64570441 0.31781716 0.07268926 0.84183753 0.72194227]
I want to get the maximum and the index of that maximum [0.84183753][0, 3]. No specific representation of the results needed, just an example. I even need the horizontal index only.
I tried using numpy's nditer object:
A_it = np.nditer(A, flags=['multi_index'], op_flags=['readwrite'])
while not A_it.finished:
print(np.amax(A_it.value))
print(A_it.multi_index[1])
A_it.iternext()
I can access every element of the array and its index over the iterations that way but i don't seem to be able to bring the numpy.amax() function in each element and the index together syntax wise. Can i even do it using nditerobject?
Also, in Numpy: Beginner nditer i read that using nditer or using iterations in numpy usually means that i am doing something wrong. But i can't find another convenient way to achieve my goal here without any iterations. Obviously i am a total beginner in numpy and python in general, so any keyword to search for or hint is very much appreciated.
A major problem with nditer is that it iterates over each element, not each row. It's best used as a stepping stone toward a Cython or C rewrite of your code.
If you just want the maximum for each row of your array, a simple iteration or list comprehension will do nicely.
for row in A: print(np.amax(row))
or to turn it back into an array:
np.array([np.amax(row) for row in A])
But you can get the same values by giving amax an axis parameter
np.amax(A,axis=1)
np.argmax identifies the location of the maximum.
np.argmax(A,axis=1)
With the argmax values you could then select the max values as well,
ind=np.argmax(A,axis=1)
A[np.arange(A.shape[0]),ind]
(speed's about the same as repeating the np.amax call).
For an assignment I have to use different combinations of features belonging to some data, to evaluate a classification system. By features I mean measurements, e.g. height, weight, age, income. So for instance I want to see how well a classifier performs when given just the height and weight to work with, and then the height and age say. I not only want to be able to test what two features work best together, but also what 3 features work best together and would like to be able to generalise this to n features.
I've been attempting this using numpy's mgrid, to create n dimensional arrays, flattening them, and then making arrays that use the same elements from each array to create new ones. Tricky to explain so here is some code and psuedo code:
import numpy as np
def test_feature_combos(data, combinations):
dimensions = combinations.shape[0]
grid = np.empty(dimensions)
for i in xrange(dimensions):
grid[i] = combinations[i].flatten()
#The above code throws an error "setting an array element with a sequence" error which I understand, but this shows my approach.
**Pseudo code begin**
For each element of each element of this new array,
create a new array like so:
[[1,1,2,2],[1,2,1,2]] ---> [[1,1],[1,2],[2,1],[2,2]]
Call this new array combo_indices
Then choose the columns (features) from the data in a loop using:
new_data = data[:, combo_indices[j]]
combinations = np.mgrid[1:5,1:5]
test_feature_combos(data, combinations)
I concede that this approach means a lot of unnecessary combinations due to repeats, however I cannot even implement this so beggars can not be choosers.
Please can someone advise me on how I can either a) implement my approach or b) achieve this goal in a much more elegant way.
Thanks in advance, and let me know if any clarification needs to be made, this was tough to explain.
To generate all combinations of k elements drawn without replacement from a set of size n you can use itertools.combinations, e.g.:
idx = np.vstack(itertools.combinations(range(n), k)) # an (n, k) array of indices
For the special case where k=2 it's often faster to use the indices of the upper triangle of an n x n matrix, e.g.:
idx = np.vstack(np.triu_indices(n, 1)).T
I have a problem with some numpy stuff. I need a numpy array to behave in an unusual manner by returning a slice as a view of the data I have sliced, not a copy. So heres an example of what I want to do:
Say we have a simple array like this:
a = array([1, 0, 0, 0])
I would like to update consecutive entries in the array (moving left to right) with the previous entry from the array, using syntax like this:
a[1:] = a[0:3]
This would get the following result:
a = array([1, 1, 1, 1])
Or something like this:
a[1:] = 2*a[:3]
# a = [1,2,4,8]
To illustrate further I want the following kind of behaviour:
for i in range(len(a)):
if i == 0 or i+1 == len(a): continue
a[i+1] = a[i]
Except I want the speed of numpy.
The default behavior of numpy is to take a copy of the slice, so what I actually get is this:
a = array([1, 1, 0, 0])
I already have this array as a subclass of the ndarray, so I can make further changes to it if need be, I just need the slice on the right hand side to be continually updated as it updates the slice on the left hand side.
Am I dreaming or is this magic possible?
Update: This is all because I am trying to use Gauss-Seidel iteration to solve a linear algebra problem, more or less. It is a special case involving harmonic functions, I was trying to avoid going into this because its really not necessary and likely to confuse things further, but here goes.
The algorithm is this:
while not converged:
for i in range(len(u[:,0])):
for j in range(len(u[0,:])):
# skip over boundary entries, i,j == 0 or len(u)
u[i,j] = 0.25*(u[i-1,j] + u[i+1,j] + u[i, j-1] + u[i,j+1])
Right? But you can do this two ways, Jacobi involves updating each element with its neighbours without considering updates you have already made until the while loop cycles, to do it in loops you would copy the array then update one array from the copied array. However Gauss-Seidel uses information you have already updated for each of the i-1 and j-1 entries, thus no need for a copy, the loop should essentially 'know' since the array has been re-evaluated after each single element update. That is to say, every time we call up an entry like u[i-1,j] or u[i,j-1] the information calculated in the previous loop will be there.
I want to replace this slow and ugly nested loop situation with one nice clean line of code using numpy slicing:
u[1:-1,1:-1] = 0.25(u[:-2,1:-1] + u[2:,1:-1] + u[1:-1,:-2] + u[1:-1,2:])
But the result is Jacobi iteration because when you take a slice: u[:,-2,1:-1] you copy the data, thus the slice is not aware of any updates made. Now numpy still loops right? Its not parallel its just a faster way to loop that looks like a parallel operation in python. I want to exploit this behaviour by sort of hacking numpy to return a pointer instead of a copy when I take a slice. Right? Then every time numpy loops, that slice will 'update' or really just replicate whatever happened in the update. To do this I need slices on both sides of the array to be pointers.
Anyway if there is some really really clever person out there that awesome, but I've pretty much resigned myself to believing the only answer is to loop in C.
Late answer, but this turned up on Google so I probably point to the doc the OP wanted. Your problem is clear: when using NumPy slices, temporaries are created. Wrap your code in a quick call to weave.blitz to get rid of the temporaries and have the behaviour your want.
Read the weave.blitz section of PerformancePython tutorial for full details.
accumulate is designed to do what you seem to want; that is, to proprigate an operation along an array. Here's an example:
from numpy import *
a = array([1,0,0,0])
a[1:] = add.accumulate(a[0:3])
# a = [1, 1, 1, 1]
b = array([1,1,1,1])
b[1:] = multiply.accumulate(2*b[0:3])
# b = [1 2 4 8]
Another way to do this is to explicitly specify the result array as the input array. Here's an example:
c = array([2,0,0,0])
multiply(c[:3], c[:3], c[1:])
# c = [ 2 4 16 256]
Just use a loop. I can't immediately think of any way to make the slice operator behave the way you're saying you want it to, except maybe by subclassing numpy's array and overriding the appropriate method with some sort of Python voodoo... but more importantly, the idea that a[1:] = a[0:3] should copy the first value of a into the next three slots seems completely nonsensical to me. I imagine that it could easily confuse anyone else who looks at your code (at least the first few times).
It is not the correct logic.
I'll try to use letters to explain it.
Image array = abcd with a,b,c,d as elements.
Now, array[1:] means from the element in position 1 (starting from 0) on.
In this case:bcd and array[0:3] means from the character in position 0 up to the third character (the one in position 3-1) in this case: 'abc'.
Writing something like:
array[1:] = array[0:3]
means: replace bcd with abc
To obtain the output you want, now in python, you should use something like:
a[1:] = a[0]
It must have something to do with assigning a slice. Operators, however, as you may already know, do follow your expected behavior:
>>> a = numpy.array([1,0,0,0])
>>> a[1:]+=a[:3]
>>> a
array([1, 1, 1, 1])
If you already have zeros in your real-world problem where your example does, then this solves it. Otherwise, at added cost, set them to zero either by multiplying by zero or assigning to zero, (whichever is faster)
edit:
I had another thought. You may prefer this:
numpy.put(a,[1,2,3],a[:3])
Numpy must be checking if the target array is the same as the input array when doing the setkey call. Luckily, there are ways around it. First, I tried using numpy.put instead
In [46]: a = numpy.array([1,0,0,0])
In [47]: numpy.put(a,[1,2,3],a[0:3])
In [48]: a
Out[48]: array([1, 1, 1, 1])
And then from the documentation of that, I gave using flatiters a try (a.flat)
In [49]: a = numpy.array([1,0,0,0])
In [50]: a.flat[1:] = a[0:3]
In [51]: a
Out[51]: array([1, 1, 1, 1])
But this doesn't solve the problem you had in mind
In [55]: a = np.array([1,0,0,0])
In [56]: a.flat[1:] = 2*a[0:3]
In [57]: a
Out[57]: array([1, 2, 0, 0])
This fails because the multiplication is done before the assignment, not in parallel as you would like.
Numpy is designed for repeated application of the exact same operation in parallel across an array. To do something more complicated, unless you can find decompose it in terms of functions like numpy.cumsum and numpy.cumprod, you'll have to resort to something like scipy.weave or writing the function in C. (See the PerfomancePython page for more details.) (Also, I've never used weave, so I can't guarantee it will do what you want.)
You could have a look at np.lib.stride_tricks.
There is some information in these excellent slides:
http://mentat.za.net/numpy/numpy_advanced_slides/
with stride_tricks starting at slide 29.
I'm not completely clear on the question though so can't suggest anything more concrete - although I would probably do it in cython or fortran with f2py or with weave. I'm liking fortran more at the moment because by the time you add all the required type annotations in cython I think it ends up looking less clear than the fortran.
There is a comparison of these approaches here:
www. scipy. org/ PerformancePython
(can't post more links as I'm a new user)
with an example that looks similar to your case.
In the end I came up with the same problem as you. I had to resort to use Jacobi iteration and weaver:
while (iter_n < max_time_steps):
expr = "field[1:-1, 1:-1] = (field[2:, 1:-1] "\
"+ field[:-2, 1:-1]+"\
"field[1:-1, 2:] +"\
"field[1:-1, :-2] )/4."
weave.blitz(expr, check_size=0)
#Toroidal conditions
field[:,0] = field[:,self.flow.n_x - 2]
field[:,self.flow.n_x -1] = field[:,1]
iter_n = iter_n + 1
It works and is fast, but is not Gauss-Seidel, so convergence can be a bit tricky. The only option of doing Gauss-Seidel as a traditional loop with indexes.
i would suggest cython instead of looping in c. there might be some fancy numpy way of getting your example to work using a lot of intermediate steps... but since you know how to write it in c already, just write that quick little bit as a cython function and let cython's magic make the rest of the work easy for you.