Accessing elements with offsets in Python's for .. in loops - python

I've been mucking around a bit with Python, and I've gathered that it's usually better (or 'pythonic') to use
for x in SomeArray:
rather than the more C-style
for i in range(0, len(SomeArray)):
I do see the benefits in this, mainly cleaner code, and the ability to use the nice map() and related functions. However, I am quite often faced with the situation where I would like to simultaneously access elements of varying offsets in the array. For example, I might want to add the current element to the element two steps behind it. Is there a way to do this without resorting to explicit indices?

The way to do this in Python is:
for i, x in enumerate(SomeArray):
print i, x
The enumerate generator produces a sequence of 2-tuples, each containing the array index and the element.

List indexing and zip() are your friends.
Here's my answer for your more specific question:
I might want to add the current element to the element two steps behind it. Is there a way to do this without resorting to explicit indices?
arr = range(10)
[i+j for i,j in zip(arr[:-2], arr[2:])]
You can also use the module numpy if you intend to work on numerical arrays. For example, the above code can be more elegantly written as:
import numpy
narr = numpy.arange(10)
narr[:-2] + narr[2:]
Adding the nth element to the (n-2)th element is equivalent to adding the mth element to the (m+2) element (for the mathematically inclined, we performed the substitution n->m+2). The range of n is [2, len(arr)) and the range of m is [0, len(arr)-2). Note the brackets and parenthesis. The elements from 0 to len(arr)-3 (you exclude the last two elements) is indexed as [:-2] while elements from 2 to len(arr)-1 (you exclude the first two elements) is indexed as [2:].
I assume that you already know list comprehensions.

Related

i want to iterate over this structure. It is treating everyhting inside as one element and not as seperate numbers

loss=[[ -137.70171527 -81408.95809899 -94508.84395371 -311.81198933 -294.08711874]]
When I print loss it prints the addition of the numbers and not the individual numbers. I want to change this to a list so I can iterate over each individual number bit and I don't know how, please help.
I have tried:
result = map(tuple,loss)
However it prints the addition of the inside. When I try to index it says there is only 1 element. It works if I put a comma in between but this is a matrix that is outputed from other codes so i can't change or add to it.
You have a list of a list with numbers the outer list contains exactly one element (namely the inner list). The inner list is your list of integers over which you want to iterate. Hence to iterate over the inner list you would first have to access it from the outer list using indices for example like this:
for list_item in loss[0]:
do_something_for_each_element(list_item)
Moreover, I think that you wanted to have separate elements in the inner list and not compute one single number, didn't you? If that is the case you have to separate each element using a ,.
E.g.:
loss=[[-137.70171527, -81408.95809899, -94508.84395371, -311.81198933, -294.08711874]]
EDIT:
As you clarified in the comments you want to iterate over a numpy matrix. One way to do so is by converting the matrix into an n dimensional array (ndarray) and the iterate that structure. This could for example look like this, other options have also been presented in this answer(Iterate over a numpy Matrix rows):
import numpy as np
test_matrix=np.matrix([[1, 2], [3, 4]])
for row in test_matrix.A:
print(row)
note that the A attribute of a matrix object is its ndarray representation (https://docs.scipy.org/doc/numpy/reference/generated/numpy.matrix.html).

python: avoiding nested for-loop [duplicate]

This question already has answers here:
How to get the cartesian product of multiple lists
(17 answers)
Closed 4 years ago.
How can I avoid the nested for-loops in the following Code and using the map-function instead for example?
import numpy as np
A = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
n = len(A[0])
B = np.zeros((n,n))
for i in range(n):
for j in range(n):
B[j,i] = min(A[:,i]/A[:,j])
I don't think map() is a good candidate for this problem because your desired result is nested rather than flattened. It'd be a little more verbose than just using a list comprehension to achieve your desired result. Here's a cleaner way of initializing B.
B = np.array([np.min(A.T/r, axis=1) for r in A.T])
This iterates over every column of A (every row of A.T) and computes the broadcasted division A.T/r. Numpy is able to optimize that far beyond what we can do with raw loops. Then the use of np.min() computes minimums of that matrix we just computed along every row (rows instead of columns because of the parameter axis=1) more quickly than we would be able to with the builtin min() because of internal vectorization.
If you did want to map something, #MartijnPieters points out that you could use itertools with itertools.product(range(len(A[0])), repeat=2) to achieve the same pairs of indices. Otherwise, you could use the generator ((r, s) for r in A.T for s in A.T) to get pairs of rows instead of pairs of indices. In either case you could apply map() across that iterable, but you'd still have to somehow nest the results to initalize B properly.
Note: that this probably isn't the result you would expect. The elements of A are integers by default (notice that you used, e.g., 3 instead of 3.). When you divide the integer elements of A you get integers again. In your code, that was partially obfuscated by the fact that np.zeros() casts those integers to floats, but the math was wrong regardless. To fix that, pass an additional argument when constructing A:
A = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]], float)

Is there a built-in method for checking individual list indices in Python

Python has a built in functionality for checking the validity of entire slices: slice.indices. Is there something similar that is built-in for individual indices?
Specifically, I have an index, say a = -2 that I wish to normalize with respect to a 4-element list. Is there a method that is equivalent to the following already built in?
def check_index(index, length):
if index < 0:
index += length
if index < 0 or index >= length:
raise IndexError(...)
My end result is to be able to construct a tuple with a single non-None element. I am currently using list.__getitem__ to do the check for me, but it seems a little awkward/overkill:
items = [None] * 4
items[a] = 'item'
items = tuple(items)
I would like to be able to do
a = check_index(a, 4)
items = tuple('item' if i == a else None for i in range(4))
Everything in this example is pretty negotiable. The only things that are fixed is that I am getting a in a way that can have all of the problems that an arbitrary index can have and that the final result has to be a tuple.
I would be more than happy if the solution used numpy and only really applied to numpy arrays instead of Python sequences. Either one would be perfect for the application I have in mind.
If I understand correctly, you can use range(length)[index], in your example range(4)[-2]. This properly handles negative and out-of-bounds indices. At least in recent versions of Python, range() doesn't literally create a full list so this will have decent performance even for large arguments.
If you have a large number of indices to do this with in parallel, you might get better performance doing the calculation with Numpy vectorized arithmetic, but I don't think the technique with range will work in that case. You'd have to manually do the calculation using the implementation in your question.
There is a function called numpy.core.multiarray.normalize_axis_index which does exactly what I need. It is particularly useful to be because the implementation I had in mind was for numpy array indexing:
from numpy.core.multiarray import normalize_axis_index
>>> normalize_axis_index(3, 4)
3
>>> normalize_axis_index(-3, 4)
1
>>> normalize_axis_index(-5, 4)
...
numpy.core._internal.AxisError: axis -5 is out of bounds for array of dimension 4
The function was added in version 1.13.0. The source for this function is available here, and the documentation source is here.

More efficient way to find index of objects in Python array

I have a very large 400x300x60x27 array (lets call it 'A'). I took the maximum values which is now a 400x300x60 array called 'B'. Basically I need to find the index in 'A' of each value in 'B'. I have converted them both to lists and set up a for loop to find the indices, but it takes an absurdly long time to get through it because there are over 7 million values. This is what I have:
B=np.zeros((400,300,60))
C=np.zeros((400*300*60))
B=np.amax(A,axis=3)
A=np.ravel(A)
A=A.tolist()
B=np.ravel(B)
B=B.tolist()
for i in range(0,400*300*60):
C[i]=A.index(B[i])
Is there a more efficient way to do this? Its taking hours and hours and the program is still stuck on the last line.
You don't need amax, you need argmax. In case of argmax, the array will only contain the indices rather than values, the computational efficiency of finding the values using indices are much better than vice versa.
So, I would recommend you to store only the indices. Before flattening the array.
instead of np.amax, run A.argmax, this will contain the indices.
But before you're flattening it to 1D, you will need to use a mapping function that causes the indices to 1D as well. This is probably a trivial problem, as you'd need to just use some basic operations to achieve this. But that would also consume some time as it needs to be executed quite some times. But it won't be a searching probem and would save you quite some time.
You are getting those argmax indices and because of the flattening, you are basically converting to linear index equivalents of those.
Thus, a solution would be to add in the proper offsets into the argmax indices in steps leveraging broadcasting at each one of them, like so -
m,n,r,s = A.shape
idx = A.argmax(axis=3)
idx += s*np.arange(r)
idx += r*s*np.arange(n)[:,None]
idx += n*r*s*np.arange(m)[:,None,None] # idx is your C output
Alternatively, a compact way to put it would be like so -
m,n,r,s = A.shape
I,J,K = np.ogrid[:m,:n,:r]
idx = n*r*s*I + r*s*J + s*K + A.argmax(axis=3)

Whats the best way to iterate over multidimensional array and tracking/doing operations on iteration index

I need to do a lot of operations on multidimensional numpy arrays and therefor i am experimenting towards the best approach on this.
So let's say i have an array like this:
A = np.random.uniform(0, 1, size = 100).reshape(20, 5)
My goal is to get the maximum value numpy.amax() of each entry and it's index. So may A[0] be something like this:
A[0] = [ 0.64570441 0.31781716 0.07268926 0.84183753 0.72194227]
I want to get the maximum and the index of that maximum [0.84183753][0, 3]. No specific representation of the results needed, just an example. I even need the horizontal index only.
I tried using numpy's nditer object:
A_it = np.nditer(A, flags=['multi_index'], op_flags=['readwrite'])
while not A_it.finished:
print(np.amax(A_it.value))
print(A_it.multi_index[1])
A_it.iternext()
I can access every element of the array and its index over the iterations that way but i don't seem to be able to bring the numpy.amax() function in each element and the index together syntax wise. Can i even do it using nditerobject?
Also, in Numpy: Beginner nditer i read that using nditer or using iterations in numpy usually means that i am doing something wrong. But i can't find another convenient way to achieve my goal here without any iterations. Obviously i am a total beginner in numpy and python in general, so any keyword to search for or hint is very much appreciated.
A major problem with nditer is that it iterates over each element, not each row. It's best used as a stepping stone toward a Cython or C rewrite of your code.
If you just want the maximum for each row of your array, a simple iteration or list comprehension will do nicely.
for row in A: print(np.amax(row))
or to turn it back into an array:
np.array([np.amax(row) for row in A])
But you can get the same values by giving amax an axis parameter
np.amax(A,axis=1)
np.argmax identifies the location of the maximum.
np.argmax(A,axis=1)
With the argmax values you could then select the max values as well,
ind=np.argmax(A,axis=1)
A[np.arange(A.shape[0]),ind]
(speed's about the same as repeating the np.amax call).

Categories