Iterating efficiently through indices of arbitrary order array

Iterating efficiently through indices of arbitrary order array - python

Say I have an arbitrary array of variable order N. For example:
A is a 2x3x3 array is an order 3 array with 2,3, and 3 dimiensions along it's three indices.
I would like to efficiently loop through each element. If I knew a priori the order then I could do something like (in python),
#for order 3
import numpy as np
shape = np.shape(A)
i = 0
while i < shape[0]:
j = 0
while j < shape[1]:
k = 0
while k < shape[2]:
#code using i,j,k
k += 1
j += 1
i += 1
Now suppose I don't know the order of A, i.e. I don't know a priori the length of shape. How can I permute the quickest through all elements of the array?

There are many ways to do this, e.g. iterating over a.ravel() or a.flat. However, looping over every single element of an array in a Python loop will never be particularly efficient.

I don't think it matters which index you choose to permute over first, which index you choose to permute over second, etc. because your inner-most while statement will always be executed once per combination of i, j, and k.

If you need to keep the results of your operation (and assuming its a function of A and i,j,k) You'd want to use something like this:
import itertools
import numpy as np
results = ( (position, code(A,position))
for indices in itertools.product(*(range(i) for i in np.shape(A))))
Then you can iterate the results getting out the position and return value of code for each position. Or convert the generator expression to a list if you need to access the results multiple times.

If the array of of the format array = [[[1,2,3,4],[1,2]],[[1],[1,2,3]]]
You could use the following structure:
array = [[[1,2,3,4],[1,2]],[[1],[1,2,3]]]
indices = []
def iter_array(array,indices):
indices.append(0)
for a in array:
if isinstance(a[0],list):
iter_array(a,indices)
else:
indices.append(0)
for nonlist in a:
#do something using each element in indices
#print(indices)
indices.append(indices.pop()+1)
indices.pop()
indices.append(indices.pop()+1)
indices.pop()
iter_array(array,indices)
This should work for the usual nested list "arrays" I don't know if it would be possible to mimic this using numpy's array structure.

Related

Creating data in loop subject to moving condition

I am trying to create a list of data in a for loop then store this list in a list if it satisfies some condition. My code is
R = 10
lam = 1
proc_length = 100
L = 1
#Empty list to store lists
exponential_procs_lists = []
for procs in range(0,R):
#Draw exponential random variables
z_exponential = np.random.exponential(lam,proc_length)
#Sort values to increase
z_exponential.sort()
#Insert 0 at start of list
z_dat_r = np.insert(z_exponential,0,0)
sum = np.sum(np.diff(z_dat_r))
if sum < 5*L:
exponential_procs_lists.append(z_dat_r)
which will store some of the R lists that satisfies the sum < 5L condition. My question is, what is the best way to store R lists where the sum of each list is less than 5L? The lists can be different length but they must satisfy the condition that the sum of the increments is less than 5*L. Any help much appreciated.

Okay so based on your comment, I take that you want to generate an exponential_procs_list, inside which every sublist has a sum < 5*L.
Well, I modified your code to chop the sublists as soon as the sum exceeds 5*L.
Edit : See answer history to see my last answer for the approach above.
Well looking closer, notice you don't actually need the discrete difference array. You're finding the difference array, summing it up and checking whether the sum's < 5L and if it is, you append the original array.
But notice this:
if your array is like so: [0, 0.00760541, 0.22281415, 0.60476231], it's difference array would be [0.00760541 0.21520874 0.38194816].
If you add the first x terms of the difference array, you get the x+1th element of the original array. So you really just need to keep elements which are lesser than 5L:
import numpy as np
R = 10
lam = 1
proc_length = 5
L = 1
exponential_procs_lists = []
def chop(nums, target):
good_list = []
for num in nums:
if num >= target:
break
good_list.append(num)
return good_list
for procs in range(0,R):
z_exponential = np.random.exponential(lam,proc_length)
z_exponential.sort()
z_dat_r = np.insert(z_exponential,0,0)
good_list = chop(z_dat_r, 5*L)
exponential_procs_lists.append(good_list)
You could probably also just do a binary search(for better time complexity) or use a filter lambda, that's up to you.

Unexpected output after merging two sorted arrays with Python

I found a partial solution to the problem; however, it seems that I'm getting extra numbers from my array than what it should be. This is the question I'm trying to find out:
Given two sorted integer arrays nums1 and nums2, merge nums2 into
nums1 as one sorted array.
Note:
The number of elements initialized in nums1 and nums2 are m and n
respectively. You may assume that nums1 has enough space (size that is
greater or equal to m + n) to hold additional elements from nums2.
Example:
Input: nums1 = [1,2,3,0,0,0], m = 3 nums2 = [2,5,6], n = 3
Output: [1,2,2,3,5,6]
I'm practicing some coding challenges to the hang of Python3 language and prepare myself for an interview. I have tried a few methods like using pop when the beginning of the array are 0s. But it seems that after new test case showed up, I should've expected more. I'm pretty new with the language.
def mergeArrays(nums1, m, nums2, n):
nums1[:] = sorted(nums1 + nums2)
i = 0
while (i < len(nums1[:-1])):
if nums1[i] is 0:
nums1.pop(i)
if i > len(nums1):
break
i += 1
print(nums1)
nums1 = [-49,-48,-48,-47,-45,-42,-39,-36,-33,-33,-28,-28,-23,-23,-7,-4,-3,0,0,4,6,21,29,29,31,34,36,38,40,43,45,46,47,0,0,0,0,0,0,0,0]
m = len(nums1)
nums2 = [-16,-5,-3,26,33,35,38,41]
n = len(nums2)
mergeArrays(nums1, m, nums2, n);
My expected output should be of both arrays sorted and go through. Results should be this: [-49,-48,-48,-47,-45,-42,-39,-36,-33,-33,-28,-28,-23,-23,-16,-7,-5,-4,-3,-3,0,0,4,6,21,26,29,29,31,33,34,35,36,38,38,40,41,43,45,46,47]
However, I'm getting a couple extra zeros, which should look like this:
[-49,-48,-48,-47,-45,-42,-39,-36,-33,-33,-28,-28,-23,-23,-16,-7,-5,-4,-3,-3,0,0,0,0,0,4,6,21,26,29,29,31,33,34,35,36,38,38,40,41,43,45,46,47]
EDIT: added more information to make the problem clear.

As per my understanding you want to sort the two sorted array without having any duplicate element. You can refer the below code:
first_list = [-49,-48,-48,-47,-45,-42,-39,-36,-33,-33,-28,-28,-23,-23,-7,-4,-3,0,0,4,6,21,29,29,31,34,36,38,40,43,45,46,47,0,0,0,0,0,0,0,0]
second_list = [-16,-5,-3,26,33,35,38,41]
merged_list = list(set(first_list+second_list))
merged_list.sort()
print(merged_list)

With one of the old methods that I used was a loop comprehension. Basically, what I did was array splice from beginning to end and do the sort inside of the loop:
def mergeArrays(nums1, m, nums2, n):
nums1[0: m + n] = [x for x in sorted(nums1[:m] + nums2[:n])]
If you have a different explanation than what I just did, please feel free :)

After much back-and-forth on the intent of your code and where your unwanted mystery zeros come from, seems you want to do the following: merge-sort your two arrays, preserving duplicates:
your input is arrays nums1, nums2 which are zero-padded, and can be longer than length m,n respectively
But to avoid picking up those padded zeros, you should only reference the entries 0..(m-1), i.e. nums1[:m], and likewise nums2[:n]
Your mistake was to reference all the way up to nums1[:-1]
Your solution is: sorted(nums1[:m] + nums2[:n]). It's a one-liner list comprehension and you don't need a function.
There is no reason whatsoever that zero entries need special treatment. There's no need for your while-loop.
Also btw even if you wanted to (say) exclude all zeros, you can still use a one-liner list-comprehension: x for x in sorted(nums1[:m] + nums2[:n]) if x != 0]
List comprehensions are a neat idiom and super-powerful! Please read more about them. Often you don't need while-loops in Python; list comprehensions, iterators or generators are typically cleaner, shorter code and more efficient.

How to iterate over this n-dimensional dataset?

I have a dataset which has 4 dimensions (for now...) and I need to iterate over it.
To access a value in the dataset, I do this:
value = dataset[i,j,k,l]
Now, I can get the shape for the dataset:
shape = [4,5,2,6]
The values in shape represent the length of the dimension.
How, given the number of dimensions, can I iterate over all the elements in my dataset? Here is an example:
for i in range(shape[0]):
for j in range(shape[1]):
for k in range(shape[2]):
for l in range(shape[3]):
print('BOOM')
value = dataset[i,j,k,l]
In the future, the shape may change. So for example, shape may have 10 elements rather than the current 4.
Is there a nice and clean way to do this with Python 3?

You could use itertools.product to iterate over the cartesian product 1 of some values (in this case the indices):
import itertools
shape = [4,5,2,6]
for idx in itertools.product(*[range(s) for s in shape]):
value = dataset[idx]
print(idx, value)
# i would be "idx[0]", j "idx[1]" and so on...
However if it's a numpy array you want to iterate over, it could be easier to use np.ndenumerate:
import numpy as np
arr = np.random.random([4,5,2,6])
for idx, value in np.ndenumerate(arr):
print(idx, value)
# i would be "idx[0]", j "idx[1]" and so on...
1 You asked for clarification what itertools.product(*[range(s) for s in shape]) actually does. So I'll explain it in more details.
For example is you have this loop:
for i in range(10):
for j in range(8):
# do whatever
This can also be written using product as:
for i, j in itertools.product(range(10), range(8)):
# ^^^^^^^^---- the inner for loop
# ^^^^^^^^^-------------- the outer for loop
# do whatever
That means product is just a handy way of reducing the number of independant for-loops.
If you want to convert a variable number of for-loops to a product you essentially need two steps:
# Create the "values" each for-loop iterates over
loopover = [range(s) for s in shape]
# Unpack the list using "*" operator because "product" needs them as
# different positional arguments:
prod = itertools.product(*loopover)
for idx in prod:
i_0, i_1, ..., i_n = idx # index is a tuple that can be unpacked if you know the number of values.
# The "..." has to be replaced with the variables in real code!
# do whatever
That's equivalent to:
for i_1 in range(shape[0]):
for i_2 in range(shape[1]):
... # more loops
for i_n in range(shape[n]): # n is the length of the "shape" object
# do whatever

How is this 2D array being sized by FOR loops?

Question background:
This is the first piece of Python code I've looked at and as such I'm assuming that my thread title is correct in explaining what this code is actually trying to achieve i.e setting a 2D array.
The code:
The code I'm looking at sets the size of a 2D array based on two for loops:
n = len(sentences)
values = [[0 for x in xrange(n)] for x in xrange(n)]
for i in range(0, n):
for j in range(0, n):
values[i][j] = self.sentences_intersection(sentences[i], sentences[j])
I could understand it if each side of the array was set with using the length property of the sentences variable, unless this is in effect what xrange is doing by using the loop size based on the length?
Any helping with explaing how the array is being set would be great.

This code is actually a bit redundant.
Firstly you need to realize that values is not an array, it is a list. A list is a dynamically sized one-dimensional structure.
The second line of the code uses a nested list comprehension to create one list of size n, each element of which is itself a list consisting of n zeros.
The second loop goes through this list of lists, and sets each element according to whatever sentences_intersection does.
The reason this is redundant is because lists don't need to be pre-allocated. Rather than doing two separate iterations, really the author should just be building up the lists with the correct values, then appending them.
This would be better:
n = len(sentences)
values = []
for i in range(0, n):
inner = []
for j in range(0, n):
inner.append(self.sentences_intersection(sentences[i], sentences[j]))
values.append(inner)
but you could actually do the whole thing in the list comprehension if you wanted:
values = [[self.sentences_intersection(sentences[i], sentences[j]) for i in xrange(n)] for j in xrange(n)]

How do I get an empty list of any size in Python?

I basically want a Python equivalent of this Array in C:
int a[x];
but in python I declare an array like:
a = []
and the problem is I want to assign random slots with values like:
a[4] = 1
but I can't do that with Python, since the Python list is empty (of length 0).

If by "array" you actually mean a Python list, you can use
a = [0] * 10
or
a = [None] * 10

You can't do exactly what you want in Python (if I read you correctly). You need to put values in for each element of the list (or as you called it, array).
But, try this:
a = [0 for x in range(N)] # N = size of list you want
a[i] = 5 # as long as i < N, you're okay
For lists of other types, use something besides 0. None is often a good choice as well.

You can use numpy:
import numpy as np
Example from Empty Array:
np.empty([2, 2])
array([[ -9.74499359e+001, 6.69583040e-309],
[ 2.13182611e-314, 3.06959433e-309]])

also you can extend that with extend method of list.
a= []
a.extend([None]*10)
a.extend([None]*20)

Just declare the list and append each element. For ex:
a = []
a.append('first item')
a.append('second item')

If you (or other searchers of this question) were actually interested in creating a contiguous array to fill with integers, consider bytearray and memoryivew:
# cast() is available starting Python 3.3
size = 10**6
ints = memoryview(bytearray(size)).cast('i')
ints.contiguous, ints.itemsize, ints.shape
# (True, 4, (250000,))
ints[0]
# 0
ints[0] = 16
ints[0]
# 16

It is also possible to create an empty array with a certain size:
array = [[] for _ in range(n)] # n equal to your desired size
array[0].append(5) # it appends 5 to an empty list, then array[0] is [5]
if you define it as array = [] * n then if you modify one item, all are changed the same way, because of its mutability.

x=[]
for i in range(0,5):
x.append(i)
print(x[i])

If you actually want a C-style array
import array
a = array.array('i', x * [0])
a[3] = 5
try:
[5] = 'a'
except TypeError:
print('integers only allowed')
Note that there's no concept of un-initialized variable in python. A variable is a name that is bound to a value, so that value must have something. In the example above the array is initialized with zeros.
However, this is uncommon in python, unless you actually need it for low-level stuff. In most cases, you are better-off using an empty list or empty numpy array, as other answers suggest.

The (I think only) way to assign "random slots" is to use a dictionary, e.g.:
a = {} # initialize empty dictionary
a[4] = 1 # define the entry for index 4 to be equal to 1
a['French','red'] = 'rouge' # the entry for index (French,red) is "rouge".
This can be handy for "quick hacks", and the lookup overhead is irrelevant if you don't have intensive access to the array's elements.
Otherwise, it will be more efficient to work with pre-allocated (e.g., numpy) arrays of fixed size, which you can create with a = np.empty(10) (for an non-initialized vector of length 10) or a = np.zeros([5,5]) for a 5x5 matrix initialized with zeros).
Remark: in your C example, you also have to allocate the array (your int a[x];) before assigning a (not so) "random slot" (namely, integer index between 0 and x-1).
References:
The dict datatype: https://docs.python.org/3/library/stdtypes.html#mapping-types-dict
Function np.empty(): https://numpy.org/doc/stable/reference/generated/numpy.empty.html

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Iterating efficiently through indices of arbitrary order array - python

There are many ways to do this, e.g. iterating over a.ravel() or a.flat. However, looping over every single element of an array in a Python loop will never be particularly efficient.

I don't think it matters which index you choose to permute over first, which index you choose to permute over second, etc. because your inner-most while statement will always be executed once per combination of i, j, and k.

Related

Creating data in loop subject to moving condition

Unexpected output after merging two sorted arrays with Python

How to iterate over this n-dimensional dataset?

How is this 2D array being sized by FOR loops?

How do I get an empty list of any size in Python?

Categories

Resources