Suggestions for storing outputs of a nested forloop? - python

I am trying to write a function in python that gives all possible combinations of three variable inputs that can be of size 1-4. I have written a nested forloop that I believe gives all possible combinations, but I am struggling to store the output in a single 1D array. I don't know if I have to store it as a 3D array and compress it maybe? Here is the code I have:
import numpy as np
def test(x,y,z):
len1 = len(x)
len2 = len(y)
len3 = len(z)
lentot = len1*len2*len3
codons = np.empty((1,lentot))
for i in range(len1):
for j in range(len2):
for k in range(len3):
codons[] = np.array([x[i],y[j],z[k]])
return codons
Basically, I a cannot figure out what to put in the bracket on the second to last line to get my output to store as a 1D array. I don't even know if it is possible. I tried using itertools.product to perform this for me, but the output is stored as a single element, not an array (each line being its own element). For my application it is important that I can pass this output through another function, so I need it to be an array of strings.

You can append them to a list and then convert it to an array at the end.
def test(x, y, z):
codon_list = []
for i in x:
for j in y:
for k in z:
codon_list.append([i, j, k])
codons = np.array(codon_list)

Using the combinations function from itertools itertools.combinations([x, y, z], 3) seems to be the easiest way.

You can use list comprehension to make your life easier:
def test(x,y,z):
codons = np.array([[i,j,k] for i in x for j in y for k in z])
return codons

Related

python intersection of lists while not having the same index

I have a curious case, and after some time I have not come up with an adequate solution.
Say you have two lists and you need to find items that have the same index.
x = [1,4,5,7,8]
y = [1,3,8,7,9]
I am able to get a correct intersection of those which appear in both lists with the same index by using the following:
matches = [i for i, (a,b) in enumerate(zip(x,y)) if a==b)
This would return:
[0,3]
I am able to get a a simple intersection of both lists with the following (and in many other ways, this is just an example)
intersected = set(x) & set(y)
This would return this list:
[1,8,7,9]
Here's the question. I'm wondering for some ideas for a way of getting a list of items (as in the second list) which do not include those matches above but are not in the same position on the list.
In other words, I'm looking items in x that do not share the same index in the y
The desired result would be the index position of "8" in y, or [2]
Thanks in advance
You're so close: iterate through y; look for a value that is in x, but not at the same position:
offset = [i for i, a in enumerate(y) if a in x and a != x[i] ]
Result:
[2]
Including the suggested upgrade from pault, with respect to Martijn's comment ... the pre-processing reduces the complexity, in case of large lists:
>>> both = set(x) & set(y)
>>> offset = [i for i, a in enumerate(y) if a in both and a != x[i] ]
As PaulT pointed out, this is still quite readable at OP's posted level.
I'd create a dictionary of indices for the first list, then use that to test if the second value is a) in that dictionary, and b) the current index is not present:
def non_matching_indices(x, y):
x_indices = {}
for i, v in enumerate(x):
x_indices.setdefault(v, set()).add(i)
return [i for i, v in enumerate(y) if i not in x_indices.get(v, {i})]
The above takes O(len(x) + len(y)) time; a single full scan through the one list, then another full scan through the other, where each test to include i is done in constant time.
You really don't want to use a value in x containment test here, because that requires a scan (a loop) over x to see if that value is really in the list or not. That takes O(len(x)) time, and you do that for each value in y, which means that the fucntion takes O(len(x) * len(y)) time.
You can see the speed differences when you run a time trial with a larger list filled with random data:
>>> import random, timeit
>>> def using_in_x(x, y):
... return [i for i, a in enumerate(y) if a in x and a != x[i]]
...
>>> x = random.sample(range(10**6), 1000)
>>> y = random.sample(range(10**6), 1000)
>>> for f in (using_in_x, non_matching_indices):
... timer = timeit.Timer("f(x, y)", f"from __main__ import f, x, y")
... count, total = timer.autorange()
... print(f"{f.__name__:>20}: {total / count * 1000:6.3f}ms")
...
using_in_x: 10.468ms
non_matching_indices: 0.630ms
So with two lists of 1000 numbers each, if you use value in x testing, you easily take 15 times as much time to complete the task.
x = [1,4,5,7,8]
y = [1,3,8,7,9]
result=[]
for e in x:
if e in y and x.index(e) != y.index(e):
result.append((x.index(e),y.index(e),e))
print result #gives tuple with x_position,y_position,value
This version goes item by item through the first list and checks whether the item is also in the second list. If it is, it compares the indices for the found item in both lists and if they are different then it stores both indices and the item value as a tuple with three values in the result list.

2 dimensions list (array?) in python

I have to create 2 functions that involve a 2 dimension list in order to make a grid for a basic Python game :
The first function must take in parameter an int n and return a list of 2 dimensions with n columns and n lines with all values to 0.
The second one must take a 2 dimension list in parameter and print the grid but return nothing.
Here is what I came with:
def twoDList(x, y):
arr = [[x for x in range(6)] for y in range(6)] # x = height and y = width
return arr
def displayGrid(arr):
for i in range(0, 5):
print(arr[i][i])
Could you please help me to improve the code regarding the instructions and help me to understand how to display the whole grid with the code please?
Here are 2 methods using no 3rd party libraries.
One simple way to create a 2D array is to keep appending an array to an array:
for x in range(10): #width
for y in range(10): #height
a.append(y) #you can also append other data is you want it to be empty, this just makes it 0-9
arr.append(a) #add the 1-9
a = [] #clear the inner array
Here, I re-created the same array (a) 10 times, so it's kind of inefficient, but the point is that you can use the same structure with custom data input to make your own 2D array.
Another way to get the exact same 2D array is list comprehension
arr = [[x for x in range(10)] for y in range(10)]
This is probably what you were trying to do with the code you provided, which is, as mentioned in the comments, syntactically incorrect.
To print, just tweak the code you have to have 2 loops: one for x and one for y:
for x in range(5):
for y in range(5):
print(arr[x][y])
I still see erros in your code:
In your first function, since x,y are your inputs, you want to USE them in your list comprehension. You're not using them in your code
def twoDList(x, y):
arr = [[x for x in range(6)] for y in range(6)] # x = height and y = width
return arr
In your example, no matter what the value of x or y is, you're getting a 6x6 grid. You want to use x and y and replace the fixed values you have over there (HINT: change your '6').
Won't do that for you,
In your print function, you might want to use two variables, once per each dimension, to use as indexes.
Also, don't use fixed values in here, get them from your input (i'm guessing this is homework, so won't put the whole code)
def displayGrid(arr):
for i in range(0, 5):
for j in range(0, 5):
print(arr[i][j])

Iterating efficiently through indices of arbitrary order array

Say I have an arbitrary array of variable order N. For example:
A is a 2x3x3 array is an order 3 array with 2,3, and 3 dimiensions along it's three indices.
I would like to efficiently loop through each element. If I knew a priori the order then I could do something like (in python),
#for order 3
import numpy as np
shape = np.shape(A)
i = 0
while i < shape[0]:
j = 0
while j < shape[1]:
k = 0
while k < shape[2]:
#code using i,j,k
k += 1
j += 1
i += 1
Now suppose I don't know the order of A, i.e. I don't know a priori the length of shape. How can I permute the quickest through all elements of the array?
There are many ways to do this, e.g. iterating over a.ravel() or a.flat. However, looping over every single element of an array in a Python loop will never be particularly efficient.
I don't think it matters which index you choose to permute over first, which index you choose to permute over second, etc. because your inner-most while statement will always be executed once per combination of i, j, and k.
If you need to keep the results of your operation (and assuming its a function of A and i,j,k) You'd want to use something like this:
import itertools
import numpy as np
results = ( (position, code(A,position))
for indices in itertools.product(*(range(i) for i in np.shape(A))))
Then you can iterate the results getting out the position and return value of code for each position. Or convert the generator expression to a list if you need to access the results multiple times.
If the array of of the format array = [[[1,2,3,4],[1,2]],[[1],[1,2,3]]]
You could use the following structure:
array = [[[1,2,3,4],[1,2]],[[1],[1,2,3]]]
indices = []
def iter_array(array,indices):
indices.append(0)
for a in array:
if isinstance(a[0],list):
iter_array(a,indices)
else:
indices.append(0)
for nonlist in a:
#do something using each element in indices
#print(indices)
indices.append(indices.pop()+1)
indices.pop()
indices.append(indices.pop()+1)
indices.pop()
iter_array(array,indices)
This should work for the usual nested list "arrays" I don't know if it would be possible to mimic this using numpy's array structure.

How is this 2D array being sized by FOR loops?

Question background:
This is the first piece of Python code I've looked at and as such I'm assuming that my thread title is correct in explaining what this code is actually trying to achieve i.e setting a 2D array.
The code:
The code I'm looking at sets the size of a 2D array based on two for loops:
n = len(sentences)
values = [[0 for x in xrange(n)] for x in xrange(n)]
for i in range(0, n):
for j in range(0, n):
values[i][j] = self.sentences_intersection(sentences[i], sentences[j])
I could understand it if each side of the array was set with using the length property of the sentences variable, unless this is in effect what xrange is doing by using the loop size based on the length?
Any helping with explaing how the array is being set would be great.
This code is actually a bit redundant.
Firstly you need to realize that values is not an array, it is a list. A list is a dynamically sized one-dimensional structure.
The second line of the code uses a nested list comprehension to create one list of size n, each element of which is itself a list consisting of n zeros.
The second loop goes through this list of lists, and sets each element according to whatever sentences_intersection does.
The reason this is redundant is because lists don't need to be pre-allocated. Rather than doing two separate iterations, really the author should just be building up the lists with the correct values, then appending them.
This would be better:
n = len(sentences)
values = []
for i in range(0, n):
inner = []
for j in range(0, n):
inner.append(self.sentences_intersection(sentences[i], sentences[j]))
values.append(inner)
but you could actually do the whole thing in the list comprehension if you wanted:
values = [[self.sentences_intersection(sentences[i], sentences[j]) for i in xrange(n)] for j in xrange(n)]

How to pick the largest number in a matrix of lists in python?

I have a list-of-list-of-lists, where the first two act as a "matrix", where I can access the third list as
list3 = m[x][y]
and the third list contains a mix of strings and numbers, but each list has the same size & structure. Let's call a specific entry in this list The Number of Interest. This number always has the same index in this list!
What's the fastest way to get the 'coordinates' (x,y) for the list that has the largest Number of Interest in Python?
Thank you!
(So really, I'm trying to pick the largest number in m[x][y][k] where k is fixed, for all x & y, and 'know' what its address is)
max((cell[k], x, y)
for (y, row) in enumerate(m)
for (x, cell) in enumerate(row))[1:]
Also, you can assign the result directly to a couple of variables:
(_, x, y) = max((cell[k], x, y)
for (y, row) in enumerate(m)
for (x, cell) in enumerate(row))
This is O(n2), btw.
import itertools
indexes = itertools.product( xrange(len(m)), xrange(len(m[0]))
print max(indexes, key = lambda x: m[x[0]][x[1]][k])
or using numpy
import numpy
data = numpy.array(m)
print numpy.argmax(m[:,:,k])
In you are interested in speeding up operations in python, you really need to look at numpy.
Assuming "The Number of Interest" is in a known spot in the list, and there will be a nonzero maximum,
maxCoords = [-1, -1]
maxNumOfInterest = -1
rowIndex = 0
for row in m:
colIndex = 0
for entry in row:
if entry[indexOfNum] > maxNumOfInterest:
maxNumOfInterest = entry[indexOfNum]
maxCoords = [rowIndex,colIndex]
colIndex += 1
rowIndex += 1
Is a naive method that will be O(n2) on the size of the matrix. Since you have to check every element, this is the fastest solution possible.
#Marcelo's method is more succulent, but perhaps less readable.

Categories