Find highest values in n unequal lists - python

I have list with n multiple lists.
data = [
[1, 2, 3, 4, 5, 6, 7, 8],
[2, 6, 3, 5, 9, 1, 1, 1, 2, 4, 5],
[8, 1, 4, 1, 2, 3, 4, 2, 5]
[3, 9, 1, 2, 2, 1, 1, 5, 9, 3]
]
How can I efficiently compare them and generate a list which always contains the highest value at the current position?
I don't know how I can do this since the boundaries for each list are different.
The output for the above example should be a list with these values:
[8,9,4,5,9,6,7,8,9,4,5]

The most idiomatic approach would be transposing the 2D list and calling max on each row in the transposed list. But in your case, you're dealing with ragged lists, so zip cannot be directly applied here (it zips upto the shortest list only).
Instead, use itertools.zip_longest (izip_longest for python 2), and then apply max using map -
from itertools import zip_longest
r = list(map(max, zip_longest(*data, fillvalue=-float('inf'))))
Or, using #Peter DeGlopper's suggestion, with a list comprehension -
r = [max(x) for x in zip_longest(*data, fillvalue=-float('inf'))]
print(r)
[8, 9, 4, 5, 9, 6, 7, 8, 9, 4, 5]
Here, I use a fillvalue parameter to fill missing values with negative infinity. The intermediate result looks something like this -
list(zip_longest(*data, fillvalue=-float('inf')))
[(1, 2, 8, 3),
(2, 6, 1, 9),
(3, 3, 4, 1),
(4, 5, 1, 2),
(5, 9, 2, 2),
(6, 1, 3, 1),
(7, 1, 4, 1),
(8, 1, 2, 5),
(-inf, 2, 5, 9),
(-inf, 4, -inf, 3),
(-inf, 5, -inf, -inf)]
Now, applying max becomes straightforward - just do it over each row and you're done.

zip_longest is your friend in this case.
from itertools import zip_longest
data = [
[1, 2, 3, 4, 5, 6, 7, 8],
[2, 6, 3, 5, 9, 1, 1, 1, 2, 4, 5],
[8, 1, 4, 1, 2, 3, 4, 2, 5],
[3, 9, 1, 2, 2, 1, 1, 5, 9, 3],
]
output = list()
for x in zip_longest(*data, fillvalue=0):
output.append(max(x))
print(output)
>>> [8, 9, 4, 5, 9, 6, 7, 8, 9, 4, 5]

Adding a pandas solution
import pandas as pd
pd.DataFrame(data).max().astype(int).tolist()
Out[100]: [8, 9, 4, 5, 9, 6, 7, 8, 9, 4, 5]

You don't need any external module , Just use some logic and you go :
data = [
[1, 2, 3, 4, 5, 6, 7, 8],
[2, 6, 3, 5, 9, 1, 1, 1, 2, 4, 5],
[8, 1, 4, 1, 2, 3, 4, 2, 5],
[3, 9, 1, 2, 2, 1, 1, 5, 9, 3]
]
new_data={}
for j in data:
for k,m in enumerate(j):
if k not in new_data:
new_data[k] = [m]
else:
new_data[k].append(m)
final_data=[0]*len(new_data.keys())
for key,value in new_data.items():
final_data[key]=max(value)
print(final_data)
output:
[8, 9, 4, 5, 9, 6, 7, 8, 9, 4, 5]

You can use itertools.izip_longest (itertools.zip_longest in Python3):
Python2:
import itertools
data = [
[1, 2, 3, 4, 5, 6, 7, 8],
[2, 6, 3, 5, 9, 1, 1, 1, 2, 4, 5],
[8, 1, 4, 1, 2, 3, 4, 2, 5],
[3, 9, 1, 2, 2, 1, 1, 5, 9, 3],
]
new_data = [max(filter(lambda x:x, i)) for i in itertools.izip_longest(*data)]
Output:
[8, 9, 4, 5, 9, 6, 7, 8, 9, 4, 5]
Python3:
import itertools
data = [
[1, 2, 3, 4, 5, 6, 7, 8],
[2, 6, 3, 5, 9, 1, 1, 1, 2, 4, 5],
[8, 1, 4, 1, 2, 3, 4, 2, 5],
[3, 9, 1, 2, 2, 1, 1, 5, 9, 3],
]
new_data = [max(filter(None, i)) for i in itertools.zip_longest(*data)]

Related

How to create an array with arrays in one function

I am trying to create an output that will be an array that contains 5 "sub-arrays". Every array should include 10 random numbers between 0 and 10.
I have this code:
def count_tweets():
big_array = []
for i in range(5):
array = []
for p in range(10):
array.append(random.randint(0,10))
big_array.append(array)
print(big_array)
I get a result like:
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0], [0, 1, 9, 9, 4, 2, 10, 4, 3, 8]]
[[4, 2, 7, 1, 3, 2, 6, 9, 3, 10], [5, 10, 7, 10, 7, 2, 1, 4, 8, 3], [2, 7, 1, 3, 8, 5, 7, 6, 0, 0], [0, 1, 9, 9, 4, 2, 10, 4, 3, 8], [3, 7, 3, 5, 4, 0, 2, 8, 6, 2]]
But instead it should be like:
[[0,2,6,7,9,4,6,1,10,5],[1,3,5,9,8,7,6,9,0,10],[3,5,1,7,9,4,7,2,7,9],[10,2,8,5,6,9,2,3,5,9],[4,5,2,9,8,7,5,1,3,5]]
I cannot seem to get the indentation correct. How do I fix the code?
So what you did was put the print() statement inside a loop, which will print each time it runs.
import random
def count_tweets():
big_array = []
for i in range(5):
array = []
for p in range(10):
array.append(random.randint(0,10))
big_array.append(array)
print(big_array)
count_tweets()
Hope this helps :)
You got it right, just slide the print out of the for loop.(delete four spaces before print())

Append arrays of different dimensions to get a single array

l have three vectors (numpy arrays), vector_1, vector_2, vector_3
as follow :
Dimension(vector1)=(200,2048)
Dimension(vector2)=(200,8192)
Dimension(vector3)=(200,32768)
l would like to append these vectors to get vector_4 :
Dimension(vector4)= (200,2048+8192+32768)= (200, 43008)
Add respectively vector1 then vector2 then vector3
l tries the following :
vector4=numpy.concatenate((vector1,vector2,vector3),axis=0)
ValueError: all the input array dimensions except for the concatenation axis must match exactly
and
vector4=numpy.append(vector4,[vector1,vector2,vectors3],axis=0)
TypeError: append() missing 1 required positional argument: 'values'
I believe you are looking for numpy.hstack.
>>> import numpy as np
>>> a = np.arange(4).reshape(2,2)
>>> b = np.arange(6).reshape(2,3)
>>> c = np.arange(8).reshape(2,4)
>>> a
array([[0, 1],
[2, 3]])
>>> b
array([[0, 1, 2],
[3, 4, 5]])
>>> c
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
>>> np.hstack((a,b,c))
array([[0, 1, 0, 1, 2, 0, 1, 2, 3],
[2, 3, 3, 4, 5, 4, 5, 6, 7]])
The error message is pretty much telling you exactly what is the problem:
ValueError: all the input array dimensions except for the concatenation axis must match exactly
But you are doing the opposite, the concatenation axis dimensions match exactly but the others don't. Consider:
In [3]: arr1 = np.random.randint(0,10,(20, 5))
In [4]: arr2 = np.random.randint(0,10,(20, 3))
In [5]: arr3 = np.random.randint(0,10,(20, 11))
Note the dimensions. Just give it the correct axis. So use the second rather than the first:
In [8]: arr1.shape, arr2.shape, arr3.shape
Out[8]: ((20, 5), (20, 3), (20, 11))
In [9]: np.concatenate((arr1, arr2, arr3), axis=1)
Out[9]:
array([[3, 1, 4, 7, 3, 6, 1, 1, 6, 7, 4, 6, 8, 6, 2, 8, 2, 5, 0],
[4, 2, 2, 1, 7, 8, 0, 7, 2, 2, 3, 9, 8, 0, 7, 3, 5, 9, 6],
[2, 8, 9, 8, 5, 3, 5, 8, 5, 2, 4, 1, 2, 0, 3, 2, 9, 1, 0],
[6, 7, 3, 5, 6, 8, 3, 8, 4, 8, 1, 5, 4, 4, 6, 4, 0, 3, 4],
[3, 5, 8, 8, 7, 7, 4, 8, 7, 3, 8, 7, 0, 2, 8, 9, 1, 9, 0],
[5, 4, 8, 3, 7, 8, 3, 2, 7, 8, 2, 4, 8, 0, 6, 9, 2, 0, 3],
[0, 0, 1, 8, 6, 4, 4, 4, 2, 8, 4, 1, 4, 1, 3, 1, 5, 5, 1],
[1, 6, 3, 3, 9, 2, 3, 4, 9, 2, 6, 1, 4, 1, 5, 6, 0, 1, 9],
[4, 5, 4, 7, 1, 4, 0, 8, 8, 1, 6, 0, 4, 6, 3, 1, 2, 5, 2],
[6, 4, 3, 2, 9, 4, 1, 7, 7, 0, 0, 5, 9, 3, 7, 4, 5, 6, 1],
[7, 7, 0, 4, 1, 9, 9, 1, 0, 1, 8, 3, 6, 0, 5, 1, 4, 0, 7],
[7, 9, 0, 4, 0, 5, 5, 9, 8, 9, 9, 7, 8, 8, 2, 6, 2, 3, 1],
[4, 1, 6, 5, 4, 5, 6, 7, 9, 2, 5, 8, 6, 6, 6, 8, 2, 3, 1],
[7, 7, 8, 5, 0, 8, 5, 6, 4, 4, 3, 5, 9, 8, 7, 9, 8, 8, 1],
[3, 9, 3, 6, 3, 2, 2, 4, 0, 1, 0, 4, 3, 0, 1, 3, 4, 1, 3],
[5, 1, 9, 7, 1, 8, 3, 9, 4, 7, 6, 7, 4, 7, 0, 1, 2, 8, 7],
[6, 3, 8, 0, 6, 2, 1, 8, 1, 0, 0, 3, 7, 2, 1, 5, 7, 0, 7],
[5, 4, 7, 5, 5, 8, 3, 2, 6, 1, 0, 4, 6, 9, 7, 3, 9, 2, 5],
[1, 4, 8, 5, 7, 2, 0, 2, 6, 2, 6, 5, 5, 4, 6, 1, 8, 8, 1],
[4, 4, 5, 6, 2, 6, 0, 5, 1, 8, 4, 5, 8, 9, 2, 1, 0, 4, 2]])
In [10]: np.concatenate((arr1, arr2, arr3), axis=1).shape
Out[10]: (20, 19)

Numpy create index/slicing programmatically from array

I can use numpy.mgrid as follows:
a = numpy.mgrid[x0:x1, y0:y1] # 2 dimensional
b = numpy.mgrid[x0:x1, y0:y1, z0:z1] # 3 dimensional
Now, I'd like to create the expression in brackets programmatically, because I do not know whether I have 1, 2, 3 or more dimensions. I'm looking for something like:
shape = np.array([[x0, x1], [y0, y1], ... maybe more dimensions ...])
idx = (s[0]:s[1] for s in shape)
a = numpy.mgrid[idx]
That gives at least a syntax error in the second line. How can I properly generate those indices/slices programmatically? (The mgrid here is rather an example/use case, the question is really about indexing in general.)
Use the slice object. For example:
shape = np.array([[0, 10], [0, 10]])
idx = tuple(slice(s[0],s[1], 1) for s in shape)
#yields the following
#(slice(0, 10, 1), slice(0, 10, 1))
np.mgrid[idx]
yields
array([[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7, 7, 7, 7, 7],
[8, 8, 8, 8, 8, 8, 8, 8, 8, 8],
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9]],
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]])
Alternatively, you could use the Numpy shorthand np.s_, e.g. np.s_[0:10:1], instead of slice(1, 10, 1), but they are equivalent objects.

Adding an array to the end of another Python

I'm very new to python and I have been faced with the task of taking several arrays into another array, this is inside of a loop.
So if you had
a = np.array([2,3,4,3,4,4,5,3,2,3,4])
and
b = np.array([1,1,1,1,1,2,23,2,3,3,3])
and
c = np.array([])
and wanted the result
c = [[2,3,4,3,4,4,5,3,2,3,4],
[1,1,1,1,1,2,23,2,3,3,3]]
so if I did c[0,:] I would get [2,3,4,3,4,4,5,3,2,3,4]
I tried using c = [c, np.array(a)] then next iteration you get c = [c, np.array(b)]
but I i do c[0,:] i get the error message list indices must be integers not tuples
EDIT:
When I get it to print out c it gives [array([2,3,4,3,4,4,5,3,2,3,4],dtype = unit8)]
Do you have any ideas?
In [10]: np.vstack((a,b))
Out[10]:
array([[ 2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4],
[ 1, 1, 1, 1, 1, 2, 23, 2, 3, 3, 3]])
EDIT: Here's an example of using it in a loop to gradually build a matrix:
In [14]: c = np.random.randint(0, 10, 10)
In [15]: c
Out[15]: array([9, 5, 9, 7, 3, 0, 1, 9, 2, 0])
In [16]: for _ in xrange(10):
....: c = np.vstack((c, np.random.randint(0, 10, 10)))
....:
In [17]: c
Out[17]:
array([[9, 5, 9, 7, 3, 0, 1, 9, 2, 0],
[0, 8, 1, 9, 7, 5, 4, 2, 1, 2],
[2, 1, 4, 2, 9, 6, 7, 1, 3, 2],
[6, 0, 7, 9, 1, 9, 8, 5, 9, 8],
[8, 1, 0, 9, 6, 6, 6, 4, 8, 5],
[0, 0, 5, 0, 6, 9, 9, 4, 6, 9],
[4, 0, 9, 8, 6, 0, 2, 2, 7, 0],
[1, 3, 4, 8, 2, 2, 8, 7, 7, 7],
[0, 0, 4, 8, 3, 6, 5, 6, 5, 7],
[7, 1, 3, 8, 6, 0, 0, 3, 9, 0],
[8, 5, 7, 4, 7, 2, 4, 8, 6, 7]])
Most numpythonic way is using np.array:
>>> c = np.array((a,b))
>>>
>>> c
array([[ 2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4],
[ 1, 1, 1, 1, 1, 2, 23, 2, 3, 3, 3]])
You may try this:
>>> c = [list(a), list(b)]
>>> c
[[2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4], [1, 1, 1, 1, 1, 2, 23, 2, 3, 3, 3]]
You can concatenate arrays in numpy. For this to work, they must have the same size in all dimensions except the concatenation direction.
If you just say
>>> c = np.concatenate([a,b])
you will get
>>> c
array([ 2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4, 1, 1, 1, 1, 1, 2,
23, 2, 3, 3, 3])
So in order to achieve what you want you first have to add another dimension to your vectors a and b like so
>>> a[None,:]
array([[2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4]])
or equivalently
>>> a[np.newaxis,:]
array([[2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4]])
So you could do the following:
>>> c = np.concatenate([a[None,:],b[None,:]],axis = 0)
>>> c
array([[ 2, 3, 4, 3, 4, 4, 5, 3, 2, 3, 4],
[ 1, 1, 1, 1, 1, 2, 23, 2, 3, 3, 3]])

Adding the product of an element of a list to an existing list in Python

I am trying to take a list of lists (like shown below)
list = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
compute the product of all the elements of each list, and append the result back onto the original list.
So, for example, if I were to take the list I posted above, what I would want it to look like is this:
list_2 = [[5000940,[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3]],
[0,[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0]],
[0,[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6]],
[0,[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2]],
[0,[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]]
The code that I have written so far takes in the list, outputs the products, but unfortunately I can't seem to get it properly appended to the exiting list and I was hoping someone would be able to show me how to do this.
for i in range(len(list)):
global products
products = []
list_prod = reduce(mul, list[i], 1)
#products.append(list_prod)
print products
Here's one way to do it:
from operator import mul
from pprint import pprint
lst = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
lst[:] = map(lambda e: [reduce(mul, e, 1), e], lst)
pprint(lst)
Online Demo
list = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
[[reduce(lambda x, y: x * y, line)] + line for line in list]
Gives me
[[5000940, 7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[0, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[0, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[0, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[0, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
If you are like me, and find list comprehensions difficult to read(Esp in the future). You may find the below code useful. Additionally, avoid using "list" as a name for the variable. As its a library function name.
num_list = [[7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3],
[3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0],
[1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6],
[6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2],
[7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]
num_list_fin = []
for num_item in num_list:
num_item_u = [reduce(lambda x,y: x*y, num_item)]
num_item_u.append(num_item)
num_list_fin.append(num_item_u)
print num_list_fin
This would give the output:
[[5000940, [7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3]], [0, [3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0]], [0, [1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6]], [0, [6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2]], [0, [7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2, 4]]]
It may make things clearer if you use a helper function.
def listprod(lst): return reduce(mul, lst, 1)
print( zip(map(listprod, mylist),mylist) )
Change the tuples to lists if you really need that.

Categories