Get the unique value of the different part of an array - python

I have an array with two rows, each rows repeated 4 columns.
a = np.array([[ 0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1],
[ 10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
I want to consider one value for 4 columns. For example, 0 for the 4 columns of the first row. I can not use the unique(), The output of a is:
b = np.array([[ 0,4, 7, 1],
[ 10,14, 17, 21]])

You can simply take every 4th column like so:
>>> a = np.array([[ 0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1],
... [ 10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
>>> a[:,::4]
array([[ 0, 4, 7, 1],
[10, 14, 17, 21]])
For more info, see numpy slicing.

You can remove duplicates in a row
def remove_duplicates(arr):
"""
remove duplicates in a row from array
"""
if len(arr) == 0:
return arr
else:
i = 0
while i < len(arr) - 1:
if arr[i] == arr[i + 1]:
del arr[i]
else:
i += 1
return arr
print(remove_duplicates([0,0,0,0,1,1,1,1,0,0,0,0]))
[0, 1, 0]
print(remove_duplicates([0,0,0,0,4,4,4,4,7,7,7,7,1,1,1,1]))
[0, 4, 7, 1]

Use np.apply_along_axis, which applies a method across each row:
>>> np.apply_along_axis(lambda x: x[::4], axis=1, arr=a)
array([[ 0, 4, 7, 1],
[10, 14, 17, 21]])
Here, the function we pass in just takes every 4th element of the row (this assumes 4 is always static).

You could use itertools.groupby:
>>> import numpy as np
>>> from itertools import groupby
>>> a = np.array([[0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1], [10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
>>> a
array([[ 0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1],
[10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
>>> b = np.array([[k for k, _ in groupby(arr)] for arr in a])
>>> b
array([[ 0, 4, 7, 1],
[10, 14, 17, 21]])

Related

Filtering nested lists with python conditions

how are you?
I have a distance matrix and need to perform a filter based on another list before applying some functions.
The matrix has 10 elements that represent machines and the distances between them, I need to filter this list by getting only the distances between some chosen machines.
matrix = [[0, 1, 3, 17, 24, 12, 18, 16, 17, 15],
[1, 0, 2, 2, 5, 6, 13, 11, 12, 10],
[3, 2, 0, 1, 6, 12, 18, 12, 17, 15],
[17, 2, 1, 0, 3, 12, 17, 15, 16, 14],
[24, 5, 6, 3, 0, 1, 24, 22, 23, 21],
[12, 6, 12, 12, 1, 0, 12, 10, 11, 9],
[18, 13, 18, 17, 24, 12, 0, 3, 4, 5],
[16, 11, 12, 15, 22, 10, 3, 0, 1, 2],
[17, 12, 17, 16, 23, 11, 4, 1, 0, 1],
[15, 10, 15, 14, 21, 9, 5, 2, 1, 0]]
The list used for filtering, for example, is:
filter_list = [1, 2, 7, 10]
The idea is to use this list to filter the rows and the indices of the sublists to get the final matrix:
final_matrix = [[0, 1, 18, 15],
[1, 0, 13, 10],
[18, 13, 0, 5],
[15, 10, 5, 0]]
It is worth noting that the filter list elements vary. Can someone please help me?
That's what I tried:
final_matrix = []
for i in range(0, len(filter_list)):
for j in range(0,len(filter_list[i])):
a = filter_list[i][j]
final_matrix .append(matrix[a-1])
print(final_matrix)
This is because the filter_list can have sublists. I get it:
final_matrix = [[0, 1, 3, 17, 24, 12, 18, 16, 17, 15],
[1, 0, 2, 2, 5, 6, 13, 11, 12, 10],
[18, 13, 18, 17, 24, 12, 0, 3, 4, 5],
[15, 10, 15, 14, 21, 9, 5, 2, 1, 0]]
I could not remove the spare elements.
You forgot to filter by column ids. You can do this using nested list comprehensions.
final_matrix = [[matrix[row-1][col-1] for col in filter_list] for row in filter_list]
final_matrix = []
for i in filter_list:
to_append = []
for j in filter_list:
to_append.append(matrix[i-1][j-1])
final_matrix.append(to_append)
or with list comprehension
final_matrix = [[matrix[i-1][j-1] for j in filter_list] for i in filter_list]

Efficiently iterate over nested lists to find sum

I have an array of arrays and want to check if the sum equals 40. The problem is that the array has around 270,000,000 elements and doing in sequentially is out of the picture. The problem that I am having is finding the sums in a reasonable amount of time. I have ran this program overnight and it is still running in the morning. How can I make this program more efficient and run decently fast?
Here is my code so far:
import numpy as np
def cartesianProduct(arrays):
la = arrays.shape[0]
arr = np.empty([la] + [a.shape[0] for a in arrays], dtype="int32")
for i, a in enumerate(np.ix_(*arrays)):
arr[i, ...] = a
return arr.reshape(la, -1).T
rows = np.array(
[
[2, 15, 23, 19, 3, 2, 3, 27, 20, 11, 27, 10, 19, 10, 13, 10],
[22, 9, 5, 10, 5, 1, 24, 2, 10, 9, 7, 3, 12, 24, 10, 9],
[16, 0, 17, 0, 2, 0, 2, 0, 10, 0, 15, 0, 6, 0, 9, 0],
[11, 27, 14, 5, 5, 7, 8, 24, 8, 3, 6, 15, 22, 6, 1, 1],
[10, 0, 2, 0, 22, 0, 2, 0, 17, 0, 15, 0, 14, 0, 5, 0],
[1, 6, 10, 6, 10, 2, 6, 10, 4, 1, 5, 5, 4, 8, 6, 3],
[6, 0, 13, 0, 3, 0, 3, 0, 6, 0, 10, 0, 10, 0, 10, 0],
],
dtype="int32",
)
product = cartesianProduct(rows)
combos = []
for row in product:
if sum(row) == 40:
combos.append(row)
print(combos)
I believe what you are trying to do is called NP-hard. Look into "dynamic programming" and "subset sum"
Examples:
https://www.geeksforgeeks.org/subset-sum-problem-dp-25/
https://www.techiedelight.com/subset-sum-problem/
As suggested in the comments one way to optimize this is to check if the sum of a sub array already exceeds your threshold (40 in this case). and as another optimization to this you can even sort the arrays incrementally from largest to smallest.
Check heapq.nlargest() for incremental partial sorting.

Python - How to add n zeros randomly in an existing matrix?

i have this array that i generated using the default_rng:
import numpy as np
from numpy.random import default_rng
rng = default_rng(seed=10)
rng = rng.integers(1,20,(5,10))
rng
>>>array([[15, 19, 6, 4, 16, 16, 10, 3, 16, 10],
[ 3, 3, 8, 14, 8, 16, 1, 9, 10, 19],
[ 5, 16, 2, 7, 15, 11, 18, 15, 18, 16],
[ 3, 18, 17, 3, 19, 15, 6, 3, 8, 18],
[15, 5, 10, 17, 13, 6, 3, 19, 5, 10]], dtype=int64)
I want to add 10 zeros in this matrix using the generator with seed=5.
I thought to create a new array with dimessions [5,10] and to put 10 zeros inside and the rest to be one and then mutliply the two arrays but i have to use the generator so i can't do this.
Try with np.random.choice to choose the index, then set the values at those indexes to 0:
np.random.seed(10)
idx = np.random.choice(np.arange(5*10), size=5, replace=False)
rng.ravel()[idx] = 0
Output:
array([[15, 19, 6, 4, 16, 16, 10, 3, 16, 10],
[ 3, 3, 8, 14, 8, 16, 1, 9, 10, 19],
[ 5, 16, 2, 0, 15, 11, 18, 15, 18, 16],
[ 3, 18, 17, 3, 19, 15, 6, 0, 8, 18],
[15, 5, 0, 17, 0, 6, 3, 0, 5, 10]])
Of course
idx = np.random.choice(rng.ravel(), 10, replace= False)
print(idx)
rng.ravel()[idx] = 0
rng
Output
[10 17 3 6 15 15 15 16 15 15]
array([[15, 19, 6, 0, 16, 16, 0, 3, 16, 10],
[ 0, 3, 8, 14, 8, 0, 0, 0, 10, 19],
[ 5, 16, 2, 7, 15, 11, 18, 15, 18, 16],
[ 3, 18, 17, 3, 19, 15, 6, 3, 8, 18],
[15, 5, 10, 17, 13, 6, 3, 19, 5, 10]], dtype=int64)
So instead of take 10 zeros i take only 6 becaus of 15 appears five times in my idx.

Broadcasting 2D array in specific columns in Python

I have an array like this:
A = np.array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20]])
What I want to do is add 1 to each value in the first and last column. I want to understand broadcasting (avoid loops), by using this and appropriate vector, but I have tried but it doesn't work. Expected results:
A = np.array([[ 2, 2, 3, 4, 6],
[ 7, 7, 8, 9, 11],
[12, 12, 13, 14, 16],
[17, 17, 18, 19, 21]])
You can use numpy indexing to do this. Try this:
# 0 is the first and -1 is the last column
A[:,[0,-1]] = A[:,[0,-1]]+1
Or
A[:,(0,-1)] = A[:,(0,-1)]+1
Or
A[:,[0,-1]]+=1
Or
A[:,(0,-1)]+=1
Output in either case:
array([[ 2, 2, 3, 4, 6],
[ 7, 7, 8, 9, 11],
[12, 12, 13, 14, 16],
[17, 17, 18, 19, 21]])
You can use vector [1,0,0,0,1] and python will do broadcasting for you.
b = np.array([1,0,0,0,1])
A + b
array([[ 2, 2, 3, 4, 6],
[ 7, 7, 8, 9, 11],
[12, 12, 13, 14, 16],
[17, 17, 18, 19, 21]])
If you would like to know how broadcasting works, you can simply try to broadcast once by yourself.
b = np.array([1,0,0,0,1])
B = np.tile(b,(A.shape[0],1))
array([[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[1, 0, 0, 0, 1]])
A + B
Same result.

Mapping an array into other with zeros at the begining and the end

I have a numpy array
a = np.arange(30).reshape(5,6)
and I want to map it into
b = np.zeros((a.shape[0],a.shape[1]+2))
but leaving the first and last columns as zeros
i.e.
b =
array [[0, 0, 1, 2, 3, 4, 5, 0],
. . .
[0, 24, 25, 26, 27, 28, 29, 0]])
Thanks
a = np.arange(30).reshape(5, 6)
b = np.zeros((a.shape[0], a.shape[1]+2), dtype=a.dtype)
b[:, 1:-1] = a
>>> b
array([[ 0, 0, 1, 2, 3, 4, 5, 0],
[ 0, 6, 7, 8, 9, 10, 11, 0],
[ 0, 12, 13, 14, 15, 16, 17, 0],
[ 0, 18, 19, 20, 21, 22, 23, 0],
[ 0, 24, 25, 26, 27, 28, 29, 0]])

Categories