Related
I have two index array like these:
import numpy as np
p_index = np.array([2, 0, 0, 2, 0, 2, 1, 2, 1])
m_index = np.array([0, 1, 1, 2, 1, 1, 2, 0, 0])
two object array like these:
p = np.array([17, 13, 16])
m = np.array([15, 14, 19])
and a matrix like this:
t=np.array([18, 16, 14, 12, 11, 19, 11, 16, 11])
I need to do a for-loop like this:
for i in range(len(t)):
newvalue = max(p[p_index[i]],m[m_index[i]])+t[i]
p[p_index[i]] = newvalue
m[m_index[i]] = newvalue
I take array as example,but p,m,t,and the index array are all Matrix with same rows actually.I need to do this to every row.How can I do it without for-loop?
NEW↓:
If I take Matrice as example it will like these:
p_index = np.array([[2, 0, 0, 2, 0, 2, 1, 2, 1],
[0, 2, 0, 1, 1, 1, 1, 2, 2],
[1, 2, 0, 2, 1, 1, 0, 0, 2],
[0, 2, 1, 1, 1, 1, 0, 1, 1]])
m_index = np.array([[0, 1, 1, 2, 1, 1, 2, 0, 0],
[2, 2, 2, 2, 1, 0, 1, 1, 2],
[2, 2, 2, 0, 1, 0, 2, 2, 2],
[1, 0, 0, 1, 0, 2, 1, 2, 1]])
t=np.array([[18, 16, 14, 12, 11, 19, 11, 16, 11],
[10, 14, 18, 17, 14, 15, 18, 19, 17],
[18, 17, 18, 18, 10, 12, 17, 15, 14],
[15, 15, 16, 15, 19, 12, 13, 19, 17]])
p = np.array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0]])
m = np.array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0]])
row,col = np.shape(t)
for i in range(row):
for j in range(col):
newvalue = max(p[i][p_index[i][j]],m[i][m_index[i][j]])+t[i][j]
p[i][p_index[i][j]] = newvalue
m[i][m_index[i][j]] = newvalue
f = np.max(p,axis=1).reshape(len(p),1)
And actually I just need the final 'f'
If we iterate without actually modifying p and m:
In [247]: for i in range(len(t)):
...: print(max(p[p_index[i]],m[m_index[i]])+t[i])
...:
34
33
31
31
28
35
30
32
26
We can get the same values with a single numpy expression:
In [250]: np.maximum(p[p_index], m[m_index])+t
Out[250]: array([34, 33, 31, 31, 28, 35, 30, 32, 26])
But with your full loop, p and m elements are modified several times, each depending on a previous iteration:
In [258]: for i in range(len(t)):
...: newvalue = max(p[p_index[i]],m[m_index[i]])+t[i]
...: print(newvalue)
...: p[p_index[i]] = newvalue
...: m[m_index[i]] = newvalue
...:
34
33
47 # versus 31
46
58 # versus 28
77
57
93
104
In [259]: p,m
Out[259]: (array([ 58, 104, 93]), array([104, 77, 57]))
There are some numpy tools for performing things like cumulative sums, and working with duplicate indices, but you can't (readily) apply these to general functions such as your max(...)+t.
I have an array with two rows, each rows repeated 4 columns.
a = np.array([[ 0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1],
[ 10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
I want to consider one value for 4 columns. For example, 0 for the 4 columns of the first row. I can not use the unique(), The output of a is:
b = np.array([[ 0,4, 7, 1],
[ 10,14, 17, 21]])
You can simply take every 4th column like so:
>>> a = np.array([[ 0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1],
... [ 10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
>>> a[:,::4]
array([[ 0, 4, 7, 1],
[10, 14, 17, 21]])
For more info, see numpy slicing.
You can remove duplicates in a row
def remove_duplicates(arr):
"""
remove duplicates in a row from array
"""
if len(arr) == 0:
return arr
else:
i = 0
while i < len(arr) - 1:
if arr[i] == arr[i + 1]:
del arr[i]
else:
i += 1
return arr
print(remove_duplicates([0,0,0,0,1,1,1,1,0,0,0,0]))
[0, 1, 0]
print(remove_duplicates([0,0,0,0,4,4,4,4,7,7,7,7,1,1,1,1]))
[0, 4, 7, 1]
Use np.apply_along_axis, which applies a method across each row:
>>> np.apply_along_axis(lambda x: x[::4], axis=1, arr=a)
array([[ 0, 4, 7, 1],
[10, 14, 17, 21]])
Here, the function we pass in just takes every 4th element of the row (this assumes 4 is always static).
You could use itertools.groupby:
>>> import numpy as np
>>> from itertools import groupby
>>> a = np.array([[0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1], [10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
>>> a
array([[ 0, 0, 0, 0, 4, 4, 4, 4, 7, 7, 7, 7, 1, 1, 1, 1],
[10, 10, 10, 10, 14, 14, 14, 14, 17, 17, 17, 17, 21, 21, 21, 21]])
>>> b = np.array([[k for k, _ in groupby(arr)] for arr in a])
>>> b
array([[ 0, 4, 7, 1],
[10, 14, 17, 21]])
I have an array of arrays and want to check if the sum equals 40. The problem is that the array has around 270,000,000 elements and doing in sequentially is out of the picture. The problem that I am having is finding the sums in a reasonable amount of time. I have ran this program overnight and it is still running in the morning. How can I make this program more efficient and run decently fast?
Here is my code so far:
import numpy as np
def cartesianProduct(arrays):
la = arrays.shape[0]
arr = np.empty([la] + [a.shape[0] for a in arrays], dtype="int32")
for i, a in enumerate(np.ix_(*arrays)):
arr[i, ...] = a
return arr.reshape(la, -1).T
rows = np.array(
[
[2, 15, 23, 19, 3, 2, 3, 27, 20, 11, 27, 10, 19, 10, 13, 10],
[22, 9, 5, 10, 5, 1, 24, 2, 10, 9, 7, 3, 12, 24, 10, 9],
[16, 0, 17, 0, 2, 0, 2, 0, 10, 0, 15, 0, 6, 0, 9, 0],
[11, 27, 14, 5, 5, 7, 8, 24, 8, 3, 6, 15, 22, 6, 1, 1],
[10, 0, 2, 0, 22, 0, 2, 0, 17, 0, 15, 0, 14, 0, 5, 0],
[1, 6, 10, 6, 10, 2, 6, 10, 4, 1, 5, 5, 4, 8, 6, 3],
[6, 0, 13, 0, 3, 0, 3, 0, 6, 0, 10, 0, 10, 0, 10, 0],
],
dtype="int32",
)
product = cartesianProduct(rows)
combos = []
for row in product:
if sum(row) == 40:
combos.append(row)
print(combos)
I believe what you are trying to do is called NP-hard. Look into "dynamic programming" and "subset sum"
Examples:
https://www.geeksforgeeks.org/subset-sum-problem-dp-25/
https://www.techiedelight.com/subset-sum-problem/
As suggested in the comments one way to optimize this is to check if the sum of a sub array already exceeds your threshold (40 in this case). and as another optimization to this you can even sort the arrays incrementally from largest to smallest.
Check heapq.nlargest() for incremental partial sorting.
I have a Numpy array that looks like
array([1, 2, 3, 4, 5, 6, 7, 8])
and I want to reshape it to an array
array([[5, 0, 0, 6],
[0, 1, 2, 0],
[0, 3, 4, 0],
[7, 0, 0, 8]])
More specifically, I'm trying to reshape a 2D numpy array to get a 3D Numpy array to go from
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16],
[17, 18, 19, 20, 21, 22, 23, 24],
...
[ 9, 10, 11, 12, 13, 14, 15, 16],
[89, 90, 91, 92, 93, 94, 95, 96]])
to a numpy array that looks like
array([[[ 5, 0, 0, 6],
[ 0, 1, 2, 0],
[ 0, 3, 4, 0],
[ 7, 0, 0, 8]],
[[13, 0, 0, 14],
[ 0, 9, 10, 0],
[ 0, 11, 12, 0],
[15, 0, 0, 16]],
...
[[93, 0, 0, 94],
[ 0, 89, 90, 0],
[ 0, 91, 92, 0],
[95, 0, 0, 96]]])
Is there an efficient way to do this using numpy functionality, particularly vectorized?
We can make use of slicing -
def expand(a): # a is 2D array
out = np.zeros((len(a),4,4),dtype=a.dtype)
out[:,1:3,1:3] = a[:,:4].reshape(-1,2,2)
out[:,::3,::3] = a[:,4:].reshape(-1,2,2)
return out
The benefit is memory and hence perf. efficiency, as only the output would occupy memory space. The steps involved work with views thanks to the slicing on the input and output.
Sample run -
2D input :
In [223]: a
Out[223]:
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16]])
In [224]: expand(a)
Out[224]:
array([[[ 5, 0, 0, 6],
[ 0, 1, 2, 0],
[ 0, 3, 4, 0],
[ 7, 0, 0, 8]],
[[13, 0, 0, 14],
[ 0, 9, 10, 0],
[ 0, 11, 12, 0],
[15, 0, 0, 16]]])
1D input (feed in 2D extended input with None) :
In [225]: a = np.array([1, 2, 3, 4, 5, 6, 7, 8])
In [226]: expand(a[None])
Out[226]:
array([[[5, 0, 0, 6],
[0, 1, 2, 0],
[0, 3, 4, 0],
[7, 0, 0, 8]]])
In the documentation of numpy.take, it is stated that a is indexed according to indices and axis, then the result is optionally stored in the out parameter. Does exists a function that perform the indexing on out instead? Using fancy indexing it would be something like:
out[:, :, indices, :] = a
Here I assume that axis=2 but in my case I don't know the axis in advance.
A solution using 1d boolean masks instead of indices is acceptable as well.
You can use swapaxes like so:
>>> A = np.arange(24).reshape(2,3,4)
>>> out = np.empty_like(A)
>>> I = [2,0,1]
>>> axis = 1
>>> out.swapaxes(0, axis)[I] = A.swapaxes(0, axis)
>>> out
array([[[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[ 0, 1, 2, 3]],
[[16, 17, 18, 19],
[20, 21, 22, 23],
[12, 13, 14, 15]]])
Some of the numpy functions construct an indexing tuple when operating on a specified axis.
The code isn't particularly pretty, but is general and reasonably efficient.
In [700]: out = np.zeros((2,1,4,5),int)
In [701]: out
Out[701]:
array([[[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]],
[[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]]])
In [702]: indices = [3,0,1]
Make an indexing tuple. Start with a list or array for ease of construction, and then convert to tuple when indexing:
In [703]: idx = [slice(None)]*out.ndim
In [704]: idx[2] = indices
In [705]: idx
Out[705]:
[slice(None, None, None),
slice(None, None, None),
[3, 0, 1],
slice(None, None, None)]
In [706]: out[tuple(idx)] = 10
In [707]: out
Out[707]:
array([[[[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10],
[ 0, 0, 0, 0, 0],
[10, 10, 10, 10, 10]]],
[[[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10],
[ 0, 0, 0, 0, 0],
[10, 10, 10, 10, 10]]]])
It matches the take:
In [708]: np.take(out, indices, axis=2)
Out[708]:
array([[[[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10]]],
[[[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10]]]])
We can set more complex values, as long as we get the broadcasting right:
out[tuple(idx)] = np.array([10,11,12])[...,None]
I have also seen numpy functions that shift the axis of interest to a known location - either beginning or end. Depending on the action it may require swapping back.
There are functions like place, put, copyto that provide other ways of controlling assignment (besides the usual indexing). But none take an axis parameter like np.take.