I'd like to know how to make a simple data cube (matrix) with three 1D arrays or if there's a simpler way. I want to be able to call a specific value at the end from the cube such as cube[0,2,6].
x = arange(10)
y = arange(10,20,1)
z = arange(20,30,1)
cube = meshgrid(x,y,z)
But this doesn't give the desired result, as it gives mulitple arrays and can't call a specific number easily. I'd like to be able to use this for large data sets that would be laborious to do by hand, later on. Thanks
meshgrid as its name suggests creates an orthogonal mesh. If you call it with 3 arguments it will be a 3d mesh. Now the mesh is 3d arrangement of points but each point has 3 coordinates. Therefore meshgrid returns 3 arrays one for each coordinate.
The standard way of getting one 3d array out of that is to apply a vectorised function with three arguments. Here is a simple example:
>>> x = arange(7)
>>> y = arange(0,30,10)
>>> z = arange(0,200,100)
>>> ym, zm, xm = meshgrid(y, z, x)
>>> xm
array([[[0, 1, 2, 3, 4, 5, 6],
[0, 1, 2, 3, 4, 5, 6],
[0, 1, 2, 3, 4, 5, 6]],
[[0, 1, 2, 3, 4, 5, 6],
[0, 1, 2, 3, 4, 5, 6],
[0, 1, 2, 3, 4, 5, 6]]])
>>> ym
array([[[ 0, 0, 0, 0, 0, 0, 0],
[10, 10, 10, 10, 10, 10, 10],
[20, 20, 20, 20, 20, 20, 20]],
[[ 0, 0, 0, 0, 0, 0, 0],
[10, 10, 10, 10, 10, 10, 10],
[20, 20, 20, 20, 20, 20, 20]]])
>>> zm
array([[[ 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0]],
[[100, 100, 100, 100, 100, 100, 100],
[100, 100, 100, 100, 100, 100, 100],
[100, 100, 100, 100, 100, 100, 100]]])
>>> cube = xm + ym + zm
>>> cube
array([[[ 0, 1, 2, 3, 4, 5, 6],
[ 10, 11, 12, 13, 14, 15, 16],
[ 20, 21, 22, 23, 24, 25, 26]],
[[100, 101, 102, 103, 104, 105, 106],
[110, 111, 112, 113, 114, 115, 116],
[120, 121, 122, 123, 124, 125, 126]]])
>>> cube[0, 2, 6]
26
Related
I have two index array like these:
import numpy as np
p_index = np.array([2, 0, 0, 2, 0, 2, 1, 2, 1])
m_index = np.array([0, 1, 1, 2, 1, 1, 2, 0, 0])
two object array like these:
p = np.array([17, 13, 16])
m = np.array([15, 14, 19])
and a matrix like this:
t=np.array([18, 16, 14, 12, 11, 19, 11, 16, 11])
I need to do a for-loop like this:
for i in range(len(t)):
newvalue = max(p[p_index[i]],m[m_index[i]])+t[i]
p[p_index[i]] = newvalue
m[m_index[i]] = newvalue
I take array as example,but p,m,t,and the index array are all Matrix with same rows actually.I need to do this to every row.How can I do it without for-loop?
NEW↓:
If I take Matrice as example it will like these:
p_index = np.array([[2, 0, 0, 2, 0, 2, 1, 2, 1],
[0, 2, 0, 1, 1, 1, 1, 2, 2],
[1, 2, 0, 2, 1, 1, 0, 0, 2],
[0, 2, 1, 1, 1, 1, 0, 1, 1]])
m_index = np.array([[0, 1, 1, 2, 1, 1, 2, 0, 0],
[2, 2, 2, 2, 1, 0, 1, 1, 2],
[2, 2, 2, 0, 1, 0, 2, 2, 2],
[1, 0, 0, 1, 0, 2, 1, 2, 1]])
t=np.array([[18, 16, 14, 12, 11, 19, 11, 16, 11],
[10, 14, 18, 17, 14, 15, 18, 19, 17],
[18, 17, 18, 18, 10, 12, 17, 15, 14],
[15, 15, 16, 15, 19, 12, 13, 19, 17]])
p = np.array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0]])
m = np.array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0]])
row,col = np.shape(t)
for i in range(row):
for j in range(col):
newvalue = max(p[i][p_index[i][j]],m[i][m_index[i][j]])+t[i][j]
p[i][p_index[i][j]] = newvalue
m[i][m_index[i][j]] = newvalue
f = np.max(p,axis=1).reshape(len(p),1)
And actually I just need the final 'f'
If we iterate without actually modifying p and m:
In [247]: for i in range(len(t)):
...: print(max(p[p_index[i]],m[m_index[i]])+t[i])
...:
34
33
31
31
28
35
30
32
26
We can get the same values with a single numpy expression:
In [250]: np.maximum(p[p_index], m[m_index])+t
Out[250]: array([34, 33, 31, 31, 28, 35, 30, 32, 26])
But with your full loop, p and m elements are modified several times, each depending on a previous iteration:
In [258]: for i in range(len(t)):
...: newvalue = max(p[p_index[i]],m[m_index[i]])+t[i]
...: print(newvalue)
...: p[p_index[i]] = newvalue
...: m[m_index[i]] = newvalue
...:
34
33
47 # versus 31
46
58 # versus 28
77
57
93
104
In [259]: p,m
Out[259]: (array([ 58, 104, 93]), array([104, 77, 57]))
There are some numpy tools for performing things like cumulative sums, and working with duplicate indices, but you can't (readily) apply these to general functions such as your max(...)+t.
Say for example I have these 3 arrays:
# Array 1:
array_1 = [[100, 0, 0, 0, 0, 100],
[0, 100, 0, 0, 0, 100],
[0, 0, 100, 100, 0, 0]]
# Array 2:
array_2 = [[0, 0, 0, 0, 100, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 100]]
# Array 3:
array_3 = [[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 100, 0],
[0, 0, 0, 0, 0, 0]]
How will I be able to combine the 3 arrays into a one single array?
This will be the expected output:
[[100 0 0 0 100 100]
[0 100 0 0 100 100]
[0 0 100 100 0 100]]
As you can see, the 100s from array_1, array_2 and array_3 can be seen in the newly created array.
Combination of 100s must be with the same row as the other.
In this case, you can just add the arrays together
>>> a = np.arange(18).reshape((3,6))
>>> b = np.arange(18).reshape((3,6))
>>> c = np.arange(18).reshape((3,6))
>>> a
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
>>> a + b + c
array([[ 0, 3, 6, 9, 12, 15],
[18, 21, 24, 27, 30, 33],
[36, 39, 42, 45, 48, 51]])
I have an array of arrays and want to check if the sum equals 40. The problem is that the array has around 270,000,000 elements and doing in sequentially is out of the picture. The problem that I am having is finding the sums in a reasonable amount of time. I have ran this program overnight and it is still running in the morning. How can I make this program more efficient and run decently fast?
Here is my code so far:
import numpy as np
def cartesianProduct(arrays):
la = arrays.shape[0]
arr = np.empty([la] + [a.shape[0] for a in arrays], dtype="int32")
for i, a in enumerate(np.ix_(*arrays)):
arr[i, ...] = a
return arr.reshape(la, -1).T
rows = np.array(
[
[2, 15, 23, 19, 3, 2, 3, 27, 20, 11, 27, 10, 19, 10, 13, 10],
[22, 9, 5, 10, 5, 1, 24, 2, 10, 9, 7, 3, 12, 24, 10, 9],
[16, 0, 17, 0, 2, 0, 2, 0, 10, 0, 15, 0, 6, 0, 9, 0],
[11, 27, 14, 5, 5, 7, 8, 24, 8, 3, 6, 15, 22, 6, 1, 1],
[10, 0, 2, 0, 22, 0, 2, 0, 17, 0, 15, 0, 14, 0, 5, 0],
[1, 6, 10, 6, 10, 2, 6, 10, 4, 1, 5, 5, 4, 8, 6, 3],
[6, 0, 13, 0, 3, 0, 3, 0, 6, 0, 10, 0, 10, 0, 10, 0],
],
dtype="int32",
)
product = cartesianProduct(rows)
combos = []
for row in product:
if sum(row) == 40:
combos.append(row)
print(combos)
I believe what you are trying to do is called NP-hard. Look into "dynamic programming" and "subset sum"
Examples:
https://www.geeksforgeeks.org/subset-sum-problem-dp-25/
https://www.techiedelight.com/subset-sum-problem/
As suggested in the comments one way to optimize this is to check if the sum of a sub array already exceeds your threshold (40 in this case). and as another optimization to this you can even sort the arrays incrementally from largest to smallest.
Check heapq.nlargest() for incremental partial sorting.
I have a Numpy array that looks like
array([1, 2, 3, 4, 5, 6, 7, 8])
and I want to reshape it to an array
array([[5, 0, 0, 6],
[0, 1, 2, 0],
[0, 3, 4, 0],
[7, 0, 0, 8]])
More specifically, I'm trying to reshape a 2D numpy array to get a 3D Numpy array to go from
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16],
[17, 18, 19, 20, 21, 22, 23, 24],
...
[ 9, 10, 11, 12, 13, 14, 15, 16],
[89, 90, 91, 92, 93, 94, 95, 96]])
to a numpy array that looks like
array([[[ 5, 0, 0, 6],
[ 0, 1, 2, 0],
[ 0, 3, 4, 0],
[ 7, 0, 0, 8]],
[[13, 0, 0, 14],
[ 0, 9, 10, 0],
[ 0, 11, 12, 0],
[15, 0, 0, 16]],
...
[[93, 0, 0, 94],
[ 0, 89, 90, 0],
[ 0, 91, 92, 0],
[95, 0, 0, 96]]])
Is there an efficient way to do this using numpy functionality, particularly vectorized?
We can make use of slicing -
def expand(a): # a is 2D array
out = np.zeros((len(a),4,4),dtype=a.dtype)
out[:,1:3,1:3] = a[:,:4].reshape(-1,2,2)
out[:,::3,::3] = a[:,4:].reshape(-1,2,2)
return out
The benefit is memory and hence perf. efficiency, as only the output would occupy memory space. The steps involved work with views thanks to the slicing on the input and output.
Sample run -
2D input :
In [223]: a
Out[223]:
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16]])
In [224]: expand(a)
Out[224]:
array([[[ 5, 0, 0, 6],
[ 0, 1, 2, 0],
[ 0, 3, 4, 0],
[ 7, 0, 0, 8]],
[[13, 0, 0, 14],
[ 0, 9, 10, 0],
[ 0, 11, 12, 0],
[15, 0, 0, 16]]])
1D input (feed in 2D extended input with None) :
In [225]: a = np.array([1, 2, 3, 4, 5, 6, 7, 8])
In [226]: expand(a[None])
Out[226]:
array([[[5, 0, 0, 6],
[0, 1, 2, 0],
[0, 3, 4, 0],
[7, 0, 0, 8]]])
In the documentation of numpy.take, it is stated that a is indexed according to indices and axis, then the result is optionally stored in the out parameter. Does exists a function that perform the indexing on out instead? Using fancy indexing it would be something like:
out[:, :, indices, :] = a
Here I assume that axis=2 but in my case I don't know the axis in advance.
A solution using 1d boolean masks instead of indices is acceptable as well.
You can use swapaxes like so:
>>> A = np.arange(24).reshape(2,3,4)
>>> out = np.empty_like(A)
>>> I = [2,0,1]
>>> axis = 1
>>> out.swapaxes(0, axis)[I] = A.swapaxes(0, axis)
>>> out
array([[[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[ 0, 1, 2, 3]],
[[16, 17, 18, 19],
[20, 21, 22, 23],
[12, 13, 14, 15]]])
Some of the numpy functions construct an indexing tuple when operating on a specified axis.
The code isn't particularly pretty, but is general and reasonably efficient.
In [700]: out = np.zeros((2,1,4,5),int)
In [701]: out
Out[701]:
array([[[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]],
[[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]]])
In [702]: indices = [3,0,1]
Make an indexing tuple. Start with a list or array for ease of construction, and then convert to tuple when indexing:
In [703]: idx = [slice(None)]*out.ndim
In [704]: idx[2] = indices
In [705]: idx
Out[705]:
[slice(None, None, None),
slice(None, None, None),
[3, 0, 1],
slice(None, None, None)]
In [706]: out[tuple(idx)] = 10
In [707]: out
Out[707]:
array([[[[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10],
[ 0, 0, 0, 0, 0],
[10, 10, 10, 10, 10]]],
[[[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10],
[ 0, 0, 0, 0, 0],
[10, 10, 10, 10, 10]]]])
It matches the take:
In [708]: np.take(out, indices, axis=2)
Out[708]:
array([[[[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10]]],
[[[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10],
[10, 10, 10, 10, 10]]]])
We can set more complex values, as long as we get the broadcasting right:
out[tuple(idx)] = np.array([10,11,12])[...,None]
I have also seen numpy functions that shift the axis of interest to a known location - either beginning or end. Depending on the action it may require swapping back.
There are functions like place, put, copyto that provide other ways of controlling assignment (besides the usual indexing). But none take an axis parameter like np.take.