I am using some datas in my program were I have some sorting issue which takes longer time for me. So I have mentioned an example situation here for which I would like to get a solution.
import numpy as np
A = np.array([[1,2,3],[4,5,6],[7,8,9]])
B = np.array([[4,5,6,7],[7,8,9,4],[1,2,3,2]])
# Need to apply some sort function
C = sort(B[:,0:3] to be sorted with respect to A)
print(C)
I have two numpy arrays were I would like the first 3 columns of array B to be sorted with respect to array A.
And I want the output of C as
[[1,2,3,2],[4,5,6,7],[7,8,9,4]]
Is there any numpy method or any other python libraries which could do this.
Looking forward for some answers
Regards
Aadithya
This only works when using the first column :
_, indexer, _ = np.intersect1d(B[:,:1], A[:,:1], return_indices=True)
B[indexer]
array([[1, 2, 3, 2],
[4, 5, 6, 7],
[7, 8, 9, 4]])
Apparently, the above solution works only if the values are unique (Thanks #AadithyaSaathya.
If we are to use all of A, we could use itertools' product function :
from itertools import product
indexer = [B[0]
for A,B
in
product(enumerate(A), enumerate(B[:,:3]))
if np.all(np.equal(A[-1], B[-1]))]
B[indexer]
array([[1, 2, 3, 2],
[4, 5, 6, 7],
[7, 8, 9, 4]])
Related
I am trying to append to create a multi dimensional array where it inputs 4 random numbers per row. The code below does not work. How would I be able to fix it?
import numpy as np
import random
Array = np.array([[]])
for i in range(3):
for k in range(4):
Array[i][k]= np.append(Array[i][k], random.randint(0,9))
Expected Output:
[[1,3,4,8],
[2,3,6,4],
[7,4,1,5],
[8,3,1,1]]
Don't do this. It is highly inefficient to try to create an array like this incrementally using np.append. If you must do something like this, use a listand then convert the resulting list to anumpy.ndarray` later.
However, in this case, you simply want:
>>> import numpy as np
>>> np.random.randint(0, 10, (3,4))
array([[0, 3, 7, 4],
[6, 4, 2, 2],
[4, 4, 0, 6]])
Or perhaps:
>>> np.random.randint(0, 10, (4,4))
array([[8, 8, 2, 7],
[3, 7, 2, 1],
[5, 5, 5, 5],
[6, 2, 7, 9]])
Note, np.random.randint has an exclusive end, so if you want to draw from numbers [0, 9] you need to use 9+1 as the end.
I am trying to efficiently update some elements of a numpy array A, using another array b to indicate the indexes of the elements of A to be updated. However b can contain duplicates which are ignored whereas I would like to be taken into account. I would like to avoid for looping b. To illustrate it:
>>> A = np.arange(10).reshape(2,5)
>>> A[0, np.array([1,1,1,2])] += 1
>>> A
array([[0, 2, 3, 3, 4],
[5, 6, 7, 8, 9]])
whereas I would like the output to be:
array([[0, 3, 3, 3, 4],
[5, 6, 7, 8, 9]])
Any ideas?
To correctly handle the duplicate indices, you'll need to use np.add.at instead of +=. Therefore to update the first row of A, the simplest way would probably be to do the following:
>>> np.add.at(A[0], [1,1,1,2], 1)
>>> A
array([[0, 4, 3, 3, 4],
[5, 6, 7, 8, 9]])
The documents for the ufunc.at method can be found here.
One approach is to use numpy.histogram to find out how many values there are at each index, then add the result to A:
A[0, :] += np.histogram(np.array([1,1,1,2]), bins=np.arange(A.shape[1]+1))[0]
So I'm pretty new to numpy, and I'm trying working on a project, but have encountered an error that I can't seem to solve.
Imagine we had an NDarray in the following format
[4,5,6,1]
[3,5,2,0]
[4,7,3,1]
How would I split it into two parts such that the first part is:
[4,5,6]
[3,5,2]
[4,7,3]
and the second part is
[1,0,1]
I know the solution must be pretty simple but I can't seem to figure it out
Thanks in advance!
Try:
a = np.array([[4,5,6,1],
[3,5,2,0],
[4,7,3,1]])
b,c = a[:,:-1], a[:,-1]
This uses numpy's slicing to keep all rows and split the columns on the last one.
>>> import numpy as np
>>> a=np.array([[4,5,6,1],[3,5,2,0],[4,7,3,1]])
>>> a
array([[4, 5, 6, 1],
[3, 5, 2, 0],
[4, 7, 3, 1]])
>>> b=a[:,0:3]
>>> b
array([[4, 5, 6],
[3, 5, 2],
[4, 7, 3]])
>>> c=a[:,3]
>>> c
array([1, 0, 1])
>>>
This is something called array slice in python, not too much about numpy.
For more details about array slice, see Explain Python's slice notation
I am trying to efficiently update some elements of a numpy array A, using another array b to indicate the indexes of the elements of A to be updated. However b can contain duplicates which are ignored whereas I would like to be taken into account. I would like to avoid for looping b. To illustrate it:
>>> A = np.arange(10).reshape(2,5)
>>> A[0, np.array([1,1,1,2])] += 1
>>> A
array([[0, 2, 3, 3, 4],
[5, 6, 7, 8, 9]])
whereas I would like the output to be:
array([[0, 3, 3, 3, 4],
[5, 6, 7, 8, 9]])
Any ideas?
To correctly handle the duplicate indices, you'll need to use np.add.at instead of +=. Therefore to update the first row of A, the simplest way would probably be to do the following:
>>> np.add.at(A[0], [1,1,1,2], 1)
>>> A
array([[0, 4, 3, 3, 4],
[5, 6, 7, 8, 9]])
The documents for the ufunc.at method can be found here.
One approach is to use numpy.histogram to find out how many values there are at each index, then add the result to A:
A[0, :] += np.histogram(np.array([1,1,1,2]), bins=np.arange(A.shape[1]+1))[0]
I have a 2D list something like
a = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
and I want to convert it to a 2d numpy array. Can we do it without allocating memory like
numpy.zeros((3,3))
and then storing values to it?
Just pass the list to np.array:
a = np.array(a)
You can also take this opportunity to set the dtype if the default is not what you desire.
a = np.array(a, dtype=...)
just use following code
c = np.matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
matrix([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Then it will give you
you can check shape and dimension of matrix by using following code
c.shape
c.ndim
np.array() is even more powerful than what unutbu said above.
You also could use it to convert a list of np arrays to a higher dimention array, the following is a simple example:
aArray=np.array([1,1,1])
bArray=np.array([2,2,2])
aList=[aArray, bArray]
xArray=np.array(aList)
xArray's shape is (2,3), it's a standard np array. This operation avoids a loop programming.
I am using large data sets exported to a python file in the form
XVals1 = [.........]
XVals2 = [.........]
Each list is of identical length. I use
>>> a1 = np.array(SV.XVals1)
>>> a2 = np.array(SV.XVals2)
Then
>>> A = np.matrix([a1,a2])