Related
I need to create array 5x5, which looks like:
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]
I want to do it with numpy using ndarray, I write np.ndarray((5, 5)) and then fill it as I need, but the answer is can I fill it during creation with this pattern?
Use numpy.tile (to repeat an array specified number of times):
np.tile(np.arange(5), (5,1))
array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
How can i get the sorted indices of a numpy array (distance), only considering certain indices from another numpy array (val).
For example, consider the two numpy arrays val and distance below:
val = np.array([[10, 0, 0, 0, 0],
[0, 0, 10, 0, 10],
[0, 10, 10, 0, 0],
[0, 0, 0, 10, 0],
[0, 0, 0, 0, 0]])
distance = np.array([[4, 3, 2, 3, 4],
[3, 2, 1, 2, 3],
[2, 1, 0, 1, 2],
[3, 2, 1, 2, 3],
[4, 3, 2, 3, 4]])
the distances where val == 10 are 4, 1, 3, 1, 0, 2. I would like to get these sorted to be 0, 1, 1, 2, 3, 4 and return the respective indices from distance array.
Returning something like:
(array([2, 1, 2, 3, 1, 0], dtype=int64), array([2, 2, 1, 3, 4, 0], dtype=int64))
or:
(array([2, 2, 1, 3, 1, 0], dtype=int64), array([2, 1, 2, 3, 4, 0], dtype=int64))
since the second and third element both have distance '1', so i guess the indices can be interchangable.
Tried using combinations of np.where, np.argsort, np.argpartition, np.unravel_index but cant seem to get it working right
Here's one way with masking -
In [20]: mask = val==10
In [21]: np.argwhere(mask)[distance[mask].argsort()]
Out[21]:
array([[2, 2],
[1, 2],
[2, 1],
[3, 3],
[1, 4],
[0, 0]])
I have a numpy array of shape (1429,1) where each row itself is a numpy array of shape (3,100) where l may vary from row to row.
How can I reshape this array by flattening each row such that the resulting numpy array will have the shape (1429, 300)?
I guess your initial array's shape is (1429, 3, 100), if that's true, you can change it's shape as below:
import numpy as np
a = a.flatten().reshape((1429, 300)) #a is the initial numpy array
The type of your embedding structure is probably object. It's just a collection of references on 1429 numpy.ndarrays.
As an exemple :
a=np.empty((1429,1),object)
for x in a :
x[0]=np.random.rand(3,100)
In [19]: a.shape,a.dtype
Out[19]: ((1429, 1), dtype('O'))
In [20]: a[0,0].shape
Out[20]: (3, 100)
The structure is probably not contiguous. To obtain a block containing all your data, you must reconstruct it to obtain the good layout :
b=np.array([x.ravel() for x in a.ravel()])
In [21]: b.shape
Out[21]: (1429, 300)
ravel discard unwanted dimensions.
Assuming it is an object dtype array with shape (1429,1), and all elements are 2d of shape (3,100), a good way to 'flatten' is to use concatenate or stack.
np.stack(arr.ravel()).reshape(-1,300)
I use arr.ravel() so the array looks like a (1429) element list to stack. stack then concatenates the elements, creating a (1429, 3, 100) array. The reshape then converts that to (1429, 300).
In [939]: arr = np.empty((5,1),object)
In [940]: arr[:,0] = [np.arange(6).reshape(2,3) for _ in range(5)]
In [941]: arr
Out[941]:
array([[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])],
[array([[0, 1, 2],
[3, 4, 5]])]], dtype=object)
In [942]: np.stack(arr.ravel())
Out[942]:
array([[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]]])
In [943]: np.stack(arr.ravel()).reshape(-1,6)
Out[943]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])
np.stack with the default axis=0 is the same as np.array(...).
Or with concatenate
In [950]: np.concatenate(arr.ravel(),axis=0)
Out[950]:
array([[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5]])
In [951]: np.concatenate(arr.ravel(),axis=0).reshape(5,6)
Out[951]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])
I have an array A:
array([[1, 2, 3],
[1, 1, 1],
[2, 2, 2]])
and an array B:
array([[1, 0],
[1, 0],
[0, 1]])
I want to make array B as the last column of array A, so I want the result array (let's call it C) to look like this:
array([[1, 2, 3, [1, 0]],
[1, 1, 1, [1, 0]],
[2, 2, 2, [0, 1]]])
I tried: np.insert(a,-1,b,axis=1) , but this gave me an error:
ValueError: could not broadcast input array from shape (2,3) into shape (3,3)
Maybe that's what you're looking for:
import numpy as np
a = np.array([[1, 2, 3],
[1, 1, 1],
[2, 2, 2]])
b = np.array([[1, 0],
[1, 0],
[0, 1]])
np.hstack([a,b])
Which results in:
array([[1, 2, 3, 1, 0],
[1, 1, 1, 1, 0],
[2, 2, 2, 0, 1]])
print zip(*zip(*a)+[b.tolist(),])
although it wont be a numpy array afterwards
>>> a
array([[1, 2, 3],
[1, 1, 1],
[2, 2, 2]])
>>> b
array([[1, 0],
[1, 0],
[0, 1]])
>>> zip(*zip(*a)+[b.tolist(),])
[(1, 2, 3, [1, 0]), (1, 1, 1, [1, 0]), (2, 2, 2, [0, 1])]
In numpy, I would like to be able to input n for rows and m for columns and end with the array that looks like:
[(0,0,0,0),
(1,1,1,1),
(2,2,2,2)]
So that would be a 3x4. Each column is just a copy of the previous one and the row increases by one each time. As an example:
input would be 4, then 6 and the output would be and array
[(0,0,0,0,0,0),
(1,1,1,1,1,1),
(2,2,2,2,2,2),
(3,3,3,3,3,3)]
4 rows and 6 columns where the row increases by one each time. Thanks for your time.
So many possibilities...
In [51]: n = 4
In [52]: m = 6
In [53]: np.tile(np.arange(n), (m, 1)).T
Out[53]:
array([[0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3]])
In [54]: np.repeat(np.arange(n).reshape(-1,1), m, axis=1)
Out[54]:
array([[0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3]])
In [55]: np.outer(np.arange(n), np.ones(m, dtype=int))
Out[55]:
array([[0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3]])
Here's one more. The neat trick here is that the values are not duplicated--only memory for the single sequence [0, 1, 2, ..., n-1] is allocated.
In [67]: from numpy.lib.stride_tricks import as_strided
In [68]: seq = np.arange(n)
In [69]: rep = as_strided(seq, shape=(n,m), strides=(seq.strides[0],0))
In [70]: rep
Out[70]:
array([[0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3]])
Be careful with the as_strided function. If you don't get the arguments right, you can crash Python.
To see that seq has not been copied, change seq in place, and then check rep:
In [71]: seq[1] = 99
In [72]: rep
Out[72]:
array([[ 0, 0, 0, 0, 0, 0],
[99, 99, 99, 99, 99, 99],
[ 2, 2, 2, 2, 2, 2],
[ 3, 3, 3, 3, 3, 3]])
import numpy as np
def foo(n, m):
return np.array([np.arange(n)] * m).T
Natively (no Python lists):
rows, columns = 4, 6
numpy.arange(rows).reshape(-1, 1).repeat(columns, axis=1)
#>>> array([[0, 0, 0, 0, 0, 0],
#>>> [1, 1, 1, 1, 1, 1],
#>>> [2, 2, 2, 2, 2, 2],
#>>> [3, 3, 3, 3, 3, 3]])
You can easily do this using built in python functions. The program counts to 3 converting each number to a string and repeats the string 6 times.
print [6*str(n) for n in range(0,4)]
Here is the output.
ks-MacBook-Pro:~ kyle$ pbpaste | python
['000000', '111111', '222222', '333333']
On more for fun
np.zeros((n, m), dtype=np.int) + np.arange(n, dtype=np.int)[:,None]
As has been mentioned, there are many ways to do this.
Here's what I'd do:
import numpy as np
def makearray(m, n):
A = np.empty((m,n))
A.T[:] = np.arange(m)
return A
Here's an amusing alternative that will work if you aren't going to be changing the contents of the array.
It should save some memory.
Be careful though because this doesn't allocate a full array, it will have multiple entries pointing to the same memory address.
import numpy as np
from numpy.lib.stride_tricks import as_strided
def makearray(m, n):
A = np.arange(m)
return as_strided(A, strides=(A.strides[0],0), shape=(m,n))
In either case, as I have written them, a 3x4 array can be created by makearray(3, 4)
Using count from the built-in module itertools:
>>> from itertools import count
>>> rows = 4
>>> columns = 6
>>> cnt = count()
>>> [[cnt.next()]*columns for i in range(rows)]
[[0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1], [2, 2, 2, 2, 2, 2], [3, 3, 3, 3, 3, 3]]
you can simply
>>> nc=5
>>> nr=4
>>> [[k]*nc for k in range(nr)]
[[0, 0, 0, 0, 0], [1, 1, 1, 1, 1], [2, 2, 2, 2, 2], [3, 3, 3, 3, 3]]
Several other possibilities using a (n,1) array
a = np.arange(n)[:,None] (or np.arange(n).reshape(-1,1))
a*np.ones((m),dtype=int)
a[:,np.zeros((m),dtype=int)]
If used with a (m,) array, just leave it (n,1), and let broadcasting expand it for you.