Related
Let's say we have initial array:
test_array = np.array([1, 4, 2, 5, 7, 4, 2, 5, 6, 7, 7, 2, 5])
What is the best way to remap elements in this array by using two other arrays, one that represents elements we want to replace and second one which represents new values which replace them:
map_from = np.array([2, 4, 5])
map_to = np.array([9, 0, 3])
So the results should be:
remaped_array = [1, 0, 9, 3, 7, 0, 9, 3, 6, 7, 7, 9, 3]
There might be a more succinct way of doing this, but this should work by using a mask.
mask = test_array[:,None] == map_from
val = map_to[mask.argmax(1)]
np.where(mask.any(1), val, test_array)
output:
array([1, 0, 9, 3, 7, 0, 9, 3, 6, 7, 7, 9, 3])
If your original array contains only positive integers and their maximum values are not very large, it is easiest to use a mapped array:
>>> a = np.array([1, 4, 2, 5, 7, 4, 2, 5, 6, 7, 7, 2, 5])
>>> mapping = np.arange(a.max() + 1)
>>> map_from = np.array([2, 4, 5])
>>> map_to = np.array([9, 0, 3])
>>> mapping[map_from] = map_to
>>> mapping[a]
array([1, 0, 9, 3, 7, 0, 9, 3, 6, 7, 7, 9, 3])
Here is another general method:
>>> vals, inv = np.unique(a, return_inverse=True)
>>> vals[np.searchsorted(vals, map_from)] = map_to
>>> vals[inv]
array([1, 0, 9, 3, 7, 0, 9, 3, 6, 7, 7, 9, 3])
I have a array X structured with shape 2, 5 as follows:
0, 6, 7, 9, 1
2, 4, 6, 2, 7
I'd like to reshape it to repeat each row n times as follows (example uses n = 3):
0, 6, 7, 9, 1
0, 6, 7, 9, 1
0, 6, 7, 9, 1
2, 4, 6, 2, 7
2, 4, 6, 2, 7
2, 4, 6, 2, 7
I have tried to use np.tile as follows, but it repeats as shown below:
np.tile(X, (3, 5))
0, 6, 7, 9, 1
2, 4, 6, 2, 7
0, 6, 7, 9, 1
2, 4, 6, 2, 7
0, 6, 7, 9, 1
2, 4, 6, 2, 7
How might i efficiently create the desired output?
If a be the main array:
a = np.array([0, 6, 7, 9, 1, 2, 4, 6, 2, 7])
we can do this by first reshaping to the desired array shape and then use np.repeat as:
b = a.reshape(2, 5)
final = np.repeat(b, 3, axis=0)
It can be done with np.tile too, but it needs unnecessary extra operations, something as below. So, np.repeat will be the better choice.
test = np.tile(b, (3, 1))
final = np.concatenate((test[::2], test[1::2]))
For complex repeats, I'd use np.kron instead:
np.kron(x, np.ones((2, 1), dtype=int))
For something relatively simple,
np.repeat(x, 2, axis=0)
Can someone explain this to me?
import numpy as np
arr = reversed(np.arange(11))
print(list(arr))
print(list(arr))
print(list(arr))
the output of this code is :
why can't I access the arr variable more than once?
reversed returns an iterator. When you first call list on it, you iterated on it till the end and it has nothing to give back anymore. So if you call list again, it immediately says "stop iteration" and you get an empty list.
Fix is to store it when you first call list on it:
>>> arr = list(reversed(np.arange(11)))
>>> arr
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
Though, in numpy's context you wouldn't cast to list but reverse in numpy domain:
>>> arr = np.arange(11)[::-1]
or
>>> arr = np.arange(10, -1, -1)
or
>>> arr = np.flip(np.arange(11))
or
>>> arr = np.flipud(np.arange(11))
to get
>>> arr
array([10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0])
The reversed method
Return a reverse iterator over the values of the given sequence
Once the iterator is consumed, it's empty
If you consume it and store in a list, then you can use it multiple times
import numpy as np
arr = list(reversed(np.arange(11)))
print(list(arr)) # [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
print(list(arr)) # [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
print(list(arr)) # [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
This has to do with the return of reversed(). You anticipate that it is a list, but it is an iterator
import numpy as np
arr = reversed(np.arange(11))
>>> print(list(arr))
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>>> print(list(arr))
[]
>>> print(list(arr))
[]
After the firt run, the iterator is empty. You can prevent this by creating a list from the iterator and then print it:
arr2 = list(reversed(np.arange(11)))
>>> print(list(arr2))
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>>> print(list(arr2))
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>>> print(list(arr2))
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
This question already has an answer here:
Numpy to list over 2nd axis
(1 answer)
Closed 2 years ago.
Let numpy array be shape (x, y, z).
I want it to be (x, y) shape with every element being a list of z-length: [a, b, c, ..., z]
Is there any way to do it with numpy methods?
You can use tolist and assign to a preallocated object array:
import numpy as np
a = np.random.randint(0,10,(100,100,100))
def f():
A = np.empty(a.shape[:-1],object)
A[...] = a.tolist()
return A
f()[99,99]
# [4, 5, 9, 2, 8, 9, 9, 6, 8, 5, 7, 9, 8, 7, 6, 1, 9, 6, 2, 9, 0, 7, 0, 1, 2, 8, 4, 4, 7, 0, 1, 2, 3, 8, 9, 6, 0, 1, 4, 7, 0, 7, 9, 3, 9, 1, 8, 7, 1, 2, 3, 6, 6, 2, 7, 0, 2, 8, 7, 0, 0, 1, 8, 2, 6, 3, 5, 4, 9, 6, 9, 0, 2, 5, 9, 5, 3, 7, 0, 1, 9, 0, 8, 2, 0, 7, 3, 6, 9, 9, 4, 4, 3, 8, 4, 7, 4, 2, 1, 8]
type(f()[99,99])
# <class 'list'>
from timeit import timeit
timeit(f,number=100)*10
# 28.67872992530465
I can't imagine why numpy would need such a method. Here is, more or less, a pythonic solution.
import numpy as np
# an example array with shape [2,3,4]
a = np.random.random([2,3,4])
# create the target array shaped [2,3] with 'object' type (accepting other types than numbers).
b = np.array([[None for row in mat] for mat in a])
for i in range(b.shape[0]):
for j in range(b.shape[1]):
b[i,j] = list(a[i,j])
I have a large numpy array of size 100x100. Among these 10000 values, there are only about 50 unique values. So I want to create a second array of length 50, containing these unique values, and then somehow map the large array to the smaller array. Effectively, I want to store just 50 values in my system instead of redundant 10000 values.
Slices of arrays seem to share memory, but as soon as I use specific indexing, memory sharing is lost.
a = np.array([1,2,3,4,5])
b = a[:3]
indices = [0,1,2]
c = a[indices]
print(b,c)
print(np.shares_memory(a,b),np.shares_memory(a,c))
This gives the output:
[1 2 3] [1 2 3]
True False
Even though b and c are referring to the same values of a, b(the slice) shares memory with a while c doesn't. If I execute b[0] = 100, a[0] also becomes 100 since they share memory. That is not the case with c.
I want to make c, which is a collection of values which are all from a, share memory with a.
In general it is not possible to save memory in this way. The reason is that your data consists of 64-bit integers, and pointers are also 64-bit integers, so if you try to store each value exactly once in some auxiliary array and then point at those values, you will end up using basically the same amount of space.
The answer would be different if for example some of your arrays are subsets of other ones, or you if you were storing large types like long strings.
So make a random array with a small set of unique values:
In [45]: x = np.random.randint(0,10,(10,10))
In [46]: x
Out[46]:
array([[4, 3, 8, 5, 4, 8, 8, 1, 8, 1],
[9, 2, 7, 2, 9, 5, 3, 9, 3, 3],
[6, 2, 6, 9, 4, 2, 3, 4, 6, 7],
[1, 0, 2, 1, 0, 9, 4, 2, 6, 2],
[8, 1, 6, 8, 3, 9, 5, 0, 8, 5],
[4, 9, 1, 4, 1, 2, 8, 4, 7, 2],
[4, 5, 2, 4, 8, 0, 1, 4, 4, 7],
[2, 2, 0, 5, 3, 0, 3, 3, 3, 9],
[3, 1, 0, 6, 4, 8, 8, 3, 5, 2],
[7, 5, 9, 2, 8, 0, 8, 1, 7, 8]])
Find the unique ones:
In [48]: np.unique(x)
Out[48]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
better yet the unique values plus an array that lets us map those values on the original:
In [49]: np.unique(x, return_inverse=True)
Out[49]:
(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
array([4, 3, 8, 5, 4, 8, 8, 1, 8, 1, 9, 2, 7, 2, 9, 5, 3, 9, 3, 3, 6, 2,
6, 9, 4, 2, 3, 4, 6, 7, 1, 0, 2, 1, 0, 9, 4, 2, 6, 2, 8, 1, 6, 8,
3, 9, 5, 0, 8, 5, 4, 9, 1, 4, 1, 2, 8, 4, 7, 2, 4, 5, 2, 4, 8, 0,
1, 4, 4, 7, 2, 2, 0, 5, 3, 0, 3, 3, 3, 9, 3, 1, 0, 6, 4, 8, 8, 3,
5, 2, 7, 5, 9, 2, 8, 0, 8, 1, 7, 8]))
There's a value in the reverse mapping for each element in the original.