Joining np.arrays Python with a padding - python

Analogous to:
"True".join(['False','False'])
I'd like to join numpy arrays, e.g.
arr = np.zeros((15,10), dtype=bool)
joiner = np.ones((15,1), dtype=bool)
result = np.hstack((arr, joiner, arr))
result.shape
(15, 21)
That is, I'd like to join a variable amount of arrays with a truth vector in between each of them.
arr, joiner, arr, joiner, arr, ...
How to extend the above for any number of arrays?
We can assume that they all have the same shape.

I came up with a simple quite silly appending method (I expect it to be really slow compared to some solutions out there):
def mergeArrays(*args):
if args:
joiner = np.ones((args[0].shape[0], 1))
new = []
for x in args[:-1]:
new.append(x)
new.append(joiner)
new.append(args[-1])
return np.hstack(new)

Related

Return True/False for entire array if any value meets mask requirement(s)

I have already tried looking at other similar posts however, their solutions do not solve this specific issue. Using the answer from this post I found that I get the error: "The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()" because I define my array differently from theirs. Their array is a size (n,) while my array is a size (n,m). Moreover, the solution from this post does not work either because it applies to lists. The only method I could think of was this:
When there is at least 1 True in array, then entire array is considered True:
filt = 4
tracktruth = list()
arraytruth = list()
arr1 = np.array([[1,2,4]])
for track in range(0,arr1.size):
if filt == arr1[0,track]:
tracktruth.append(True)
else:
tracktruth.append(False)
if any(tracktruth):
arraytruth.append(True)
else:
arraytruth.append(False)
When there is not a single True in array, then entire array is considered False:
filt = 5
tracktruth = list()
arraytruth = list()
arr1 = np.array([[1,2,4]])
for track in range(0,arr1.size):
if filt == arr1[0,track]:
tracktruth.append(True)
else:
tracktruth.append(False)
if any(tracktruth):
arraytruth.append(True)
else:
arraytruth.append(False)
The reason the second if-else statement is there is because I wish to apply this mask to multiple arrays and ultimately create a master list that describes which arrays are true and which are false in their entirety. However, with a for loop and two if-else statements, I think this would be very slow with larger arrays. What would be a faster way to do this?
This seems overly complicated, you can use boolean indexing to achieve results without loops
arr1=np.array([[1,2,4]])
filt=4
arr1==filt
array([[False, False, True]])
np.sum(arr1==filt).astype(bool)
True
With nmore than one row, you can use the row or column index in the np.sum or you can use the axis parameter to sum on rows or columns
As pointed out in the comments, you can use np.any() instead of the np.sum(...).astype(bool) and it runs in roughly 2/3 the time on the test dataset:
np.any(a==filt, axis=1)
array([ True])
You can do this with list comprehension. I've done it here for one array but it's easily extended to multiple arrays with a for loop
filt = 4
arr1 = np.array([[1,2,4]])
print(any([part == filt for part in arr1[0]]))
You can get the arraytruth more generally, with list comprehension for the array of size (n,m)
import numpy as np
filt = 4
a = np.array([[1, 2, 4]])
b = np.array([[1, 2, 3],
[5, 6, 7]])
array_lists = [a, b]
arraytruth = [True if a[a==filt].size>0 else False for a in array_lists]
print(arraytruth)
This will give you:
[True, False]
[edit] Use numpy hstack method.
filt = 4
arr = np.array([[1,2,3,4], [1,2,3]])
print(any([x for x in np.hstack(arr) if x < filt]))

numpy element-wise multiplication of an array and a vector

I want to do something like this:
a = # multi-dimensional numpy array
ares = # multi-dim array, same shape as a
a.shape
>>> (45, 72, 37, 24) # the relevant point is that all dimension are different
v = # 1D numpy array, i.e. a vector
v.shape
>>> (37) # note that v has the same length as the 3rd dimension of a
for i in range(37):
ares[:,:,i,:] = a[:,:,i,:]*v[i]
I'm thinking there has to be a more compact way to do this with numpy, but I haven't figured it out. I guess I could replicate v and then calculate a*v, but I am guessing there is something better than that too. So I need to do element wise multiplication "over a given axis", so to speak. Anyone know how I can do this? Thanks. (BTW, I did find a close duplicate question, but because of the nature of the OP's particular problem there, the discussion was very short and got tracked into other issues.)
Here is one more:
b = a * v.reshape(-1, 1)
IMHO, this is more readable than transpose, einsum and maybe even v[:, None], but pick the one that suits your style.
You can automatically broadcast the vector against the outermost axis of an array. So, you can transpose the array to swap the axis you want to the outside, multiply, then transpose it back:
ares = (a.transpose(0,1,3,2) * v).transpose(0,1,3,2)
You can do this with Einstein summation notation using numpy's einsum function:
ares = np.einsum('ijkl,k->ijkl', a, v)
I tend to do something like
b = a * v[None, None, :, None]
where I think I'm officially supposed to write np.newaxis instead of None.
For example:
>>> import numpy as np
>>> a0 = np.random.random((45,72,37,24))
>>> a = a0.copy()
>>> v = np.random.random(37)
>>> for i in range(len(v)):
... a[:,:,i,:] *= v[i]
...
>>> b = a0 * v[None,None,:,None]
>>>
>>> np.allclose(a,b)
True

Index of element in NumPy array [duplicate]

This question already has answers here:
Is there a NumPy function to return the first index of something in an array?
(20 answers)
Closed 2 years ago.
In Python we can get the index of a value in an array by using .index().
But with a NumPy array, when I try to do:
decoding.index(i)
I get:
AttributeError: 'numpy.ndarray' object has no attribute 'index'
How could I do this on a NumPy array?
Use np.where to get the indices where a given condition is True.
Examples:
For a 2D np.ndarray called a:
i, j = np.where(a == value) # when comparing arrays of integers
i, j = np.where(np.isclose(a, value)) # when comparing floating-point arrays
For a 1D array:
i, = np.where(a == value) # integers
i, = np.where(np.isclose(a, value)) # floating-point
Note that this also works for conditions like >=, <=, != and so forth...
You can also create a subclass of np.ndarray with an index() method:
class myarray(np.ndarray):
def __new__(cls, *args, **kwargs):
return np.array(*args, **kwargs).view(myarray)
def index(self, value):
return np.where(self == value)
Testing:
a = myarray([1,2,3,4,4,4,5,6,4,4,4])
a.index(4)
#(array([ 3, 4, 5, 8, 9, 10]),)
You can convert a numpy array to list and get its index .
for example:
tmp = [1,2,3,4,5] #python list
a = numpy.array(tmp) #numpy array
i = list(a).index(2) # i will return index of 2, which is 1
this is just what you wanted.
I'm torn between these two ways of implementing an index of a NumPy array:
idx = list(classes).index(var)
idx = np.where(classes == var)
Both take the same number of characters, but the first method returns an int instead of a numpy.ndarray.
This problem can be solved efficiently using the numpy_indexed library (disclaimer: I am its author); which was created to address problems of this type. npi.indices can be viewed as an n-dimensional generalisation of list.index. It will act on nd-arrays (along a specified axis); and also will look up multiple entries in a vectorized manner as opposed to a single item at a time.
a = np.random.rand(50, 60, 70)
i = np.random.randint(0, len(a), 40)
b = a[i]
import numpy_indexed as npi
assert all(i == npi.indices(a, b))
This solution has better time complexity (n log n at worst) than any of the previously posted answers, and is fully vectorized.
You can use the function numpy.nonzero(), or the nonzero() method of an array
import numpy as np
A = np.array([[2,4],
[6,2]])
index= np.nonzero(A>1)
OR
(A>1).nonzero()
Output:
(array([0, 1]), array([1, 0]))
First array in output depicts the row index and second array depicts the corresponding column index.
If you are interested in the indexes, the best choice is np.argsort(a)
a = np.random.randint(0, 100, 10)
sorted_idx = np.argsort(a)

Change the values of a NumPy array that are NOT in a list of indices

I have a NumPy array like:
a = np.arange(30)
I know that I can replace the values located at positions indices=[2,3,4] using for instance fancy indexing:
a[indices] = 999
But how to replace the values at the positions that are not in indices? Would be something like below?
a[ not in indices ] = 888
I don't know of a clean way to do something like this:
mask = np.ones(a.shape,dtype=bool) #np.ones_like(a,dtype=bool)
mask[indices] = False
a[~mask] = 999
a[mask] = 888
Of course, if you prefer to use the numpy data-type, you could use dtype=np.bool_ -- There won't be any difference in the output. it's just a matter of preference really.
Only works for 1d arrays:
a = np.arange(30)
indices = [2, 3, 4]
ia = np.indices(a.shape)
not_indices = np.setxor1d(ia, indices)
a[not_indices] = 888
Obviously there is no general not operator for sets. Your choices are:
Subtracting your indices set from a universal set of indices (depends on the shape of a), but that will be a bit difficult to implement and read.
Some kind of iteration (probably the for-loop is your best bet since you definitely want to use the fact that your indices are sorted).
Creating a new array filled with new value, and selectively copying indices from the old one.
b = np.repeat(888, a.shape)
b[indices] = a[indices]
Just overcome similar situation, solved this way:
a = np.arange(30)
indices=[2,3,4]
a[indices] = 999
not_in_indices = [x for x in range(len(a)) if x not in indices]
a[not_in_indices] = 888

Converting list of 2-D arrays into a 3-D array, adding elements along "fast" axes

I have a list of 2d numpy arrays. As a test, consider the following list:
lst = [np.arange(10).reshape(5,2)]*10
Now I can get at a particular data element by:
lst[k][j,i]
I would like to convert this to a numpy array so that I can index it:
array[k,j,i]
i.e., the shape should be (10, 5, 2).
This seems to work, but seems completely unnecessary:
z = np.empty((10,5,2))
for i,x in enumerate(z):
x[:,:] = lst[i]
These don't work:
np.hstack(lst)
np.vstack(lst)
np.dstack(lst) #this is closest, but gives wrong shape (5, 2, 10)
I suppose I could pair a np.dstack with a np.rollaxis, but again, that doesn't seem quite right ...
Is there a good way to do this with numpy?
I've looked at this very related post, but I can't quite seem to work it out.
This should work simply by calling the array constructor, i.e. np.array(lst).
>>> l = [np.arange(10).reshape((5,2)) for i in range(10)]
>>> np.array(l).shape
(10, 5, 2)
Do you mean like
>>> lst = [np.arange(10).reshape(5,2)]*10
>>> arr = np.array(lst)
>>> arr.shape
(10, 5, 2)
?

Categories