Consider numpy arrays arr1 and arr2. They can be any number of dimensions. For example
arr1=np.zeros([5,8])
arr2=np.ones([4,10])
I would like to put arr2 into arr1 either by cutting off excess lengths in some dimensions, or filling missing length with zeros.
I have tried:
arr1[exec(str(",:"*len([arr1.shape]))[1:])]=arr2[exec(str(",:"*len([arr2.shape]))[1:])]
which is basically the same as
arr1[:,:]=arr2[:,:]
I would like to do this preferably in one line and without "for" loops.
You could use this :
arr1[:min(arr1.shape[0], arr2.shape[0]), :min(arr1.shape[1], arr2.shape[1])]=arr2[:min(arr1.shape[0], arr2.shape[0]), :min(arr1.shape[1], arr2.shape[1])]
without any for loop.
It's the same concept you applied in second try, but with a condition to choose minimum length.
I solved this by coming up with the following. I used slice() as #hpaulj suggested. Considering I want to assign ph10 (an array) to ph14 (an array of zeros of size bound1):
ph14=np.zeros(bound1)
ph10=np.array(list1)
ind_min=np.min([ph14.shape,ph10.shape],0)
ph24=[]
for n2 in range(0,len(ind_min.shape)):
ph24=ph24+[slice(0,ind_min[n2])]
ph14[ph24]=ph10[ph24]
Related
I have a 3D array with the shape (9, 100, 7200). I want to remove the 2nd half of the 7200 values in every row so the new shape will be (9, 100, 3600).
What can I do to slice the array or delete the 2nd half of the indices? I was thinking np.delete(arr, [3601:7200], axis=2), but I get an invalid syntax error when using the colon.
Why not just slicing?
arr = arr[:,:,:3600]
The syntax error occurs because [3601:7200] is not valid python. I assume you are trying to create a new array of numbers to pass as the obj parameter for the delete function. You could do it this way using something like the range function:
np.delete(arr, range(3600,7200), axis=2)
keep in mind that this will not modify arr, but it will return a new array with the elements deleted. Also, notice I have used 3600 not 3601.
However, its often better practice to use slicing in a problem like this:
arr[:,:,:3600]
This gives your required shape. Let me break this down a little. We are slicing a numpy array with 3 dimensions. Just putting a colon in means we are taking everything in that dimension. :3600 means we are taking the first 3600 elements in that dimension. A better way to think about deleting the last have, is to think of it as keeping the first half.
If I have an array like this
arr=np.array([['a','b','c'],
['d','e','f']])
and an array of booleans of the same shape, like this:
boolarr=np.array([[False,True,False],
[True,True,True]])
I want to be able to only select the elements from the first array, that correspond to a True in the boolean array. So the output would be:
out=[['b'],
['d','e','f']]
I managed to solve this with a simple for loop
out=[]
for n, i in enumerate(arr):
out.append(i[boolarr[n]])
out=np.array(out)
but the problem is this solution is slow for large arrays, and was wondering if there was an easier solution with numpys indexing. Just using the normal notation arr[boolarr] returns a single flat array ['b','d','e','f']. I also tried using a slice with arr[:,[True,False,True]], which keeps the shape but can only use one boolean array.
Thanks for the comments. I misunderstood how an array works. For those curious this is my solution (I'm actually working with numbers):
arr[boolarr]=np.nan
And then I just changed how the rest of the function handles nan values
Given a Boolean numpy nd-array, how can I found if the total number of ones is greater than the total numbe of zeros in the array without traversing the whole array with nested for loops. I meant a function in-line with any() and all(). Say max_bool() which works as follows:
def max_bool(array):
return array.ones => array.zeros
Traversing is not an option as the dimensions of arrays, I intended to work with have diverse unpredictible dimensions and can be too large. I am not concerned about the exact number of ones & zeros either. Just that if array has more ones or zeros, even if the number of ones is just one greater than the number of zeros. Any help?
Simplest way I can think of:
def max_bool(array):
return array.mean() >= .5
I have two numpy arrays of the same shape. The elements in the arrays are random integers from [0,N]. I need to check which (if any) of the elements in the same position in the arrays are equal.
The output I need are the positions of the same elements.
mock code:
A=np.array([0,1])
B=np.array([1,0])
C=np.array([1,1])
np.any_elemenwise(A,B)
np.any_elemenwise(A,C)
np.any_elemenwise(A,A)
desired output:
[]
[1]
[0,1]
I can write a loop going through all of the elements one by one, but I assume that the desired output can be achieved much faster.
EDIT:The question changed.
You just want to evaluate np.where(v1==v2)[0]
Suppose I have an N*M*X-dimensional array "data", where N and M are fixed, but X is variable for each entry data[n][m].
(Edit: To clarify, I just used np.array() on the 3D python list which I used for reading in the data, so the numpy array is of dimensions N*M and its entries are variable-length lists)
I'd now like to compute the average over the X-dimension, so that I'm left with an N*M-dimensional array. Using np.average/mean with the axis-argument doesn't work, so the way I'm doing it right now is just iterating over N and M and appending the manually computed average to a new list, but that just doesn't feel very "python":
avgData=[]
for n in data:
temp=[]
for m in n:
temp.append(np.average(m))
avgData.append(temp)
Am I missing something obvious here? I'm trying to freshen up my python skills while I'm at it, so interesting/varied responses are more than welcome! :)
Thanks!
What about using np.vectorize:
do_avg = np.vectorize(np.average)
data_2d = do_avg(data)
data = np.array([[1,2,3],[0,3,2,4],[0,2],[1]]).reshape(2,2)
avg=np.zeros(data.shape)
avg.flat=[np.average(x) for x in data.flat]
print avg
#array([[ 2. , 2.25],
# [ 1. , 1. ]])
This still iterates over the elements of data (nothing un-Pythonic about that). But since there's nothing special about the shape or axes of data, I'm just using data.flat. While appending to Python list, with numpy it is better to assign values to the elements of an existing array.
There are fast numeric methods to work with numpy arrays, but most (if not all) work with simple numeric dtypes. Here the array elements are object (either list or array), numpy has to resort to the usual Python iteration and list operations.
For this small example, this solution is a bit faster than Zwicker's vectorize. For larger data the two solutions take about the same time.