I have three 2-D numpy arrays with shape as (3,7).
I want to take the (0,0) element from each of the array, pass these values in a function and store the returned value at the (0,0) index in a new 2-D array.
Then I want to take (0,1) element from each of the array, pass these values to the same function and store the returned value at the (0,1) index of the same new array.
I want to run this for all the columns and then move on to the next row and continue till the end of the array.
The catch here is that I don't want to use loops, just the numpy methods. Been struggling a lot on this lately. Any ideas would be of great help.
Thanks!
I am running a loop like this for now. It gives me back the result for each element in the 1st row only. Here a, b and c are the three 2-D arrays that I mentioned earlier.
count = 0
def(a, b, c):
for i in range(0,7):
count += -(c[:1,:][i][0]) - (((a[:1,:][0][i]-b[:1,:][i][0])/c[:1,:][i][0]))**2
return count
Since all three arrays are the same shape, and you're operating on each element in the same way, you can easily translate to vetorised NumPy functions like so:
# res is a 2-D array of the same shape as a, b and c
res = -c - ((a - b) / c)**2
It looks like in your example code you're trying to sum each row, so you can do this after performing the operations:
count = np.sum(res, axis=1)
Related
How can I index the last axis of a Numpy array if I don't know its rank in advance?
Here is what I want to do: Let a be a Numpy array of unknown rank. I want the slice of the last k elements of the last axis.
If a is 1D, I want
b = a[-k:]
If a is 2D, I want
b = a[:, -k:]
If a is 3D, I want
b = a[:, :, -k:]
and so on.
I want this to work regardless of the rank of a (as long as the rank is at least 1).
The fact that I want the last k elements in the example is irrelevant of course, the point is that I want to specify indices for whatever the last axis is when I don't know the rank of an array in advance.
b = a[..., -k:]
This is mentioned in the docs.
Suppose I have the following Numpy nd array:
array([[['a',0,0,0],
[0,'b','c',0],
['e','d',0,0]]])
Now I would like to define 'double connections' of elements as follows:
We consider each column in this array as a time instant, and all elements in this instant are considered to happen at the same time. 0 means nothing happens. For example, a and e happens at the first time instant, b and d happens at the second time instant, and c itself happens in the third time instant.
If two elements, I believe it has 'double connections', and I would like to print the connections like this(if there is no such pair in one column, just move on to the next column until the end):
('a','e')
('e','a')
('b','d')
('d','b')
I tried to come up with solutions on iterating all the columns but did not work.Can anyone share some tips on this?
You can recreate the original array by the following commands
array = np.array([['a',0,0,0],
[0,'b','c',0],
['e','d',0,0],dtype=object)
You could count how many non-zero elements you have for each column. You select the columns with two non-zero elements, repeat them and inverse every second column:
pairs = np.repeat(array[(array[:, (array != 0).sum(axis=0) == 2]).nonzero()].reshape((2, -1)).T, 2, axis=0)
pairs[1::2] = pairs[1::2, ::-1]
If you want to convert these to tuples like in your desired output you could just do a list comprehension:
output = [tuple(pair) for pair in pairs]
I have an numpy array that is shape 20, 3. (So 20 3 by 1 arrays. Correct me if I'm wrong, I am still pretty new to python)
I need to separate it into 3 arrays of shape 20,1 where the first array is 20 elements that are the 0th element of each 3 by 1 array. Second array is also 20 elements that are the 1st element of each 3 by 1 array, etc.
I am not sure if I need to write a function for this. Here is what I have tried:
Essentially I'm trying to create an array of 3 20 by 1 arrays that I can later index to get the separate 20 by 1 arrays.
a = np.load() #loads file
num=20 #the num is if I need to change array size
num_2=3
for j in range(0,num):
for l in range(0,num_2):
array_elements = np.zeros(3)
array_elements[l] = a[j:][l]
This gives the following error:
'''
ValueError: setting an array element with a sequence
'''
I have also tried making it a dictionary and making the dictionary values lists that are appended, but it only gives the first or last value of the 20 that I need.
Your array has shape (20, 3), this means it's a 2-dimensional array with 20 rows and 3 columns in each row.
You can access data in this array by indexing using numbers or ':' to indicate ranges. You want to split this in to 3 arrays of shape (20, 1), so one array per column. To do this you can pick the column with numbers and use ':' to mean 'all of the rows'. So, to access the three different columns: a[:, 0], a[:, 1] and a[:, 2].
You can then assign these to separate variables if you wish e.g. arr = a[:, 0] but this is just a reference to the original data in array a. This means any changes in arr will also be made to the corresponding data in a.
If you want to create a new array so this doesn't happen, you can easily use the .copy() function. Now if you set arr = a[:, 0].copy(), arr is completely separate to a and changes made to one will not affect the other.
Essentially you want to group your arrays by their index. There are plenty of ways of doing this. Since numpy does not have a group by method, you have to horizontally split the arrays into a new array and reshape it.
old_length = 3
new_length = 20
a = np.array(np.hsplit(a, old_length)).reshape(old_length, new_length)
Edit: It appears you can achieve the same effect by rotating the array -90 degrees. You can do this by using rot90 and setting k=-1 or k=3 telling numpy to rotate by 90 k times.
a = np.rot90(a, k=-1)
how do I null certain values in numpy array based on a condition?
I don't understand why I end up with 0 instead of null or empty values where the condition is not met... b is a numpy array populated with 0 and 1 values, c is another fully populated numpy array. All arrays are 71x71x166
a = np.empty(((71,71,166)))
d = np.empty(((71,71,166)))
for indexes, value in np.ndenumerate(b):
i,j,k = indexes
a[i,j,k] = np.where(b[i,j,k] == 1, c[i,j,k], d[i,j,k])
I want to end up with an array which only has values where the condition is met and is empty everywhere else but with out changing its shape
FULL ISSUE FOR CLARIFICATION as asked for:
I start with a float populated array with shape (71,71,166)
I make an int array based on a cutoff applied to the float array basically creating a number of bins, roughly marking out 10 areas within the array with 0 values in between
What I want to end up with is an array with shape (71,71,166) which has the average values in a particular array direction (assuming vertical direction, if you think of a 3D array as a 3D cube) of a certain "bin"...
so I was trying to loop through the "bins" b == 1, b == 2 etc, sampling the float where that condition is met but being null elsewhere so I can take the average, and then recombine into one array at the end of the loop....
Not sure if I'm making myself understood. I'm using the np.where and using the indexing as I keep getting errors when I try and do it without although it feels very inefficient.
Consider this example:
import numpy as np
data = np.random.random((4,3))
mask = np.random.random_integers(0,1,(4,3))
data[mask==0] = np.NaN
The data will be set to nan wherever the mask is 0. You can use any kind of condition you want, of course, or do something different for different values in b.
To erase everything except a specific bin, try the following:
c[b!=1] = np.NaN
So, to make a copy of everything in a specific bin:
a = np.copy(c)
a[b!=1] == np.NaN
To get the average of everything in a bin:
np.mean(c[b==1])
So perhaps this might do what you want (where bins is a list of bin values):
a = np.empty(c.shape)
a[b==0] = np.NaN
for bin in bins:
a[b==bin] = np.mean(c[b==bin])
np.empty sometimes fills the array with 0's; it's undefined what the contents of an empty() array is, so 0 is perfectly valid. For example, try this instead:
d = np.nan * np.empty((71, 71, 166)).
But consider using numpy's strength, and don't iterate over the array:
a = np.where(b, c, d)
(since b is 0 or 1, I've excluded the explicit comparison b == 1.)
You may even want to consider using a masked array instead:
a = np.ma.masked_where(b, c)
which seems to make more sense with respect to your question: "how do I null certain values in a numpy array based on a condition" (replace null with mask and you're done).
Does anyone know how to combine integer indices in numpy? Specifically, I've got the results of a few np.wheres and I would like to extract the elements that are common between them.
For context, I am trying to populate a large 3d array with the number of elements that are between boundary values of each cell, i.e. I have records of individual events including their time, latitude and longitude. I want to grid this into a 3D frequency matrix, where the dimensions are time, lat and lon.
I could loop round the array elements doing an np.where(timeCondition & latCondition & lonCondition), population with the length of the where result, but I figured this would be very inefficient as you would have to repeat a lot of the wheres.
What would be better is to just have a list of wheres for each of the cells in each dimension, and then loop through the logically combining them?
as #ali_m said, use bitwise and should be much faster, but to answer your question:
call ravel_multi_index() to convert the multi-dim index into 1-dim index.
call intersect1d() to get the index that in both condition.
call unravel_index() to convert the 1-dim index back to multi-dim index.
Here is the code:
import numpy as np
a = np.random.rand(10, 20, 30)
idx1 = np.where(a>0.2)
idx2 = np.where(a<0.4)
ridx1 = np.ravel_multi_index(idx1, a.shape)
ridx2 = np.ravel_multi_index(idx2, a.shape)
ridx = np.intersect1d(ridx1, ridx2)
idx = np.unravel_index(ridx, a.shape)
np.allclose(a[idx], a[(a>0.2) & (a<0.4)])
or you can use ridx directly:
a.ravel()[ridx]