Related
Let's say I have :
one = np.array([ [2,3,np.array([ [1,2], [7,3] ])],
[4,5,np.array([ [11,12],[14,15] ])]
], dtype=object)
two = np.array([ [1,2] ,[7, 3],
[11,12] , [14,15] ])
I want to be able to compare the values that are in the array of the one array, with the values of two array.
I am talking about the
[1,2] ,[7, 3],
[11,12] , [14,15]
So, I want to check if they are the same, one by one.
Probably like:
for idx,x in np.ndenumerate(one):
for idy,y in np.ndenumerate(two):
print(y)
which gives all the elements of two.
I can't figure how to access at the same time all elements (but only the last from each row) of one and compare them with two
The problem is that they don't have the same dimensions.
This works
np.r_[tuple(one[:, 2])] == two
Output:
array([[ True, True],
[ True, True],
[ True, True],
[ True, True]], dtype=bool)
In a comment link #George tried to work with:
In [246]: a
Out[246]: array([1, [2, [33, 44, 55, 66]], 11, [22, [77, 88, 99, 100]]], dtype=object)
In [247]: a.shape
Out[247]: (4,)
This is a 4 element array. If we reshape it, we can isolate an inner layer
In [257]: a.reshape(2,2)
Out[257]:
array([[1, [2, [33, 44, 55, 66]]],
[11, [22, [77, 88, 99, 100]]]], dtype=object)
In [258]: a.reshape(2,2)[:,1]
Out[258]: array([[2, [33, 44, 55, 66]], [22, [77, 88, 99, 100]]], dtype=object)
This last case is (2,) - 2 lists. We can isolate the 2nd item in each list with a comprehension, and create an array from the resulting lists:
In [260]: a1=a.reshape(2,2)[:,1]
In [261]: [i[1] for i in a1]
Out[261]: [[33, 44, 55, 66], [77, 88, 99, 100]]
In [263]: np.array([i[1] for i in a1])
Out[263]:
array([[ 33, 44, 55, 66],
[ 77, 88, 99, 100]])
Nothing fancy here - just paying attention to array shapes, and using list operations where arrays don't work.
Is there a way to do multiple indexing in a numpy array as described below?
arr=np.array([55, 2, 3, 4, 5, 6, 7, 8, 9])
arr[np.arange(0,2):np.arange(5,7)]
output:
IndexError: too many indices for array
Desired output:
array([55,2,3,4,5],[2,3,4,5,6])
This problem might be similar to calculating a moving average over an array (but I want to do it without any function that is provided).
Here's an approach using strides -
start_index = np.arange(0,2)
L = 5 # Interval length
n = arr.strides[0]
strided = np.lib.stride_tricks.as_strided
out = strided(arr[start_index[0]:],shape=(len(start_index),L),strides=(n,n))
Sample run -
In [976]: arr
Out[976]: array([55, 52, 13, 64, 25, 76, 47, 18, 69, 88])
In [977]: start_index
Out[977]: array([2, 3, 4])
In [978]: L = 5
In [979]: out
Out[979]:
array([[13, 64, 25, 76, 47],
[64, 25, 76, 47, 18],
[25, 76, 47, 18, 69]])
I have a numpy array:
import numpy as np
a = np.array([2, 56, 4, 8, 564])
and I want to add two elements: one at the beginning of the array, 88, and one at the end, 77.
I can do this with:
a = np.insert(np.append(a, [77]), 0, 88)
so that a ends up looking like:
array([ 88, 2, 56, 4, 8, 564, 77])
The question: what is the correct way of doing this? I feel like nesting a np.append in a np.insert is quite likely not the pythonic way to do this.
Another way to do that would be to use numpy.concatenate . Example -
np.concatenate([[88],a,[77]])
Demo -
In [62]: a = np.array([2, 56, 4, 8, 564])
In [64]: np.concatenate([[88],a,[77]])
Out[64]: array([ 88, 2, 56, 4, 8, 564, 77])
You can use np.concatenate -
np.concatenate(([88],a,[77]))
You can pass the list of indices to np.insert :
>>> np.insert(a,[0,5],[88,77])
array([ 88, 2, 56, 4, 8, 564, 77])
Or if you don't know the length of your array you can use array.size to specify the end of array :
>>> np.insert(a,[0,a.size],[88,77])
array([ 88, 2, 56, 4, 8, 564, 77])
what about:
a = np.hstack([88, a, 77])
I have a multiband catalog of radiation sources (from SourceExtractor, if you care to know), which I have read into an astropy table in the following form:
Source # | FLUX_APER_BAND1 | FLUXERR_APER_BAND1 ... FLUX_APER_BANDN | FLUXERR_APER_BANDN
1 np.array(...) np.array(...) ... np.array(...) np.array(...)
...
The arrays in FLUX_APER_BAND1, FLUXERR_APER_BAND1, etc. each have 14 elements, which give the number of photon counts for a given source in a given band, within 14 different distances from the center of the source (aperture photometry). I have the array of apertures (2, 3, 4, 6, 8, 10, 14, 20, 28, 40, 60, 80, 100, and 160 pixels), and I want to interpolate the 14 samples into a single (assumed) count at some other aperture a.
I could iterate over the sources, but the catalog has over 3000 of them, and that's not very pythonic or very efficient (interpolating 3000 objects in 8 bands would take a while). Is there a way of interpolating all the arrays in a single column simultaneously, to the same aperture? I tried simply applying np.interp, but that threw ValueError: object too deep for desired array, as well as np.vectorize(np.interp), but that threw ValueError: object of too small depth for desired array. It seems like aggregation should also be possible over the contents of a single column, but I can't make sense of the documentation.
Can someone shed some light on this? Thanks in advance!
I'm not familiar with the format of an astropy table, but it looks like it could be represented as a three-dimensional numpy array, with axes for source, band and aperture. If that is the case, you can use, for example, scipy.interpolate.interp1d. Here's a simple example.
In [51]: from scipy.interpolate import interp1d
Make some sample data. The "table" y is 3-D, with shape (2, 3, 14). Think of it as the array holding the counts for 2 sources, 3 bands and 14 apertures.
In [52]: x = np.array([2, 3, 4, 6, 8, 10, 14, 20, 28, 40, 60, 80, 100, 160])
In [53]: y = np.array([[x, 2*x, 3*x], [x**2, (x+1)**3/400, (x**1.5).astype(int)]])
In [54]: y
Out[54]:
array([[[ 2, 3, 4, 6, 8, 10, 14, 20, 28,
40, 60, 80, 100, 160],
[ 4, 6, 8, 12, 16, 20, 28, 40, 56,
80, 120, 160, 200, 320],
[ 6, 9, 12, 18, 24, 30, 42, 60, 84,
120, 180, 240, 300, 480]],
[[ 4, 9, 16, 36, 64, 100, 196, 400, 784,
1600, 3600, 6400, 10000, 25600],
[ 0, 0, 0, 0, 1, 3, 8, 23, 60,
172, 567, 1328, 2575, 10433],
[ 2, 5, 8, 14, 22, 31, 52, 89, 148,
252, 464, 715, 1000, 2023]]])
Create the interpolator. This creates a linear interpolator by default. (Check out the docstring for different interpolators. Also, before calling interp1d, you might want to transform your data in such a way that linear interpolation is appropriate.) I use axis=2 to create an interpolator of the aperture axis. f will be a function that takes an aperture value and returns an array with shape (2,3).
In [55]: f = interp1d(x, y, axis=2)
Take a look at a couple y slices. These correspond to apertures 2 and 3 (i.e. x[0] and x[1]).
In [56]: y[:,:,0]
Out[56]:
array([[2, 4, 6],
[4, 0, 2]])
In [57]: y[:,:,1]
Out[57]:
array([[3, 6, 9],
[9, 0, 5]])
Use the interpolator to get the values at apertures 2, 2.5 and 3. As expected, the values at 2 and 3 match the values in y.
In [58]: f(2)
Out[58]:
array([[ 2., 4., 6.],
[ 4., 0., 2.]])
In [59]: f(2.5)
Out[59]:
array([[ 2.5, 5. , 7.5],
[ 6.5, 0. , 3.5]])
In [60]: f(3)
Out[60]:
array([[ 3., 6., 9.],
[ 9., 0., 5.]])
About being Pythonic, key aspects of that are simplicity, readability, and practicality. If your case is really a one-off (i.e. you'll be doing the 3000 x 8 interpolations a few times rather than a million times), then the fastest and most easily understood solution would be the simple one of just iterating with Python loops. By fastest I mean from the time you know your question until the time you have an answer from your code.
The overhead of looping and calling a function 24000 times is quite small in human / astronomer time scales, and definitely much lower than writing a stack-overflow post. :-)
When I was trying to solve a scientific problem with Python (Numpy), a 'shape mismatch' error came up: "shape mismatch: objects cannot be broadcast to a single shape". I managed to reproduce the same error in a simpler form, as shown below:
import numpy as np
nx = 3; ny = 5
ff = np.ones([nx,ny,7])
def test(x, y):
z = 0.0
for i in range(7):
z = z + ff[x,y,i]
return z
print test(np.arange(nx),np.arange(ny))
When I tried to call test(x,y) with x=1,y=np.arange(ny), everything works fine. So what's going on here? Why can't the both parameters be numpy arrays?
UPDATE
I have worked out the problem with some hints from #Saullo Castro. Here's some updated info for you guys who tried to help but feel unclear about my intention:
Basically I created a mesh grid with dimension nx*ny and another array ff that stores some value for each node. In the above code, ff has 7 values for each node and I was trying to sum up the 7 values to get a new nx*ny array.
However, the "shape mismatch" error is not due to the summing process as many of you might have guess now. I have misunderstood the rule of functions taking ndarray objects as input parameters. I tried to pass np.arange(nx), np.arange(ny) to test() is not gonna give me what I desired, even if nx==ny.
Back to my original intention, I solve the problem by creating another function and used np.fromfunction to created the array:
def tt(x, y):
return np.fromfunction(lambda a,b: test(a,b), (x, y))
which is not perfect but it works. (In this example there seems to be no need to create a new function, but in my actual code I modified it a bit so it can be used for slice of the grid)
Anyway, I do believe there's a much better way compared to my kind of dirty solution. So if you have any idea about that, please share with us :).
Let's look into an array similar to your ff array:
nx = 3; ny = 4
ff = np.arange(nx*ny*5).reshape(nx,ny,5)
#array([[[ 0, 1, 2, 3, 4],
# [ 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14],
# [15, 16, 17, 18, 19]],
#
# [[20, 21, 22, 23, 24],
# [25, 26, 27, 28, 29],
# [30, 31, 32, 33, 34],
# [35, 36, 37, 38, 39]],
#
# [[40, 41, 42, 43, 44],
# [45, 46, 47, 48, 49],
# [50, 51, 52, 53, 54],
# [55, 56, 57, 58, 59]]])
When you index using arrays of indices a, b, c like in ff[a, b, c], a, b, c must have the same shape, and numpy will build a new array based on the indices. For example:
ff[[0, 0, 1, 1, 2, 2], [0, 1, 0, 1, 2, 3], [0, 0, 0, 1, 1, 1]]
#array([ 0, 5, 20, 26, 51, 56])
This is called fancy indexing, which is like building an array with:
np.array([ff[0, 0, 0], ff[0, 1, 0], ff[1, 0, 0], ..., ff[2, 3, 1]])
In your case the f[x, y, i] will produce a shape mismatch error since a, b, c do not have the same shape.
Looks like you want to sum ff over the last dimension, with the 1st 2 dimensions covering their whole range. : is used to denote the whole range of a dimension:
def test():
z = 0.0
for i in range(7):
z = z + ff[:,:,i]
return z
print test()
But you can get the same result without looping, by using the sum method.
print ff.sum(axis=-1)
: is shorthand for 0:n
ff[0:nx, 0:ny, 0]==ff[:,:,0]
It is possible to index a block of ff with ranges, but you have to be much more careful about the shapes of the indexing arrays. For a beginner it is better to focus on getting slicing and broadcasting correct.
edit -
You can index an array like ff with arrays generated by meshgrid:
I,J = meshgrid(np.arange(nx),np.arange(ny),indexing='ij',sparse=False)
I.shape # (nx,ny)
ff[I,J,:]
also works with
I,J = meshgrid(np.arange(nx),np.arange(ny),indexing='ij',sparse=True)
I.shape # (nx,1)
J.shape # (1, ny)
ogrid and mgrid are alternatives to meshgrid.
Let's reproduce your problem in 2D case, so it is easier to see:
import numpy as np
a = np.arange(15).reshape(3,5)
x = np.arange(3)
y = np.arange(5)
Demo:
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
>>> a[x, y] # <- This is the error that you are getting
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: shape mismatch: objects cannot be broadcast to a single shape
# You are getting the error because x and y are different lengths,
# If x and y were the same lengths, the code would work:
>>> a[x, x]
array([ 0, 6, 12])
# mixing arrays and scalars is not a problem
>>> a[x, 2]
array([ 2, 7, 12])
It is not clear in your question what you are trying to do or what result are you expecting. It seems, though, that you are trying to calculate a total with your variable z.
Check if the sum method produces the result that you need:
import numpy as np
nx = 3; ny = 5
ff = ff = np.array(np.arange(nx*ny*7)).reshape(nx,ny,7)
print ff.sum() # 5460
print ff.sum(axis=0) # array([[105, 108, 111, 114, 117, 120, 123],
# [126, 129, 132, 135, 138, 141, 144],
# [147, 150, 153, 156, 159, 162, 165],
# [168, 171, 174, 177, 180, 183, 186],
# [189, 192, 195, 198, 201, 204, 207]]) shape(5,7)
print ff.sum(axis=1) # array([[ 70, 75, 80, 85, 90, 95, 100],
# [245, 250, 255, 260, 265, 270, 275],
# [420, 425, 430, 435, 440, 445, 450]]) shape (3,7)
print ff.sum(axis=2) # array([[ 21, 70, 119, 168, 217],
# [266, 315, 364, 413, 462],
# [511, 560, 609, 658, 707]]) shape (3,5)