numpy.resize() rearanging instead of resizing? - python

I'm trying to resize numpy array, but it seems that the resize works by first flattening the array, then getting first X*Y elem and putting them in the new shape. What I want to do instead is to cut the array at coord 3,3, not rearrange it. Similar thing happens when I try to upsize it say to 7,7 ... instead of "rearranging" I want to fill the new cols and rows with zeros and keep the data as it is.
Is there a way to do that ?
> a = np.zeros((5,5))
> a.flat = range(25)
> a
array(
[[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[ 10., 11., 12., 13., 14.],
[ 15., 16., 17., 18., 19.],
[ 20., 21., 22., 23., 24.]])
> a.resize((3,3),refcheck=False)
> a
array(
[[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
thank you ...

Upsizing to 7x7 goes like this
upsized = np.zeros([7, 7])
upsized[:5, :5] = a

I believe you want to use numpy's slicing syntax instead of resize. resize works by first raveling the array and working with a 1D view.
>>> a = np.arange(25).reshape(5,5)
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
>>> a[:3,:3]
array([[ 0, 1, 2],
[ 5, 6, 7],
[10, 11, 12]])
What you are doing here is taking a view of the numpy array. For example to update the original array by slicing:
>>> a[:3,:3] = 0
>>> a
array([[ 0, 0, 0, 3, 4],
[ 0, 0, 0, 8, 9],
[ 0, 0, 0, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
An excellent guide on numpy's slicing syntax can be found here.
Upsizing (or padding) only works by making a copy of the data. You start with an array of zeros and fill in appropriately
upsized = np.zeros([7, 7])
upsized[:5, :5] = a

Related

Clean way to return default when taking minimum of empty NumPy array

I have two arrays, one holding a series of years and another holding some quantities. I want to study for each year how long it takes for the quantity to double.
For this I wrote this code:
years = np.arange(2020, 2060)
qma = np.array([8.00000000e+13, 8.14928049e+13, 8.30370113e+13, 8.46353044e+13,
8.62905581e+13, 8.80058517e+13, 8.97844887e+13, 9.16300175e+13,
9.35462542e+13, 9.55373083e+13, 9.76076116e+13, 9.97619497e+13,
1.02005499e+14, 1.04343864e+14, 1.06783128e+14, 1.09329900e+14,
1.11991375e+14, 1.14775397e+14, 1.17690539e+14, 1.20746183e+14,
1.23952624e+14, 1.27321176e+14, 1.30864305e+14, 1.34595778e+14,
1.38530838e+14, 1.74048570e+14, 1.92205500e+14, 2.14405932e+14,
2.42128686e+14, 2.77655470e+14, 3.24688168e+14, 3.89624819e+14,
4.84468500e+14, 6.34373436e+14, 9.74364148e+14, 2.33901669e+15,
1.78934647e+16, 4.85081278e+20, 8.63469750e+21, 2.08204297e+22])
def doubling_year(idx):
try:
return years[qma >= 2*qma[idx]].min()
except ValueError:
return np.nan
years_until_doubling = [doubling_year(idx) - years[idx]
for idx in range(len(years))]
This works as I expect, but having to define a named function for what is essentially a one-liner feels wrong. Is there a cleaner and more succing way of replicating this behaviour?
For each year in the series, the number of years in which the quantity is double or more the original quantity can be computed through broadcasting. Then you simply have to subtract that number from the number of years remaining in the series and replace the 0's with np.nan's.
In [426]: n_doubled = np.sum(qma[None, :] >= 2*qma[:, None], axis=1)
In [427]: n_doubled
Out[427]:
array([15, 15, 15, 15, 15, 14, 14, 14, 14, 14, 13, 13, 13, 13, 13, 12, 12,
12, 12, 12, 11, 11, 11, 11, 11, 9, 9, 8, 8, 7, 6, 6, 6, 5,
5, 4, 3, 2, 1, 0])
In [428]: np.where(n_doubled, np.arange(len(years), 0, -1) - n_doubled, np.nan)
Out[428]:
array([25., 24., 23., 22., 21., 21., 20., 19., 18., 17., 17., 16., 15.,
14., 13., 13., 12., 11., 10., 9., 9., 8., 7., 6., 5., 6.,
5., 5., 4., 4., 4., 3., 2., 2., 1., 1., 1., 1., 1.,
nan])

find infinity values and replace with maximum per vector in a numpy array

Suppose I have the following array with shape (3, 5) :
array = np.array([[1, 2, 3, inf, 5],
[10, 9, 8, 7, 6],
[4, inf, 2, 6, inf]])
Now I want to find the infinity values per vector and replace them with the maximum of that vector, with a lower limit of 1.
So the output for this example shoud be:
array_solved = np.array([[1, 2, 3, 5, 5],
[10, 9, 8, 7, 6],
[4, 6, 2, 6, 6]])
I could do this by looping over every vector of the array and apply:
idx_inf = np.isinf(array_vector)
max_value = np.max(np.append(array_vector[~idx_inf], 1.0))
array_vector[idx_inf] = max_value
But I guess there is a faster way.
Anyone an idea?
One way is to first convert infs to NaNs with np.isinf masking and then NaNs to max values of rows with np.nanmax:
array[np.isinf(array)] = np.nan
array[np.isnan(array)] = np.nanmax(array, axis=1)
to get
>>> array
array([[ 1., 2., 3., 5., 5.],
[10., 9., 8., 7., 6.],
[ 4., 10., 2., 6., 6.]])
import numpy as np
array = np.array([[1, 2, 3, np.inf, 5],
[10, 9, 8, 7, 6],
[4, np.inf, 2, 6, np.inf]])
n, m = array.shape
array[np.isinf(array)] = -np.inf
mx_array = np.repeat(np.max(array, axis=1), m).reshape(n, m)
ind = np.where(np.isinf(array))
array[ind] = mx_array[ind]
Output array:
array([[ 1., 2., 3., 5., 5.],
[10., 9., 8., 7., 6.],
[ 4., 6., 2., 6., 6.]])

Unique entries in columns of a 2D numpy array

I have an array of integers:
import numpy as np
demo = np.array([[1, 2, 3],
[1, 5, 3],
[4, 5, 6],
[7, 8, 9],
[4, 2, 3],
[4, 2, 12],
[10, 11, 13]])
And I want an array of unique values in the columns, padded with something if necessary (e.g. nan):
[[1, 4, 7, 10, nan],
[2, 5, 8, 11, nan],
[3, 6, 9, 12, 13]]
It does work when I iterate over the transposed array and use a boolean_indexing solution from a previous question. But I was hoping there would be a built-in method:
solution = []
for row in np.unique(demo.T, axis=1):
solution.append(np.unique(row))
def boolean_indexing(v, fillval=np.nan):
lens = np.array([len(item) for item in v])
mask = lens[:,None] > np.arange(lens.max())
out = np.full(mask.shape,fillval)
out[mask] = np.concatenate(v)
return out
print(boolean_indexing(solution))
AFAIK, there are no builtin solution for that. That being said, your solution seems a bit complex to me. You could create an array with initialized values and fill it with a simple loop (since you already use loops anyway).
solution = [np.unique(row) for row in np.unique(demo.T, axis=1)]
result = np.full((len(solution), max(map(len, solution))), np.nan)
for i,arr in enumerate(solution):
result[i][:len(arr)] = arr
If you want to avoid the loop you could do:
demo = demo.astype(np.float32) # nan only works on floats
sort = np.sort(demo, axis=0)
diff = np.diff(sort, axis=0)
np.place(sort[1:], diff == 0, np.nan)
sort.sort(axis=0)
edge = np.argmax(sort, axis=0).max()
result = sort[:edge]
print(result.T)
Output:
array([[ 1., 4., 7., 10., nan],
[ 2., 5., 8., 11., nan],
[ 3., 6., 9., 12., 13.]], dtype=float32)
Not sure if this is any faster than the solution given by Jérôme.
EDIT
A slightly better solution
demo = demo.astype(np.float32)
sort = np.sort(demo, axis=0)
mask = np.full(sort.shape, False, dtype=bool)
np.equal(sort[1:], sort[:-1], out=mask[1:])
np.place(sort, mask, np.nan)
edge = (~mask).sum(0).max()
result = np.sort(sort, axis=0)[:edge]
print(result.T)
Output:
array([[ 1., 4., 7., 10., nan],
[ 2., 5., 8., 11., nan],
[ 3., 6., 9., 12., 13.]], dtype=float32)

Classifying dots in matrix (Python)

I have big matrix, like 600x600 with 9 dots in 9 same sectors(# like tic-tac-toe).
I need to turn it to 3x3 array with iDs of dots in this sectors, like:
[[id2,id1,id5],[id4,id6,id7],[id3,id8,id9]]
Dividing plane in 9 small planes goes really bad. I need something like relative positions, and dont know even the worlds I need to google
def classificator(val):
global A
global closed
height, width = map(int, closed.shape)
h1 = height // 3
w1 = width // 3
h2 = height // 3 * 2
w2 = width // 3 * 2
for x in range(len(val)):
xcoord = val[x][0]
ycoord = val[x][1]
if 0 <= val[x][0] < h1 and 0 <= val[x][1] < w1 and A[0, 0] == '_': #top left X
A[0, 0] = val[x][2]
Following from the comments above. This is still asking for clarification but shows a way of interpreting your question.
In [1]: import numpy as np
In [2]: data_in=np.fromfunction(lambda r, c: 10*r+c, (6, 6))
# Create an array where the vales give a indication of where they are in the array.
In [3]: data_in
Out[3]:
array([[ 0., 1., 2., 3., 4., 5.],
[10., 11., 12., 13., 14., 15.],
[20., 21., 22., 23., 24., 25.],
[30., 31., 32., 33., 34., 35.],
[40., 41., 42., 43., 44., 45.],
[50., 51., 52., 53., 54., 55.]])
In [4]: slices=[np.s_[0:3], np.s_[3:6] ]
In [5]: slices
Out[5]: [slice(0, 3, None), slice(3, 6, None)]
In [8]: result=np.zeros((4,3,3), dtype=np.int32)
In [9]: ix=0
In [12]: for rows in slices:
...: for columns in slices:
...: result[ix,:,:]=data_in[rows, columns]
...: ix+=1
...:
In [13]: result
Out[13]:
array([[[ 0, 1, 2],
[10, 11, 12], # Top Left in data_in
[20, 21, 22]],
[[ 3, 4, 5],
[13, 14, 15], # Top Right in data_in
[23, 24, 25]],
[[30, 31, 32],
[40, 41, 42], # Bottom Left in data_in
[50, 51, 52]],
[[33, 34, 35],
[43, 44, 45], # Bottom Right in data_in
[53, 54, 55]]], dtype=int32)
Can you use it as a basis to explain what you expect to see?
If your input data was only 6 by 6 what would it look like and what would you expect to see coming out?
Edits: Two typos corrected.

An elegant way of inserting a numpy matrix into another

I have a requirement where I have 2 2D numpy arrays, and I would like to combine them in a specific manner:
x = [[0, 1, 2],
[3, 4, 5],
[6, 7, 8]]
| | |
0 1 2
y = [[10, 11, 12],
[13, 14, 15],
[16, 17, 18]]
| | |
3 4 5
x op y = [ 0 3 1 4 2 5 ] (in terms of the columns)
In other words,
The combination of x and y should look something like this:
[[ 0., 10., 1., 11., 2., 12.],
[ 3., 13., 4., 14., 5., 15.],
[ 6., 16., 7., 17., 8., 18.]]
Where I alternately combine the columns of each individual array to form the final 2D array. I have come up with one way of doing so, but it is rather ugly. Here's my code:
x = np.arange(9).reshape(3, 3)
y = np.arange(start=10, stop=19).reshape(3, 3)
>>> a = np.zeros((6, 3)) # create a 2D array where num_rows(a) = num_cols(x) + num_cols(y)
>>> a[: : 2] = x.T
>>> a[1: : 2] = y.T
>>> a.T
array([[ 0., 10., 1., 11., 2., 12.],
[ 3., 13., 4., 14., 5., 15.],
[ 6., 16., 7., 17., 8., 18.]])
As you can see, this is a very ugly sequence of operations. Furthermore, things become even more cumbersome in higher dimensions. For example, if you have x and y to be [3 x 3 x 3], then this operation has to be repeated in each dimension. So I'd probably have to tackle this with a loop.
Is there a simpler way around this?
Thanks.
In [524]: x=np.arange(9).reshape(3,3)
In [525]: y=np.arange(10,19).reshape(3,3)
This doesn't look at all ugly to me (one liners are over rated):
In [526]: a = np.zeros((3,6),int)
....
In [528]: a[:,::2]=x
In [529]: a[:,1::2]=y
In [530]: a
Out[530]:
array([[ 0, 10, 1, 11, 2, 12],
[ 3, 13, 4, 14, 5, 15],
[ 6, 16, 7, 17, 8, 18]])
still if you want a one liner, this might do:
In [535]: np.stack((x.T,y.T),axis=1).reshape(6,3).T
Out[535]:
array([[ 0, 10, 1, 11, 2, 12],
[ 3, 13, 4, 14, 5, 15],
[ 6, 16, 7, 17, 8, 18]])
The idea on this last was to combine the arrays on a new dimension, and reshape is some way other. I found it by trial and error.
and with another trial:
In [539]: np.stack((x,y),2).reshape(3,6)
Out[539]:
array([[ 0, 10, 1, 11, 2, 12],
[ 3, 13, 4, 14, 5, 15],
[ 6, 16, 7, 17, 8, 18]])
Here is a compact way to write it with a loop, it might be generalizable to higher dimension arrays with a little work:
x = np.array([[0,1,2], [3,4,5], [6,7,8]])
y = np.array([[10,11,12], [13,14,15], [16,17,18]])
z = np.zeros((3,6))
for i in xrange(3):
z[i] = np.vstack((x.T[i],y.T[i])).reshape((-1,),order='F')

Categories