Python Create vertical Numpy array - python

I have created a code in which from my lists I create an array, which must be vertical, like a vector, the problem is that using the reshape method I don't get anything.
import numpy as np
data = [[ 28, 29, 30, 19, 20, 21],
[ 31, 32, 33, 22, 23, 24],
[ 1, 34, 35, 36, 25, 26],
[ 2, 19, 20, 21, 10, 11],
[ 3, 4, 5, 6, 7, 8 ]]
index = []
for i in range(len(data)):
index.append([data[i][0], data[i][1], data[i][2],
data[i][3], data[i][4], data[i][5]])
y = np.array([index[i]])
# y.reshape(6,1)
Is there any solution for these cases? Thank you.
I'm looking for something like this to remain:

If you want to view each row as a column, transpose the array in any one of the following ways:
index = data.T
index = np.transpose(data)
index = data.transpose()
index = np.swapaxes(data, 0, 1)
index = np.moveaxis(data, 1, 0)
...
Each column of index will be a row of data. If you just want to access one column at a time, you can do that too. For example, to get row 3 (4th row) of the original array, any of the following would work:
y = data[3, :]
y = data[3]
y = index[:, 3]
You can get a column vector from the result by explicitly reshaping it to one:
y = y.reshape(-1, 1)
y = np.reshape(y, (-1, 1))
y = np.expand_dims(y, 1)
Remember that reshaping creates a new array object which views the same data as the original. The only way I know to reshape an array in-place is to assign to its shape attribute:
y.shape = (y.size, 1)

You can use flatten() from numpy https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html
(if you want a copy of the original array without modifying the original)
import numpy as np
data = [[ 28, 29, 30, 19, 20, 21],
[ 31, 32, 33, 22, 23, 24],
[ 1, 34, 35, 36, 25, 26],
[ 2, 19, 20, 21, 10, 11],
[ 3, 4, 5, 6, 7, 8 ]]
data = np.array(data).flatten()
print(data.shape)
(30,)
You can also use ravel()
(if you don't want a copy)
data = np.array(data).ravel()
If your array always has 2-d, this also works,
data = data.reshape(-1)

Related

How can I extract a set of 2D slices from a larger 2D numpy array?

If I have a large 2D numpy array and 2 arrays which correspond to the x and y indices I want to extract, It's easy enough:
h = np.arange(49).reshape(7,7)
# h = [[0, 1, 2, 3, 4, 5, 6],
# [7, 8, 9, 10, 11, 12, 13],
# [14, 15, 16, 17, 18, 19, 20],
# [21, 22, 23, 24, 25, 26, 27],
# [28, 29, 30, 31, 32, 33, 34],
# [35, 36, 37, 38, 39, 40, 41],
# [42, 43, 44, 45, 46, 47, 48]]
x_indices = np.array([1,3,4])
y_indices = np.array([2,3,5])
reduced_h = h[x_indices, y_indices]
#reduced_h = [ 9, 24, 33]
However, I would like to, for each x,y pair cut out a square (denoted by 'a' - the number of indices in each direction from the centre) surrounding this 'coordinate' and return an array of these little 2D arrays.
For example, for h, x,y_indices as above and a=1:
reduced_h = [[[1,2,3],[8,9,10],[15,16,17]], [[16,17,18],[23,24,25],[30,31,32]], [[25,26,27],[32,33,34],[39,40,41]]]
i.e one 3x3 array for each x-y index pair corresponding to the 3x3 square of elements centred on the x-y index. In general, this should return a numpy array which has shape (len(x_indices),2a+1, 2a+1)
By analogy to reduced_h[0] = h[x_indices[0]-1:x_indices[0]+1 , y_indices[0]-1:y_indices[0]+1] = h[1-1:1+1 , 2-1:2+1] = h[0:2, 1:3] my first try was the following:
h[x_indices-a : x_indices+a, y_indices-a : y_indices+a]
However, perhaps unsurprisingly, slicing between the arrays fails.
So the obvious next thing to try is to create this slice manually. np.arange seems to struggle with this but linspace works:
a=1
xrange = np.linspace(x_indices-a, x_indices+a, 2*a+1, dtype=int)
# xrange = [ [0, 2, 3], [1, 3, 4], [2, 4, 5] ]
yrange = np.linspace(y_indices-a, y_indices+a, 2*a+1, dtype=int)
Now can try h[xrange,yrange] but this unsurprisingly does this element-wise meaning I get only one (2a+1)x(2a+1) array (the same dimensions as xrange and yrange). It there a way to, for every index, take the right slices from these ranges (without loops)? Or is there a way to make the broadcast work initially without having to set up linspace explicitly? Thanks
You can index np.lib.stride_tricks.sliding_window_view using your x and y indices:
import numpy as np
h = np.arange(49).reshape(7,7)
x_indices = np.array([1,3,4])
y_indices = np.array([2,3,5])
a = 1
window = (2*a+1, 2*a+1)
out = np.lib.stride_tricks.sliding_window_view(h, window)[x_indices-a, y_indices-a]
out:
array([[[ 1, 2, 3],
[ 8, 9, 10],
[15, 16, 17]],
[[16, 17, 18],
[23, 24, 25],
[30, 31, 32]],
[[25, 26, 27],
[32, 33, 34],
[39, 40, 41]]])
Note that you may need to pad h first to handle windows around your coordinates that reach "outside" h.

define a 3d numpy array with a range in column-wise format

I want to define a 3d numpy array with shape = [3, 3, 5] and also with values as a range starting with 11 and step = 3 in column-wise manner. I mean:
B[:,:,0] = [[ 11, 20, 29],
[ 14, 23, 32],
[17, 26, 35]]
B[:,:,1] = [[ 38, ...],
[ 41, ...],
[ 44, ...]]
...
I am new to numpy and I doubt doing it with np.arange or np.mgrid maybe. but I don't how to do.
How can this be done with minimal lines of code?
You can calculate the end of the range by multiplying the shape by the step and adding the start. Then it's just reshape and transpose to move the column around:
start = 11
step = 3
shape = [5, 3, 3]
end = np.prod(shape) * step + start
B = np.arange(start, end, step).reshape([5, 3, 3]).transpose(2, 1, 0)
B[:, :, 0]
# array([[11, 20, 29],
# [14, 23, 32],
# [17, 26, 35]])

Assign a new column that uses exponential function on the index of the numpy array dynamically

Lets say I have an array of the below nature:
x = arange(30).reshape((10,3))
x
Out[52]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26],
[27, 28, 29]])
How do I add a fourth column to each of the row such that this column is an exponential function of the index number and ends up with something like this:
array([[ 0, 1, 2, 2.718281828],
[ 3, 4, 5, 7.389056099], ,
[ 6, 7, 8, 20.08553692],
[ 9, 10, 11, 54.59815003 ],
[12, 13, 14, 148.4131591],
[15, 16, 17, 403.4287935],
[18, 19, 20, 1096.633158 ],
[21, 22, 23, 2980.957987],
[24, 25, 26, 8103.083928],
[27, 28, 29, 22026.46579]])
Computing the exponential is easy:
ex = np.exp(np.arange(x.shape[0]) + 1)
What you want to do with it is a whole different story. Numpy doesn't allow heterogeneous arrays, unlike say pandas. So with the simple answer, your result will be float64 (x is most likely int64 or int32):
x = np.concatenate((x, ex[:, None]), axis=1)
An alternative is using structured arrays, which will let you preserve the input types:
d = [('', x.dtype)] * x.shape[1] + [('', ex.dtype)]
out = np.empty(ex.shape, dtype=d)
Bulk assignment is a bit tricky, but can be done with a view obtained from the raw ndarray constructor:
view = np.ndarray(buffer=out, dtype=x.dtype, shape=x.shape, strides=(out.dtype.itemsize, x.dtype.itemsize))
view[...] = x
np.ndarray(buffer=out, dtype=ex.dtype, shape=ex.shape, strides=(out.dtype.itemsize,), offset=x.strides[0])[:] = ex
A simpler approach would be to use recarray, as #PaulPanzer suggests:
out = np.core.records.fromarrays([*x.T, ex])
Try this:
import numpy as np
a = np.arange(30).reshape((10,3))
b = np.zeros((a.shape[0], a.shape[1] + 1))
b[:, :-1] = a
b[:, 3] = np.exp(np.arange(len(b)))
To create a single array of powers of e starting at one, you can use
powers = np.power(np.e, np.arange(10) + 1)
Which basically takes the number e and rases it to the powers given by array np.arange(10) + 1, i.e. the numbers [1...10].
You can then add this as an additional column by first reshaping it and then adding it using np.hstack.
powers = powers.reshape(-1, 1)
x = np.hstack((x, powers))
You can construct such column with:
>>> np.exp(np.arange(1, 11))
array([2.71828183e+00, 7.38905610e+00, 2.00855369e+01, 5.45981500e+01,
1.48413159e+02, 4.03428793e+02, 1.09663316e+03, 2.98095799e+03,
8.10308393e+03, 2.20264658e+04])
So we can first obtain the number of rows, and then use np.hstack:
rows = x.shape[0]
result = np.hstack((x, np.exp(np.arange(1, rows+1)).reshape(-1, 1)))
We then otain:
>>> np.hstack((x, np.exp(np.arange(1, 11)).reshape(-1, 1)))
array([[0.00000000e+00, 1.00000000e+00, 2.00000000e+00, 2.71828183e+00],
[3.00000000e+00, 4.00000000e+00, 5.00000000e+00, 7.38905610e+00],
[6.00000000e+00, 7.00000000e+00, 8.00000000e+00, 2.00855369e+01],
[9.00000000e+00, 1.00000000e+01, 1.10000000e+01, 5.45981500e+01],
[1.20000000e+01, 1.30000000e+01, 1.40000000e+01, 1.48413159e+02],
[1.50000000e+01, 1.60000000e+01, 1.70000000e+01, 4.03428793e+02],
[1.80000000e+01, 1.90000000e+01, 2.00000000e+01, 1.09663316e+03],
[2.10000000e+01, 2.20000000e+01, 2.30000000e+01, 2.98095799e+03],
[2.40000000e+01, 2.50000000e+01, 2.60000000e+01, 8.10308393e+03],
[2.70000000e+01, 2.80000000e+01, 2.90000000e+01, 2.20264658e+04]])

python slicing multi-demnsional array

Could any one explain me how does the command (<---) below works in python numpy
r = np.arange(36)
r.resize(6,6)
r.reshape(36)[::7] # <---
You just have to run the commands one by one and analyse their output:
Create a list of the first [0, 35] numbers.
>>> r = np.arange(36)
>>> r
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35])
Reshape the list in-place to a 6 x 6 array:
>>> r.resize(6,6) # equivalent to r = r.reshape(6,6)
>>> r
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
Reshape the vector r to a 1Dimensional vector
>>> tmp = r.reshape(36)
tmp above is exactly the same as r in the first step
Filter every 7 element
>>> tmp[::7]
array([ 0, 7, 14, 21, 28, 35])
Slicing/Indexing is represented as i:j:k, where i = from, j = to and k = step. Thus, 5:10:2 would mean from element 5th to the 10th, give me elements every 2 steps. If i is not present, it is assumed to be from the beginning of the array. If j is not present, it is assumed to be until the end of the array. If k is not present it is assumed to have an step of 1 (all the elements in the range).
With all the above, you could rewrite your example in a single line as:
>>> np.arange(36)[::7]
Or if you already have r, which is N-Dimensional:
>>> r.ravel()[::7]
Here ravel will return a 1Dimensional view of r (preferred to reshape(36)).
If you want to know more about slicing, please refer to the numpy documentation.
At first, you are using NumPy ndarray.reshape, which reconstructs the given array to the specified shape. In your case, you are converting it to a 1-Dimension array with 36 elements.
Secondly, with the numbers between brackets, your are indexing certain values in the array. The slicing consists in 3 values per dimension, in the form of [number1:number2:number3]. If you leave the values blank (like in your case for numbers 1 and 2), you will leave them to default i.e. number1 will be 0, number2 will be -1 (the last array index) and number3 will be 3:
The first number indicates the array index where you will begin taking values.
The second number indicates the array index where you will stop taking values.
Finally, the last number indicates the number of positions that will be ignored after each index reading. In your case, you are reading every 7 indexes.
One point to add, both reshape() and resize() methods have the SAME functionality, the ONLY difference between them is how they affect the calling array object r:
r.resize() have no return. It directly change the shape of calling array object r.
r.reshape() returns a new reshaped array object. And leaves the original r unchanged.
>>> import numpy as np
>>> r = np.arange(36)
>>> r.shape
(36,)
>>> # 1. --- `reshape()` returns a new object and keep the `r` ---
>>> new = r.reshape(6,6)
>>> new.shape
(6, 6)
>>>
>>> # 2. --- resize changes `r` directly and returns `None` ---
>>> nothing = r.resize(6,6)
>>> type(nothing)
<class 'NoneType'>
>>> r.shape
(6, 6)

Numpy: find index of the elements within range

I have a numpy array of numbers, for example,
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
I would like to find all the indexes of the elements within a specific range. For instance, if the range is (6, 10), the answer should be (3, 4, 5). Is there a built-in function to do this?
You can use np.where to get indices and np.logical_and to set two conditions:
import numpy as np
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
np.where(np.logical_and(a>=6, a<=10))
# returns (array([3, 4, 5]),)
As in #deinonychusaur's reply, but even more compact:
In [7]: np.where((a >= 6) & (a <=10))
Out[7]: (array([3, 4, 5]),)
Summary of the answers
For understanding what is the best answer we can do some timing using the different solution.
Unfortunately, the question was not well-posed so there are answers to different questions, here I try to point the answer to the same question. Given the array:
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
The answer should be the indexes of the elements between a certain range, we assume inclusive, in this case, 6 and 10.
answer = (3, 4, 5)
Corresponding to the values 6,9,10.
To test the best answer we can use this code.
import timeit
setup = """
import numpy as np
import numexpr as ne
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
# or test it with an array of the similar size
# a = np.random.rand(100)*23 # change the number to the an estimate of your array size.
# we define the left and right limit
ll = 6
rl = 10
def sorted_slice(a,l,r):
start = np.searchsorted(a, l, 'left')
end = np.searchsorted(a, r, 'right')
return np.arange(start,end)
"""
functions = ['sorted_slice(a,ll,rl)', # works only for sorted values
'np.where(np.logical_and(a>=ll, a<=rl))[0]',
'np.where((a >= ll) & (a <=rl))[0]',
'np.where((a>=ll)*(a<=rl))[0]',
'np.where(np.vectorize(lambda x: ll <= x <= rl)(a))[0]',
'np.argwhere((a>=ll) & (a<=rl)).T[0]', # we traspose for getting a single row
'np.where(ne.evaluate("(ll <= a) & (a <= rl)"))[0]',]
functions2 = [
'a[np.logical_and(a>=ll, a<=rl)]',
'a[(a>=ll) & (a<=rl)]',
'a[(a>=ll)*(a<=rl)]',
'a[np.vectorize(lambda x: ll <= x <= rl)(a)]',
'a[ne.evaluate("(ll <= a) & (a <= rl)")]',
]
rdict = {}
for i in functions:
rdict[i] = timeit.timeit(i,setup=setup,number=1000)
print("%s -> %s s" %(i,rdict[i]))
print("Sorted:")
for w in sorted(rdict, key=rdict.get):
print(w, rdict[w])
Results
The results are reported in the following plot for a small array (on the top the fastest solution) as noted by #EZLearner they may vary depending on the size of the array. sorted slice could be faster for larger arrays, but it requires your array to be sorted, for arrays with over 10 M of entries ne.evaluate could be an option. Is hence always better to perform this test with an array of the same size as yours:
If instead of the indexes you want to extract the values you can perform the tests using functions2 but the results are almost the same.
I thought I would add this because the a in the example you gave is sorted:
import numpy as np
a = [1, 3, 5, 6, 9, 10, 14, 15, 56]
start = np.searchsorted(a, 6, 'left')
end = np.searchsorted(a, 10, 'right')
rng = np.arange(start, end)
rng
# array([3, 4, 5])
a = np.array([1,2,3,4,5,6,7,8,9])
b = a[(a>2) & (a<8)]
Other way is with:
np.vectorize(lambda x: 6 <= x <= 10)(a)
which returns:
array([False, False, False, True, True, True, False, False, False])
It is sometimes useful for masking time series, vectors, etc.
This code snippet returns all the numbers in a numpy array between two values:
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56] )
a[(a>6)*(a<10)]
It works as following:
(a>6) returns a numpy array with True (1) and False (0), so does (a<10). By multiplying these two together you get an array with either a True, if both statements are True (because 1x1 = 1) or False (because 0x0 = 0 and 1x0 = 0).
The part a[...] returns all values of array a where the array between brackets returns a True statement.
Of course you can make this more complicated by saying for instance
...*(1-a<10)
which is similar to an "and Not" statement.
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
np.argwhere((a>=6) & (a<=10))
Wanted to add numexpr into the mix:
import numpy as np
import numexpr as ne
a = np.array([1, 3, 5, 6, 9, 10, 14, 15, 56])
np.where(ne.evaluate("(6 <= a) & (a <= 10)"))[0]
# array([3, 4, 5], dtype=int64)
Would only make sense for larger arrays with millions... or if you hitting a memory limits.
This may not be the prettiest, but works for any dimension
a = np.array([[-1,2], [1,5], [6,7], [5,2], [3,4], [0, 0], [-1,-1]])
ranges = (0,4), (0,4)
def conditionRange(X : np.ndarray, ranges : list) -> np.ndarray:
idx = set()
for column, r in enumerate(ranges):
tmp = np.where(np.logical_and(X[:, column] >= r[0], X[:, column] <= r[1]))[0]
if idx:
idx = idx & set(tmp)
else:
idx = set(tmp)
idx = np.array(list(idx))
return X[idx, :]
b = conditionRange(a, ranges)
print(b)
s=[52, 33, 70, 39, 57, 59, 7, 2, 46, 69, 11, 74, 58, 60, 63, 43, 75, 92, 65, 19, 1, 79, 22, 38, 26, 3, 66, 88, 9, 15, 28, 44, 67, 87, 21, 49, 85, 32, 89, 77, 47, 93, 35, 12, 73, 76, 50, 45, 5, 29, 97, 94, 95, 56, 48, 71, 54, 55, 51, 23, 84, 80, 62, 30, 13, 34]
dic={}
for i in range(0,len(s),10):
dic[i,i+10]=list(filter(lambda x:((x>=i)&(x<i+10)),s))
print(dic)
for keys,values in dic.items():
print(keys)
print(values)
Output:
(0, 10)
[7, 2, 1, 3, 9, 5]
(20, 30)
[22, 26, 28, 21, 29, 23]
(30, 40)
[33, 39, 38, 32, 35, 30, 34]
(10, 20)
[11, 19, 15, 12, 13]
(40, 50)
[46, 43, 44, 49, 47, 45, 48]
(60, 70)
[69, 60, 63, 65, 66, 67, 62]
(50, 60)
[52, 57, 59, 58, 50, 56, 54, 55, 51]
You can use np.clip() to achieve the same:
a = [1, 3, 5, 6, 9, 10, 14, 15, 56]
np.clip(a,6,10)
However, it holds the values less than and greater than 6 and 10 respectively.

Categories