Create a 2-D numpy array with list comprehension - python

I need to create a 2-D numpy array using only list comprehension, but it has to follow the following format:
[[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7]]]
So far, all I've managed to figure out is:
two_d_array = np.array([[x+1 for x in range(3)] for y in range(5)])
Giving:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
Just not very sure how to change the incrementation. Any help would be appreciated, thanks!
EDIT: Accidentally left out [3, 4, 5] in example. Included it now.

Here's a quick one-liner that will do the job:
np.array([np.arange(i, i+3) for i in range(1, 6)])
Where 3 is the number of columns, or elements in each array, and 6 is the number of iterations to perform - or in this case, the number of arrays to create; which is why there are 5 arrays in the output.
Output:
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7]])

Change the code, something like this can work:
two_d_array = np.array([[(y*3)+x+1 for x in range(3)] for y in range(5)])
>>> [[1,2,3],[4,5,6],...]
two_d_array = np.array([[y+x+1 for x in range(3)] for y in range(5)])
>>> [[1,2,3],[2,3,4],...]

You've got a couple of good comprehension answers, so here are a couple of numpy solutions.
Simple addition:
np.arange(1, 6)[:, None] + np.arange(3)
Crazy stride tricks:
base = np.arange(1, 8)
np.lib.stride_tricks.as_strided(base, shape=(5, 3), strides=base.strides * 2).copy()
Reshaped cumulative sum:
base = np.ones(15)
base[3::3] = -1
np.cumsum(base).reshape(5, 3)

Related

Finding the intersection

I want intersection of x and y.Is there any way i can get output in below format.
I do not want use for loop.Since x can be of very large size.
x=np.array([[1, 3, 4, 3], [3, 1, 2, 1], [6, 3, 4, 2]])
y=np.array([1,2,0,9,9])
I want output in format:
np.array([[1],[1,2],[2]])
output can also be list of list.
Also consider a case if y is also 2D(np.array([[1,2,0,9,9],[1,5,6,8,9]])) .
You can use numpy intersect1d
arr = [np.intersect1d(z, y).tolist() for z in x]
print(arr) # [[1], [1, 2], [2]]

ordering a 2xN matrix in Python [duplicate]

I have a 2D numpy array of shape (N,2) which is holding N points (x and y coordinates). For example:
array([[3, 2],
[6, 2],
[3, 6],
[3, 4],
[5, 3]])
I'd like to sort it such that my points are ordered by x-coordinate, and then by y in cases where the x coordinate is the same. So the array above should look like this:
array([[3, 2],
[3, 4],
[3, 6],
[5, 3],
[6, 2]])
If this was a normal Python list, I would simply define a comparator to do what I want, but as far as I can tell, numpy's sort function doesn't accept user-defined comparators. Any ideas?
EDIT: Thanks for the ideas! I set up a quick test case with 1000000 random integer points, and benchmarked the ones that I could run (sorry, can't upgrade numpy at the moment).
Mine: 4.078 secs
mtrw: 7.046 secs
unutbu: 0.453 secs
Using lexsort:
import numpy as np
a = np.array([(3, 2), (6, 2), (3, 6), (3, 4), (5, 3)])
ind = np.lexsort((a[:,1],a[:,0]))
a[ind]
# array([[3, 2],
# [3, 4],
# [3, 6],
# [5, 3],
# [6, 2]])
a.ravel() returns a view if a is C_CONTIGUOUS. If that is true,
#ars's method, slightly modifed by using ravel instead of flatten, yields a nice way to sort a in-place:
a = np.array([(3, 2), (6, 2), (3, 6), (3, 4), (5, 3)])
dt = [('col1', a.dtype),('col2', a.dtype)]
assert a.flags['C_CONTIGUOUS']
b = a.ravel().view(dt)
b.sort(order=['col1','col2'])
Since b is a view of a, sorting b sorts a as well:
print(a)
# [[3 2]
# [3 4]
# [3 6]
# [5 3]
# [6 2]]
The title says "sorting 2D arrays". Although the questioner uses an (N,2)-shaped array, it's possible to generalize unutbu's solution to work with any (N,M) array, as that's what people might actually be looking for.
One could transpose the array and use slice notation with negative step to pass all the columns to lexsort in reversed order:
>>> import numpy as np
>>> a = np.random.randint(1, 6, (10, 3))
>>> a
array([[4, 2, 3],
[4, 2, 5],
[3, 5, 5],
[1, 5, 5],
[3, 2, 1],
[5, 2, 2],
[3, 2, 3],
[4, 3, 4],
[3, 4, 1],
[5, 3, 4]])
>>> a[np.lexsort(np.transpose(a)[::-1])]
array([[1, 5, 5],
[3, 2, 1],
[3, 2, 3],
[3, 4, 1],
[3, 5, 5],
[4, 2, 3],
[4, 2, 5],
[4, 3, 4],
[5, 2, 2],
[5, 3, 4]])
The numpy_indexed package (disclaimer: I am its author) can be used to solve these kind of processing-on-nd-array problems in an efficient fully vectorized manner:
import numpy_indexed as npi
npi.sort(a) # by default along axis=0, but configurable
You can use np.complex_sort. This has the side effect of changing your data to floating point, I hope that's not a problem:
>>> a = np.array([[3, 2], [6, 2], [3, 6], [3, 4], [5, 3]])
>>> atmp = np.sort_complex(a[:,0] + a[:,1]*1j)
>>> b = np.array([[np.real(x), np.imag(x)] for x in atmp])
>>> b
array([[ 3., 2.],
[ 3., 4.],
[ 3., 6.],
[ 5., 3.],
[ 6., 2.]])
I was struggling with the same thing and just got help and solved the problem. It works smoothly if your array have column names (structured array) and I think this is a very simple way to sort using the same logic that excel does:
array_name[array_name[['colname1','colname2']].argsort()]
Note the double-brackets enclosing the sorting criteria. And off course, you can use more than 2 columns as sorting criteria.
EDIT: removed bad answer.
Here's one way to do it using an intermediate structured array:
from numpy import array
a = array([[3, 2], [6, 2], [3, 6], [3, 4], [5, 3]])
b = a.flatten()
b.dtype = [('x', '<i4'), ('y', '<i4')]
b.sort()
b.dtype = '<i4'
b.shape = a.shape
print b
which gives the desired output:
[[3 2]
[3 4]
[3 6]
[5 3]
[6 2]]
Not sure if this is quite the best way to go about it though.
I found one way to do it:
from numpy import array
a = array([(3,2),(6,2),(3,6),(3,4),(5,3)])
array(sorted(sorted(a,key=lambda e:e[1]),key=lambda e:e[0]))
It's pretty terrible to have to sort twice (and use the plain python sorted function instead of a faster numpy sort), but it does fit nicely on one line.

How to loop back to beginning of the array for out of bounds index in numpy?

I have a 2D numpy array that I want to extract a submatrix from.
I get the submatrix by slicing the array as below.
Here I want a 3*3 submatrix around an item at the index of (2,3).
>>> import numpy as np
>>> a = np.array([[0, 1, 2, 3],
... [4, 5, 6, 7],
... [8, 9, 0, 1],
... [2, 3, 4, 5]])
>>> a[1:4, 2:5]
array([[6, 7],
[0, 1],
[4, 5]])
But what I want is that for indexes that are out of range, it goes back to the beginning of array and continues from there. This is the result I want:
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
I know that I can do things like getting mod of the index to the width of the array; but I'm looking for a numpy function that does that.
And also for an one dimensional array this will cause an index out of range error, which is not really useful...
This is one way using np.pad with wraparound mode.
>>> a = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 0, 1],
[2, 3, 4, 5]])
>>> pad_width = 1
>>> i, j = 2, 3
>>> startrow, endrow = i-1+pad_width, i+2+pad_width # for 3 x 3 submatrix
>>> startcol, endcol = j-1+pad_width, j+2+pad_width
>>> np.pad(a, (pad_width, pad_width), 'wrap')[startrow:endrow, startcol:endcol]
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
Depending on the shape of your patch (eg. 5 x 5 instead of 3 x 3) you can increase the pad_width and start and end row and column indices accordingly.
np.take does have a mode parameter which can wrap-around out of bound indices. But it's a bit hacky to use np.take for multidimensional arrays since the axis must be a scalar.
However, In your particular case you could do this:
a = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 0, 1],
[2, 3, 4, 5]])
np.take(a, np.r_[2:5], axis=1, mode='wrap')[1:4]
Output:
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
EDIT
This function might be what you are looking for (?)
def select3x3(a, idx):
x,y = idx
return np.take(np.take(a, np.r_[x-1:x+2], axis=0, mode='wrap'), np.r_[y-1:y+2], axis=1, mode='wrap')
But in retrospect, i recommend using modulo and fancy indexing for this kind of operation (it's basically what the mode='wrap' is doing internally anyways):
def select3x3(a, idx):
x,y = idx
return a[np.r_[x-1:x+2][:,None] % a.shape[0], np.r_[y-1:y+2][None,:] % a.shape[1]]
The above solution is also generalized for any 2d shape on a.

Array into multiple arrays

(Python) I have the following array
x=[1,2,3,4,5,6]
and I want this one
x=[[1,2],[2,3],[3,4],[4,5],[5,6]]
I used this function
def split(arr, size):
arrs = []
while len(arr) > size:
pice = arr[:size]
arrs.append(pice)
arr = arr[size:]
arrs.append(arr)
return arrs
But it only generates this
x=[[1,2],[3,4],[5,6]]
I suppose you want to develop your own code and not use libraries or built-in functions.
Your code is fine.
There's just one simple mistake: change the slice index from size to 1 in this line arr = arr[size:], where 1 means size - (size-1)
def split(arr, size):
arrs = []
while len(arr) > size:
pice = arr[:size]
arrs.append(pice)
arr = arr[1:]
arrs.append(arr)
return arrs
output:
[[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]
It also works for other sizes:
print split(x, 3)
print split(x, 4)
[[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6]]
[[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 6]]
loop your array with index, then put [index : index+size] as one element of new array.
The codes will be like:
x=[1,2,3,4,5,6]
size = 2
print([x[index:index+size] for index in range(0, len(x)-size+1)])
size = 4
print([x[index:index+size] for index in range(0, len(x)-size+1)])
Output:
[[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]
[[1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 6]]
[Finished in 0.189s]
or use zip() with size>1.
x=[1,2,3,4,5,6]
size = 2
print( [item for item in zip( *[x[index:len(x)+index-1] for index in range(0, size)])])
size = 4
print( [item for item in zip( *[x[index:len(x)+index-1] for index in range(0, size)])])
Output:
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)]
[(1, 2, 3, 4), (2, 3, 4, 5), (3, 4, 5, 6)]
[Finished in 0.284s]
You can use windowed() function from the more_itertools module.
from more_itertools import windowed
x = list(windowed(x, 2))
You can install it using pip
pip install more-itertools
Try this :
def split(arr, size):
return [x[i:i+size] for i in range(0,len(x)-size+1)]
a = [1,2,3,4,5]
answer = list(map(list, zip(a, a[1:])))
print(answer)
You can use fancy indexing when using NumPy. This will require x to be converted to a numpy array. You will also need to create an array of incidences labeled ind below
x=[1,2,3,4,5,6]
x = np.array(x)
ind = np.arange(len(x)-1)[:,None] + np.arange(2)
x[ind]
array([[1, 2],
[2, 3],
[3, 4],
[4, 5],
[5, 6]])

Put numpy arrays split with np.split() back together

I have split a numpy array like so:
x = np.random.randn(10,3)
x_split = np.split(x,5)
which splits x equally into five numpy arrays each with shape (2,3) and puts them in a list. What is the best way to combine a subset of these back together (e.g. x_split[:k] and x_split[k+1:]) so that the resulting shape is similar to the original x i.e. (something,3)?
I found that for k > 0 this is possible with you do:
np.vstack((np.vstack(x_split[:k]),np.vstack(x_split[k+1:])))
but this does not work when k = 0 as x_split[:0] = [] so there must be a better and cleaner way. The error message I get when k = 0 is:
ValueError: need at least one array to concatenate
The comment by Paul Panzer is right on target, but since NumPy now gently discourages vstack, here is the concatenate version:
x = np.random.randn(10, 3)
x_split = np.split(x, 5, axis=0)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=0)
Note the explicit axis argument passed both times (it has to be the same); this makes it easy to adapt the code to work for other axes if needed. E.g.,
x_split = np.split(x, 3, axis=1)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=1)
np.r_ can turn several slices into a list of indices.
In [20]: np.r_[0:3, 4:5]
Out[20]: array([0, 1, 2, 4])
In [21]: np.vstack([xsp[i] for i in _])
Out[21]:
array([[9, 7, 5],
[6, 4, 3],
[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[2, 2, 5],
[4, 4, 5]])
In [22]: np.r_[0:0, 1:5]
Out[22]: array([1, 2, 3, 4])
In [23]: np.vstack([xsp[i] for i in _])
Out[23]:
array([[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[3, 2, 0],
[0, 3, 8],
[2, 2, 5],
[4, 4, 5]])
Internally np.r_ has a lot of ifs and loops to handle the slices and their boundaries, but it hides it all from us.
If the xsp (your x_split) was an array, we could do xsp[np.r_[...]], but since it is a list we have to iterate. Well we could also hide that iteration with an operator.itemgetter object.
In [26]: operator.itemgetter(*Out[22])
Out[26]: operator.itemgetter(1, 2, 3, 4)
In [27]: np.vstack(operator.itemgetter(*Out[22])(xsp))

Categories