How to append arrays to another numpy array? - python

I am trying to loop through a set of coordinates and 'stacking' these arrays of coordinates to another array (so in essence I want to have an array of arrays) using numpy.
This is my attempt:
import numpy as np
all_coordinates = np.array([[]])
for y in range(2):
for x in range(2):
coordinate = np.array([[x,y]])
# append
all_coordinates = np.append(all_coordinates,[coordinate])
print(all_coordinates)
But it's not working. It's just concatenating the individual numbers and not appending the array.
Instead of giving me (the output that I want to achieve):
[[0 0] [1 0] [0,1] [1,1]]
The output I get instead is:
[0 0 1 0 0 1 1 1]
Why? What I am doing wrong here?

The problem that stack functions don't work, is that they need that the row added is of the same size of the already present rows. Using np.array([[]]), the first row is has a length of zero, which means that you can only add rows that also have length zero.
In order to solve this, we need to tell Numpy that the first row is of size two and not zero. The array thus needs to be of size (0, 2) and not (0, 0). This can be done using one of the array-initializing functions that accept size arguments, like empty, zeros or ones. Which function does not matter, as there are no spaces to fill.
Then you can use one of the functions mentioned in comments, like vstack or stack. The code thus becomes:
import numpy as np
all_coordinates = np.zeros((0, 2))
for y in range(2):
for x in range(2):
coordinate = np.array([[x,y]])
# append
all_coordinates = np.vstack((all_coordinates, coordinate))
print(all_coordinates)

In such a case, I would use a list and only convert it into an array once you have appended all the elements you want.
here is a suggested improvement
import numpy as np
all_coordinates = []
for y in range(2):
for x in range(2):
coordinate = np.array([x,y])
# append
all_coordinates.append(coordinate)
all_coordinates = np.array(all_coordinates)
print(all_coordinates)
The output of this code is indeed
array([[0, 0],
[1, 0],
[0, 1],
[1, 1]])

Related

Neighbors in a 2D array python

I have a 2D numpy array as follows:
start = np.array([
[1,1,0,1],
[1,0,0,1],
[0,1,0,0]
])
I need to get the same matrix, but replace each value with the number of neighbors to which I could get by moving by one step in any direction, but walking only along 1
As a result, I should get the follow:
finish = np.array([
[4,4,0,2],
[4,0,0,2],
[0,4,0,0]
])
It seems to me that this is a well-known problem, but I have not even figured out how to formulate it in search, since everything that I was looking for is a bit different.
What's the best way to do this?
You can use the scipy.ndimage labeling function with a customized structure array s:
import numpy as np
from scipy.ndimage import label
start = np.asarray([ [1,1,0,1],
[1,0,0,1],
[0,1,0,0] ])
#structure array what to consider as "neighbors"
s = [[1,1,1],
[1,1,1],
[1,1,1]]
#label blobs in array
labeledarr,_ = label(start, structure=s)
#retrieve blobs and the number of elements within each blobs
blobnr, blobval = np.unique(labeledarr.ravel(), return_counts=True)
#substitute blob label with the number of elements
finish = np.zeros_like(labeledarr)
for k, v in zip(blobnr[1:], blobval[1:]):
finish[labeledarr==k] = v
print(finish)
Output:
[[4 4 0 2]
[4 0 0 2]
[0 4 0 0]]
I am sure the final step of substituting the label number with the value of its occurrence can be optimized in terms of speed.
And #mad-physicist rightly mentioned that the initially used labeledarr.flat should be substituted by labeledarr.ravel(). The reasons for this are explained here.
You can use scipy.ndimage.label to label connected regions and return the number of regions as #Mr.T points out. This can than be used to create a boolean mask for indexing and counting.
Credits should go to #Mr.T as he came up with a similar solution first. This answer is still posted as the second part is different, I find it more readable and its 40% faster on my machine.
import numpy as np
from scipy.ndimage import label
a = [[1,1,0,1],
[1,0,0,1],
[0,1,0,0]])
# Label connected regions, the second arg defines the connection structure
labeled, n_labels = label(a, np.ones((3,3)))
# Replace label value with the size of the connected region
b = np.zeros_like(labeled)
for i in range(1, n_labels+1):
target = (labeled==i)
b[target] = np.count_nonzero(target)
print(b)
output:
[[4 4 0 2]
[4 0 0 2]
[0 4 0 0]]

Compiling certain columns into a new array

Let's say I have a 2D numpy array 'x' and I want to create a new array 'y' with only certain columns from x.
Is there an easy solution for this?
I was trying to write a function that iterates through each column of an array, and then appends every 3rd column to a new array.
def grab_features(x, starting = 0, every = 3, rowlength = 16):
import numpy as np
import pandas as pd
y = np.empty([rowlength,1])
for i in range(starting, np.size(x, 1), every):
y = np.append(y, np.reshape(x[:, i], (rowlength, 1)), axis=0)
return y
I didn't get any errors, but instead the function returned a long 1 dimensional array of float numbers. I was hoping for an array of the same type as x, just with 1/3 of the columns.
You can use the slice syntax of i:j:k where i is the starting index, j is the stopping index, and k is the step size
import numpy as np
array = np.array([[1,2],
[3,4],
[5,6],
[7,8],
[9,10],
[11,12]])
print(array[::3])
[[1 2]
[7 8]]

Iteratively appending ndarray arrays using numpy in Python

I am trying to figure out how to iteratively append 2D arrays to generate a singular larger array. On each iteration a 16x200 ndarray is generated as seen below:
For each iteration a new 16x200 array is generated, I would like to 'append' this to the previously generated array for a total of N iterations. For example for two iterations the first generated array would be 16x200 and for the second iteration the newly generated 16x200 array would be appended to the first creating a 16x400 sized array.
train = np.array([])
for i in [1, 2, 1, 2]:
spike_count = [0, 0, 0, 0]
img = cv2.imread("images/" + str(i) + ".png", 0) # Read the associated image to be classified
k = np.array(temporallyEncode(img, 200, 4))
# Somehow append k to train on each iteration
In the case of the above embedded code the loop iterates 4 times so the final train array is expected to be 16x800 in size. Any help would be greatly appreciated, I have drawn a blank on how to successfully accomplish this. The code below is a general case:
import numpy as np
totalArray = np.array([])
for i in range(1,3):
arrayToAppend = totalArray = np.zeros((4, 200))
# Append arrayToAppend to totalArray somehow
While it is possible to perform a concatenate (or one of the 'stack' variants) at each iteration, it is generally faster to accumulate the arrays in a list, and perform the concatenate once. List append is simpler and faster.
alist = []
for i in range(0,3):
arrayToAppend = totalArray = np.zeros((4, 200))
alist.append(arrayToAppend)
arr = np.concatenate(alist, axis=1) # to get (4,600)
# hstack does the same thing
# vstack is the same, but with axis=0 # (12,200)
# stack creates new dimension, # (3,4,200), (4,3,200) etc
Try using numpy hstack. From the documention, hstack takes a sequence of arrays and stack them horizontally to make a single array.
For example:
import numpy as np
x = np.zeros((16, 200))
y = x.copy()
for i in xrange(5):
y = np.hstack([y, x])
print y.shape
Gives:
(16, 400)
(16, 600)
(16, 800)
(16, 1000)
(16, 1200)

A particular way of resizing a matrix

Having a nxn (6x6 in the example below) matrix filled only with 0 and 1:
old_matrix=[[0,0,0,1,1,0],
[1,1,1,1,0,0],
[0,0,1,0,0,0],
[1,0,0,0,0,1],
[0,1,1,1,1,0],
[1,0,0,1,1,0]]
I want to resize it in a particular way. Taking (2x2) sub-matrice and checking if there are more ones or zeros. This means the new matrix will be (3x3) If there are more 1 than 0 un the sub-matrice a 1 value will be assigned in the new matrix. Otherwise, (if there are less or equal) its new value will be 0.
new_matrix=[[0,1,0],
[0,0,0],
[0,1,0]]
I've tried to achieve this by using lots of whiles. However it doesn seem to work. Here's what I got so far:
def convert_track(a):
#converts original map to a 8x8 tile Track
NEW_TRACK=[]
w=0 #matrix width
h=0 #matrix heigth
t_w=0 #submatrix width
t_h=0 #submatrix heigth
BLACK=0 #number of ones in submatrix
WHITE=0 #number of zeros in submatrix
while h<=6:
while w<=6:
l=[]
while t_h<=2 and h<=6:
t_w=0
while t_w<=2 and w<=6:
if a[h][w]==1:
BLACK+=1
else:
WHITE+=1
t_w+=1
w+=1
h+=1
t_h+=1
t_w=0
t_h+=1
if BLACK<=WHITE:
l.append(0)
else:
l.append(1)
BLACK=0
WHITE=0
t_h=0
NEW_TRACK.append(l)
return NEW_TRACK
Raises the error list index out of range or returns the list
[[0]]
is there an easier way to achieve this? What am i doing wrong?
If you are willing/able to use NumPy you can do something like this. If you're working with anything like the data you've shown it's well worth your time to learn as operations like these can be done very efficiently and with very little code.
import numpy as np
from scipy.signal import convolve2d
old_matrix=[[0,0,0,1,1,0],
[1,1,1,1,0,0],
[0,0,1,0,0,0],
[1,0,0,0,0,1],
[0,1,1,1,1,0],
[1,0,0,1,1,0]]
a = np.array(old_matrix)
k = np.ones((2,2))
# compute sums at each submatrix
local_sums = convolve2d(a, k, mode='valid')
# restrict to sums corresponding to non-overlapping
# sub-matrices with local_sums[::2, ::2] and check if
# there are more 1 than 0 elements
result = local_sums[::2, ::2] > 2
# convert back to Python list if needed
new_matrix = result.astype(np.int).tolist()
Result:
>>> result.astype(np.int).tolist()
[[0, 1, 0], [0, 0, 0], [0, 1, 0]]
Here I've used convolve2d to compute the sums at each submatrix. From what I can tell you are only interested in non-overlapping sub-matrices, so the part local_sums[::2, ::2] chops out only the sums corresponding to those.

Iterate across arbitrary dimension in numpy

I have a multidimensional numpy array, and I need to iterate across a given dimension. Problem is, I won't know which dimension until runtime. In other words, given an array m, I could want
m[:,:,:,i] for i in xrange(n)
or I could want
m[:,:,i,:] for i in xrange(n)
etc.
I imagine that there must be a straightforward feature in numpy to write this, but I can't figure out what it is/what it might be called. Any thoughts?
There are many ways to do this. You could build the right index with a list of slices, or perhaps alter m's strides. However, the simplest way may be to use np.swapaxes:
import numpy as np
m=np.arange(24).reshape(2,3,4)
print(m.shape)
# (2, 3, 4)
Let axis be the axis you wish to loop over. m_swapped is the same as m except the axis=1 axis is swapped with the last (axis=-1) axis.
axis=1
m_swapped=m.swapaxes(axis,-1)
print(m_swapped.shape)
# (2, 4, 3)
Now you can just loop over the last axis:
for i in xrange(m_swapped.shape[-1]):
assert np.all(m[:,i,:] == m_swapped[...,i])
Note that m_swapped is a view, not a copy, of m. Altering m_swapped will alter m.
m_swapped[1,2,0]=100
print(m)
assert(m[1,0,2]==100)
You can use slice(None) in place of the :. For example,
from numpy import *
d = 2 # the dimension to iterate
x = arange(5*5*5).reshape((5,5,5))
s = slice(None) # :
for i in range(5):
slicer = [s]*3 # [:, :, :]
slicer[d] = i # [:, :, i]
print x[slicer] # x[:, :, i]

Categories