Append zeros to array - python

I want to create an array that initially has these values:
[[[0,0,0,0]], [[0,0,0,0]]]
Then appending zeros to the next row:
[[[0,0,0,0], [0,0,0,0]], [[0,0,0,0], [0,0,0,0]]]
and again similarly appending zeros to the next row in a loop:
[[[0,0,0,0], [0,0,0,0], ...], [[0,0,0,0], [0,0,0,0], ...]]
How can I do this with python?
I tried
import numpy, pandas
x = numpy.array([[[0, 0, 0, 0]], [[0, 0, 0, 0]]])
numpy.append(x[0], [[0, 0, 0, 0]], axis=0)
print(x)

What you're looking for might be numpy.insert(arr, obj, values, axis=None) to "Insert values along the given axis before the given indices". In your case, in the array A (arr) you want to insert zeros (values) along axis 1 (axis=1). If you want to insert the element at the beginning, specify the position 0 (obj), but if you want it at the end use -1:
>>> A = np.array([[[0,0,0,0]], [[0,0,0,0]]])
>>> A.shape
(2, 1, 4)
>>> B = np.insert(A, 1, 0, axis=1)
>>> B.shape
(2, 2, 4)

In [160]: x = numpy.array([[[0, 0, 0, 0]], [[0, 0, 0, 0]]])
In [161]: x
Out[161]:
array([[[0, 0, 0, 0]],
[[0, 0, 0, 0]]])
In [162]: x.shape
Out[162]: (2, 1, 4)
Let's be clear what you start with - note the shape!
Your append attempt:
In [163]: np.append(x[0], [[0, 0, 0, 0]], axis=0)
Out[163]:
array([[0, 0, 0, 0],
[0, 0, 0, 0]])
In [164]: _.shape
Out[164]: (2, 4)
It makes a NEW array - x[0] is (1,4) array, and you join another (1,4) to make a (2,4). np.append when used like this is just np.concatenate. Read, and reread, the docs if necessary.
It did not change x. np.append is NOT a list append clone.
In [165]: x
Out[165]:
array([[[0, 0, 0, 0]],
[[0, 0, 0, 0]]])
If you must work incrementally, use list and list append. Arrays have a fixed size, so 'growing' them requires making a new array.

Related

Repeat Array while Maintaining Order within group

I have the below array and would like to repeat each array n times.
x_array
[array([14.91488012, 1.2986064 , 4.98965322]),
array([2.39389187e+02, 1.04442059e-01, 3.06391338e-01]),
array([ 48.19437348, 201.09951372, 0.35223001]),
array([ 19.96978171, 367.52578786, 0.68676553]),
array([0.55120466, 0.27133609, 0.75646697]),
array([8.21287360e+02, 1.76495077e+02, 4.87263691e-01]),
array([184.03439377, 1.24823107, 5.33109884]),
array([575.59800297, 186.4650814 , 2.21028258]),
array([0.50308552, 3.09976082, 0.10537899]),
array([1.02259912e+00, 1.52282513e+02, 1.15085308e-01])]
I've tried np.repeat(x_array, 2) but this doesn't preserve the order of the matrix/array. I've also tried x_array*2, but this seems to just put the new array at the bottom. I was hopping to repeat x_array[0] n times and do the same for the next set of arrays, so that I have n total of each in order.
Thanks in advance.
Building off of the last example from https://numpy.org/doc/stable/reference/generated/numpy.repeat.html,
x_array = np.array(x_array) # Or a similiar operation to convert x_array to an ndarray vs. a list of arrays.
expanded_x_array = np.repeat(x_array, n, axis=0)
print(expanded_x_array)
should produce what you are looking for.
You just need to specify the axis:
>>> np.repeat(x_array, 2, axis=0)
array([[1.49149e+01, 1.29861e+00, 4.98965e+00],
[1.49149e+01, 1.29861e+00, 4.98965e+00],
[2.39389e+02, 1.04442e-01, 3.06391e-01],
[2.39389e+02, 1.04442e-01, 3.06391e-01],
...,
[5.03086e-01, 3.09976e+00, 1.05379e-01],
[5.03086e-01, 3.09976e+00, 1.05379e-01],
[1.02260e+00, 1.52283e+02, 1.15085e-01],
[1.02260e+00, 1.52283e+02, 1.15085e-01]])
From the docs:
numpy.repeat(a, repeats, axis=None)
...
axis int, optional
The axis along which to repeat values. By default, use the flattened input array, and return a flat output array.
(added bold)
You could use a list comprehension:
n = 2
repeated_list = [row for row in a for _ in range(n)]
print(repeated_list)
Your terminology is confusing. You say it's an "array", but the display looks more like a list, And the fact that x_array*2 puts an "new array" at the bottom confirms that - that's a list use of *.
np.repeat(x_array) first makes an array (a real one!)
np.array(x_array)
is a (n,3) float dtype array. Without axis np.repeat flattens - as documented!
Specifying the axis=0 works because it's repeating on that first n dimension. The result is a (2*n,3) float dtype array (not a list).
It is possible to make a 1d object dtype array containing those arrays. With that repeat will work without the axis parameter.
Knowing what you have, and describing it accurately, can make this kind of task much easier - and the questions clearer.
illustration
Make a list of arrays:
In [21]: alist = [np.ones(3,int),np.zeros(3,int),np.arange(3)]
In [22]: alist
Out[22]: [array([1, 1, 1]), array([0, 0, 0]), array([0, 1, 2])]
List repeat:
In [23]: alist*2
Out[23]:
[array([1, 1, 1]),
array([0, 0, 0]),
array([0, 1, 2]),
array([1, 1, 1]),
array([0, 0, 0]),
array([0, 1, 2])]
Make a 2d array from the list:
In [24]: np.array(alist)
Out[24]:
array([[1, 1, 1],
[0, 0, 0],
[0, 1, 2]])
repeat without axis repeats elements in a flattened way:
In [25]: np.repeat(alist,2)
Out[25]: array([1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 2, 2])
repeat this 2d array on 0 axis:
In [26]: np.repeat(alist,2,axis=0)
Out[26]:
array([[1, 1, 1],
[1, 1, 1],
[0, 0, 0],
[0, 0, 0],
[0, 1, 2],
[0, 1, 2]])
Object dtype array from list:
In [27]: arr = np.empty(3,object); arr[:]=alist
In [28]: arr
Out[28]: array([array([1, 1, 1]), array([0, 0, 0]), array([0, 1, 2])], dtype=object)
Since the arrays have the same size we have to use this special construct. Otherwise we get the 2d array [24].
This array has a repeat method, and with only one dimension we dont need to specify the axis. It's repeating the object elements, arrays, not the numbers in the 2d [24] array.
In [29]: arr.repeat(2)
Out[29]:
array([array([1, 1, 1]), array([1, 1, 1]), array([0, 0, 0]),
array([0, 0, 0]), array([0, 1, 2]), array([0, 1, 2])], dtype=object)

Matrix substraction in Python error or confusion

I have this code to help understand the problem
hypotesisResult = (hypotesis(dataset, theta_init))
print(dataset.T.shape)
print(hypotesisResult.shape)
print(y.shape)
print((hypotesisResult - y).shape)
print(np.subtract(hypotesisResult, y).shape)
And the output of this:
(7, 1329)
(1329, 1)
(1329,1)
(1329, 1329)
(1329, 1329)
And my question is, why if a have the matrix "hypotesisResult" with a size of (1329, 1), and i substract "y" (1329,1) from it, it results in a matrix of (1329, 1329) ¿What are i'm doing wrong? I want a new matrix of (1329, 1), this like an scalar substraction
I don't trust your displayed shapes. It might be good to rerun this script, and also check dtype.
Here's the behavior I expect, involving (n,1) and (n,) or (1,n) arrays:
In [605]: x=np.ones((4,1),int); y=np.ones((4,),int)
In [606]: x,y
Out[606]:
(array([[1],
[1],
[1],
[1]]),
array([1, 1, 1, 1]))
In [607]: x-y
Out[607]:
array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
In [608]: x-y[None,:]
Out[608]:
array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
In [609]: x-y[:,None]
Out[609]:
array([[0],
[0],
[0],
[0]])
In [610]: x.T-y
Out[610]: array([[0, 0, 0, 0]])
The two basic broadcasting rules are:
add leading size 1 dimensions if needed to match the number of dimensions
adjust size 1 dimensions to match the other arrays.
Hey this isnt your answer but I have something cool for you
consider a matrix saved in a list
matx = [[1,54,2],[5,2,3],[2,23,4]]
so if we need to calculate its transpose we can use zip() fuction
print(zip(matx))
results:-
>>>[(1,5,2),(54,2,23),(2,3,4)]
Hope you like this

Efficient way to substitute repeating np.vstack in python?

I am trying to implement this post in python.
import numpy as np
x = np.array([0,0,0])
for r in range(3):
x = np.vstack((x, np.array([-r, r, -r])))
x gets this value
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
I am concerned the runtime efficiency about the repeating np.vstack. Is there a more efficient way to do this?
Build a list of arrays or lists, and apply np.array (or vstack) to that once:
In [598]: np.array([[-r,r,-r] for r in [0,0,1,2]])
Out[598]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
But if the column pattern is consistent, broadcasting two arrays against each other will be faster
In [599]: np.array([-1,1,-1])*np.array([0,0,1,2])[:,None]
Out[599]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
Would it be useful to use numpy.tile?
N = 3
A = np.array([[0, *range(0, -N, -1)]]).T
B = np.tile(A, (1, N))
B[:,1] = -B[:,1]
The first line sets the expected number of rows after the first row of zeroes. The second creates a NumPy array by creating an initial value of 0, followed by the linear sequence of 0, -1, -2, up to -N + 1. Note the use of the splat operator which unpacks the range object and creates elements in an individual list. These are concatenated with the first value of 0, and we create a 2D NumPy array that is a column vector. The third line tiles this vector N times horizontally to get the desired output. Finally the fourth line negates the second column to get your desired output
Example Run
In [175]: N = 3
In [176]: A = np.array([[0, *range(0, -N, -1)]]).T
In [177]: B = np.tile(A, (1, N))
In [178]: B[:,1] = -B[:,1]
In [178]: B
Out[178]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[-1, 1, -1],
[-2, 2, -2]])
You can use np.block as following:
First create a block which you are currently doing inside the for loop
Finally, vertically stack a row of zeros using np.vstack to get the final desired answer
import numpy as np
size = 3
sign = np.ones(3)*((-1)**np.arange(1, size+1)) # General sign array of repeating -1, 1
A = np.ones((size, size), int)
B = np.arange(0, size) * A
B = sign * np.block([B.T])
# array([[ 0, 0, 0],
# [ -1, 1, -1],
# [ -2, 2, -2]])
answer = np.vstack([B[0], B])
# array([[ 0, 0, 0],
# [ 0, 0, 0],
# [ -1, 1, -1],
# [ -2, 2, -2]])

Numpy: increment elements of an array given the indices required to increment

I am trying to turn a second order tensor into a binary third order tensor. Given a second order tensor as a m x n numpy array: A, I need to take each element value: x, in A and replace it with a vector: v, with dimensions equal to the maximum value of A, but with a value of 1 incremented at the index of v corresponding to the value x (i.e. v[x] = 1). I have been following this question: Increment given indices in a matrix, which addresses producing an array with increments at indices given by 2 dimensional coordinates. I have been reading the answers and trying to use np.ravel_multi_index() and np.bincount() to do the same but with 3 dimensional coordinates, however I keep on getting a ValueError: "invalid entry in coordinates array". This is what I have been using:
def expand_to_tensor_3(array):
(x, y) = array.shape
(a, b) = np.indices((x, y))
a = a.reshape(x*y)
b = b.reshape(x*y)
tensor_3 = np.bincount(np.ravel_multi_index((a, b, array.reshape(x*y)), (x, y, np.amax(array))))
return tensor_3
If you know what is wrong here or know an even better method to accomplish my goal, both would be really helpful, thanks.
You can use (A[:,:,np.newaxis] == np.arange(A.max()+1)).astype(int).
Here's a demonstration:
In [52]: A
Out[52]:
array([[2, 0, 0, 2],
[3, 1, 2, 3],
[3, 2, 1, 0]])
In [53]: B = (A[:,:,np.newaxis] == np.arange(A.max()+1)).astype(int)
In [54]: B
Out[54]:
array([[[0, 0, 1, 0],
[1, 0, 0, 0],
[1, 0, 0, 0],
[0, 0, 1, 0]],
[[0, 0, 0, 1],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]],
[[0, 0, 0, 1],
[0, 0, 1, 0],
[0, 1, 0, 0],
[1, 0, 0, 0]]])
Check a few individual elements of A:
In [55]: A[0,0]
Out[55]: 2
In [56]: B[0,0,:]
Out[56]: array([0, 0, 1, 0])
In [57]: A[1,3]
Out[57]: 3
In [58]: B[1,3,:]
Out[58]: array([0, 0, 0, 1])
The expression A[:,:,np.newaxis] == np.arange(A.max()+1) uses broadcasting to compare each element of A to np.arange(A.max()+1). For a single value, this looks like:
In [63]: 3 == np.arange(A.max()+1)
Out[63]: array([False, False, False, True], dtype=bool)
In [64]: (3 == np.arange(A.max()+1)).astype(int)
Out[64]: array([0, 0, 0, 1])
A[:,:,np.newaxis] is a three-dimensional view of A with shape (3,4,1). The extra dimension is added so that the comparison to np.arange(A.max()+1) will broadcast to each element, giving a result with shape (3, 4, A.max()+1).
With a trivial change, this will work for an n-dimensional array. Indexing a numpy array with the ellipsis ... means "all the other dimensions". So
(A[..., np.newaxis] == np.arange(A.max()+1)).astype(int)
converts an n-dimensional array to an (n+1)-dimensional array, where the last dimension is the binary indicator of the integer in A. Here's an example with a one-dimensional array:
In [6]: a = np.array([3, 4, 0, 1])
In [7]: (a[...,np.newaxis] == np.arange(a.max()+1)).astype(int)
Out[7]:
array([[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0]])
You can make it work this way:
tensor_3 = np.bincount(np.ravel_multi_index((a, b, array.reshape(x*y)),
(x, y, np.amax(array) + 1)))
The difference is that I add 1 to the amax() result, because ravel_multi_index() expects that the indexes are all strictly less than the dimensions, not less-or-equal.
I'm not 100% sure if this is what you wanted; another way to make the code run is to specify mode='clip' or mode='wrap' in ravel_multi_index(), which does something a bit different and I'm guessing is less correct. But you can try it.

matlab find() for nonzero element in python

I have a sparse matrix (numpy.array) and I would like to have the index of the nonzero elements in it.
In Matlab I would write:
[i, j] = find(CM)
and in Python what should I do?
I have tried numpy.nonzero (but I don't know how to take the indices from that) and flatnonzero (but it's not convenient for me, I need both the row and column index).
Thanks in advance!
Assuming that by "sparse matrix" you don't actually mean a scipy.sparse matrix, but merely a numpy.ndarray with relatively few nonzero entries, then I think nonzero is exactly what you're looking for. Starting from an array:
>>> a = (np.random.random((5,5)) < 0.10)*1
>>> a
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 1, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
nonzero returns the indices (here x and y) where the nonzero entries live:
>>> a.nonzero()
(array([1, 2, 3]), array([4, 2, 0]))
We can assign these to i and j:
>>> i, j = a.nonzero()
We can also use them to index back into a, which should give us only 1s:
>>> a[i,j]
array([1, 1, 1])
We can even modify a using these indices:
>>> a[i,j] = 2
>>> a
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 2],
[0, 0, 2, 0, 0],
[2, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
If you want a combined array from the indices, you can do that too:
>>> np.array(a.nonzero()).T
array([[1, 4],
[2, 2],
[3, 0]])
(there are lots of ways to do this reshaping; I chose one almost at random.)
This goes slightly beyond what you as and I only mention it since I once faced a similar problem. If you want the indices to access some other array there is some very simple sytax:
import numpy as np
array = np.random.randint(0, 2, size=(3, 3))
data = np.random.random(size=(3, 3))
Now array looks something like
>>> print array
array([[0, 1, 0],
[1, 0, 1],
[1, 1, 0]])
while data could be
>>> print data
array([[ 0.92824816, 0.43605604, 0.16627849],
[ 0.00301434, 0.94342538, 0.95297402],
[ 0.32665135, 0.03504204, 0.86902492]])
Then if we want the elements of data which are zero:
>>> print data[array==0]
array([ 0.92824816, 0.16627849, 0.94342538, 0.86902492])
Which is nice and simple.

Categories