Merging/appending 2 3D arrays dynamically - python

I am trying to append two 3D arrays in python and I thought I would be plain easy however despite trying multiple things it is not working.
Arr1 = np.empty(0,6,2)
print(Arr1.shape)
>> (0,6,2)
df = pd.DataFrame({'var1': np.arange(10), 'var2': np.arange(10), 'prob':
np.random.randint(0,10,10)})
xs = []
ys = []
for i in range(6,10):
xs.append(df[i-6:i][['var1', 'var2']].values)
ys.append(df.iloc[i]['prob'])
Arr2 = np.array(xs).reshape(-1,6,2)
print(Arr2.shape)
>> (4,6,2)
I am trying to append/merge Arr2 into Arr1 on axis=0. I tried following things but I keep getting errors.
try 1: x = np.append(Arr2 ,Arr1 ) -> no error but it gives one dimensional array as output
try 2: x = np.append(Arr2 ,Arr1 ,axis=0) -> does not work. gives error
try 3: x = np.append(Arr2 ,Arr1 ,axis=1) -> does not work. gives error
try 4: x = np.append(Arr2 ,Arr1 ,axis=2) -> does not work. gives error
try 2: x = x = np.stack([Arr2 ,Arr1 ]) -> does not work. gives error
I know I am missing the axis logic but would appreciate any help.

In [145]: Arr1 = np.empty((0,6,2))
In [146]: Arr2 = np.ones((4,6,2))
Works, same as Arr2:
In [148]: np.append(Arr2 ,Arr1 ,axis=0).shape
Out[148]: (4, 6, 2)
What's the point to doing this? I suspect you are intending to repeat this in a loop. :(
np.append is poorly named. It is not a list append clone. About the only place it's useful is adding a scalar to a 1d array. It saves the effort of making the scalar an array! That's all. All other uses are better done with concatenate - it lets you join a whole list of arrays with one call. np.append only lets you specify 2 at a time :(
Error:
In [149]: np.append(Arr2 ,Arr1 ,axis=1).shape
Traceback (most recent call last):
Input In [149] in <cell line: 1>
np.append(Arr2 ,Arr1 ,axis=1).shape
File <__array_function__ internals>:180 in append
File /usr/local/lib/python3.8/dist-packages/numpy/lib/function_base.py:5392 in append
return **concatenate((arr, values), axis=axis)**
File <__array_function__ internals>:180 in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 4 and the array at index 1 has size 0
np.append with axis is just a front end to concatenate. It sould be obvious that you can't join a (0,6,2) and (4,6,2) on the 6 - the 4 and 0 don't match!
In [150]: np.append(Arr2 ,Arr1 ,axis=2).shape
....
Same error - the 0 and 4 don't match
In [151]: np.stack((Arr2 ,Arr1)).shape
....
ValueError: all input arrays must have the same shape
stack is meant to join identically shaped arrays on a new axis.
In [152]: np.stack((Arr2 ,Arr2)).shape
Out[152]: (2, 4, 6, 2)
In [153]: np.stack((Arr2 ,Arr2),axis=2).shape
Out[153]: (4, 6, 2, 2)
edit
It may help to join several identical arrays on different axes. Note where the shape match the source, and where it's a multiple:
In [154]: arr = np.ones((2,3,4))
In [155]: np.concatenate((arr, arr, arr), axis=0).shape
Out[155]: (6, 3, 4) # 2*3
In [156]: np.concatenate((arr, arr, arr), axis=1).shape
Out[156]: (2, 9, 4) # 3*3
In [157]: np.concatenate((arr, arr, arr), axis=2).shape
Out[157]: (2, 3, 12) # 384
In [158]: np.stack((arr, arr, arr), axis=2).shape
Out[158]: (2, 3, 3, 4) # new size 3 axis

Related

Writing a Transpose a vector in python

I have to write a python function where i need to compute the vector
For A is n by n and xn is n by 1
r_n = Axn - (xn^TAxn)xn
Im using numpy but .T doesn't work on vectors and when I just do
r_n = A#xn - (xn#A#xn)#xn but xn#A#xn gives me a scaler.
I've tried changing the A with the xn but nothing seems to work.
Making a 3x1 numpy array like this...
import numpy as np
a = np.array([1, 2, 3])
...and then attempting to take its transpose like this...
a_transpose = a.T
...will, confusingly, return this:
# [1 2 3]
If you want to define a (column) vector whose transpose you can meaningfully take, and get a row vector in return, you need to define it like this:
a = np.reshape(np.array([1, 2, 3]), (3, 1))
print(a)
# [[1]
# [2]
# [3]]
a_transpose = a.T
print(a_transpose)
# [[1 2 3]]
If you want to define a 1 x n array whose transpose you can take to get an n x 1 array, you can do it like this:
a = np.array([[1, 2, 3]])
and then get its transpose by calling a.T.
If A is (n,n) and xn is (n,1):
A#xn - (xn#A#xn)#xn
(n,n)#(n,1) - ((n,1)#(n,n)#(n,1)) # (n,1)
(n,1) error (1 does not match n)
If xn#A#xn gives scalar that's because xn is (n,) shape; as per np.matmul docs that's a 2d with two 1d arrays
(n,)#(n,n)#(n,) => (n,)#(n,) -> scalar
I think you want
(1,n) # (n,n) # (n,1) => (1,1)
Come to think of it that (1,1) array should be same single values as the scalar.
Sample calculation; 1st with the (n,) shape:
In [6]: A = np.arange(1,10).reshape(3,3); x = np.arange(1,4)
In [7]: A#x
Out[7]: array([14, 32, 50]) # (3,3)#(3,)=>(3,)
In [8]: x#A#x # scalar
Out[8]: 228
In [9]: (x#A#x)#x
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[9], line 1
----> 1 (x#A#x)#x
ValueError: matmul: Input operand 0 does not have enough dimensions (has 0, gufunc core with signature (n?,k),(k,m?)->(n?,m?) requires 1)
matmul does not like to work with scalars. But we can use np.dot instead, or simply multiply:
In [10]: (x#A#x)*x
Out[10]: array([228, 456, 684]) # (3,)
In [11]: A#x - (x#A#x)*x
Out[11]: array([-214, -424, -634])
Change the array to (3,1):
In [12]: xn = x[:,None]; xn.shape
Out[12]: (3, 1)
In [13]: A#xn - (xn.T#A#xn)*xn
Out[13]:
array([[-214],
[-424],
[-634]]) # same numbers but in (3,1) shape

how to append row vectors to an empty matrix without knowing the size of the matrix using numpy?

rows is a 343x30 matrix of real numbers. Im trying to append row vectors from rows to true rows and false rows but it only adds the first row and doesnt do anything afterwards. Ive tried vstack and also tried putting example as a 2d array ([example]) but it crashed my pycharm. what can I do?
true_rows = []
true_labels = []
false_rows = []
false_labels = []
i = 0
for example in rows:
if question.match(example):
true_rows = np.append(true_rows , example , axis=0)
true_labels.append(labels[i])
else:
#false_rows = np.vstack(false_rows, example_t)
false_rows = np.append(false_rows, example, axis=0)
false_labels.append(labels[i])
i += 1
you can use only a simple list to append your rows and then transform this list to numpy array such as :
exemple1 = np.array([1,2,3,4,5])
exemple2 = np.array([6,7,8,9,10])
exemple3 = np.array([11,12,13,14,15])
true_rows = []
true_rows.append(exemple1)
true_rows.append(exemple2)
true_rows.append(exemple3)
true_rows = np.array(true_rows)
you will get this results:
true_rows = array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15]])
you can also use np.concatenate if you want to get one dimensional array like this:
true_rows = np.concatenate(true_rows , axis =0)
you will get this results:
true_rows = array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
Your use of [] and np.append suggests you are trying to imitate a common list append model with arrays. You atleast read enough of the np.append docs to know you need to use axis, and that it returns a new array (the docs are quite clear this is a copy).
But did you test this idea with a small example, and actually look at the results (step by step)?
In [326]: rows = []
In [327]: rows = np.append(rows, np.arange(3), axis=0)
In [328]: rows
Out[328]: array([0., 1., 2.])
In [329]: rows.shape
Out[329]: (3,)
the first append doesn't do anything - the result is the same as arange(3).
In [330]: rows = np.append(rows, np.arange(3), axis=0)
In [331]: rows
Out[331]: array([0., 1., 2., 0., 1., 2.])
In [332]: rows.shape
Out[332]: (6,)
Do you understand why? We join 2 1d arrays on axis 0, making a 1d.
Using [] as a starting point is the same starting with this array:
In [333]: np.array([])
Out[333]: array([], dtype=float64)
In [334]: np.array([]).shape
Out[334]: (0,)
And with axis, np.append is just a call to concatenate:
In [335]: np.concatenate(( [], np.arange(3)), axis=0)
Out[335]: array([0., 1., 2.])
np.append sort looks like list append, but it is not a clone. It's really just a poorly named way to use concatenate. And you can't use it properly without actually understanding dimensions. np.append has an example with an error much like what you got with concatentate.
Repeated use of these array concatenates in a loop is not a good idea. It's hard to get the dimensions right, as you found. And even when it works, it is slow, since each step makes a copy (which grows with the iteration).
That's why the other answer sticks with list append.
vstack is like concatenate with axis 0, but it makes sure all arguments are 2d. But if the number columns differ, it raise an error:
In [336]: np.vstack(( [],np.arange(3)))
Traceback (most recent call last):
File "<ipython-input-336-22038d6ef0f7>", line 1, in <module>
np.vstack(( [],np.arange(3)))
File "<__array_function__ internals>", line 180, in vstack
File "/usr/local/lib/python3.8/dist-packages/numpy/core/shape_base.py", line 282, in vstack
return _nx.concatenate(arrs, 0)
File "<__array_function__ internals>", line 180, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 0 and the array at index 1 has size 3
In [337]: np.vstack(( [0,0,0],np.arange(3)))
Out[337]:
array([[0, 0, 0],
[0, 1, 2]])
If all you are joining are rows of a (n,30) array, then you do know the column size of the result.
In [338]: res = np.zeros((0,3))
In [339]: np.vstack(( res, np.arange(3)))
Out[339]: array([[0., 1., 2.]])
If you pay attention to the shape details, it is possible to create an array iteratively.
But instead of collecting rows one by one, why not create a mask and do the collection once.
Roughly do
mask = np.array([question.match(example) for example in rows])
true_rows = rows[mask]
false_rows = rows[~mask]
this still requires an iteration, but overall should be faster.

How to input a 1-D array of dimensions into numpy.random.randn?

Say I have a 1-D array dims:
dims = np.array((1,2,3,4))
I want to create a n-th order normally distributed tensor where n is the size of the dims and dims[i] is the size of the i-th dimension.
I tried to do
A = np.random.randn(dims)
But this doesn't work. I could do
A = np.random.randn(1,2,3,4)
which would work but n can be large and n can be random in itself. How can I read in a array of the size of the dimensions in this case?
Use unpacking with an asterisk:
np.random.randn(*dims)
Unpacking is standard Python when the signature is randn(d0, d1, ..., dn)
In [174]: A = np.random.randn(*dims)
In [175]: A.shape
Out[175]: (1, 2, 3, 4)
randn docs suggests standard_normal which takes a tuple (or array which can be treated as a tuple):
In [176]: B = np.random.standard_normal(dims)
In [177]: B.shape
Out[177]: (1, 2, 3, 4)
In fact the docs, say new code should use this:
In [180]: rgn = np.random.default_rng()
In [181]: rgn.randn
Traceback (most recent call last):
File "<ipython-input-181-b8e8c46209d0>", line 1, in <module>
rgn.randn
AttributeError: 'numpy.random._generator.Generator' object has no attribute 'randn'
In [182]: rgn.standard_normal(dims).shape
Out[182]: (1, 2, 3, 4)

Dot product of a vector with each vector in another matrix

weight = np.array([[[ 0.38932115, -0.27430567]],
[[-0.04543304, -0.05643598]],
[[ 0.46912688, -0.07695298]]])
data = np.array([[-0.2056065, 0.7889058]])
like,
data = np.array([[1, 2, 3], [4, 5, 6]])
I want to take the dot product of the row in data with each row in weight, how could I accomplish this? I tried tensordot but it seems a bit convoluted / non-obvious the way axes works. Is there an easier way?
Your use of row and vector is a bit ambiguous:
In [7]: weight = np.array([[[ 0.38932115, -0.27430567]],
...:
...: [[-0.04543304, -0.05643598]],
...:
...: [[ 0.46912688, -0.07695298]]])
...:
...: data = np.array([[-0.2056065, 0.7889058]])
In [8]: weight.shape
Out[8]: (3, 1, 2)
In [9]: data.shape
Out[9]: (1, 2)
Are your rows of shape (2,) or (1,2)?
dot is a 'sum of products' function, but sum on which axis?
With einsum we can control the sum axis.
Sum on both the 1 and 2's:
In [11]: np.einsum('ijk,jk',weight, data)
Out[11]: array([-0.29644829, -0.03518134, -0.15716419]) # shape (3,)
or just the 1's:
In [12]: np.einsum('ijk,jm',weight, data)
Out[12]:
array([[[-0.08004696, 0.30713771],
[ 0.05639903, -0.21640133]],
[[ 0.00934133, -0.03584239],
[ 0.0116036 , -0.04452267]],
[[-0.09645554, 0.37009692],
[ 0.01582203, -0.06070865]]])
In [13]: _.shape
Out[13]: (3, 2, 2)
Or just the 2's:
In [14]: np.einsum('ijk,mk',weight, data)
Out[14]:
array([[[-0.29644829]],
[[-0.03518134]],
[[-0.15716419]]])
In [16]: _.shape
Out[16]: (3, 1, 1)
matmul/# also does this sum - data.T changes the (1,2) array to a (2,1). This pairs the (3,1,2) with a (2,1) to fit the "Last A with the second to the last of B" rule for dot/#.
In [17]: weight # data.T
Out[17]:
array([[[-0.29644829]],
[[-0.03518134]],
[[-0.15716419]]])
You ask about a multidimensional data. Just what do you mean by that? It already is 2d. Do you mean a (n,2) array, or a (n,1,2)? What's the relation between this n dimension and the 3 dimension of weight? No hand waving please :)
You can also use np.apply_along_axis
np.apply_along_axis(lambda x:np.dot(x,data.T),2,weight)
which gives
array([[[-0.29644829]],
[[-0.03518134]],
[[-0.15716419]]])
If data contains more than one row, this will also work, for example
weight = np.array([[[ 0.38932115, -0.27430567]],
[[-0.04543304, -0.05643598]],
[[ 0.46912688, -0.07695298]]])
data = np.array([[-0.2056065, 0.7889058],[-0.2056065, 0.7889058]])
np.apply_along_axis(lambda x:np.dot(x,data.T),2,weight)
gives you
array([[[-0.29644829, -0.29644829]],
[[-0.03518134, -0.03518134]],
[[-0.15716419, -0.15716419]]])
First transpose data before taking the dot product.
>>> weight.dot(data.T)
array([[[-0.29644829]],
[[-0.03518134]],
[[-0.15716419]]])
# Multiple rows of data.
data = np.array([[-0.2056065, 0.7889058],
[0.7889058, -.2056065]])
>>> weight.dot(data.T)
array([[[-0.29644829, 0.36353674]],
[[-0.03518134, -0.02423878]],
[[-0.15716419, 0.38591895]]])

How to create an empty array and append it?

I am new to programming.
I am trying to run this code
from numpy import *
x = empty((2, 2), int)
x = append(x, array([1, 2]), axis=0)
x = append(x, array([3, 5]), axis=0)
print(x)
But i get this error
Traceback (most recent call last):
File "/home/samip/PycharmProjects/MyCode/test.py", line 3, in <module>
x = append(x, array([1, 2]), axis=0)
File "<__array_function__ internals>", line 5, in append
File "/usr/lib/python3/dist-packages/numpy/lib/function_base.py", line 4700, in append
return concatenate((arr, values), axis=axis)
File "<__array_function__ internals>", line 5, in concatenate
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)
I suspect you are trying to replicate this working list code:
In [56]: x = []
In [57]: x.append([1,2])
In [58]: x
Out[58]: [[1, 2]]
In [59]: np.array(x)
Out[59]: array([[1, 2]])
But with arrays:
In [53]: x = np.empty((2,2),int)
In [54]: x
Out[54]:
array([[73096208, 10273248],
[ 2, -1]])
Despite the name, the np.empty array is NOT a close of the empty list. It has 4 elements, the shape that you specified.
In [55]: np.append(x, np.array([1,2]), axis=0)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-55-64dd8e7900e3> in <module>
----> 1 np.append(x, np.array([1,2]), axis=0)
<__array_function__ internals> in append(*args, **kwargs)
/usr/local/lib/python3.6/dist-packages/numpy/lib/function_base.py in append(arr, values, axis)
4691 values = ravel(values)
4692 axis = arr.ndim-1
-> 4693 return concatenate((arr, values), axis=axis)
4694
4695
<__array_function__ internals> in concatenate(*args, **kwargs)
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)
Note that np.append has passed the task on to np.concatenate. With the axis parameter, that's all this append does. It is NOT a list append clone.
np.concatenate demands consistency in the dimensions of its inputs. One is (2,2), the other (2,). Mismatched dimensions.
np.append is a dangerous function, and not that useful even when used correctly. np.concatenate (and the various stack) functions are useful. But you need to pay attention to shapes. And don't use them iteratively. List append is more efficient for that.
When you got this error, did you look up the np.append, np.empty (and np.concatenate) functions? Read and understand the docs? In the long run SO questions aren't a substitute for reading the documentation.
You can create empty list by []. In order to add new item use append. For add other list use extend.
x = [1, 2, 3]
x.append(4)
x.extend([5, 6])
print(x)
# [1, 2, 3, 4, 5, 6]
The issue is that the line x = empty((2, 2), int) is creating a 2D array.
Later when you try to append array([1, 2]) you get an error because it is a 1D array.
You can try the code below.
from numpy import *
x = empty((2, 2), int)
x = append(x,[1,2])
print(x)
as you can see in the error your two arrays must match the same shape, x.shape returns (2,2), and array([1,2]).shape returns (2,) so what you have to do is
x = np.append(x, np.array([1,2]).reshape((1,2)), axis=0)
Printing x returns :
array([[1.966937e-316, 4.031792e-313],
[0.000000e+000, 4.940656e-324],
[1.000000e+000, 2.000000e+000]])

Categories