Elegant solution to appending vector to matrix in Numpy? - python

I've seen others post on this, but it's not clear to me if there's a better solution. I've got a 2D NumPy array, and I'd like to append a column to it. For example:
import numpy as np
A = np.array([[2., 3.],[-1., -2.]])
e = np.ones(2)
print(A)
print(e)
B = np.hstack((A,e.reshape((2,1))))
print(B)
does exactly what I want. But is there a way to avoid this clunky use of reshape?

If you want to avoid using reshape then you have to be appending a column of the right dimensions:
e = np.ones((2, 1))
B = np.hstack((A,e))
Note the modification to the call to ones. The reason you have to use reshape at the moment is that numpy does not regard an array of dimension 2 to be the same as an array of dimension (2, 1). The second is a 2D array where the size of one of the dimensions is 1.

My nomination for a direct solution is
np.concatenate((A, e[:, None]), axis=1)
The [:,None] turns e into a (2,1) which can be joined to the (2,2) to produce a (2,3). Reshape does the same, but isn't as syntactically pretty.
Solutions using hstack, vstack, and c_ do the same thing but hide one or more details.
In this case I think column_stack hides the most details.
np.column_stack((A, e))
Under the covers this does:
np.concatenate((A, np.array(e, copy=False, ndmin=2).T), axis=1)
That np.array(... ndmin=2).T is yet another way of doing the reshape.

There are many solutions. I like np.c_ which treats 1d inputs as columns (hence c) resulting in a concise, clutter-free, easy to read:
np.c_[A, e]
# array([[ 2., 3., 1.],
# [-1., -2., 1.]])

As Tim B says, to hstack you need a (2,1) array. Alternatively (keeping your e as a one-dimensional array), vstack to the transpose, and take the transpose:
In [11]: np.vstack((A.T, e)).T
Out[11]:
array([[ 2., 3., 1.],
[-1., -2., 1.]])

Related

Add a level to Numpy array

I have a problem with a numpy array.
In particular, suppose to have a matrix
x = np.array([[1., 2., 3.], [4., 5., 6.]])
with shape (2,3), I want to convert the float numbers into list so to obtain the array [[[1.], [2.], [3.]], [[4.], [5.], [6.]]] with shape (2,3,1).
I tried to convert each float number to a list (i.e., x[0][0] = [x[0][0]]) but it does not work.
Can anyone help me? Thanks
What you want is adding another dimension to your numpy array. One way of doing it is using reshape:
x = x.reshape(2,3,1)
output:
[[[1.]
[2.]
[3.]]
[[4.]
[5.]
[6.]]]
There is a function in Numpy to perform exactly what #Valdi_Bo mentions. You can use np.expand_dims and add a new dimension along axis 2, as follows:
x = np.expand_dims(x, axis=2)
Refer:
np.expand_dims
Actually, you want to add a dimension (not level).
To do it, run:
result = x[...,np.newaxis]
Its shape is just (2, 3, 1).
Or save the result back under x.
You are trying to add a new dimension to the numpy array. There are multiple ways of doing this as other answers mentioned np.expand_dims, np.new_axis, np.reshape etc. But I usually use the following as I find it the most readable, especially when you are working with vectorizing multiple tensors and complex operations involving broadcasting (check this Bounty question that I solved with this method).
x[:,:,None].shape
(2,3,1)
x[None,:,None,:,None].shape
(1,2,1,3,1)
Well, maybe this is an overkill for the array you have, but definitely the most efficient solution is to use np.lib.stride_tricks.as_strided. This way no data is copied.
import numpy as np
x = np.array([[1., 2., 3.], [4., 5., 6.]])
newshape = x.shape[:-1] + (x.shape[-1], 1)
newstrides = x.strides + x.strides[-1:]
a = np.lib.stride_tricks.as_strided(x, shape=newshape, strides=newstrides)
results in:
array([[[1.],
[2.],
[3.]],
[[4.],
[5.],
[6.]]])
>>> a.shape
(2, 3, 1)

python why use numpy.r_ instead of concatenate

In which case using objects like numpy.r_ or numpy.c_ is better (more efficient, more suitable) than using functions like concatenate or vstack for example ?
I am trying to understand a code where the programmer wrote something like:
return np.r_[0.0, 1d_array, 0.0] == 2
where 1d_array is an array whose values can be 0, 1 or 2.
Why not using np.concatenate (for example) instead ? Like :
return np.concatenate([[0.0], 1d_array, [0.0]]) == 2
It is more readable and apparently it does the same thing.
np.r_ is implemented in the numpy/lib/index_tricks.py file. This is pure Python code, with no special compiled stuff. So it is not going to be any faster than the equivalent written with concatenate, arange and linspace. It's useful only if the notation fits your way of thinking and your needs.
In your example it just saves converting the scalars to lists or arrays:
In [452]: np.r_[0.0, np.array([1,2,3,4]), 0.0]
Out[452]: array([ 0., 1., 2., 3., 4., 0.])
error with the same arguments:
In [453]: np.concatenate([0.0, np.array([1,2,3,4]), 0.0])
...
ValueError: zero-dimensional arrays cannot be concatenated
correct with the added []
In [454]: np.concatenate([[0.0], np.array([1,2,3,4]), [0.0]])
Out[454]: array([ 0., 1., 2., 3., 4., 0.])
hstack takes care of that by passing all arguments through [atleast_1d(_m) for _m in tup]:
In [455]: np.hstack([0.0, np.array([1,2,3,4]), 0.0])
Out[455]: array([ 0., 1., 2., 3., 4., 0.])
So at least in simple cases it is most similar to hstack.
But the real usefulness of r_ comes when you want to use ranges
np.r_[0.0, 1:5, 0.0]
np.hstack([0.0, np.arange(1,5), 0.0])
np.r_[0.0, slice(1,5), 0.0]
r_ lets you use the : syntax that is used in indexing. That's because it is actually an instance of a class that has a __getitem__ method. index_tricks uses this programming trick several times.
They've thrown in other bells-n-whistles
Using an imaginary step, uses np.linspace to expand the slice rather than np.arange.
np.r_[-1:1:6j, [0]*3, 5, 6]
produces:
array([-1. , -0.6, -0.2, 0.2, 0.6, 1. , 0. , 0. , 0. , 5. , 6. ])
There are more details in the documentation.
I did some time tests for many slices in https://stackoverflow.com/a/37625115/901925
I was also interested in this question and compared the speed of
numpy.c_[a, a]
numpy.stack([a, a]).T
numpy.vstack([a, a]).T
numpy.column_stack([a, a])
numpy.concatenate([a[:,None], a[:,None]], axis=1)
which all do the same thing for any input vector a. Here's what I found (using perfplot):
For smaller numbers, numpy.concatenate is the winner, for larger stack/vstack.
The plot was created with
import numpy as np
import perfplot
b = perfplot.bench(
setup=np.random.rand,
kernels=[
lambda a: np.c_[a, a],
lambda a: np.stack([a, a]).T,
lambda a: np.vstack([a, a]).T,
lambda a: np.column_stack([a, a]),
lambda a: np.concatenate([a[:, None], a[:, None]], axis=1),
],
labels=["c_", "stack", "vstack", "column_stack", "concat"],
n_range=[2**k for k in range(22)],
xlabel="len(a)",
)
b.save("out.png")
b.show()
All the explanation you need:
https://sourceforge.net/p/numpy/mailman/message/13869535/
I found the most relevant part to be:
"""
For r_ and c_ I'm summarizing, but effectively they seem to be doing
something like:
r_[args]:
concatenate( map(atleast_1d,args),axis=0 )
c_[args]:
concatenate( map(atleast_1d,args),axis=1 )
c_ behaves almost exactly like hstack -- with the addition of range
literals being allowed.
r_ is most like vstack, but a little different since it effectively
uses atleast_1d, instead of atleast_2d. So you have
>>> numpy.vstack((1,2,3,4))
array([[1],
[2],
[3],
[4]])
but
>>> numpy.r_[1,2,3,4]
array([1, 2, 3, 4])
"""

fill off diagonal of numpy array fails

I'm trying to the fill the offset diagonals of a matrix:
loss_matrix = np.zeros((125,125))
np.diagonal(loss_matrix, 3).fill(4)
ValueError: assignment destination is read-only
Two questions:
1) Without iterating over indexes, how can I set the offset diagonals of a numpy array?
2) Why is the result of np.diagonal read only? The documentation for numpy.diagonal reads: "In NumPy 1.10, it will return a read/write view and writing to the returned array will alter your original array."
np.__version__
'1.10.1'
Judging by the discussion on the NumPy issue tracker, it looks like the feature is stuck in limbo and they never got around to fixing the documentation to say it was delayed.
If you need writability, you can force it. This will only work on NumPy 1.9 and up, since np.diagonal makes a copy on lower versions:
diag = np.diagonal(loss_matrix, 3)
# It's not writable. MAKE it writable.
diag.setflags(write=True)
diag.fill(4)
In an older version, diagflat constructs an array from a diagonal.
In [180]: M=np.diagflat(np.ones(125-3)*4,3)
In [181]: M.shape
Out[181]: (125, 125)
In [182]: M.diagonal(3)
Out[182]:
array([ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.,... 4.])
In [183]: np.__version__
Out[183]: '1.8.2'
Effectively it does this (working from its Python code)
res = np.zeros((125, 125))
i = np.arange(122)
fi = i+3+i*125
res.flat[fi] = 4
That is, it finds the flatten array equivalent indices of the diagonal.
I can also get fi with:
In [205]: i=np.arange(0,122)
In [206]: np.ravel_multi_index((i,i+3),(125,125))

Problems on how to transpose single column data in python

I created a text file called 'column.txt' containing the following data:
1
2
3
4
9
8
Then I wrote the code below to transpose my data to a single-row text file.
import numpy as np
x=np.loadtxt('column.txt')
z=x.T
y=x.transpose()
np.savetxt('row.txt',y, fmt='%i')
I tried two different ways - using matrix multiplication (the commented line in my code) and using transpose command. The problem was the output was exactly the same as the input!
Afterwards, I added another column to the input file, ran the code and surprisingly this time the output was completely fine (The output contained two rows!)
So my question is:
Is there anyway to transpose a single column file to a single row one? If yes, could you please describe how?
You can use numpy.reshape to transpose data and change the shape of your array like the following:
>>> import numpy as np
>>> arr=np.loadtxt('column.txt')
>>> arr
array([ 1., 2., 3., 4., 9., 8.])
>>> arr.shape
(6,)
>>> arr=arr.reshape(6,1)
>>> arr
array([[ 1.],
[ 2.],
[ 3.],
[ 4.],
[ 9.],
[ 8.]])
or you can just give the number of an array dimension as an input to the numpy.loadtxt function
>>> np.loadtxt('column.txt', ndmin=2)
array([[ 1.],
[ 2.],
[ 3.],
[ 4.],
[ 9.],
[ 8.]])
But if you want to convert a single column to a single row and write it into a file just you need to do as following
>>> parr=arr.reshape(1,len(arr))
np.savetxt('column.txt',parr, fmt='%i')
If your input data only consists of a single column, np.loadtxt() will return an one-dimensional array. Transposing basically means to reverse the order of the axes. For a one-dimensional array with only a single axis, this is a no-op. You can convert the array into a two-dimensional array in many different ways, and transposing will work as expected for the two-dimensional array, e.g.
x = np.atleast_2d(np.loadtxt('column.txt'))
It is because the transpose of a 1D array is the same as itself, as there is no other dimension to transpose to.
You could try adding a 2nd dimension by doing this,
>>> import numpy as np
>>> x = np.array([[1], [2], [3], [4], [9], [8]])
>>> x.T
array([[1, 2, 3, 4, 9, 8]])

Assigning values to two dimensional array from two one dimensional ones

Most probably somebody else already asked this but I couldn't find it. The question is how can I assign values to a 2D array from two 1D arrays. For example:
import numpy as np
#a is the 2D array. b is the 1D array and should be assigned
#to second coordinate. In this exaple the first coordinate is 1.
a=np.zeros((3,2))
b=np.asarray([1,2,3])
c=np.ones(3)
a=np.vstack((c,b)).T
output:
[[ 1. 1.]
[ 1. 2.]
[ 1. 3.]]
I know the way I am doing it so naive, but I am sure there should be a one line way of doing this.
P.S. In real case that I am dealing with, this is a subarray of an array, and therefore I cannot set the first coordinate from the beginning to one. The whole array's first coordinate are different, but after applying np.where they become constant.
How about 2 lines?
>>> c = np.ones((3, 2))
>>> c[:, 1] = [1, 2, 3]
And the proof it works:
>>> c
array([[ 1., 1.],
[ 1., 2.],
[ 1., 3.]])
Or, perhaps you want np.column_stack:
>>> np.column_stack(([1.,1,1],[1,2,3]))
array([[ 1., 1.],
[ 1., 2.],
[ 1., 3.]])
First, there's absolutely no reason to create the original zeros array that you stick in a, never reference, and replace with a completely different array with the same name.
Second, if you want to create an array the same shape and dtype as b but with all ones, use ones_like.
So:
b = np.array([1,2,3])
c = np.ones_like(b)
d = np.vstack((c, b).T
You could of course expand b to a 3x1-array instead of a 3-array, in which case you can use hstack instead of needing to vstack then transpose… but I don't think that's any simpler:
b = np.array([1,2,3])
b = np.expand_dims(b, 1)
c = np.ones_like(b)
d = np.hstack((c, b))
If you insist on 1 line, use fancy indexing:
>>> a[:,0],a[:,1]=[1,1,1],[1,2,3]

Categories