Problems on how to transpose single column data in python - python

I created a text file called 'column.txt' containing the following data:
1
2
3
4
9
8
Then I wrote the code below to transpose my data to a single-row text file.
import numpy as np
x=np.loadtxt('column.txt')
z=x.T
y=x.transpose()
np.savetxt('row.txt',y, fmt='%i')
I tried two different ways - using matrix multiplication (the commented line in my code) and using transpose command. The problem was the output was exactly the same as the input!
Afterwards, I added another column to the input file, ran the code and surprisingly this time the output was completely fine (The output contained two rows!)
So my question is:
Is there anyway to transpose a single column file to a single row one? If yes, could you please describe how?

You can use numpy.reshape to transpose data and change the shape of your array like the following:
>>> import numpy as np
>>> arr=np.loadtxt('column.txt')
>>> arr
array([ 1., 2., 3., 4., 9., 8.])
>>> arr.shape
(6,)
>>> arr=arr.reshape(6,1)
>>> arr
array([[ 1.],
[ 2.],
[ 3.],
[ 4.],
[ 9.],
[ 8.]])
or you can just give the number of an array dimension as an input to the numpy.loadtxt function
>>> np.loadtxt('column.txt', ndmin=2)
array([[ 1.],
[ 2.],
[ 3.],
[ 4.],
[ 9.],
[ 8.]])
But if you want to convert a single column to a single row and write it into a file just you need to do as following
>>> parr=arr.reshape(1,len(arr))
np.savetxt('column.txt',parr, fmt='%i')

If your input data only consists of a single column, np.loadtxt() will return an one-dimensional array. Transposing basically means to reverse the order of the axes. For a one-dimensional array with only a single axis, this is a no-op. You can convert the array into a two-dimensional array in many different ways, and transposing will work as expected for the two-dimensional array, e.g.
x = np.atleast_2d(np.loadtxt('column.txt'))

It is because the transpose of a 1D array is the same as itself, as there is no other dimension to transpose to.
You could try adding a 2nd dimension by doing this,
>>> import numpy as np
>>> x = np.array([[1], [2], [3], [4], [9], [8]])
>>> x.T
array([[1, 2, 3, 4, 9, 8]])

Related

Add a level to Numpy array

I have a problem with a numpy array.
In particular, suppose to have a matrix
x = np.array([[1., 2., 3.], [4., 5., 6.]])
with shape (2,3), I want to convert the float numbers into list so to obtain the array [[[1.], [2.], [3.]], [[4.], [5.], [6.]]] with shape (2,3,1).
I tried to convert each float number to a list (i.e., x[0][0] = [x[0][0]]) but it does not work.
Can anyone help me? Thanks
What you want is adding another dimension to your numpy array. One way of doing it is using reshape:
x = x.reshape(2,3,1)
output:
[[[1.]
[2.]
[3.]]
[[4.]
[5.]
[6.]]]
There is a function in Numpy to perform exactly what #Valdi_Bo mentions. You can use np.expand_dims and add a new dimension along axis 2, as follows:
x = np.expand_dims(x, axis=2)
Refer:
np.expand_dims
Actually, you want to add a dimension (not level).
To do it, run:
result = x[...,np.newaxis]
Its shape is just (2, 3, 1).
Or save the result back under x.
You are trying to add a new dimension to the numpy array. There are multiple ways of doing this as other answers mentioned np.expand_dims, np.new_axis, np.reshape etc. But I usually use the following as I find it the most readable, especially when you are working with vectorizing multiple tensors and complex operations involving broadcasting (check this Bounty question that I solved with this method).
x[:,:,None].shape
(2,3,1)
x[None,:,None,:,None].shape
(1,2,1,3,1)
Well, maybe this is an overkill for the array you have, but definitely the most efficient solution is to use np.lib.stride_tricks.as_strided. This way no data is copied.
import numpy as np
x = np.array([[1., 2., 3.], [4., 5., 6.]])
newshape = x.shape[:-1] + (x.shape[-1], 1)
newstrides = x.strides + x.strides[-1:]
a = np.lib.stride_tricks.as_strided(x, shape=newshape, strides=newstrides)
results in:
array([[[1.],
[2.],
[3.]],
[[4.],
[5.],
[6.]]])
>>> a.shape
(2, 3, 1)

Elegant solution to appending vector to matrix in Numpy?

I've seen others post on this, but it's not clear to me if there's a better solution. I've got a 2D NumPy array, and I'd like to append a column to it. For example:
import numpy as np
A = np.array([[2., 3.],[-1., -2.]])
e = np.ones(2)
print(A)
print(e)
B = np.hstack((A,e.reshape((2,1))))
print(B)
does exactly what I want. But is there a way to avoid this clunky use of reshape?
If you want to avoid using reshape then you have to be appending a column of the right dimensions:
e = np.ones((2, 1))
B = np.hstack((A,e))
Note the modification to the call to ones. The reason you have to use reshape at the moment is that numpy does not regard an array of dimension 2 to be the same as an array of dimension (2, 1). The second is a 2D array where the size of one of the dimensions is 1.
My nomination for a direct solution is
np.concatenate((A, e[:, None]), axis=1)
The [:,None] turns e into a (2,1) which can be joined to the (2,2) to produce a (2,3). Reshape does the same, but isn't as syntactically pretty.
Solutions using hstack, vstack, and c_ do the same thing but hide one or more details.
In this case I think column_stack hides the most details.
np.column_stack((A, e))
Under the covers this does:
np.concatenate((A, np.array(e, copy=False, ndmin=2).T), axis=1)
That np.array(... ndmin=2).T is yet another way of doing the reshape.
There are many solutions. I like np.c_ which treats 1d inputs as columns (hence c) resulting in a concise, clutter-free, easy to read:
np.c_[A, e]
# array([[ 2., 3., 1.],
# [-1., -2., 1.]])
As Tim B says, to hstack you need a (2,1) array. Alternatively (keeping your e as a one-dimensional array), vstack to the transpose, and take the transpose:
In [11]: np.vstack((A.T, e)).T
Out[11]:
array([[ 2., 3., 1.],
[-1., -2., 1.]])

List/Array of strings to numpy float array

I am new to scikit learn and numpy. How can I represent my dataset made of list/array of strings eg
[["aa bb","a","bbb","à"], [bb cc","c","ddd","à"], ["kkk","a","","a"]]
to a numpy array of dtype float?
I think what your looking for is a numeric representation of your words. You can use gensim and map each word to a token id and from that create your numpy arrays as follows:
import numpy as np
from gensim import corpora
toconvert = [["aa bb","a","bbb","à"], ["bb", "cc","c","ddd","à"], ["kkk","a","","a"]]
# convert your list of lists into token id's. For example, 'aa bb' could be represented as a 2, a as a 1, etc.
tdict = corpora.Dictionary(toconvert)
# given nested structure, you can append nested numpy arrays
newlist = []
for l in toconvert:
tmplist = []
for word in l:
# append to intermediate list the id for the given word under observation
tmplist.append(tdict.token2id[word])
# convert to numpy array and append to main list
newlist.append(np.array(tmplist).astype(float)) # type float
print(newlist) # desired output: [array([ 2., 0., 1., 0.]), array([ 5., 3., 4., 6., 0.]), array([ 7., 0., 8., 0.])]
# and to see what id's represent which strings:
tdict[0] # 'a'

fill off diagonal of numpy array fails

I'm trying to the fill the offset diagonals of a matrix:
loss_matrix = np.zeros((125,125))
np.diagonal(loss_matrix, 3).fill(4)
ValueError: assignment destination is read-only
Two questions:
1) Without iterating over indexes, how can I set the offset diagonals of a numpy array?
2) Why is the result of np.diagonal read only? The documentation for numpy.diagonal reads: "In NumPy 1.10, it will return a read/write view and writing to the returned array will alter your original array."
np.__version__
'1.10.1'
Judging by the discussion on the NumPy issue tracker, it looks like the feature is stuck in limbo and they never got around to fixing the documentation to say it was delayed.
If you need writability, you can force it. This will only work on NumPy 1.9 and up, since np.diagonal makes a copy on lower versions:
diag = np.diagonal(loss_matrix, 3)
# It's not writable. MAKE it writable.
diag.setflags(write=True)
diag.fill(4)
In an older version, diagflat constructs an array from a diagonal.
In [180]: M=np.diagflat(np.ones(125-3)*4,3)
In [181]: M.shape
Out[181]: (125, 125)
In [182]: M.diagonal(3)
Out[182]:
array([ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.,... 4.])
In [183]: np.__version__
Out[183]: '1.8.2'
Effectively it does this (working from its Python code)
res = np.zeros((125, 125))
i = np.arange(122)
fi = i+3+i*125
res.flat[fi] = 4
That is, it finds the flatten array equivalent indices of the diagonal.
I can also get fi with:
In [205]: i=np.arange(0,122)
In [206]: np.ravel_multi_index((i,i+3),(125,125))

Assigning values to two dimensional array from two one dimensional ones

Most probably somebody else already asked this but I couldn't find it. The question is how can I assign values to a 2D array from two 1D arrays. For example:
import numpy as np
#a is the 2D array. b is the 1D array and should be assigned
#to second coordinate. In this exaple the first coordinate is 1.
a=np.zeros((3,2))
b=np.asarray([1,2,3])
c=np.ones(3)
a=np.vstack((c,b)).T
output:
[[ 1. 1.]
[ 1. 2.]
[ 1. 3.]]
I know the way I am doing it so naive, but I am sure there should be a one line way of doing this.
P.S. In real case that I am dealing with, this is a subarray of an array, and therefore I cannot set the first coordinate from the beginning to one. The whole array's first coordinate are different, but after applying np.where they become constant.
How about 2 lines?
>>> c = np.ones((3, 2))
>>> c[:, 1] = [1, 2, 3]
And the proof it works:
>>> c
array([[ 1., 1.],
[ 1., 2.],
[ 1., 3.]])
Or, perhaps you want np.column_stack:
>>> np.column_stack(([1.,1,1],[1,2,3]))
array([[ 1., 1.],
[ 1., 2.],
[ 1., 3.]])
First, there's absolutely no reason to create the original zeros array that you stick in a, never reference, and replace with a completely different array with the same name.
Second, if you want to create an array the same shape and dtype as b but with all ones, use ones_like.
So:
b = np.array([1,2,3])
c = np.ones_like(b)
d = np.vstack((c, b).T
You could of course expand b to a 3x1-array instead of a 3-array, in which case you can use hstack instead of needing to vstack then transpose… but I don't think that's any simpler:
b = np.array([1,2,3])
b = np.expand_dims(b, 1)
c = np.ones_like(b)
d = np.hstack((c, b))
If you insist on 1 line, use fancy indexing:
>>> a[:,0],a[:,1]=[1,1,1],[1,2,3]

Categories