About NumPy array in Python [closed] - python

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
What is the difference between:
import numpy as np
A = np.zeros((3,))
and
import numpy as np
B = np.zeros((1,3))
Thanks for your answer!

Hope these illustrate the difference in practice.
>>> A = np.zeros((3,))
>>> B = np.zeros((1,3))
>>> A #no column, just 1D
array([ 0., 0., 0.])
>>> B #has one column
array([[ 0., 0., 0.]])
>>> A.shape
(3,)
>>> B.shape
(1, 3)
>>> A[1]
0.0
>>> B[1] #can't do this, it will take the 2nd column, but there is only one column.
Traceback (most recent call last):
File "<pyshell#89>", line 1, in <module>
B[1]
IndexError: index 1 is out of bounds for axis 0 with size 1
>>> B[0] #But you can take the 1st column
array([ 0., 0., 0.])
>>> B[:,1] #take the 2nd cell, for each column
array([ 0.])
>>> B[0,1] #how to get the same result as A[1]? take the 2nd cell of the 1st col.
0.0

The first one creates a 1D numpy.array of zeros:
>>> import numpy as np
>>> A = np.zeros((3,))
>>> A
array([ 0., 0., 0.])
>>> A[0]
0.0
>>>
The second creates a 2D numpy.array of 1 row and 3 columns, filled with zeros:
>>> import numpy as np
>>> B = np.zeros((1,3))
>>> B
array([[ 0., 0., 0.]])
>>> B[0]
array([ 0., 0., 0.])
>>>
Here is a reference on numpy.zeros and one on numpy.array if you want further details.

A is a one-dimensional array with three elements.
B is a two-dimensional array with one row and three columns.
You could also use C = np.zeros((3,1)) which would create a two-dimensional array with three rows and one column.
A, B, and C have the same elements -- the difference is in how they will be interpreted by later calls. For instance some numpy calls operate on specific dimensions, or can be told to operate on a specific dimensions. For instance sum:
>> np.sum(A, 0)
3.0
>> np.sum(B, 0)
array([ 1., 1., 1.])
They also have different behavior with matrix/tensor operations like dot, and also operations like hstack and vstack.
If all you are going to use is vectors, form A will usually do what you want. The extra 'singleton' dimension (i.e., a dimension of size 1) is just extra cruft you have to keep track of. However, if you need to interact with 2d arrays it is likely that you will have to distinguish between row vectors and column vectors. In that case forms B and C will be useful.

Related

Elegant solution to appending vector to matrix in Numpy?

I've seen others post on this, but it's not clear to me if there's a better solution. I've got a 2D NumPy array, and I'd like to append a column to it. For example:
import numpy as np
A = np.array([[2., 3.],[-1., -2.]])
e = np.ones(2)
print(A)
print(e)
B = np.hstack((A,e.reshape((2,1))))
print(B)
does exactly what I want. But is there a way to avoid this clunky use of reshape?
If you want to avoid using reshape then you have to be appending a column of the right dimensions:
e = np.ones((2, 1))
B = np.hstack((A,e))
Note the modification to the call to ones. The reason you have to use reshape at the moment is that numpy does not regard an array of dimension 2 to be the same as an array of dimension (2, 1). The second is a 2D array where the size of one of the dimensions is 1.
My nomination for a direct solution is
np.concatenate((A, e[:, None]), axis=1)
The [:,None] turns e into a (2,1) which can be joined to the (2,2) to produce a (2,3). Reshape does the same, but isn't as syntactically pretty.
Solutions using hstack, vstack, and c_ do the same thing but hide one or more details.
In this case I think column_stack hides the most details.
np.column_stack((A, e))
Under the covers this does:
np.concatenate((A, np.array(e, copy=False, ndmin=2).T), axis=1)
That np.array(... ndmin=2).T is yet another way of doing the reshape.
There are many solutions. I like np.c_ which treats 1d inputs as columns (hence c) resulting in a concise, clutter-free, easy to read:
np.c_[A, e]
# array([[ 2., 3., 1.],
# [-1., -2., 1.]])
As Tim B says, to hstack you need a (2,1) array. Alternatively (keeping your e as a one-dimensional array), vstack to the transpose, and take the transpose:
In [11]: np.vstack((A.T, e)).T
Out[11]:
array([[ 2., 3., 1.],
[-1., -2., 1.]])

Problems on how to transpose single column data in python

I created a text file called 'column.txt' containing the following data:
1
2
3
4
9
8
Then I wrote the code below to transpose my data to a single-row text file.
import numpy as np
x=np.loadtxt('column.txt')
z=x.T
y=x.transpose()
np.savetxt('row.txt',y, fmt='%i')
I tried two different ways - using matrix multiplication (the commented line in my code) and using transpose command. The problem was the output was exactly the same as the input!
Afterwards, I added another column to the input file, ran the code and surprisingly this time the output was completely fine (The output contained two rows!)
So my question is:
Is there anyway to transpose a single column file to a single row one? If yes, could you please describe how?
You can use numpy.reshape to transpose data and change the shape of your array like the following:
>>> import numpy as np
>>> arr=np.loadtxt('column.txt')
>>> arr
array([ 1., 2., 3., 4., 9., 8.])
>>> arr.shape
(6,)
>>> arr=arr.reshape(6,1)
>>> arr
array([[ 1.],
[ 2.],
[ 3.],
[ 4.],
[ 9.],
[ 8.]])
or you can just give the number of an array dimension as an input to the numpy.loadtxt function
>>> np.loadtxt('column.txt', ndmin=2)
array([[ 1.],
[ 2.],
[ 3.],
[ 4.],
[ 9.],
[ 8.]])
But if you want to convert a single column to a single row and write it into a file just you need to do as following
>>> parr=arr.reshape(1,len(arr))
np.savetxt('column.txt',parr, fmt='%i')
If your input data only consists of a single column, np.loadtxt() will return an one-dimensional array. Transposing basically means to reverse the order of the axes. For a one-dimensional array with only a single axis, this is a no-op. You can convert the array into a two-dimensional array in many different ways, and transposing will work as expected for the two-dimensional array, e.g.
x = np.atleast_2d(np.loadtxt('column.txt'))
It is because the transpose of a 1D array is the same as itself, as there is no other dimension to transpose to.
You could try adding a 2nd dimension by doing this,
>>> import numpy as np
>>> x = np.array([[1], [2], [3], [4], [9], [8]])
>>> x.T
array([[1, 2, 3, 4, 9, 8]])

Normalise 2D Numpy Array: Zero Mean Unit Variance

I have a 2D Numpy array, in which I want to normalise each column to zero mean and unit variance. Since I'm primarily used to C++, the method in which I'm doing is to use loops to iterate over elements in a column and do the necessary operations, followed by repeating this for all columns. I wanted to know about a pythonic way to do so.
Let class_input_data be my 2D array. I can get the column mean as:
column_mean = numpy.sum(class_input_data, axis = 0)/class_input_data.shape[0]
I then subtract the mean from all columns by:
class_input_data = class_input_data - column_mean
By now, the data should be zero mean. However, the value of:
numpy.sum(class_input_data, axis = 0)
isn't equal to 0, implying that I have done something wrong in my normalisation. By isn't equal to 0, I don't mean very small numbers which can be attributed to floating point inaccuracies.
Something like:
import numpy as np
eg_array = 5 + (np.random.randn(10, 10) * 2)
normed = (eg_array - eg_array.mean(axis=0)) / eg_array.std(axis=0)
normed.mean(axis=0)
Out[14]:
array([ 1.16573418e-16, -7.77156117e-17, -1.77635684e-16,
9.43689571e-17, -2.22044605e-17, -6.09234885e-16,
-2.22044605e-16, -4.44089210e-17, -7.10542736e-16,
4.21884749e-16])
normed.std(axis=0)
Out[15]: array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

Assigning values to two dimensional array from two one dimensional ones

Most probably somebody else already asked this but I couldn't find it. The question is how can I assign values to a 2D array from two 1D arrays. For example:
import numpy as np
#a is the 2D array. b is the 1D array and should be assigned
#to second coordinate. In this exaple the first coordinate is 1.
a=np.zeros((3,2))
b=np.asarray([1,2,3])
c=np.ones(3)
a=np.vstack((c,b)).T
output:
[[ 1. 1.]
[ 1. 2.]
[ 1. 3.]]
I know the way I am doing it so naive, but I am sure there should be a one line way of doing this.
P.S. In real case that I am dealing with, this is a subarray of an array, and therefore I cannot set the first coordinate from the beginning to one. The whole array's first coordinate are different, but after applying np.where they become constant.
How about 2 lines?
>>> c = np.ones((3, 2))
>>> c[:, 1] = [1, 2, 3]
And the proof it works:
>>> c
array([[ 1., 1.],
[ 1., 2.],
[ 1., 3.]])
Or, perhaps you want np.column_stack:
>>> np.column_stack(([1.,1,1],[1,2,3]))
array([[ 1., 1.],
[ 1., 2.],
[ 1., 3.]])
First, there's absolutely no reason to create the original zeros array that you stick in a, never reference, and replace with a completely different array with the same name.
Second, if you want to create an array the same shape and dtype as b but with all ones, use ones_like.
So:
b = np.array([1,2,3])
c = np.ones_like(b)
d = np.vstack((c, b).T
You could of course expand b to a 3x1-array instead of a 3-array, in which case you can use hstack instead of needing to vstack then transpose… but I don't think that's any simpler:
b = np.array([1,2,3])
b = np.expand_dims(b, 1)
c = np.ones_like(b)
d = np.hstack((c, b))
If you insist on 1 line, use fancy indexing:
>>> a[:,0],a[:,1]=[1,1,1],[1,2,3]

Add a vector to array

A really stupid question, but I could not figure the right way..
A is a 2 by 2 matrix, and B is a 2 by 1 matrix.
In a 10 iterations loop, B_new=A*B. B_new is 2 by 1.
Save B_new to an output matrix B_final after each iteration. So in the end, B_final is 2 by 10.
However, I have problem of adding B to B_new in a loop. Below is my code, can anyone give me some suggestions?
import numpy as np
a=np.ones(shape=(2,2))
b=np.ones(shape=(2,1))
c_final=np.zeros(shape=(2,10))
for i in range(0,10):
c=np.dot(a,b)
b=c
c_final[:,i]=c
Here is the error message:
c_final[:,i]=c
ValueError: output operand requires a reduction, but reduction is not enabled
The error you're seeing is because when numpy broadcasts c_final[:,i] and np.dot(a,b) together it produces an array with shape (2,2), which then can't be assigned to c_final[:,i] since it has a shape of (2,1). I think it's much clearer if you just play around with it in the interpreter:
>>> import numpy as np
>>> a = np.ones((2,2))
>>> b = np.ones((2,1))
>>> c_final = np.zeros((2,10))
>>> np.dot(a,b)
array([[ 2.],
[ 2.]])
>>> np.dot(a,b).shape
(2, 1)
>>> c_final[:,0]
array([ 0., 0.])
>>> c_final[:,0].shape
(2,)
>>> np.broadcast(c_final[:,0],np.dot(a,b)).shape
(2, 2)
The way around this is to flatten np.dot(a,b) by using np.squeeze or something similar so that when they are broadcast together they produce a 2 element array. For example:
>>> c_final[:,0] = np.dot(a,b).squeeze()
You're not alone in finding the error message unhelpful. Someone filed a ticket about this about a year ago.

Categories