Matrix-like printing of 2D arrays in Python - python

Say I have a matrix in a numpy array in Python
In [3]: my_matrix
Out[3]:
array([[ 2., 2., 2., 2., 2., 2., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 2., 2., 2., 2., 0., 0., 0.,
0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 2., 2., 2.,
2., 2., 2., 2., 2.]])
Is there a way to have Python/IPython print my array as:
[ 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2;
0 0 0 0 0 0 2 2 2 2 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 ]
? (~ similar to the way MATLAB does it)
Also, I have noticed that IPython does not use the full width of my terminal when printing numpy arrays. Other functions do (e.g. pprint.pprint). How can I change that?

Use numpy.set_printoptions. For increasing the line width:
np.set_printoptions(linewidth=150)
Replace 150 by whatever you need. Now, to print as you asked (I guess it means without the decimal point):
print my_matrix.astype('i')
If you have floating point values you can also control the precision for printouts with the option precision:
np.set_printoptions(precision=3)

Related

Expanding a matrix [duplicate]

This question already has answers here:
Quick way to upsample numpy array by nearest neighbor tiling [duplicate]
(3 answers)
Closed 4 years ago.
Given a matrix, such as:
1 0 0
0 1 1
1 1 0
I would like to expand each element to a "sub-matrix" of size AxA, e.g., 3x3, the result will be:
1 1 1 0 0 0 0 0 0
1 1 1 0 0 0 0 0 0
1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 1 1 1
0 0 0 1 1 1 1 1 1
0 0 0 1 1 1 1 1 1
1 1 1 1 1 1 0 0 0
1 1 1 1 1 1 0 0 0
1 1 1 1 1 1 0 0 0
What is the fastest way of doing it in Python using numpy (or PyTorch)?
Since what you're describing is the Kronecker product:
Use np.kron
Computes the Kronecker product, a composite array made of blocks of the second array scaled by the first.
x = np.array([[1, 0, 0], [0, 1, 1], [1, 1, 0]])
np.kron(x, np.ones((3, 3)))
array([[1., 1., 1., 0., 0., 0., 0., 0., 0.],
[1., 1., 1., 0., 0., 0., 0., 0., 0.],
[1., 1., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 1., 1., 1., 1., 1.],
[0., 0., 0., 1., 1., 1., 1., 1., 1.],
[0., 0., 0., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 0., 0., 0.],
[1., 1., 1., 1., 1., 1., 0., 0., 0.],
[1., 1., 1., 1., 1., 1., 0., 0., 0.]])

Editing Large Matrix Python

I want to make a 34x34 Matrix consisting of entirely zeroes and ones. I have an array that lists the coordinates where all of the ones should go but don't know how to use it. The array looks like this:
0 1 1
0 2 1
0 3 1
1 1 1
where the first number in each row is the x coordinate, the second number in each row is the y coordinate, and the third number is the desired value (always 1).
I tried to create a blank matrix using Matrix=numpy.zeros(34,34) but I don't know how to change the desired values all at once.
Any idea how to take a matrix and change multiple values at once?
That's work:
a = np.array([[0,1,1],[0,2,1],[0,3,1],[1,1,1]])
m = np.zeros([5,5])
for i in range(len(a)):
m[a[i][0],a[i][1]] = a[i][2] # Or = 1 if that's always the case
And the m matrix is:
array([[ 0., 1., 1., 1., 0.],
[ 0., 1., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])

Numpy: What is the correct way to upsample an array?

octave:1> a=[1 2 3]
a =
1 2 3
octave:2> k=[a;zeros(9,length(a))]
k =
1 2 3
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
Is the below method the correct way to achieve it in Python:
>>> a=[1, 2, 3]
>>> np.append(a,np.zeros((9,len(a))))
array([ 1., 2., 3., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0.])
The octave solution results in a 10x3 matrix while your solution results in a 1-dimensional array with 30 elements.
I am assuming you want a matrix with the dimensions 10x3 right?
>>>a=np.array((1, 2, 3))
>>>k=np.vstack((a,np.zeros((9,len(a)))))
array([[ 1., 2., 3.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])

Vectorizing / Contrasting a Dataframe with Categorical Variables

Say I have a dataframe like the following:
A B
0 bar one
1 bar three
2 flux six
3 bar three
4 foo five
5 flux one
6 foo two
I would like to apply dummy-coding contrasting on it so that I get:
A B
0 0 0
1 0 2
2 1 1
3 0 2
4 2 3
5 1 0
6 2 4
(i.e. mapping every unique value to a different integer, per column).
I have tried using scikit-learn's DictVectorizer, but I get:
> from sklearn.feature_extraction import DictVectorizer as DV
> vectorizer = DV( sparse = False )
> dict_to_vectorize = df.T.to_dict().values()
> df_vec = vectorizer.fit_transform(dict_to_vectorize )
> df_vec
array([[ 1., 0., 0., 0., 1., 0., 0., 0.],
[ 1., 0., 0., 0., 0., 0., 1., 0.],
[ 0., 1., 0., 0., 0., 1., 0., 0.],
[ 1., 0., 0., 0., 0., 0., 1., 0.],
[ 0., 0., 1., 1., 0., 0., 0., 0.],
[ 0., 1., 0., 0., 1., 0., 0., 0.],
[ 0., 0., 1., 0., 0., 0., 0., 1.]])
This is because scikit-learn's DictVectorizer is designed to output one-of-K encoding. What I want is a simple-encoding instead (one column per variable).
How can I do this with scikit-learn and/or pandas? Aside from that, are there any other Python packages that help with general contrasting methods?
You could use pd.factorize:
In [124]: df.apply(lambda x: pd.factorize(x)[0])
Out[124]:
A B
0 0 0
1 0 1
2 1 2
3 0 1
4 2 3
5 1 0
6 2 4
The patsy package provides all the contrasts you'd need (and the ability to make more). [1] AFAIK, statsmodels is the only stats package that currently uses patsy's formula framework. [2, 3].
[1] https://patsy.readthedocs.org/en/latest/API-reference.html#handling-categorical-data
[2] http://statsmodels.sourceforge.net/devel/contrasts.html
[3] http://statsmodels.sourceforge.net/devel/example_formulas.html
Dummy encoding is what you get when you call DictVectorizer. The kind of integer encoding you get is actually different:
sklearn.preprocessing.LabelBinarizer or DictVectorizer gives dummy encoding (as pandas.get_dummies)
sklearn.preprocessing.LabelEncoder gives integer categorical encoding (as pandas.factorize)

Convert matlab code into python for matrix creation

I'm struggling to create the following matrix in python:
| 1 -2 1 0 ... 0 |
| 0 1 -2 1 ... ... |
|... ... ... ... 0 |
| 0 ... 0 1 -2 1 |
I've the matlab code below which seems to create this matrix (article) but I cannot convert it in python code.
Matlab code:
D2 = spdiags(ones(T-2,1)*[1 -2 1],[0:2],T-2,T);
T is the number of columns.
The code in python looks like this:
from scipy.sparse import spdiags
D2 = spdiags( (ones((T-2,1))*array([1,-2,1])),arange(0,3),T-2,T)
This latter produce the following error:
ValueError: number of diagonals (327) does not match the number of
offsets (3)
But if I transpose the matrix like that:
D2 = spdiags( (ones((T-2,1))*array([1,-2,1])).T,arange(0,3),T-2,T)
I get the following result:
matrix([[ 1., -2., 1., ..., 0., 0., 0.],
[ 0., 1., -2., ..., 0., 0., 0.],
[ 0., 0., 1., ..., 0., 0., 0.],
...,
[ 0., 0., 0., ..., 1., 0., 0.],
[ 0., 0., 0., ..., -2., 0., 0.],
[ 0., 0., 0., ..., 1., 0., 0.]])
Does anybody can help me? Where am I wrong?
Change this:
D2 = spdiags( (ones((T-2,1))*array([1,-2,1])).T,arange(0,3),T-2,T)
to this:
D2 = spdiags( (ones((T,1))*array([1,-2,1])).T,arange(0,3),T-2,T)
That is, you want the length of the rows in the first argument, which is the array containing the diagonals, to be equal the number of columns in the result.

Categories