Say I have a matrix in a numpy array in Python
In [3]: my_matrix
Out[3]:
array([[ 2., 2., 2., 2., 2., 2., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 2., 2., 2., 2., 0., 0., 0.,
0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 2., 2., 2.,
2., 2., 2., 2., 2.]])
Is there a way to have Python/IPython print my array as:
[ 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2;
0 0 0 0 0 0 2 2 2 2 0 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 ]
? (~ similar to the way MATLAB does it)
Also, I have noticed that IPython does not use the full width of my terminal when printing numpy arrays. Other functions do (e.g. pprint.pprint). How can I change that?
Use numpy.set_printoptions. For increasing the line width:
np.set_printoptions(linewidth=150)
Replace 150 by whatever you need. Now, to print as you asked (I guess it means without the decimal point):
print my_matrix.astype('i')
If you have floating point values you can also control the precision for printouts with the option precision:
np.set_printoptions(precision=3)
Related
This question already has answers here:
Quick way to upsample numpy array by nearest neighbor tiling [duplicate]
(3 answers)
Closed 4 years ago.
Given a matrix, such as:
1 0 0
0 1 1
1 1 0
I would like to expand each element to a "sub-matrix" of size AxA, e.g., 3x3, the result will be:
1 1 1 0 0 0 0 0 0
1 1 1 0 0 0 0 0 0
1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 1 1 1
0 0 0 1 1 1 1 1 1
0 0 0 1 1 1 1 1 1
1 1 1 1 1 1 0 0 0
1 1 1 1 1 1 0 0 0
1 1 1 1 1 1 0 0 0
What is the fastest way of doing it in Python using numpy (or PyTorch)?
Since what you're describing is the Kronecker product:
Use np.kron
Computes the Kronecker product, a composite array made of blocks of the second array scaled by the first.
x = np.array([[1, 0, 0], [0, 1, 1], [1, 1, 0]])
np.kron(x, np.ones((3, 3)))
array([[1., 1., 1., 0., 0., 0., 0., 0., 0.],
[1., 1., 1., 0., 0., 0., 0., 0., 0.],
[1., 1., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 1., 1., 1., 1., 1.],
[0., 0., 0., 1., 1., 1., 1., 1., 1.],
[0., 0., 0., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 0., 0., 0.],
[1., 1., 1., 1., 1., 1., 0., 0., 0.],
[1., 1., 1., 1., 1., 1., 0., 0., 0.]])
I want to make a 34x34 Matrix consisting of entirely zeroes and ones. I have an array that lists the coordinates where all of the ones should go but don't know how to use it. The array looks like this:
0 1 1
0 2 1
0 3 1
1 1 1
where the first number in each row is the x coordinate, the second number in each row is the y coordinate, and the third number is the desired value (always 1).
I tried to create a blank matrix using Matrix=numpy.zeros(34,34) but I don't know how to change the desired values all at once.
Any idea how to take a matrix and change multiple values at once?
That's work:
a = np.array([[0,1,1],[0,2,1],[0,3,1],[1,1,1]])
m = np.zeros([5,5])
for i in range(len(a)):
m[a[i][0],a[i][1]] = a[i][2] # Or = 1 if that's always the case
And the m matrix is:
array([[ 0., 1., 1., 1., 0.],
[ 0., 1., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])
octave:1> a=[1 2 3]
a =
1 2 3
octave:2> k=[a;zeros(9,length(a))]
k =
1 2 3
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
Is the below method the correct way to achieve it in Python:
>>> a=[1, 2, 3]
>>> np.append(a,np.zeros((9,len(a))))
array([ 1., 2., 3., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0.])
The octave solution results in a 10x3 matrix while your solution results in a 1-dimensional array with 30 elements.
I am assuming you want a matrix with the dimensions 10x3 right?
>>>a=np.array((1, 2, 3))
>>>k=np.vstack((a,np.zeros((9,len(a)))))
array([[ 1., 2., 3.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
Say I have a dataframe like the following:
A B
0 bar one
1 bar three
2 flux six
3 bar three
4 foo five
5 flux one
6 foo two
I would like to apply dummy-coding contrasting on it so that I get:
A B
0 0 0
1 0 2
2 1 1
3 0 2
4 2 3
5 1 0
6 2 4
(i.e. mapping every unique value to a different integer, per column).
I have tried using scikit-learn's DictVectorizer, but I get:
> from sklearn.feature_extraction import DictVectorizer as DV
> vectorizer = DV( sparse = False )
> dict_to_vectorize = df.T.to_dict().values()
> df_vec = vectorizer.fit_transform(dict_to_vectorize )
> df_vec
array([[ 1., 0., 0., 0., 1., 0., 0., 0.],
[ 1., 0., 0., 0., 0., 0., 1., 0.],
[ 0., 1., 0., 0., 0., 1., 0., 0.],
[ 1., 0., 0., 0., 0., 0., 1., 0.],
[ 0., 0., 1., 1., 0., 0., 0., 0.],
[ 0., 1., 0., 0., 1., 0., 0., 0.],
[ 0., 0., 1., 0., 0., 0., 0., 1.]])
This is because scikit-learn's DictVectorizer is designed to output one-of-K encoding. What I want is a simple-encoding instead (one column per variable).
How can I do this with scikit-learn and/or pandas? Aside from that, are there any other Python packages that help with general contrasting methods?
You could use pd.factorize:
In [124]: df.apply(lambda x: pd.factorize(x)[0])
Out[124]:
A B
0 0 0
1 0 1
2 1 2
3 0 1
4 2 3
5 1 0
6 2 4
The patsy package provides all the contrasts you'd need (and the ability to make more). [1] AFAIK, statsmodels is the only stats package that currently uses patsy's formula framework. [2, 3].
[1] https://patsy.readthedocs.org/en/latest/API-reference.html#handling-categorical-data
[2] http://statsmodels.sourceforge.net/devel/contrasts.html
[3] http://statsmodels.sourceforge.net/devel/example_formulas.html
Dummy encoding is what you get when you call DictVectorizer. The kind of integer encoding you get is actually different:
sklearn.preprocessing.LabelBinarizer or DictVectorizer gives dummy encoding (as pandas.get_dummies)
sklearn.preprocessing.LabelEncoder gives integer categorical encoding (as pandas.factorize)
I'm struggling to create the following matrix in python:
| 1 -2 1 0 ... 0 |
| 0 1 -2 1 ... ... |
|... ... ... ... 0 |
| 0 ... 0 1 -2 1 |
I've the matlab code below which seems to create this matrix (article) but I cannot convert it in python code.
Matlab code:
D2 = spdiags(ones(T-2,1)*[1 -2 1],[0:2],T-2,T);
T is the number of columns.
The code in python looks like this:
from scipy.sparse import spdiags
D2 = spdiags( (ones((T-2,1))*array([1,-2,1])),arange(0,3),T-2,T)
This latter produce the following error:
ValueError: number of diagonals (327) does not match the number of
offsets (3)
But if I transpose the matrix like that:
D2 = spdiags( (ones((T-2,1))*array([1,-2,1])).T,arange(0,3),T-2,T)
I get the following result:
matrix([[ 1., -2., 1., ..., 0., 0., 0.],
[ 0., 1., -2., ..., 0., 0., 0.],
[ 0., 0., 1., ..., 0., 0., 0.],
...,
[ 0., 0., 0., ..., 1., 0., 0.],
[ 0., 0., 0., ..., -2., 0., 0.],
[ 0., 0., 0., ..., 1., 0., 0.]])
Does anybody can help me? Where am I wrong?
Change this:
D2 = spdiags( (ones((T-2,1))*array([1,-2,1])).T,arange(0,3),T-2,T)
to this:
D2 = spdiags( (ones((T,1))*array([1,-2,1])).T,arange(0,3),T-2,T)
That is, you want the length of the rows in the first argument, which is the array containing the diagonals, to be equal the number of columns in the result.