Linear Regression with 3 input vectors and 4 output vectors?

Linear Regression with 3 input vectors and 4 output vectors? - python

Task:
As an example, we have 3 input vectors:
foo = [1, 2, 3, 4, 5, 6]
bar = [50, 60, 70, 80, 90, 100]
spam = [-10, -20, -30, -40, -50, -60]
Also, we have 4 output vectors that have linear dependency from input vectors:
foofoo = [1, 1, 2, 2, 3, 3]
barbar = [4, 4, 5, 5, 6, 6]
spamspam = [7, 7, 8, 8, 9, 9]
hamham = [10, 10, 11, 11, 12, 12]
How to use Linear Regression at this data in Python?

You can use OLS (Ordinary Least Squares model) as done here:
#imports
import numpy as np
import statsmodels.api as sm
#generate the input matrix
X=[foo,bar,spam]
#turn it into a numpy array
X = np.array(X).T
#add a constant column
X=sm.add_constant(X)
This gives the input matrix X:
array([[ 1., 1., 50., -10.],
[ 1., 2., 60., -20.],
[ 1., 3., 70., -30.],
[ 1., 4., 80., -40.],
[ 1., 5., 90., -50.],
[ 1., 6., 100., -60.]])
And now you can fit each desired output vector:
resFoo = sm.OLS(endog=foofoo, exog=X).fit()
resBar = sm.OLS(endog=barbar, exog=X).fit()
resSpam = sm.OLS(endog=spamspam, exog=X).fit()
resham = sm.OLS(endog=hamham, exog=X).fit()
The result gives you the coefficients (for the constant, and the three columns foo, bar, and spam):
>>> resFoo.params
array([-0.00063323, 0.0035345 , 0.01001583, -0.035345 ])
You can now check it with the input:
>>> np.matrix(X)*np.matrix(resFoo.params).T
matrix([[ 0.85714286],
[ 1.31428571],
[ 1.77142857],
[ 2.22857143],
[ 2.68571429],
[ 3.14285714]])
Which is close to the desired output of foofoo.
See this question for different ways to do the regression: Multiple linear regression in Python

Related

find infinity values and replace with maximum per vector in a numpy array

Suppose I have the following array with shape (3, 5) :
array = np.array([[1, 2, 3, inf, 5],
[10, 9, 8, 7, 6],
[4, inf, 2, 6, inf]])
Now I want to find the infinity values per vector and replace them with the maximum of that vector, with a lower limit of 1.
So the output for this example shoud be:
array_solved = np.array([[1, 2, 3, 5, 5],
[10, 9, 8, 7, 6],
[4, 6, 2, 6, 6]])
I could do this by looping over every vector of the array and apply:
idx_inf = np.isinf(array_vector)
max_value = np.max(np.append(array_vector[~idx_inf], 1.0))
array_vector[idx_inf] = max_value
But I guess there is a faster way.
Anyone an idea?

One way is to first convert infs to NaNs with np.isinf masking and then NaNs to max values of rows with np.nanmax:
array[np.isinf(array)] = np.nan
array[np.isnan(array)] = np.nanmax(array, axis=1)
to get
>>> array
array([[ 1., 2., 3., 5., 5.],
[10., 9., 8., 7., 6.],
[ 4., 10., 2., 6., 6.]])

import numpy as np
array = np.array([[1, 2, 3, np.inf, 5],
[10, 9, 8, 7, 6],
[4, np.inf, 2, 6, np.inf]])
n, m = array.shape
array[np.isinf(array)] = -np.inf
mx_array = np.repeat(np.max(array, axis=1), m).reshape(n, m)
ind = np.where(np.isinf(array))
array[ind] = mx_array[ind]
Output array:
array([[ 1., 2., 3., 5., 5.],
[10., 9., 8., 7., 6.],
[ 4., 6., 2., 6., 6.]])

Classifying dots in matrix (Python)

I have big matrix, like 600x600 with 9 dots in 9 same sectors(# like tic-tac-toe).
I need to turn it to 3x3 array with iDs of dots in this sectors, like:
[[id2,id1,id5],[id4,id6,id7],[id3,id8,id9]]
Dividing plane in 9 small planes goes really bad. I need something like relative positions, and dont know even the worlds I need to google
def classificator(val):
global A
global closed
height, width = map(int, closed.shape)
h1 = height // 3
w1 = width // 3
h2 = height // 3 * 2
w2 = width // 3 * 2
for x in range(len(val)):
xcoord = val[x][0]
ycoord = val[x][1]
if 0 <= val[x][0] < h1 and 0 <= val[x][1] < w1 and A[0, 0] == '_': #top left X
A[0, 0] = val[x][2]

Following from the comments above. This is still asking for clarification but shows a way of interpreting your question.
In [1]: import numpy as np
In [2]: data_in=np.fromfunction(lambda r, c: 10*r+c, (6, 6))
# Create an array where the vales give a indication of where they are in the array.
In [3]: data_in
Out[3]:
array([[ 0., 1., 2., 3., 4., 5.],
[10., 11., 12., 13., 14., 15.],
[20., 21., 22., 23., 24., 25.],
[30., 31., 32., 33., 34., 35.],
[40., 41., 42., 43., 44., 45.],
[50., 51., 52., 53., 54., 55.]])
In [4]: slices=[np.s_[0:3], np.s_[3:6] ]
In [5]: slices
Out[5]: [slice(0, 3, None), slice(3, 6, None)]
In [8]: result=np.zeros((4,3,3), dtype=np.int32)
In [9]: ix=0
In [12]: for rows in slices:
...: for columns in slices:
...: result[ix,:,:]=data_in[rows, columns]
...: ix+=1
...:
In [13]: result
Out[13]:
array([[[ 0, 1, 2],
[10, 11, 12], # Top Left in data_in
[20, 21, 22]],
[[ 3, 4, 5],
[13, 14, 15], # Top Right in data_in
[23, 24, 25]],
[[30, 31, 32],
[40, 41, 42], # Bottom Left in data_in
[50, 51, 52]],
[[33, 34, 35],
[43, 44, 45], # Bottom Right in data_in
[53, 54, 55]]], dtype=int32)
Can you use it as a basis to explain what you expect to see?
If your input data was only 6 by 6 what would it look like and what would you expect to see coming out?
Edits: Two typos corrected.

An elegant way of inserting a numpy matrix into another

I have a requirement where I have 2 2D numpy arrays, and I would like to combine them in a specific manner:
x = [[0, 1, 2],
[3, 4, 5],
[6, 7, 8]]
| | |
0 1 2
y = [[10, 11, 12],
[13, 14, 15],
[16, 17, 18]]
| | |
3 4 5
x op y = [ 0 3 1 4 2 5 ] (in terms of the columns)
In other words,
The combination of x and y should look something like this:
[[ 0., 10., 1., 11., 2., 12.],
[ 3., 13., 4., 14., 5., 15.],
[ 6., 16., 7., 17., 8., 18.]]
Where I alternately combine the columns of each individual array to form the final 2D array. I have come up with one way of doing so, but it is rather ugly. Here's my code:
x = np.arange(9).reshape(3, 3)
y = np.arange(start=10, stop=19).reshape(3, 3)
>>> a = np.zeros((6, 3)) # create a 2D array where num_rows(a) = num_cols(x) + num_cols(y)
>>> a[: : 2] = x.T
>>> a[1: : 2] = y.T
>>> a.T
array([[ 0., 10., 1., 11., 2., 12.],
[ 3., 13., 4., 14., 5., 15.],
[ 6., 16., 7., 17., 8., 18.]])
As you can see, this is a very ugly sequence of operations. Furthermore, things become even more cumbersome in higher dimensions. For example, if you have x and y to be [3 x 3 x 3], then this operation has to be repeated in each dimension. So I'd probably have to tackle this with a loop.
Is there a simpler way around this?
Thanks.

In [524]: x=np.arange(9).reshape(3,3)
In [525]: y=np.arange(10,19).reshape(3,3)
This doesn't look at all ugly to me (one liners are over rated):
In [526]: a = np.zeros((3,6),int)
....
In [528]: a[:,::2]=x
In [529]: a[:,1::2]=y
In [530]: a
Out[530]:
array([[ 0, 10, 1, 11, 2, 12],
[ 3, 13, 4, 14, 5, 15],
[ 6, 16, 7, 17, 8, 18]])
still if you want a one liner, this might do:
In [535]: np.stack((x.T,y.T),axis=1).reshape(6,3).T
Out[535]:
array([[ 0, 10, 1, 11, 2, 12],
[ 3, 13, 4, 14, 5, 15],
[ 6, 16, 7, 17, 8, 18]])
The idea on this last was to combine the arrays on a new dimension, and reshape is some way other. I found it by trial and error.
and with another trial:
In [539]: np.stack((x,y),2).reshape(3,6)
Out[539]:
array([[ 0, 10, 1, 11, 2, 12],
[ 3, 13, 4, 14, 5, 15],
[ 6, 16, 7, 17, 8, 18]])

Here is a compact way to write it with a loop, it might be generalizable to higher dimension arrays with a little work:
x = np.array([[0,1,2], [3,4,5], [6,7,8]])
y = np.array([[10,11,12], [13,14,15], [16,17,18]])
z = np.zeros((3,6))
for i in xrange(3):
z[i] = np.vstack((x.T[i],y.T[i])).reshape((-1,),order='F')

Create a new sparse matrix from the operations of different rows of a given large sparse matrix in python

Lets say , S is the large scipy-csr-matrix(sparse) and a dictionary D with key -> index(position) of the row vector A in S & values -> list of all the indices(positions) of other row vectors l in S. For each row vector in l you subtract A and get the new vector which will be nothing but the new row vector to be updated in the new sparse matrix.
dictionary of form -> { 1 : [4 , 5 ,... ,63] }
then have to create a new sparse matrix with....
new_row_vector_1 -> S_vec1 - S_vec4
new_row_vector_2 -> S_vec1 - S_vec5
.
new_row_vector_n -> S_vec1 - S_vec63
where S_vecX is the Xth row vector of matrix S
Check out the pictorial explanation of the above statements
Numpy Example:
>>> import numpy as np
>>> s = np.array([[1,5,3,4],[3,0,12,7],[5,6,2,4],[4,6,6,4],[7,12,5,67]])
>>> s
array([[ 1, 5, 3, 4],
[ 3, 0, 12, 7],
[ 5, 6, 2, 4],
[ 4, 6, 6, 4],
[ 7, 12, 5, 67]])
>>> index_dictionary = {0: [2, 4], 1: [3, 4], 2: [1], 3: [1, 2], 4: [1, 3, 2]}
>>> n = np.zeros((10,4)) #sum of all lengths of values in index_dictionary would be the number of rows for the new array(n) and columns remain the same as s.
>>> n
array([[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]])
>>> idx = 0
>>> for index in index_dictionary:
... for k in index_dictionary[index]:
... n[idx] = s[index]-s[k]
... idx += 1
...
>>> n
array([[ -4., -1., 1., 0.],
[ -6., -7., -2., -63.],
[ -1., -6., 6., 3.],
[ -4., -12., 7., -60.],
[ 2., 6., -10., -3.],
[ 1., 6., -6., -3.],
[ -1., 0., 4., 0.],
[ 4., 12., -7., 60.],
[ 3., 6., -1., 63.],
[ 2., 6., 3., 63.]])
n is what i want.

Here's a simple demonstration of what I think you are trying to do:
First the numpy array version:
In [619]: arr=np.arange(12).reshape(4,3)
In [620]: arr[[1,0,2,3]]-arr[0]
Out[620]:
array([[3, 3, 3],
[0, 0, 0],
[6, 6, 6],
[9, 9, 9]])
Now the sparse equivalent:
In [622]: M=sparse.csr_matrix(arr)
csr implements row indexing:
In [624]: M[[1,0,2,3]]
Out[624]:
<4x3 sparse matrix of type '<class 'numpy.int32'>'
with 11 stored elements in Compressed Sparse Row format>
In [625]: M[[1,0,2,3]].A
Out[625]:
array([[ 3, 4, 5],
[ 0, 1, 2],
[ 6, 7, 8],
[ 9, 10, 11]], dtype=int32)
But not broadcasting:
In [626]: M[[1,0,2,3]]-M[0]
....
ValueError: inconsistent shapes
So we can use an explicit form of broadcasting
In [627]: M[[1,0,2,3]]-M[[0,0,0,0]] # or M[[0]*4]
Out[627]:
<4x3 sparse matrix of type '<class 'numpy.int32'>'
with 9 stored elements in Compressed Sparse Row format>
In [628]: _.A
Out[628]:
array([[3, 3, 3],
[0, 0, 0],
[6, 6, 6],
[9, 9, 9]], dtype=int32)
This may not be the fastest or most efficient, but it's a start.
I found in a previous SO question that M[[1,0,2,3]] indexing is performed with a matrix multiplication, in this case the equivalent of:
idxM = sparse.csr_matrix(([1,1,1,1],([0,1,2,3],[1,0,2,3])),(4,4))
M1 = idxM * M
Sparse matrix slicing using list of int
So my difference expression requires 2 such multiplication along with the subtraction.
We could try a row by row iteration, and building a new matrix from the result, but there's no guarantee that it will be faster. Depending on the arrays, converting to dense and back might even be faster.
=================
I can imagine 2 ways of applying this to the dictionary.
One is to iterate through the dictionary (what order?), perform this difference for each key, collect the results in a list (list of sparse matrices), and use sparse.bmat to join them into one matrix.
Another is to collect two lists of indexes, and apply the above indexed difference just once.
In [8]: index_dictionary = {0: [2, 4], 1: [3, 4], 2: [1], 3: [1, 2], 4: [1, 3, 2]}
In [10]: alist=[]
...: for index in index_dictionary:
...: for k in index_dictionary[index]:
...: alist.append((index, k))
In [11]: idx = np.array(alist)
In [12]: idx
Out[12]:
array([[0, 2],
[0, 4],
[1, 3],
[1, 4],
[2, 1],
[3, 1],
[3, 2],
[4, 1],
[4, 3],
[4, 2]])
applied to your dense s:
In [15]: s = np.array([[1,5,3,4],[3,0,12,7],[5,6,2,4],[4,6,6,4],[7,12,5,67]])
In [16]: s[idx[:,0]]-s[idx[:,1]]
Out[16]:
array([[ -4, -1, 1, 0],
[ -6, -7, -2, -63],
[ -1, -6, 6, 3],
[ -4, -12, 7, -60],
[ 2, 6, -10, -3],
[ 1, 6, -6, -3],
[ -1, 0, 4, 0],
[ 4, 12, -7, 60],
[ 3, 6, -1, 63],
[ 2, 6, 3, 63]])
and to the sparse equivalent
In [19]: arr= sparse.csr_matrix(s)
In [20]: arr
Out[20]:
<5x4 sparse matrix of type '<class 'numpy.int32'>'
with 19 stored elements in Compressed Sparse Row format>
In [21]: res=arr[idx[:,0]]-arr[idx[:,1]]
In [22]: res
Out[22]:
<10x4 sparse matrix of type '<class 'numpy.int32'>'
with 37 stored elements in Compressed Sparse Row format>

numpy.resize() rearanging instead of resizing?

I'm trying to resize numpy array, but it seems that the resize works by first flattening the array, then getting first X*Y elem and putting them in the new shape. What I want to do instead is to cut the array at coord 3,3, not rearrange it. Similar thing happens when I try to upsize it say to 7,7 ... instead of "rearranging" I want to fill the new cols and rows with zeros and keep the data as it is.
Is there a way to do that ?
> a = np.zeros((5,5))
> a.flat = range(25)
> a
array(
[[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[ 10., 11., 12., 13., 14.],
[ 15., 16., 17., 18., 19.],
[ 20., 21., 22., 23., 24.]])
> a.resize((3,3),refcheck=False)
> a
array(
[[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
thank you ...

Upsizing to 7x7 goes like this
upsized = np.zeros([7, 7])
upsized[:5, :5] = a

I believe you want to use numpy's slicing syntax instead of resize. resize works by first raveling the array and working with a 1D view.
>>> a = np.arange(25).reshape(5,5)
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
>>> a[:3,:3]
array([[ 0, 1, 2],
[ 5, 6, 7],
[10, 11, 12]])
What you are doing here is taking a view of the numpy array. For example to update the original array by slicing:
>>> a[:3,:3] = 0
>>> a
array([[ 0, 0, 0, 3, 4],
[ 0, 0, 0, 8, 9],
[ 0, 0, 0, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
An excellent guide on numpy's slicing syntax can be found here.
Upsizing (or padding) only works by making a copy of the data. You start with an array of zeros and fill in appropriately
upsized = np.zeros([7, 7])
upsized[:5, :5] = a

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Linear Regression with 3 input vectors and 4 output vectors? - python

Related

find infinity values and replace with maximum per vector in a numpy array

Classifying dots in matrix (Python)

An elegant way of inserting a numpy matrix into another

Create a new sparse matrix from the operations of different rows of a given large sparse matrix in python

numpy.resize() rearanging instead of resizing?

Categories

Resources