Python numpy broadcasting 3 dimensions (multiple weighted sums) - python

I've become sort of used to broadcasting with 2 dimensional arrays, but I can't get my head around this 3-dimensional thing I want to do.
I have two 2-dimensional arrays:
>>> a = np.array([[0.01,.2,.3,.4],[.2,.03,.4,.5],[.9,.8,.7,.06]])
>>> b= np.array([[1,2,3],[3.,4,5]])
>>> a
array([[ 0.01, 0.2 , 0.3 , 0.4 ],
[ 0.2 , 0.03, 0.4 , 0.5 ],
[ 0.9 , 0.8 , 0.7 , 0.06]])
>>> b
array([[ 1., 2., 3.],
[ 3., 4., 5.]])
Now, what I want is the sum all rows in a, where each row is weighted by the column values in b.
So, I want 1. * a[0,:] + 2. * a[1,:] + 3. * a[2,:] and the same for the second row of b.
So, I know how to do this step-by-step:
>>> (np.array([b[0]]).T * a).sum(0)
array([ 3.11, 2.66, 3.2 , 1.58])
>>> (np.array([b[1]]).T * a).sum(0)
array([ 5.33, 4.72, 6. , 3.5 ])
But I have the feeling that if I knew how to broadcast the two correctly as 3-dimensional arrays I could get the result I want in one go.
The result being:
array([[ 3.11, 2.66, 3.2 , 1.58],
[ 5.33, 4.72, 6. , 3.5 ]])
I guess this shouldn't be too hard..?!?

You want to do matrix multiplication:
>>> b.dot(a)
array([[ 3.11, 2.66, 3.2 , 1.58],
[ 5.33, 4.72, 6. , 3.5 ]])

Related

How to divide an array by an other array element wise in numpy?

I have two arrays, and I want all the elements of one to be divided by the second. For example,
In [24]: a = np.array([1,2,3])
In [25]: b = np.array([1,2,3])
In [26]: a/b
Out[26]: array([1., 1., 1.])
In [27]: 1/b
Out[27]: array([1. , 0.5 , 0.33333333])
This is not the answer I want, the output I want is like (we can see all of the elements of a are divided by b)
In [28]: c = []
In [29]: for i in a:
...: c.append(i/b)
...:
In [30]: c
Out[30]:
[array([1. , 0.5 , 0.33333333]),
array([2. , 1. , 0.66666667]),
In [34]: np.array(c)
Out[34]:
array([[1. , 0.5 , 0.33333333],
[2. , 1. , 0.66666667],
[3. , 1.5 , 1. ]])
But I don't like for loop, it's too slow for big data, so is there a function that included in numpy package or any good (faster) way to solve this problem?
It is simple to do in pure numpy, you can use broadcasting to calculate the outer product (or any other outer operation) of two vectors:
import numpy as np
a = np.arange(1, 4)
b = np.arange(1, 4)
c = a[:,np.newaxis] / b
# array([[1. , 0.5 , 0.33333333],
# [2. , 1. , 0.66666667],
# [3. , 1.5 , 1. ]])
This works, since a[:,np.newaxis] increases the dimension of the (3,) shaped array a into a (3, 1) shaped array, which can be used for the desired broadcasting operation.
First you need to cast a into a 2D array (same shape as the output), then repeat for the dimension you want to loop over. Then vectorized division will work.
>>> a.reshape(-1,1)
array([[1],
[2],
[3]])
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1)
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1) / b
array([[1. , 0.5 , 0.33333333],
[2. , 1. , 0.66666667],
[3. , 1.5 , 1. ]])
# Transpose will let you do it the other way around, but then you just get 1 for everything
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1).T
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1).T / b
array([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
This should do the job:
import numpy as np
a = np.array([1, 2, 3])
b = np.array([1, 2, 3])
print(a.reshape(-1, 1) / b)
Output:
[[ 1. 0.5 0.33333333]
[ 2. 1. 0.66666667]
[ 3. 1.5 1. ]]

python convert 2d array to 1d array [duplicate]

This question already has answers here:
How do you get the magnitude of a vector in Numpy?
(8 answers)
Closed 4 years ago.
I am new to python and need to do the following thing:
I've given an 1d array of vectors (so pretty much 2d).
My task is to create an 1d array that contains the length of each vector.
array([[0. , 0. ],
[1. , 0. ],
[1. , 1. ],
[1. , 0.75],
[0.75, 1. ],
[0.5 , 1. ]
...
should be converted to
array([0,
1,
1.4142,
...
I could easily do this in theory but I am not familiar with the inbuild commands of python and I am very happy if someone could tell me some inbuild commands of python that can do this.
Using norm from np.linalg.norm:
import numpy as np
a = np.array([[0., 0.],
[1., 0.],
[1., 1.],
[1., 0.75],
[0.75, 1.],
[0.5, 1.]])
print(np.linalg.norm(a, axis=1))
Output
[0. 1. 1.41421356 1.25 1.25 1.11803399]
With NumPy you can use vectorised operations:
A = np.array([[0. , 0. ],
[1. , 0. ],
[1. , 1. ],
[1. , 0.75],
[0.75, 1. ],
[0.5 , 1. ]])
res = np.sqrt(np.square(A).sum(1))
array([ 0. , 1. , 1.41421356, 1.25 , 1.25 ,
1.11803399])
Alternatively, if you prefer a less functional solution:
res = (A**2).sum(1)**0.5
You can use the list comprehension. In Python 2,
print [(x[0]*x[0]+x[1]*x[1])**0.5 for x in arr]
where arr is your input
you can iterate over your array to find the vector length:
array=[[0,0],[0,1],[1,0],[1,1]]
empty=[]
for (x,y) in array:
empty.append((x**2+y**2)**0.5)
print(empty)
You can achieve it with hypotenuse np.hypot
np.hypot(array[:, 0], array[:, 1])
You can try this:
import math
b = []
for el in arr:
b.append(math.sqrt(el[0]**2 + el[1]**2))
print b
or you can do it even shorter:
b = [math.sqrt(el[0]**2 + el[1]**2) for el in arr]
where arr is the your array.
Here is and one more example with lambda:
b = map(lambda el: (el[0]**2 + el[1]**2)**0.5, arr)

How to interpolate/extrapolate within partly empty regular grid?

I would like to create a python function to linearly interpolate within a partly empty grid and get a nearest extrapolation out of bounds.
Let's say I have the following data stored in pandas DataFrame:
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: x = [0,1,2,3,4]
In [4]: y = [0.5,1.5,2.5,3.5,4.5,5.5]
In [5]: z = np.array([[np.nan,np.nan,1.5,2.0,5.5,3.5],[np.nan,1.0,4.0,2.5,4.5,3.0],[2.0,0.5,6.0,1.5,3.5,np.nan],[np.nan,1.5,4.0,2.0,np.nan,np.nan],[np.nan,np.nan,2.0,np.nan,np.nan,np.nan]])
In [6]: df = pd.DataFrame(z,index=x,columns=y)
In [7]: df
Out[7]:
0.5 1.5 2.5 3.5 4.5 5.5
0 NaN NaN 1.5 2.0 5.5 3.5
1 NaN 1.0 4.0 2.5 4.5 3.0
2 2.0 0.5 6.0 1.5 3.5 NaN
3 NaN 1.5 4.0 2.0 NaN NaN
4 NaN NaN 2.0 NaN NaN NaN
I would like to get function myInterp that returns a linear interpolation within data boundaries (i.e. not NaN values) and get a nearest extrapolation outside bounds (i.e. NaN or no values) such as:
In [1]: myInterp([1.5,2.5]) #linear interpolation
Out[1]: 5.0
In [2]: myInterp([1.5,4.0]) #bi-linear interpolation
Out[2]: 3.0
In [3]: myInterp([0.0,2.0]) #nearest extrapolation (inside grid)
Out[3]: 1.5
In [4]: myInterp([5.0,2.5]) #nearest extrapolation (outside grid)
Out[4]: 2.0
I tried many combination of scipy.interpolate package with no success, does anyone have a suggestion how to do it ?
Yes, unfortunately scipy doesn't deal with nans
From the docs:
Note that calling interp2d with NaNs present in input values results in undefined behaviour.
Even masking the nans in a np.masked_array was not successful.
So my advice would be to remove all the nan entries from z by taking the opportunity to give sp.interp2d the full list of x- and y-coordinates for only the valid data and leave z also 1D:
X=[];Y=[];Z=[] # initialize new 1-D-lists for interp2
for i, xi in enumerate(x): # iterate through x
for k, yk in enumerate(y): # iterate through y
if not np.isnan(z[i, k]): # check if z-value is valid...
X.append(xi) # ...and if so, append coordinates and value to prepared lists
Y.append(yk)
Z.append(z[i, k])
This way at least sp.interp2d works and gives a result:
ip = sp.interpolate.interp2d(X,Y,Z)
However, the values in the result won't please you:
In: ip(x,y)
Out:
array([[ 18.03583061, -0.44933642, 0.83333333, -1. , -1.46105542],
[ 9.76791531, 1.3014037 , 2.83333333, 1.5 , 0.26947229],
[ 1.5 , 3.05214381, 4.83333333, 4. , 2. ],
[ 2. , 3.78378051, 1.5 , 2. , 0.8364618 ],
[ 5.5 , 3.57039277, 3.5 , -0.83019815, -0.7967441 ],
[ 3.5 , 3.29227922, 17.29607177, 0. , 0. ]])
compared to the input data:
In:z
Out:
array([[ nan, nan, 1.5, 2. , 5.5, 3.5],
[ nan, 1. , 4. , 2.5, 4.5, 3. ],
[ 2. , 0.5, 6. , 1.5, 3.5, nan],
[ nan, 1.5, 4. , 2. , nan, nan],
[ nan, nan, 2. , nan, nan, nan]])
But IMHO this is because the gradient changes in your data are far too high. Even more with respect to the low number of data samples.
I hope this is just a test data set and your real application has smoother gradients and some more samples. Then I'd be glad to hear if it works...
However, the trivial test with an array of zero gradient - only destructed by nans a little bit - could give a hint that interpolation should work, while extrapolation is only partly correct:
In:ip(x,y)
Out:
array([[ 3. , 3. , 3. , 3. , 0. ],
[ 3. , 3. , 3. , 3. , 1.94701008],
[ 3. , 3. , 3. , 3. , 3. ],
[ 3. , 3. , 3. , 3. , 1.54973345],
[ 3. , 3. , 3. , 3. , 0.37706713],
[ 3. , 3. , 2.32108317, 0.75435203, 0. ]])
resulting from the trivial test input
In:z
Out:
array([[ nan, nan, 3., 3., 3., 3.],
[ nan, 3., 3., nan, 3., 3.],
[ 3., 3., 3., 3., 3., nan],
[ nan, 3., 3., 3., nan, nan],
[ nan, nan, 3., nan, nan, nan]])
PS: Looking closer to the right hand side: there are even valid entries completely changed, i.e made wrong, which introduces errors in a following analysis.
But surprise: the cubic version performs much better here:
In:ip = sp.interpolate.interp2d(X,Y,Z, kind='cubic')
In:ip(x,y)
Out:
array([[ 3. , 3. , 3. , 3.02397028, 3.0958811 ],
[ 3. , 3. , 3. , 3. , 3. ],
[ 3. , 3. , 3. , 3. , 3. ],
[ 3. , 3. , 3. , 3. , 3. ],
[ 3. , 3. , 3. , 2.97602972, 2.9041189 ],
[ 3. , 3. , 3. , 2.9041189 , 2.61647559]])
In:z
Out:
array([[ nan, nan, 3., 3., 3., 3.],
[ nan, 3., 3., nan, 3., 3.],
[ 3., 3., 3., 3., 3., nan],
[ nan, 3., 3., 3., nan, nan],
[ nan, nan, 3., nan, nan, nan]])
Since scipy.interp2d doesn't deal with Nans, the solution is to fill the NaNs in the DataFrame before using interp2d. This can be done by using pandas.interpolate function.
In the previous example, the following provide the desired output:
In [1]: from scipy.interpolate import interp2d
In [2]: df = df.interpolate(limit_direction='both',axis=1,inplace=True)
In [3]: myInterp = interp2d(df.index,df.columns,df.values.T)
In [4]: myInterp(1.5,2.5)
Out[4]: array([5.])
In [5]: myInterp(1.5,4.0)
Out[5]: array([3.])
In [6]: myInterp(0.0,2.0)
Out[6]: array([1.5])
In [7]: myInterp(5.0,2.5)
Out[7]: array([2.])

numpy array divide column by vector

I have a 3x3 numpy array and I want to divide each column of this with a vector 3x1. I know how to divide each row by elements of the vector, but am unable to find a solution to divide each column.
You can transpose your array to divide on each column
(arr_3x3.T/arr_3x1).T
Let's try several things:
In [347]: A=np.arange(9.).reshape(3,3)
In [348]: A
Out[348]:
array([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
In [349]: x=10**np.arange(3).reshape(3,1)
In [350]: A/x
Out[350]:
array([[ 0. , 1. , 2. ],
[ 0.3 , 0.4 , 0.5 ],
[ 0.06, 0.07, 0.08]])
So this has divided each row by a different value
In [351]: A/x.T
Out[351]:
array([[ 0. , 0.1 , 0.02],
[ 3. , 0.4 , 0.05],
[ 6. , 0.7 , 0.08]])
And this has divided each column by a different value
(3,3) divided by (3,1) => replicates x across columns.
With the transpose (1,3) array is replicated across rows.
It's important that x be 2d when using .T (transpose). A (3,) array transposes to a (3,) array - that is, no change.
The simplest seems to be
A = np.arange(1,10).reshape(3,3)
b=np.arange(1,4)
A/b
A will be
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
and b will be
array([1, 2, 3])
and the division will produce
array([[1. , 1. , 1. ],
[4. , 2.5, 2. ],
[7. , 4. , 3. ]])
The first column is divided by 1, the second column by 2, and the third by 3.
If I've misinterpreted your columns for rows, simply transform with .T - as C_Z_ answered above.

Curve curvature in numpy

I am measuring x,y coordinates (in cm) of an object with a special camera in fixed time intervals of 1s. I have the data in a numpy array:
a = np.array([ [ 0. , 0. ],[ 0.3 , 0. ],[ 1.25, -0.1 ],[ 2.1 , -0.9 ],[ 2.85, -2.3 ],[ 3.8 , -3.95],[ 5. , -5.75],[ 6.4 , -7.8 ],[ 8.05, -9.9 ],[ 9.9 , -11.6 ],[ 12.05, -12.85],[ 14.25, -13.7 ],[ 16.5 , -13.8 ],[ 19.25, -13.35],[ 21.3 , -12.2 ],[ 22.8 , -10.5 ],[ 23.55, -8.15],[ 22.95, -6.1 ],[ 21.35, -3.95],[ 19.1 , -1.9 ]])
And the curve looks like this:
plt.scatter(a[:,0], a[:,1])
Question:
How can I calculate the tangential and the radial aceleration vectors at each point? I found some formulas that might be relevant:
I am able to easily calculate the vx and the vy projections with np.diff(a, axis=0) but I am a numpy/python noob and it is way over my head to continue. If I could calculate the curvature at each point, also my problem would be solved. Can somebody help?
EDIT: I put together this answer off and on over a couple of hours, so I missed your latest edits indicating that you only needed curvature. Hopefully, this answer will be helpful regardless.
Other than doing some curve-fitting, our method of approximating derivatives is via finite differences. Thankfully, numpy has a gradient method that does these difference calculations for us, taking care of the details of averaging previous and next slopes for each interior point and leaving each endpoint alone, etc.
import numpy as np
a = np.array([ [ 0. , 0. ],[ 0.3 , 0. ],[ 1.25, -0.1 ],
[ 2.1 , -0.9 ],[ 2.85, -2.3 ],[ 3.8 , -3.95],
[ 5. , -5.75],[ 6.4 , -7.8 ],[ 8.05, -9.9 ],
[ 9.9 , -11.6 ],[ 12.05, -12.85],[ 14.25, -13.7 ],
[ 16.5 , -13.8 ],[ 19.25, -13.35],[ 21.3 , -12.2 ],
[ 22.8 , -10.5 ],[ 23.55, -8.15],[ 22.95, -6.1 ],
[ 21.35, -3.95],[ 19.1 , -1.9 ]])
Now, we compute the derivatives of each variable and put them together (for some reason, if we just call np.gradient(a), we get a list of arrays...not sure off the top of my head what's going on there, but I'll just work around it for now):
dx_dt = np.gradient(a[:, 0])
dy_dt = np.gradient(a[:, 1])
velocity = np.array([ [dx_dt[i], dy_dt[i]] for i in range(dx_dt.size)])
This gives us the following vector for velocity:
array([[ 0.3 , 0. ],
[ 0.625, -0.05 ],
[ 0.9 , -0.45 ],
[ 0.8 , -1.1 ],
[ 0.85 , -1.525],
[ 1.075, -1.725],
[ 1.3 , -1.925],
[ 1.525, -2.075],
[ 1.75 , -1.9 ],
[ 2. , -1.475],
[ 2.175, -1.05 ],
[ 2.225, -0.475],
[ 2.5 , 0.175],
[ 2.4 , 0.8 ],
[ 1.775, 1.425],
[ 1.125, 2.025],
[ 0.075, 2.2 ],
[-1.1 , 2.1 ],
[-1.925, 2.1 ],
[-2.25 , 2.05 ]])
which makes sense when glancing at the scatterplot of a.
Now, for speed, we take the length of the velocity vector. However, there's one thing that we haven't really kept in mind here: everything is a function of t. Thus, ds/dt is really a scalar function of t (as opposed to a vector function of t), just like dx/dt and dy/dt. Thus, we will represent ds_dt as a numpy array of values at each of the one second time intervals, each value corresponding to an approximation of the speed at each second:
ds_dt = np.sqrt(dx_dt * dx_dt + dy_dt * dy_dt)
This yields the following array:
array([ 0.3 , 0.62699681, 1.00623059, 1.36014705, 1.74588803,
2.03254766, 2.32284847, 2.57512136, 2.58311827, 2.48508048,
2.41518633, 2.27513736, 2.50611752, 2.52982213, 2.27623593,
2.31651678, 2.20127804, 2.37065392, 2.8487936 , 3.04384625])
which, again, makes some sense as you look at the gaps between the dots on the scatterplot of a: the object picks up speed, slowing down a bit as it takes the corner, and then speeds back up even more.
Now, in order to find the unit tangent vector, we need to make a small transformation to ds_dt so that its size is the same as that of velocity (this effectively allows us to divide the vector-valued function velocity by the (representation of) the scalar function ds_dt):
tangent = np.array([1/ds_dt] * 2).transpose() * velocity
This yields the following numpy array:
array([[ 1. , 0. ],
[ 0.99681528, -0.07974522],
[ 0.89442719, -0.4472136 ],
[ 0.5881717 , -0.80873608],
[ 0.48685826, -0.87348099],
[ 0.52889289, -0.84868859],
[ 0.55965769, -0.82872388],
[ 0.5922051 , -0.80578727],
[ 0.67747575, -0.73554511],
[ 0.80480291, -0.59354215],
[ 0.90055164, -0.43474907],
[ 0.97796293, -0.2087786 ],
[ 0.99755897, 0.06982913],
[ 0.9486833 , 0.31622777],
[ 0.77979614, 0.62603352],
[ 0.48564293, 0.87415728],
[ 0.03407112, 0.99941941],
[-0.46400699, 0.88583154],
[-0.67572463, 0.73715414],
[-0.73919634, 0.67349 ]])
Note two things: 1. At each value of t, tangent is pointing in the same direction as velocity, and 2. At each value of t, tangent is a unit vector. Indeed:
In [12]:
In [12]: np.sqrt(tangent[:,0] * tangent[:,0] + tangent[:,1] * tangent[:,1])
Out[12]:
array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1.])
Now, since we take the derivative of the tangent vector and divide by its length to get the unit normal vector, we do the same trick (isolating the components of tangent for convenience):
tangent_x = tangent[:, 0]
tangent_y = tangent[:, 1]
deriv_tangent_x = np.gradient(tangent_x)
deriv_tangent_y = np.gradient(tangent_y)
dT_dt = np.array([ [deriv_tangent_x[i], deriv_tangent_y[i]] for i in range(deriv_tangent_x.size)])
length_dT_dt = np.sqrt(deriv_tangent_x * deriv_tangent_x + deriv_tangent_y * deriv_tangent_y)
normal = np.array([1/length_dT_dt] * 2).transpose() * dT_dt
This gives us the following vector for normal:
array([[-0.03990439, -0.9992035 ],
[-0.22975292, -0.97324899],
[-0.48897562, -0.87229745],
[-0.69107645, -0.72278167],
[-0.8292422 , -0.55888941],
[ 0.85188045, 0.52373629],
[ 0.8278434 , 0.56095927],
[ 0.78434982, 0.62031876],
[ 0.70769355, 0.70651953],
[ 0.59568265, 0.80321988],
[ 0.41039706, 0.91190693],
[ 0.18879684, 0.98201617],
[-0.05568352, 0.99844847],
[-0.36457012, 0.93117594],
[-0.63863584, 0.76950911],
[-0.89417603, 0.44771557],
[-0.99992445, 0.0122923 ],
[-0.93801622, -0.34659137],
[-0.79170904, -0.61089835],
[-0.70603568, -0.70817626]])
Note that the normal vector represents the direction in which the curve is turning. The vector above then makes sense when viewed in conjunction with the scatterplot for a. In particular, we go from turning down to turning up after the fifth point, and we start turning to the left (with respect to the x axis) after the 12th point.
Finally, to get the tangential and normal components of acceleration, we need the second derivatives of s, x, and y with respect to t, and then we can get the curvature and the rest of our components (keeping in mind that they are all scalar functions of t):
d2s_dt2 = np.gradient(ds_dt)
d2x_dt2 = np.gradient(dx_dt)
d2y_dt2 = np.gradient(dy_dt)
curvature = np.abs(d2x_dt2 * dy_dt - dx_dt * d2y_dt2) / (dx_dt * dx_dt + dy_dt * dy_dt)**1.5
t_component = np.array([d2s_dt2] * 2).transpose()
n_component = np.array([curvature * ds_dt * ds_dt] * 2).transpose()
acceleration = t_component * tangent + n_component * normal

Categories