I have a 3d array that has shape (2, 10, 3) and a 2d array that has shape (2, 3) like this:
print(t) #2d array
Output:
[[1.003 2.32 3.11 ]
[1.214 5.32 2.13241]]
print(normal) #3d array
Output:
[[[0.69908573 0.0826756 0.84485978]
[0.51058213 0.4052637 0.5068118 ]
[0.45974276 0.25819549 0.10780089]
[0.27484999 0.33367648 0.128262 ]
[0.35963389 0.77600065 0.89393939]
[0.46937506 0.59291623 0.06620307]
[0.87603987 0.44414505 0.83394174]
[0.83186093 0.62491876 0.38160734]
[0.96819897 0.80183442 0.75102768]
[0.54182908 0.19403844 0.07925769]]
[[2.82248573 3.2341756 0.96825978]
[2.63398213 3.5567637 0.6302118 ]
[2.58314276 3.40969549 0.23120089]
[2.39824999 3.48517648 0.251662 ]
[2.48303389 3.92750065 1.01733939]
[2.59277506 3.74441623 0.18960307]
[2.99943987 3.59564505 0.95734174]
[2.95526093 3.77641876 0.50500734]
[3.09159897 3.95333442 0.87442768]
[2.66522908 3.34553844 0.20265769]]]
How can I get each row in the 2d array t to get the corresponding dot product in the 3d array normal such that the array I end up with a shape (2, 10) where each contains all 10 dot products between the nth row in 2d array and nth matrix in 3d array?
[0.62096458 0.62618459 0.37528887 0.5728386 1.19634398 0.79620507
1.997884 0.75229492 1.2236496 0.4210626 ]
[2.96347746 3.30738892 3.50596579 4.93082295 5.33811805 4.44872493
7.33480393 4.19173472 4.7406248 7.83229689]
You can use numpy.einsum (np.einsum('ijk,ik->ij', t, normal)) to get this result:
import numpy as np
normal = np.array([
[1.003,2.32,3.11],
[1.214,5.32,2.13241]
])
t = np.array([
[
[0.69908573, 0.0826756, 0.84485978],
[0.51058213, 0.4052637, 0.5068118 ],
[0.45974276, 0.25819549, 0.10780089],
[0.27484999, 0.33367648, 0.128262 ],
[0.35963389, 0.77600065, 0.89393939],
[0.46937506, 0.59291623, 0.06620307],
[0.87603987, 0.44414505, 0.83394174],
[0.83186093, 0.62491876, 0.38160734],
[0.96819897, 0.80183442, 0.75102768],
[0.54182908, 0.19403844, 0.07925769]
],
[
[2.82248573, 3.2341756, 0.96825978],
[2.63398213, 3.5567637, 0.6302118 ],
[2.58314276, 3.40969549, 0.23120089],
[2.39824999, 3.48517648, 0.251662 ],
[2.48303389, 3.92750065, 1.01733939],
[2.59277506, 3.74441623, 0.18960307],
[2.99943987, 3.59564505, 0.95734174],
[2.95526093, 3.77641876, 0.50500734],
[3.09159897, 3.95333442, 0.87442768],
[2.66522908, 3.34553844, 0.20265769]
]
])
np.einsum('ijk,ik->ij', t, normal)
This results in
array([[ 3.52050429, 3.02851036, 1.39539629, 1.44869879, 4.9411858 ,
2.05224039, 4.50264332, 3.47096686, 5.16705551, 1.24011516],
[22.69703871, 23.46350713, 21.76853041, 21.98926093, 26.07809129,
23.47223475, 24.81159677, 24.75511727, 26.64957859, 21.46600189]])
Which is the same as doing the two multiplications in order:
t[0] # normal[0]
t[1] # normal[1]
Gives the two:
array([3.52050429, 3.02851036, 1.39539629, 1.44869879, 4.9411858 ,
2.05224039, 4.50264332, 3.47096686, 5.16705551, 1.24011516])
array([22.69703871, 23.46350713, 21.76853041, 21.98926093, 26.07809129,
23.47223475, 24.81159677, 24.75511727, 26.64957859, 21.46600189])
I want a tensorflow function, which accepts a 3D matrix and an array ( shape of the array is similar to the first dimension of a 3D matrix ) and I want to slice the elements from each 2D matrix inside the 3D matrix based on the given array. The equivalent numpy looks like as follows. The basic idea is to picking all hidden states of each input in a batch ( avoid the padded ) in a dynamic rnn
import numpy as np
a = np.random.uniform(-1,1,(3,5,7))
a_length = np.random.randint(5,size=(3))
a_tf = tf.convert_to_tensor(a)
a_length_tf = tf.convert_to_tensor(a_length)
res = []
for index, length_ in enumerate(a_length):
res.extend(a[index,:length_,:])
res = np.array(res)
Output
print(a_length)
array([1, 4, 4])
print(res)
array([[-0.060161 , 0.36000953, 0.46160677, -0.66576281, 0.28562044,
-0.60026872, 0.08034777],
[ 0.04776443, 0.38018207, -0.73352382, 0.61847258, -0.89731857,
0.57264147, -0.88192537],
[ 0.92657628, 0.6236141 , 0.41977008, 0.88720247, 0.44639323,
0.26165976, 0.2678753 ],
[-0.78125831, 0.76756136, -0.05716537, -0.64696257, 0.48918477,
0.15376225, -0.41974593],
[-0.625326 , 0.3509537 , -0.7884495 , 0.11773297, 0.23713942,
0.30296786, 0.12932378],
[ 0.88413986, -0.10958306, 0.9745586 , 0.8975006 , 0.23023047,
-0.89991669, -0.60032688],
[ 0.33462775, 0.62883724, -0.81839566, -0.70312966, -0.00246936,
-0.95542994, -0.33035891],
[-0.26355579, -0.58104982, -0.54748412, -0.30236209, -0.74270132,
0.46329941, 0.34277915],
[ 0.92837516, -0.06748299, 0.32837354, -0.62863672, 0.86226447,
0.63604586, 0.0905248 ]])
print(a)
array([[[-0.060161 , 0.36000953, 0.46160677, -0.66576281,
0.28562044, -0.60026872, 0.08034777],
[ 0.26379226, 0.67066755, -0.90139221, -0.86862163,
0.36405595, 0.71342926, -0.1265208 ],
[ 0.15007877, 0.82065234, 0.03984378, -0.20038364,
-0.09945102, 0.71605241, -0.55865999],
[ 0.27132257, -0.84289149, -0.15493576, 0.74683429,
-0.71159896, 0.50397217, -0.99025404],
[ 0.51546368, 0.45460343, 0.87519031, 0.0332339 ,
-0.53474897, -0.01733648, -0.02886814]],
[[ 0.04776443, 0.38018207, -0.73352382, 0.61847258,
-0.89731857, 0.57264147, -0.88192537],
[ 0.92657628, 0.6236141 , 0.41977008, 0.88720247,
0.44639323, 0.26165976, 0.2678753 ],
[-0.78125831, 0.76756136, -0.05716537, -0.64696257,
0.48918477, 0.15376225, -0.41974593],
[-0.625326 , 0.3509537 , -0.7884495 , 0.11773297,
0.23713942, 0.30296786, 0.12932378],
[ 0.44550219, -0.38828221, 0.35684203, 0.789946 ,
-0.8763921 , 0.90155917, -0.75549455]],
[[ 0.88413986, -0.10958306, 0.9745586 , 0.8975006 ,
0.23023047, -0.89991669, -0.60032688],
[ 0.33462775, 0.62883724, -0.81839566, -0.70312966,
-0.00246936, -0.95542994, -0.33035891],
[-0.26355579, -0.58104982, -0.54748412, -0.30236209,
-0.74270132, 0.46329941, 0.34277915],
[ 0.92837516, -0.06748299, 0.32837354, -0.62863672,
0.86226447, 0.63604586, 0.0905248 ],
[ 0.70272633, 0.17122912, -0.58209965, 0.55557024,
-0.46295566, -0.33845157, -0.62254313]]])
Here is a way to do that using tf.boolean_mask:
import tensorflow as tf
import numpy as np
# NumPy/Python implementation
a = np.random.uniform(-1,1,(3,5,7)).astype(np.float32)
a_length = np.random.randint(5,size=(3)).astype(np.int32)
res = []
for index, length_ in enumerate(a_length):
res.extend(a[index,:length_,:])
res = np.array(res)
# TensorFlow implementation
a_tf = tf.convert_to_tensor(a)
a_length_tf = tf.convert_to_tensor(a_length)
# Make a mask for all wanted elements
mask = tf.range(tf.shape(a)[1]) < a_length_tf[:, tf.newaxis]
# Apply mask
res_tf = tf.boolean_mask(a_tf, mask)
# Test
with tf.Session() as sess:
print(np.allclose(sess.run(res_tf), res))
Output:
True
I want to interpolate from x onto z. But there's a caveat:
Depending on a state y, I have a different xGrid - which i need to interpolate.
I have a grid for y, yGrid. Say yGrid=[0,1]. And xGrid is given by
1 10
2 20
3 30
The corresponding zGrid, is
100 1000
200 2000
300 3000
This means that for y=0, [1,2,3] is the proper grid for x, and for y=1, [10,20,30] is the proper grid. And similar for z.
Everything is linear and even-spaced for demonstration of the problem, but it is not in the actual data.
In words,
if y=0, x=1.5, z is the interpolation of [1,2,3] onto [100, 200, 300] at 1.5 - which is 150.
If y=1, x=10, z=1000
Here's the problem: What if is y=0.5? In this simple case, I want the interpolated grids to be [5.5, 11, 33/2] and [550, 1100, 1650], so x=10 would be something close to 1000.
It appears to me, that I need to interpolate 3 times:
twice to get the correct xGrid, and zGrid, and
once to interpolate xGrid-> xGrid
This is part of a bottleneck and efficiency is vital. How do I code this most efficiently?
Here is how I can code it quite inefficiently:
xGrid = np.array([[1, 10], [2, 20], [3, 30]])
zGrid = np.array([[100, 1000], [200, 2000], [300, 3000]])
yGrid = np.array([0, 1])
yValue = 0.5
xInterpolated = np.zeros(xGrid.shape[0])
zInterpolated = np.zeros(zGrid.shape[0])
for i in np.arange(xGrid.shape[0]):
f1 = interpolate.interp1d(pGrid, xGrid[i,:])
f2 = interpolate.interp1d(pGrid, zGrid[i,:])
xInterpolated[i] = f1(yValue)
zInterpolated[i] = f2(yValue)
f3 = interpolate.interp1d(xInterpolated, zInterpolated)
And the output is
In[73]: xInterpolated, zInterpolated
Out[73]: (array([ 5.5, 11. , 16.5]), array([ 550., 1100., 1650.]))
In[75]: f3(10)
Out[75]: array(1000.0)
Actual use-case data
xGrid:
array([[ 0.30213582, 0.42091889, 0.48596506, 0.55045007,
0.61479495, 0.67906768, 0.74328653, 0.8074641 ,
0.8716093 , 0.93572867, 0.99982708, 1.06390825,
1.12797508, 1.19202984, 1.25607435, 1.32011008,
1.38413823, 1.44815978, 1.51217558, 1.57618631],
[ 1.09945362, 1.17100971, 1.23588956, 1.30034354,
1.36467675, 1.42894086, 1.49315319, 1.55732567,
1.62146685, 1.68558297, 1.74967873, 1.8137577 ,
1.87782269, 1.94187589, 2.00591907, 2.06995365,
2.1339808 , 2.1980015 , 2.26201653, 2.32602659],
[ 1.96474476, 2.03281806, 2.09757883, 2.16200519,
2.22632562, 2.29058026, 2.35478537, 2.41895223,
2.48308893, 2.54720144, 2.61129424, 2.67537076,
2.73943368, 2.80348513, 2.86752681, 2.93156011,
2.99558615, 3.05960586, 3.12362004, 3.18762935],
[ 2.97271432, 3.03917779, 3.10382629, 3.16822546,
3.23253177, 3.29677589, 3.36097295, 3.42513351,
3.48926519, 3.55337363, 3.61746308, 3.68153682,
3.74559741, 3.80964688, 3.87368686, 3.93771869,
4.00174345, 4.06576206, 4.12977526, 4.1937837 ],
[ 4.17324037, 4.23880534, 4.30336811, 4.36773934,
4.43202986, 4.49626215, 4.56045011, 4.62460351,
4.68872947, 4.75283326, 4.81691888, 4.88098942,
4.94504732, 5.0090945 , 5.07313252, 5.13716266,
5.20118595, 5.26520326, 5.32921533, 5.39322276],
[ 5.64337535, 5.70841895, 5.77290336, 5.83724805,
5.90152063, 5.96573939, 6.02991687, 6.094062 ,
6.15818132, 6.22227969, 6.28636083, 6.35042763,
6.41448236, 6.47852685, 6.54256256, 6.60659069,
6.67061223, 6.73462802, 6.79863874, 6.86264497],
[ 7.51378714, 7.57851747, 7.6429358 , 7.70725236,
7.77150412, 7.83570702, 7.89987216, 7.9640075 ,
8.0281189 , 8.09221078, 8.15628654, 8.22034883,
8.28439974, 8.34844097, 8.41247386, 8.47649955,
8.54051897, 8.60453289, 8.66854195, 8.73254673],
[ 10.03324294, 10.09777483, 10.162134 , 10.22641722,
10.29064401, 10.35482771, 10.41897777, 10.48310105,
10.54720264, 10.61128646, 10.67535549, 10.73941211,
10.80345821, 10.8674953 , 10.93152463, 10.99554722,
11.05956392, 11.12357544, 11.1875824 , 11.25158529],
[ 13.77079831, 13.83519161, 13.89949459, 13.96373623,
14.02793138, 14.09209044, 14.15622093, 14.2203284 ,
14.28441705, 14.34849012, 14.41255015, 14.47659914,
14.54063872, 14.6046702 , 14.66869465, 14.73271299,
14.79672596, 14.86073419, 14.92473821, 14.9887385 ],
[ 20.60440125, 20.66868421, 20.7329108 , 20.79709436,
20.8612443 , 20.92536747, 20.98946899, 21.05355274,
21.11762172, 21.1816783 , 21.24572435, 21.30976141,
21.37379071, 21.43781328, 21.50182995, 21.56584146,
21.6298484 , 21.69385127, 21.75785053, 21.82184654]])
zGrid:
array([[ 0.30213582, 0.42091889, 0.48596506, 0.55045007, 0.61479495,
0.67906768, 0.74328653, 0.8074641 , 0.8716093 , 0.93572867,
0.99982708, 1.06390825, 1.12797508, 1.19202984, 1.25607435,
1.32011008, 1.38413823, 1.44815978, 1.51217558, 1.57618631],
[ 0.35871288, 0.43026897, 0.49514882, 0.5596028 , 0.62393601,
0.68820012, 0.75241245, 0.81658493, 0.88072611, 0.94484223,
1.00893799, 1.07301696, 1.13708195, 1.20113515, 1.26517833,
1.32921291, 1.39324006, 1.45726076, 1.52127579, 1.58528585],
[ 0.37285697, 0.44093027, 0.50569104, 0.5701174 , 0.63443782,
0.69869247, 0.76289758, 0.82706444, 0.89120114, 0.95531365,
1.01940644, 1.08348296, 1.14754589, 1.21159734, 1.27563902,
1.33967232, 1.40369835, 1.46771807, 1.53173225, 1.59574155],
[ 0.38688189, 0.45334537, 0.51799386, 0.58239303, 0.64669934,
0.71094347, 0.77514053, 0.83930108, 0.90343277, 0.96754121,
1.03163066, 1.0957044 , 1.15976498, 1.22381445, 1.28785443,
1.35188626, 1.41591103, 1.47992963, 1.54394284, 1.60795127],
[ 0.40252392, 0.46808889, 0.53265166, 0.59702289, 0.66131341,
0.7255457 , 0.78973366, 0.85388706, 0.91801302, 0.98211681,
1.04620243, 1.11027297, 1.17433087, 1.23837805, 1.30241607,
1.36644621, 1.4304695 , 1.49448681, 1.55849888, 1.62250631],
[ 0.42106765, 0.48611125, 0.55059566, 0.61494035, 0.67921293,
0.74343169, 0.80760917, 0.87175431, 0.93587362, 0.99997199,
1.06405313, 1.12811993, 1.19217466, 1.25621915, 1.32025486,
1.38428299, 1.44830454, 1.51232032, 1.57633104, 1.64033728],
[ 0.4442679 , 0.50899823, 0.57341657, 0.63773312, 0.70198488,
0.76618779, 0.83035293, 0.89448826, 0.95859966, 1.02269154,
1.08676731, 1.15082959, 1.21488051, 1.27892173, 1.34295463,
1.40698032, 1.47099973, 1.53501365, 1.59902272, 1.66302749],
[ 0.47525152, 0.53978341, 0.60414258, 0.6684258 , 0.73265259,
0.79683629, 0.86098635, 0.92510963, 0.98921122, 1.05329504,
1.11736407, 1.18142069, 1.24546679, 1.30950388, 1.37353321,
1.4375558 , 1.5015725 , 1.56558403, 1.62959098, 1.69359387],
[ 0.52099935, 0.58539265, 0.64969564, 0.71393728, 0.77813242,
0.84229149, 0.90642197, 0.97052944, 1.03461809, 1.09869116,
1.16275119, 1.22680018, 1.29083976, 1.35487124, 1.4188957 ,
1.48291403, 1.546927 , 1.61093523, 1.67493926, 1.73893954],
[ 0.60440125, 0.66868421, 0.7329108 , 0.79709436, 0.8612443 ,
0.92536747, 0.98946899, 1.05355274, 1.11762172, 1.1816783 ,
1.24572435, 1.30976141, 1.37379071, 1.43781328, 1.50182995,
1.56584146, 1.6298484 , 1.69385127, 1.75785053, 1.82184654]])
yGrid:
array([ 1. , 6.21052632, 11.42105263, 16.63157895,
21.84210526, 27.05263158, 32.26315789, 37.47368421,
42.68421053, 47.89473684, 53.10526316, 58.31578947,
63.52631579, 68.73684211, 73.94736842, 79.15789474,
84.36842105, 89.57894737, 94.78947368, 100. ])
I've created the interpolater following the given answer, and then interpolated some points:
yGrid = yGrid + np.zeros(xGrid.shape)
f3 = interpolate.interp2d(xGrid,yGrid,zGrid,kind='linear')
import matplotlib.pyplot as plt
plt.plot(np.linspace(0.001, 5, 100), [f3(y, 2) for y in np.linspace(0.001, 5, 100)])
plt.plot(xGrid[:, 1], zGrid[:, 1])
plt.plot(xGrid[:, 0], zGrid[:, 0])
And here's the output:
The blue line is the interpolated one. I am worried that for very small values of x, it should be tilted downwards a bit (following the weighted average of the two functions), but it is not at all.
You're actually looking at 2d interpolation: you need z(x,y) with interpolated values of x and y. The only subtlety is that you need to broadcast yGrid to have the same shape as the x and z data:
import scipy.interpolate as interpolate
xGrid = np.array([[1, 10], [2, 20], [3, 30]])
zGrid = np.array([[100, 1000], [200, 2000], [300, 3000]])
yGrid = np.array([0, 1]) + np.zeros(xGrid.shape)
yValue = 0.5
f3 = interpolate.interp2d(xGrid,yGrid,zGrid,kind='linear')
This is a bivariate function, you can call it as
In [372]: f3(10,yValue)
Out[372]: array([ 1000.])
You can turn it into a univariate function returning a scalar by using a lambda:
f4 = lambda x,y=yValue: f3(x,y)[0]
this will return a single value for your (assumedly) single y value, which is set to be yValue at the moment of the lambda definition. Use it like so:
In [376]: f4(10)
Out[376]: 1000.0
However, the general f3 function might be more suited to your problem, as you can dynamically change the value of y according to your needs, and can use array input to obtain array output for z.
Update
For oddly shaped x,y data, interp2d might give unsatisfactory results, especially at the borders of the grid. So another approach is using interpolate.LinearNDInterpolator instead, which is based on a triangulation of your input data, inherently giving a local piecewise linear approximation
f4 = interpolate.LinearNDInterpolator((xGrid.flatten(),yGrid.flatten()),zGrid.flatten())
With your update data set:
plt.figure()
plt.plot(np.linspace(0.001, 5, 100), f4(np.linspace(0.001, 5, 100), 2))
plt.plot(xGrid[:, 0], zGrid[:, 0])
plt.plot(xGrid[:, 1], zGrid[:, 1])
Note that this interpolation also has its drawbacks. I suggest plotting both interpolated functions as a surface and looking at how they are distorted compared to your original data:
from mpl_toolkits.mplot3d import Axes3D
xx,yy=(np.linspace(0,10,20),np.linspace(0,20,40))
xxs,yys=np.meshgrid(xx,yy)
zz3=f3(xx,yy) #from interp2d
zz4=f4(xxs,yys) #from LinearNDInterpolator
#plot raw data
hf=plt.figure()
ax=hf.add_subplot(111,projection='3d')
ax.plot_surface(xGrid,yGrid,zGrid,rstride=1,cstride=1)
plt.draw()
#plot interp2d case
hf=plt.figure()
ax=hf.add_subplot(111,projection='3d')
ax.plot_surface(xxs,yys,zz3,rstride=1,cstride=1)
plt.draw()
#plot LinearNDInterpolator case
hf=plt.figure()
ax=hf.add_subplot(111,projection='3d')
ax.plot_surface(xxs,yys,f4(xxs,yys),rstride=1,cstride=1)
plt.draw()
This will allow you to rotate the surfaces around and see what kind of artifacts they contain (with an appropriate backend).