Iterated Interpolation: First interpolate grids, then interpolate value - python

I want to interpolate from x onto z. But there's a caveat:
Depending on a state y, I have a different xGrid - which i need to interpolate.
I have a grid for y, yGrid. Say yGrid=[0,1]. And xGrid is given by
1 10
2 20
3 30
The corresponding zGrid, is
100 1000
200 2000
300 3000
This means that for y=0, [1,2,3] is the proper grid for x, and for y=1, [10,20,30] is the proper grid. And similar for z.
Everything is linear and even-spaced for demonstration of the problem, but it is not in the actual data.
In words,
if y=0, x=1.5, z is the interpolation of [1,2,3] onto [100, 200, 300] at 1.5 - which is 150.
If y=1, x=10, z=1000
Here's the problem: What if is y=0.5? In this simple case, I want the interpolated grids to be [5.5, 11, 33/2] and [550, 1100, 1650], so x=10 would be something close to 1000.
It appears to me, that I need to interpolate 3 times:
twice to get the correct xGrid, and zGrid, and
once to interpolate xGrid-> xGrid
This is part of a bottleneck and efficiency is vital. How do I code this most efficiently?
Here is how I can code it quite inefficiently:
xGrid = np.array([[1, 10], [2, 20], [3, 30]])
zGrid = np.array([[100, 1000], [200, 2000], [300, 3000]])
yGrid = np.array([0, 1])
yValue = 0.5
xInterpolated = np.zeros(xGrid.shape[0])
zInterpolated = np.zeros(zGrid.shape[0])
for i in np.arange(xGrid.shape[0]):
f1 = interpolate.interp1d(pGrid, xGrid[i,:])
f2 = interpolate.interp1d(pGrid, zGrid[i,:])
xInterpolated[i] = f1(yValue)
zInterpolated[i] = f2(yValue)
f3 = interpolate.interp1d(xInterpolated, zInterpolated)
And the output is
In[73]: xInterpolated, zInterpolated
Out[73]: (array([ 5.5, 11. , 16.5]), array([ 550., 1100., 1650.]))
In[75]: f3(10)
Out[75]: array(1000.0)
Actual use-case data
xGrid:
array([[ 0.30213582, 0.42091889, 0.48596506, 0.55045007,
0.61479495, 0.67906768, 0.74328653, 0.8074641 ,
0.8716093 , 0.93572867, 0.99982708, 1.06390825,
1.12797508, 1.19202984, 1.25607435, 1.32011008,
1.38413823, 1.44815978, 1.51217558, 1.57618631],
[ 1.09945362, 1.17100971, 1.23588956, 1.30034354,
1.36467675, 1.42894086, 1.49315319, 1.55732567,
1.62146685, 1.68558297, 1.74967873, 1.8137577 ,
1.87782269, 1.94187589, 2.00591907, 2.06995365,
2.1339808 , 2.1980015 , 2.26201653, 2.32602659],
[ 1.96474476, 2.03281806, 2.09757883, 2.16200519,
2.22632562, 2.29058026, 2.35478537, 2.41895223,
2.48308893, 2.54720144, 2.61129424, 2.67537076,
2.73943368, 2.80348513, 2.86752681, 2.93156011,
2.99558615, 3.05960586, 3.12362004, 3.18762935],
[ 2.97271432, 3.03917779, 3.10382629, 3.16822546,
3.23253177, 3.29677589, 3.36097295, 3.42513351,
3.48926519, 3.55337363, 3.61746308, 3.68153682,
3.74559741, 3.80964688, 3.87368686, 3.93771869,
4.00174345, 4.06576206, 4.12977526, 4.1937837 ],
[ 4.17324037, 4.23880534, 4.30336811, 4.36773934,
4.43202986, 4.49626215, 4.56045011, 4.62460351,
4.68872947, 4.75283326, 4.81691888, 4.88098942,
4.94504732, 5.0090945 , 5.07313252, 5.13716266,
5.20118595, 5.26520326, 5.32921533, 5.39322276],
[ 5.64337535, 5.70841895, 5.77290336, 5.83724805,
5.90152063, 5.96573939, 6.02991687, 6.094062 ,
6.15818132, 6.22227969, 6.28636083, 6.35042763,
6.41448236, 6.47852685, 6.54256256, 6.60659069,
6.67061223, 6.73462802, 6.79863874, 6.86264497],
[ 7.51378714, 7.57851747, 7.6429358 , 7.70725236,
7.77150412, 7.83570702, 7.89987216, 7.9640075 ,
8.0281189 , 8.09221078, 8.15628654, 8.22034883,
8.28439974, 8.34844097, 8.41247386, 8.47649955,
8.54051897, 8.60453289, 8.66854195, 8.73254673],
[ 10.03324294, 10.09777483, 10.162134 , 10.22641722,
10.29064401, 10.35482771, 10.41897777, 10.48310105,
10.54720264, 10.61128646, 10.67535549, 10.73941211,
10.80345821, 10.8674953 , 10.93152463, 10.99554722,
11.05956392, 11.12357544, 11.1875824 , 11.25158529],
[ 13.77079831, 13.83519161, 13.89949459, 13.96373623,
14.02793138, 14.09209044, 14.15622093, 14.2203284 ,
14.28441705, 14.34849012, 14.41255015, 14.47659914,
14.54063872, 14.6046702 , 14.66869465, 14.73271299,
14.79672596, 14.86073419, 14.92473821, 14.9887385 ],
[ 20.60440125, 20.66868421, 20.7329108 , 20.79709436,
20.8612443 , 20.92536747, 20.98946899, 21.05355274,
21.11762172, 21.1816783 , 21.24572435, 21.30976141,
21.37379071, 21.43781328, 21.50182995, 21.56584146,
21.6298484 , 21.69385127, 21.75785053, 21.82184654]])
zGrid:
array([[ 0.30213582, 0.42091889, 0.48596506, 0.55045007, 0.61479495,
0.67906768, 0.74328653, 0.8074641 , 0.8716093 , 0.93572867,
0.99982708, 1.06390825, 1.12797508, 1.19202984, 1.25607435,
1.32011008, 1.38413823, 1.44815978, 1.51217558, 1.57618631],
[ 0.35871288, 0.43026897, 0.49514882, 0.5596028 , 0.62393601,
0.68820012, 0.75241245, 0.81658493, 0.88072611, 0.94484223,
1.00893799, 1.07301696, 1.13708195, 1.20113515, 1.26517833,
1.32921291, 1.39324006, 1.45726076, 1.52127579, 1.58528585],
[ 0.37285697, 0.44093027, 0.50569104, 0.5701174 , 0.63443782,
0.69869247, 0.76289758, 0.82706444, 0.89120114, 0.95531365,
1.01940644, 1.08348296, 1.14754589, 1.21159734, 1.27563902,
1.33967232, 1.40369835, 1.46771807, 1.53173225, 1.59574155],
[ 0.38688189, 0.45334537, 0.51799386, 0.58239303, 0.64669934,
0.71094347, 0.77514053, 0.83930108, 0.90343277, 0.96754121,
1.03163066, 1.0957044 , 1.15976498, 1.22381445, 1.28785443,
1.35188626, 1.41591103, 1.47992963, 1.54394284, 1.60795127],
[ 0.40252392, 0.46808889, 0.53265166, 0.59702289, 0.66131341,
0.7255457 , 0.78973366, 0.85388706, 0.91801302, 0.98211681,
1.04620243, 1.11027297, 1.17433087, 1.23837805, 1.30241607,
1.36644621, 1.4304695 , 1.49448681, 1.55849888, 1.62250631],
[ 0.42106765, 0.48611125, 0.55059566, 0.61494035, 0.67921293,
0.74343169, 0.80760917, 0.87175431, 0.93587362, 0.99997199,
1.06405313, 1.12811993, 1.19217466, 1.25621915, 1.32025486,
1.38428299, 1.44830454, 1.51232032, 1.57633104, 1.64033728],
[ 0.4442679 , 0.50899823, 0.57341657, 0.63773312, 0.70198488,
0.76618779, 0.83035293, 0.89448826, 0.95859966, 1.02269154,
1.08676731, 1.15082959, 1.21488051, 1.27892173, 1.34295463,
1.40698032, 1.47099973, 1.53501365, 1.59902272, 1.66302749],
[ 0.47525152, 0.53978341, 0.60414258, 0.6684258 , 0.73265259,
0.79683629, 0.86098635, 0.92510963, 0.98921122, 1.05329504,
1.11736407, 1.18142069, 1.24546679, 1.30950388, 1.37353321,
1.4375558 , 1.5015725 , 1.56558403, 1.62959098, 1.69359387],
[ 0.52099935, 0.58539265, 0.64969564, 0.71393728, 0.77813242,
0.84229149, 0.90642197, 0.97052944, 1.03461809, 1.09869116,
1.16275119, 1.22680018, 1.29083976, 1.35487124, 1.4188957 ,
1.48291403, 1.546927 , 1.61093523, 1.67493926, 1.73893954],
[ 0.60440125, 0.66868421, 0.7329108 , 0.79709436, 0.8612443 ,
0.92536747, 0.98946899, 1.05355274, 1.11762172, 1.1816783 ,
1.24572435, 1.30976141, 1.37379071, 1.43781328, 1.50182995,
1.56584146, 1.6298484 , 1.69385127, 1.75785053, 1.82184654]])
yGrid:
array([ 1. , 6.21052632, 11.42105263, 16.63157895,
21.84210526, 27.05263158, 32.26315789, 37.47368421,
42.68421053, 47.89473684, 53.10526316, 58.31578947,
63.52631579, 68.73684211, 73.94736842, 79.15789474,
84.36842105, 89.57894737, 94.78947368, 100. ])
I've created the interpolater following the given answer, and then interpolated some points:
yGrid = yGrid + np.zeros(xGrid.shape)
f3 = interpolate.interp2d(xGrid,yGrid,zGrid,kind='linear')
import matplotlib.pyplot as plt
plt.plot(np.linspace(0.001, 5, 100), [f3(y, 2) for y in np.linspace(0.001, 5, 100)])
plt.plot(xGrid[:, 1], zGrid[:, 1])
plt.plot(xGrid[:, 0], zGrid[:, 0])
And here's the output:
The blue line is the interpolated one. I am worried that for very small values of x, it should be tilted downwards a bit (following the weighted average of the two functions), but it is not at all.

You're actually looking at 2d interpolation: you need z(x,y) with interpolated values of x and y. The only subtlety is that you need to broadcast yGrid to have the same shape as the x and z data:
import scipy.interpolate as interpolate
xGrid = np.array([[1, 10], [2, 20], [3, 30]])
zGrid = np.array([[100, 1000], [200, 2000], [300, 3000]])
yGrid = np.array([0, 1]) + np.zeros(xGrid.shape)
yValue = 0.5
f3 = interpolate.interp2d(xGrid,yGrid,zGrid,kind='linear')
This is a bivariate function, you can call it as
In [372]: f3(10,yValue)
Out[372]: array([ 1000.])
You can turn it into a univariate function returning a scalar by using a lambda:
f4 = lambda x,y=yValue: f3(x,y)[0]
this will return a single value for your (assumedly) single y value, which is set to be yValue at the moment of the lambda definition. Use it like so:
In [376]: f4(10)
Out[376]: 1000.0
However, the general f3 function might be more suited to your problem, as you can dynamically change the value of y according to your needs, and can use array input to obtain array output for z.
Update
For oddly shaped x,y data, interp2d might give unsatisfactory results, especially at the borders of the grid. So another approach is using interpolate.LinearNDInterpolator instead, which is based on a triangulation of your input data, inherently giving a local piecewise linear approximation
f4 = interpolate.LinearNDInterpolator((xGrid.flatten(),yGrid.flatten()),zGrid.flatten())
With your update data set:
plt.figure()
plt.plot(np.linspace(0.001, 5, 100), f4(np.linspace(0.001, 5, 100), 2))
plt.plot(xGrid[:, 0], zGrid[:, 0])
plt.plot(xGrid[:, 1], zGrid[:, 1])
Note that this interpolation also has its drawbacks. I suggest plotting both interpolated functions as a surface and looking at how they are distorted compared to your original data:
from mpl_toolkits.mplot3d import Axes3D
xx,yy=(np.linspace(0,10,20),np.linspace(0,20,40))
xxs,yys=np.meshgrid(xx,yy)
zz3=f3(xx,yy) #from interp2d
zz4=f4(xxs,yys) #from LinearNDInterpolator
#plot raw data
hf=plt.figure()
ax=hf.add_subplot(111,projection='3d')
ax.plot_surface(xGrid,yGrid,zGrid,rstride=1,cstride=1)
plt.draw()
#plot interp2d case
hf=plt.figure()
ax=hf.add_subplot(111,projection='3d')
ax.plot_surface(xxs,yys,zz3,rstride=1,cstride=1)
plt.draw()
#plot LinearNDInterpolator case
hf=plt.figure()
ax=hf.add_subplot(111,projection='3d')
ax.plot_surface(xxs,yys,f4(xxs,yys),rstride=1,cstride=1)
plt.draw()
This will allow you to rotate the surfaces around and see what kind of artifacts they contain (with an appropriate backend).

Related

How to create a loop to curve-fit different data sets of y for same x? in Python

İ have 3 data sets for y axis values as follows.
y = [0.2535 0.3552 0.456 0.489 0.5265 0.58384 1.87616 2.87328 2.55184 2.66992 2.8208 3.09632 3.51616]
[0.116112 0.425088 0.582528 0.70192 1.07584 2.41408 3.75232 4.61824 2.55184 2.66992 2.8208 3.09632 3.51616 ]
[0.389664 1.166368 1.60392 2.05984 2.788 4.02784 5.0184 5.60224 2.55184 2.66992 2.8208 3.09632 3.51616 ]
and one data set for x values
x = [ 0. 8.75 17.5 26.25 35. 43.75 52.5 61.25 70. 78.75
87.5 96.25 105. ]
ı am using the following command to curve fit
curve = np.polyfit(x, y, 4)
poly = np.poly1d(curve)
Which works fine for one data set of y and x. what kind of loop should ı use if ı want to have 3 different curve-fit equations for different y data sets for same x sets? ı am new to python this is why ı strugle in such a basic loop.
My expected output is an equation that represents a curve for given data sets(x and ). I managed to get an equation one by one. but. I have tons of different data sets for y and ı dont want to find the equivalent equations one by one since ı can do it in a loop for y values but dont know how?
This is the working example for one set of y values. in reality i have 3 data sets for y. ı can change the y and get 3 different results but i want to do it in a single loop for all y values
import numpy as np
import matplotlib.pyplot as plt
x = [0, 5.25, 10.5, 21, 31.5, 42, 52.5, 63, 73.5, 84, 94.5, 99.75, 105]
y = [0, 0.116112, 0.389664, 1.739712, 3.566016, 4.860304, 5.05776, 5.04792,
4.197744, 2.210064, 0.505776, 0.1312, 0]
curve = np.polyfit(x, y, 4)
poly = np.poly1d(curve)
new_x = []
new_y= []
for a in range(105):
new_x.append(a+1)
calc = poly(a+1)
new_y.append(calc)
plt.plot(new_x, new_y)
plt.scatter(x, y)
print(poly)
Following are some improvements of your code. I've made a function to automate the processing of your data. Do not forget that numpy provide vectorized operations so it was not usefull to iterate over new_x to get new_y. Vectorized operations are more readable and so much more performant.
I let you call the function inside a loop over your datasets.
import numpy as np
import matplotlib.pyplot as plt
def my_function(x,y):
curve = np.polyfit(x, y, 4)
poly = np.poly1d(curve)
new_x = np.arange(x[0],x[-1],1)
new_y= poly(new_x)
plt.plot(new_x, new_y)
plt.scatter(x, y)
print(poly)
x = [0, 5.25, 10.5, 21, 31.5, 42, 52.5, 63, 73.5, 84, 94.5, 99.75, 105]
y1 = [0, 0.116112, 0.389664, 1.739712, 3.566016, 4.860304, 5.05776, 5.04792,
4.197744, 2.210064, 0.505776, 0.1312, 0]
y2 = [0.116112 0.425088 0.582528 0.70192 1.07584 2.41408 3.75232 4.61824 2.55184 2.66992 2.8208 3.09632 3.51616 ]
y3 = [0.389664 1.166368 1.60392 2.05984 2.788 4.02784 5.0184 5.60224 2.55184 2.66992 2.8208 3.09632 3.51616 ]
ylist = [ y1, y2, y3]
for y in ylist:
my_function(x,y)

Question about plotting a 2D array in python

I have a question regarding the plotting of a 2D array with matplotlib. In my code, I have a 2D array named z of len(z) = 20 , and z has the values :
[[ 642.3774486 662.59980588 706.80142179 764.78786911 831.67963477
904.67872269 982.01426528 1062.49208551 1145.27029231 1229.73549967
1315.42936618 1402.00251422 1489.18433714 1576.7625077 1664.56866033
1752.46813939 1840.35250424 1928.13395024 2015.74109019 2103.11572013]
[ 554.60565024 560.31827232 591.87923587 638.51633542 695.03697015
758.44479983 826.83191468 898.90395242 973.74278531 1050.67523901
1129.19496311 1208.91328775 1289.52693752 1370.79606051 1452.52883572
1534.57042218 1616.79485775 1699.09901217 1781.39800199 1863.6216653 ]
[ 484.80770831 476.01059519 494.93090638 530.21865818 576.36816197
630.18473341 689.62342052 753.28967576 820.18913475 889.58883479
960.93441647 1033.79791772 1107.84339435 1182.80346976 1258.46286755
1334.64656142 1411.2110677 1488.03793055 1565.02877024 1642.1014669 ]
[ 432.98362283 409.67677451 415.95643334 439.89483737 475.67321023
519.89852343 570.3887828 625.64925554 684.60934062 746.47628701
810.64772628 876.65640413 944.13370762 1012.78473545 1082.37075581
1152.6965571 1223.60113409 1294.95070536 1366.63339492 1438.55512495]
[ 399.13339379 361.31681026 354.95581673 367.54487301 392.95211493
427.58616989 469.12800152 515.98269176 567.00340294 621.33759567
678.33489253 737.48874699 798.39787733 860.73985757 924.25250052
988.72040921 1053.96505692 1119.83733661 1186.21187604 1252.98263943]
[ 383.25702119 330.93070245 311.92905657 313.16876508 328.20487607
353.24767279 385.84107667 424.28998442 467.37132169 514.17276077
563.99591521 616.29494628 670.63590348 726.66883614 784.10810167
842.71811777 902.30283619 962.6978243 1023.76421361 1085.38401036]
[ 385.35450503 318.51845109 286.87615284 276.7665136 281.43149365
296.88303213 320.52800827 350.57113352 385.71309689 424.98178231
467.63079434 513.07500201 560.84778607 610.57167115 661.93755925
714.68968276 768.6144719 823.53216843 879.29040761 935.75923772]
[ 405.4258453 324.08005616 279.79710556 258.33811855 252.63196767
258.49224791 273.18879631 294.82613906 322.02872853 353.76466029
389.23952991 427.82891418 469.0335251 512.44836259 557.74087328
604.6351042 652.89996405 702.340369 752.79045805 804.10832153]
[ 443.47104202 347.61551768 290.69191471 257.88357994 241.80629812
238.07532013 243.82344079 257.05500104 276.3182166 300.52139471
328.82212191 360.55668279 395.19312056 432.29891048 471.51804375
512.55438207 555.15931264 599.12242601 644.26436494 690.43126177]
[ 499.49009518 389.12483563 319.56058031 275.40289778 248.95448502
235.63224878 232.43194171 237.25771947 248.58156112 265.25198557
286.37857036 311.25830784 339.32657247 370.12331481 403.26907065
438.44751639 475.39251767 513.87833946 553.71212826 594.72805845]
[ 573.48300477 448.60801002 366.40310234 310.89607205 274.07652836
251.16303388 239.01429907 235.43429433 238.81876207 247.95643287
261.90887525 279.93378933 301.43388082 325.92157557 352.993954
382.31450714 413.59957914 446.60810935 481.13374802 516.99871158]
[ 665.44977081 526.06504086 431.21948081 364.36310276 317.17242814
284.66767542 263.57051287 251.58472563 247.02981947 248.63473661
255.41303657 266.58312726 281.51504561 299.69369278 320.69269378
344.15535434 369.78049705 397.31173568 426.52922422 457.24322114]
[ 775.39039329 621.49592813 514.00971573 435.80398992 378.24218436
336.1461734 306.10058311 285.70901337 273.2147333 267.28689679
266.89105434 271.20632163 279.57006684 291.43966643 306.36529001
323.97005797 343.9352714 365.98921845 389.89855687 415.46158715]
[ 903.3048722 734.90067184 614.77380708 525.21873351 457.28579702
405.59852782 366.60450978 337.80715755 317.37350358 303.91291341
296.34292854 293.80337244 295.5989445 301.15949651 310.01174267
321.75861805 336.06390219 352.64055766 371.24174595 391.65380959]
[1049.19320756 866.27927199 733.51175488 632.60733354 554.30326611
493.02473868 445.0822929 407.87915817 379.50613029 358.51278647
343.76865919 334.37427969 329.60167861 328.85318304 331.63205178
337.52103456 346.16638942 357.26575331 370.55879147 385.81988847]
[1213.05539936 1015.63172859 870.22355911 757.96979001 669.29459165
598.42480597 541.53393246 495.92501523 459.61261345 431.08651597
409.16824628 392.91904338 381.57826916 374.520726 371.22621733
371.25730752 374.24273309 379.8648054 387.84969343 397.9598238 ]
[1394.89144759 1182.95804162 1024.90921978 901.30610293 802.25977363
721.79872971 655.95942846 601.94472873 557.69295304 521.63410191
492.5416898 469.43766351 451.52871614 438.16212541 428.79423931
422.96743691 420.2929332 420.43771393 423.11445184 428.07361556]
[1594.70135227 1368.25821109 1197.5687369 1062.61627228 953.19881205
863.14650989 788.3587809 725.93829867 673.74714907 630.15554429
593.88898977 563.93014008 539.45301957 519.77738125 504.33611774
492.65142275 484.31698975 478.9844789 476.35306668 476.16126376]
[1812.48511338 1571.532237 1388.20211045 1241.90029807 1122.11170691
1022.46814651 938.73198977 867.90572504 807.77520155 756.65084311
713.21014617 676.39647309 645.35117944 619.36649354 597.8518526
580.30926502 566.31490274 555.50510031 547.56553796 542.22276841]
[2048.24273094 1792.78011936 1596.80934044 1439.1581803 1308.9984582
1199.76363956 1107.07905509 1027.84700786 959.77711046 901.11999837
850.50515902 806.83666254 769.22319574 736.92946227 709.34144391
685.94096373 666.28667217 649.99957816 636.75186568 626.25812949]]
I wanted to plot the first 20 set of data of z, so z[0] against my other variable M. I did the following:
M = np.arange(15.5,16.5, 0.05)
plt.plot(M, Z[0], label = r'$\chi^2$ for $\Omega_m[0] $ ')
and it gave me the folllwing plot (ignore the label whith color blue, there were 2 same datas plotted and only one label) :
Then I tried the following code, which gave me the other pic.
plt.plot(M, Z[0:20], label = r'$\chi^2$ for$\Omege_m = 0$ ')
But I don't understand why, with the same data, the shape of the function is obviously different between the two pics. Could anyone explain me why the second image is different from the first one, and what does it plot exactly ? How does matplotlib plot a 2D array ?
And if i can explain a bit the background of z, it is a function that depends on 2 parameters, M and Omega_M, Omega_m = np.arange(0.0, 1.0, 0.05) (len(Omega_m) =20) and z[0] corresponds to the 20 values of the Z function for each value of Omega_m and for M[0], z1 correspond of the 20 values of the Z function for each values of Omega_m for M1 etc, until the function is calculated for each value of each parameter.
First, let's explain why the two graphs differ. Because in the first graph, you're plotting the first row of Z with M. In the second graph, you're drawing the columns of Z with M. And this became so clear when I tried to plot the first three columns of Z:
plt.plot(M, Z[:, 0], label = r'$\chi^2$ for $\Omega_m[0] $ ')
plt.plot(M, Z[:, 1], label = r'$\chi^2$ for $\Omega_m[0] $ ')
plt.plot(M, Z[:, 2], label = r'$\chi^2$ for $\Omega_m[0] $ ')
plt.show()
Which produced this graph:
And that makes total sense as it will throw an error when I pass Z with one less row:
plt.plot(M, Z[0:19], label = r'$\chi^2$ for $\Omega_m[0] $ ')
ValueError: x and y must have same first dimension, but have shapes (20,) and (19, 20)
So, to produce 20 curves that match the rows of Z, not the columns, you need to transpose your Z array using Z.T notation like so:
plt.plot(M, Z.T, label = r'$\chi^2$ for $\Omega_m[0] $ ')
plt.show()
Which will get this graph:

Python : Reduce grid according to the values of a function

I have a regularly spaced grid, of let's say 200*200*200 = 8,000,000 points. I also have a list of values of some function f (which takes positive and negative values and which varies a lot over space) on every point of this grid, as follows :
import numpy as np
from itertools import product
x = np.linspace(0, 200*0.05, 200)
y = np.linspace(0, 200*0.05, 200)
z = np.linspace(0, 200*0.05, 200)
coordinates = np.array(list(product(x, y, z)))
and
In [1]: print(coordinates, coordinates.shape)
[[ 0. 0. 0. ]
[ 0. 0. 0.05025126]
[ 0. 0. 0.10050251]
...,
[ 10. 10. 9.89949749]
[ 10. 10. 9.94974874]
[ 10. 10. 10. ]]
(8000000, 3)
In [2]: print(f,"\n",f.shape)
[ 2.46143000e-08 3.01043000e-08 3.64817000e-08 ..., 6.79642000e-08
5.83957000e-08 4.95127000e-08]
(8000000,)
In [3]: print(np.max(f), np.min(f), np.min(np.absolute(f)))
6.21966 -271.035 1.10296e-09
How can I get a new grid with less points (~250,000 points), that is very precise in regions of high f values, and much less precise in regions of low f values ?
This new grid can be regular, but can also be much more sophisticated, as long as I can still integrate the function over space afterwards.
Thank you in advance for your help !
EDIT : I have just discovered the scipy.interpolate.griddata function which will be very useful if I find someway to make a new grid, even if this grid is not regular. Is there any python library that generates grids ?
I ended up using the following code, inspired by this following stackoverflow question, and defined a probability density out of f :
n = 250000
g = 2 #the higher g, the more precise the grid will be in regions of high f, and vice-versa
x = np.linspace(0, 200*0.05, 200)
y = np.linspace(0, 200*0.05, 200)
z = np.linspace(0, 200*0.05, 200)
[x_grid,y_grid,z_grid] = np.meshgrid(x,y,z)
xi,yi,zi = x_grid.ravel(),y_grid.ravel(),z_grid.ravel()
#create normalized pdf
pdf = np.log10(np.absolute(f))
pdf = pdf - pdf.min() + 1
pdf = pdf**g
pdf = pdf/np.sum(pdf)
#obtain indices of randomly selected points, as specified by pdf:
randices = np.random.choice(np.arange(x_grid.size), n, replace = False,p = pdf.ravel())
#random positions:
x_rand = xi[randices]
y_rand = yi[randices]
z_rand = zi[randices]
#coordinates
grid_coord = np.array([x_rand, y_rand, z_rand]).swapaxes(0,1)

How to uniformly resample a non-uniform signal using SciPy?

I have an (x, y) signal with non-uniform sample rate in x. (The sample rate is roughly proportional to 1/x). I attempted to uniformly re-sample it using scipy.signal's resample function. From what I understand from the documentation, I could pass it the following arguments:
scipy.resample(array_of_y_values, number_of_sample_points, array_of_x_values)
and it would return the array of
[[resampled_y_values],[new_sample_points]]
I'd expect it to return an uniformly sampled data with a roughly identical form of the original, with the same minimal and maximalx value. But it doesn't:
# nu_data = [[x1, x2, ..., xn], [y1, y2, ..., yn]]
# with x values in ascending order
length = len(nu_data[0])
resampled = sg.resample(nu_data[1], length, nu_data[0])
uniform_data = np.array([resampled[1], resampled[0]])
plt.plot(nu_data[0], nu_data[1], uniform_data[0], uniform_data[1])
plt.show()
blue: nu_data, orange: uniform_data
It doesn't look unaltered, and the x scale have been resized too. If I try to fix the range: construct the desired uniform x values myself and use them instead, the distortion remains:
length = len(nu_data[0])
resampled = sg.resample(nu_data[1], length, nu_data[0])
delta = (nu_data[0,-1] - nu_data[0,0]) / length
new_samplepoints = np.arange(nu_data[0,0], nu_data[0,-1], delta)
uniform_data = np.array([new_samplepoints, resampled[0]])
plt.plot(nu_data[0], nu_data[1], uniform_data[0], uniform_data[1])
plt.show()
What is the proper way to re-sample my data uniformly, if not this?
Please look at this rough solution:
import matplotlib.pyplot as plt
from scipy import interpolate
import numpy as np
x = np.array([0.001, 0.002, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 20])
y = np.exp(-x/3.0)
flinear = interpolate.interp1d(x, y)
fcubic = interpolate.interp1d(x, y, kind='cubic')
xnew = np.arange(0.001, 20, 1)
ylinear = flinear(xnew)
ycubic = fcubic(xnew)
plt.plot(x, y, 'X', xnew, ylinear, 'x', xnew, ycubic, 'o')
plt.show()
That is a bit updated example from scipy page. If you execute it, you should see something like this:
Blue crosses are initial function, your signal with non uniform sampling distribution. And there are two results - orange x - representing linear interpolation, and green dots - cubic interpolation. Question is which option you prefer? Personally I don't like both of them, that is why I usually took 4 points and interpolate between them, then another points... to have cubic interpolation without that strange ups. That is much more work, and also I can't see doing it with scipy, so it will be slow. That is why I've asked about size of the data.

python: plot unevenly distributed axis

I am using python and have a plot which looks like this:
Now the problem is that, as most bins are in the range 0-500 on x-axis, so I want to make the x-axis like [0, 100, 200, 300, 400, 500, 1000, 1500, 2000, 2500] and each interval has the same length.
I don't know how to do this in python. Any idea?
Perhaps there's a simpler way to do this, but it's certainly possible to do so in pyplot using these two steps:
Plot a different function, namely one with the same y values but different x values
Manipulate the x-ticks so that it appears like you've plotted your original function (but with a different axis).
I'll start with 2. Note the existence of the xticks, which allows you to do stuff like this:
ticks = [0, 100, 200, 300, 400, 500, 1000, 1500, 2000, 2500]
xticks(range(10), ticks)
This allows you to place both the locations of the xticks, as well as the labels.
Now, for 1., you just need to translate your original x array to a new_x array, which is spread out in arange(10), but non-linearly, according to your labels. If your points are in the array x, then using np.interp1d:
from scipy import interpolate
new_x = interpolate.interp1d(ticks, arange(10))(x)
In conclusion, use plot(new_x, y) with the xticks above.
As already said, you have to map the original abscissae to a new range, and then draw the xtics accordingly... The first part is the toughest, of course, and can be done in different ways, my take uses a vectorized approach using numpy and computes the function body at runtime using eval.
def make_xmap(l):
from numpy import array
ll = len(l)
dy = 1.0 / (ll-1)
def f(l, i):
if i == 0 : return "0.0"
y0 = i*dy-dy
x0, x1 = l[i-1:i+1]
return '%r+%r*(x-%r)/%r'%(y0,dy,x0,x1-x0)
fmt = 'numpy.where(x<%f,%s%s'
body = ' '.join(fmt%(j,f(l,i),"," if i<(ll-1) else ", 1.0") for i, j in enumerate(l))
tail = ')'*ll
def xm(x):
x = array(x)
return eval(body+tail)
return xm
import numpy
xm = make_xmap([0.,200.,1000.])
x = (-10.,0.,100.,200.,600.,1000.,1010)
print xm(x)
# [0.0, 0.0, 0.25, 0.5, 0.75, 1.0, 1.0]
Note that you have to import numpy in your code, because we have used numpy.where to construct the function body... If you prefer to import numpy as np modify the fmt string in the factory function...
The second part is easier, if you have an x and an y array to plot, with the subdivision from your example, you can do
import numpy # I touched this point before...
...
intervals = [0., 100., 200., 300., 400., 500., 1000., 1500., 2000., 2500.]
xm = make_xmap(intervals)
plt.plot(xm(x),y)
plt.xticks(xm(intervals), [str(xi) for xi in intervals])
plt.show()
A small optimization
You may want to change
...
tail = ')'*ll
def xm(x):
x = array(x)
return eval(body+tail)
...
to
...
tail = ')'*ll
code = compile(body+tail,'','eval')
def xm(x):
x = array(x)
return eval(code)
...
This small optimization avoids the compilation of the code string every time you call the mapping function, and is of course more relevant if the mapping is used many times on short inputs.

Categories