Python - Clipping out data to fit profiles - python
I have several sets of data to which I'm trying to fit different profiles. In the centre of one of the minima there is contamination that prevents me from doing a good fit as you can see in this image:
How can I clip out those spikes in the bottom of my data taking into account that the spike is not always in the same position? Or how would you deal with data like this? I'm using lmfit to fit the profiles, in this case a Lorentzian and a Gaussian. Here is a minimal working example where I have played with the initial values to fit the data more closely:
import numpy as np
import matplotlib.pyplot as plt
from lmfit import Model
from lmfit.models import GaussianModel, ConstantModel, LorentzianModel
x = np.array([4085.18084467, 4085.38084374, 4085.5808428 , 4085.78084186, 4085.98084092, 4086.18083999, 4086.38083905, 4086.58083811, 4086.78083717, 4086.98083623, 4087.1808353 , 4087.38083436, 4087.58083342, 4087.78083248, 4087.98083155, 4088.18083061, 4088.38082967, 4088.58082873, 4088.78082779, 4088.98082686, 4089.18082592, 4089.38082498, 4089.58082404, 4089.78082311, 4089.98082217, 4090.18082123, 4090.38082029, 4090.58081935, 4090.78081842, 4090.98081748, 4091.18081654, 4091.3808156 , 4091.58081466, 4091.78081373, 4091.98081279, 4092.18081185, 4092.38081091, 4092.58080998, 4092.78080904, 4092.9808081 , 4093.18080716, 4093.38080622, 4093.58080529, 4093.78080435, 4093.98080341, 4094.18080247, 4094.38080154, 4094.5808006 , 4094.78079966, 4094.98079872, 4095.18079778, 4095.38079685, 4095.58079591, 4095.78079497, 4095.98079403, 4096.1807931 , 4096.38079216, 4096.58079122, 4096.78079028, 4096.98078934, 4097.18078841, 4097.38078747, 4097.58078653, 4097.78078559,4097.98078466, 4098.18078372, 4098.38078278, 4098.58078184, 4098.7807809 , 4098.98077997, 4099.18077903, 4099.38077809, 4099.58077715, 4099.78077622, 4099.98077528, 4100.18077434, 4100.3807734 , 4100.58077246, 4100.78077153, 4100.98077059, 4101.18076965, 4101.38076871, 4101.58076778, 4101.78076684, 4101.9807659 , 4102.18076496, 4102.38076402, 4102.58076309, 4102.78076215, 4102.98076121, 4103.18076027, 4103.38075934, 4103.5807584 , 4103.78075746, 4103.98075652, 4104.18075558, 4104.38075465, 4104.58075371, 4104.78075277, 4104.98075183, 4105.1807509 , 4105.38074996, 4105.58074902, 4105.78074808, 4105.98074714, 4106.18074621, 4106.38074527, 4106.58074433, 4106.78074339, 4106.98074246, 4107.18074152, 4107.38074058, 4107.58073964, 4107.7807387 , 4107.98073777, 4108.18073683, 4108.38073589, 4108.58073495, 4108.78073401, 4108.98073308, 4109.18073214, 4109.3807312 , 4109.58073026, 4109.78072933, 4109.98072839, 4110.18072745, 4110.38072651, 4110.58072557, 4110.78072464, 4110.9807237 , 4111.18072276, 4111.38072182, 4111.58072089, 4111.78071995, 4111.98071901, 4112.18071807, 4112.38071713, 4112.5807162 , 4112.78071526, 4112.98071432, 4113.18071338, 4113.38071245, 4113.58071151, 4113.78071057, 4113.98070963, 4114.18070869, 4114.38070776, 4114.58070682, 4114.78070588, 4114.98070494, 4115.18070401, 4115.38070307, 4115.58070213, 4115.78070119, 4115.98070025, 4116.18069932, 4116.38069838, 4116.58069744, 4116.7806965 , 4116.98069557, 4117.18069463, 4117.38069369, 4117.58069275, 4117.78069181, 4117.98069088, 4118.18068994, 4118.380689 , 4118.58068806, 4118.78068713, 4118.98068619, 4119.18068525, 4119.38068431, 4119.58068337, 4119.78068244, 4119.9806815 , 4120.18068056, 4120.38067962, 4120.58067869, 4120.78067775, 4120.98067681, 4121.18067587, 4121.38067493, 4121.580674 , 4121.78067306, 4121.98067212, 4122.18067118, 4122.38067025, 4122.58066931, 4122.78066837, 4122.98066743, 4123.18066649, 4123.38066556, 4123.58066462, 4123.78066368, 4123.98066274, 4124.1806618 , 4124.38066087, 4124.58065993, 4124.78065899, 4124.98065805, 4125.18065712, 4125.38065618, 4125.58065524, 4125.7806543 , 4125.98065336, 4126.18065243, 4126.38065149, 4126.58065055, 4126.78064961, 4126.98064868, 4127.18064774, 4127.3806468 , 4127.58064586, 4127.78064492, 4127.98064399, 4128.18064305, 4128.38064211, 4128.58064117, 4128.78064024, 4128.9806393 , 4129.18063836, 4129.38063742, 4129.58063648, 4129.78063555, 4129.98063461, 4130.18063367, 4130.38063273, 4130.5806318 , 4130.78063086, 4130.98062992, 4131.18062898, 4131.38062804, 4131.58062711, 4131.78062617, 4131.98062523, 4132.18062429, 4132.38062336, 4132.58062242, 4132.78062148, 4132.98062054, 4133.1806196 , 4133.38061867, 4133.58061773, 4133.78061679, 4133.98061585, 4134.18061492, 4134.38061398, 4134.58061304, 4134.7806121 , 4134.98061116])
y = np.array([0.90312759, 1.00923175, 0.94618369, 0.98284045, 0.91510612, 0.96737804, 0.97690214, 0.94363369, 1.00887784, 1.00110387, 0.91647096, 0.97943202, 1.00672907, 1.01552094, 1.01089407, 0.96914584, 0.9908419 , 1.0176613 , 0.97032148, 0.96003562, 0.9702355 , 0.93684173, 0.94652734, 0.94895018, 1.01214356, 0.85777678, 0.89308203, 0.9789272 , 0.93901884, 0.9684622 , 0.96969321, 0.86326307, 0.89607392, 0.92459571, 1.00454429, 1.06019733, 0.97291196, 0.95646497, 0.95899707, 1.02830351, 0.94938178, 0.91481128, 0.92606219, 0.97085631, 0.93597434, 0.91316857, 0.90644542, 0.91726926, 0.91686184, 0.96445563, 0.92166362, 0.95831572, 0.93859066, 0.85285273, 0.89944073, 0.91812428, 0.94265677, 0.88281406, 0.9470601 , 0.94921529, 0.97289222, 0.94632251, 0.96633195, 0.94096512, 0.95324803, 0.90920845, 0.92100257, 0.91181745, 0.95715298, 0.91715382, 0.90219214, 0.87585035, 0.86592191, 0.89335902, 0.85536392, 0.89619274, 0.9450366 , 0.82780137, 0.81214176, 0.83461329, 0.82858317, 0.80851704, 0.79253546, 0.85440086, 0.81679169, 0.80579976, 0.72312218, 0.75583125, 0.75204599, 0.84519188, 0.68686821, 0.71472154, 0.71706318, 0.72640234, 0.70526356, 0.68295282, 0.66795774, 0.65004383, 0.68096834, 0.72697547, 0.72436393, 0.77128385, 0.79666758, 0.67349101, 0.61479406, 0.57046337, 0.51614312, 0.52945366, 0.53112169, 0.53757761, 0.56680358, 0.63839684, 0.60704329, 0.62377533, 0.67862515, 0.64587581, 0.71316115, 0.76309798, 0.72217569, 0.7477785 , 0.79731849, 0.76934137, 0.77063868, 0.77871584, 0.77688526, 0.84342722, 0.85382332, 0.88700466, 0.85837992, 0.79589266, 0.83798993, 0.79835529, 0.84612746, 0.83214907, 0.86373676, 0.90729115, 0.82111605, 0.86165685, 0.84090099, 0.90389133, 0.89554032, 0.90792356, 0.92798016, 0.95588479, 0.95019718, 0.95447497, 0.89845759, 0.91638311, 0.99263342, 0.97477606, 0.95482538, 0.94489498, 0.94344967, 0.90526465, 0.92538486, 0.96279787, 0.94005143, 0.96842454, 0.92296494, 0.89954172, 0.8684367 , 0.95039002, 0.95229769, 0.93752274, 0.94741173, 0.96704449, 1.01130839, 0.95499414, 0.99596569, 0.95130622, 1.00014723, 1.00252218, 0.95130331, 1.0022896 , 0.99851989, 0.94405282, 0.95814021, 0.94851972, 1.01302067, 1.01400272, 0.97960083, 0.97070283, 1.01312797, 0.9842154 , 1.01147273, 0.97331853, 0.91403182, 0.96813051, 0.92319169, 0.9294103 , 0.96960715, 0.94811518, 0.97115083, 0.84687543, 0.90725159, 0.88061293, 0.87319615, 0.85331661, 0.89775082, 0.90956716, 0.83174505, 0.89753388, 0.89554364, 0.95329739, 0.87687031, 0.93883127, 0.97433899, 0.99515225, 0.97519981, 0.91956466, 0.97977674, 0.93582089, 1.00662722, 0.90157277, 1.02887754, 0.9777419 , 0.94257094, 1.02359615, 0.98968414, 1.00075502, 1.03230265, 1.05904074, 1.00488442, 1.05507886, 1.05085518, 1.02561781, 1.05896008, 0.98024381, 1.08005691, 0.94528977, 1.03853637, 1.02064405, 1.0467137 , 1.05375156, 1.12907949, 0.99295611, 1.06601022, 1.02846374, 0.98006807, 0.96446772, 0.97702428, 0.97788589, 0.93889781, 0.96366778, 0.96645265, 0.95857242, 1.05796304, 0.99441763, 1.00573183, 1.05001927])
e = np.array([0.0647344 , 0.04583914, 0.05665552, 0.04447208, 0.05644753, 0.03968611, 0.05985188, 0.04252311, 0.03366922, 0.04237672, 0.03765898, 0.03290132, 0.04626836, 0.05106203, 0.03619188, 0.03944098, 0.08115469, 0.05859644, 0.06091101, 0.05170821, 0.0427244 , 0.06804469, 0.06708318, 0.03369381, 0.04160575, 0.08007032, 0.09292148, 0.04378329, 0.08216214, 0.06087074, 0.05375458, 0.06185891, 0.06385766, 0.08084546, 0.04864063, 0.06400878, 0.04988693, 0.06689165, 0.05989534, 0.08010138, 0.0681177 , 0.04478208, 0.03876582, 0.05977015, 0.06610619, 0.05020086, 0.07244604, 0.0445143 , 0.06970626, 0.04423994, 0.0414573 , 0.06892836, 0.05715395, 0.04014724, 0.07908425, 0.06082051, 0.08380691, 0.08576757, 0.06571406, 0.04842625, 0.05298355, 0.05271857, 0.06340425, 0.10849621, 0.0811072 , 0.03642638, 0.10614094, 0.09865099, 0.06711037, 0.10244762, 0.11843505, 0.1092357 , 0.09748241, 0.09657009, 0.09970179, 0.10203563, 0.18494082, 0.14097796, 0.1151294 , 0.16172895, 0.17611204, 0.16226913, 0.2295418 , 0.17795924, 0.1253298 , 0.1771586 , 0.15139061, 0.14739618, 0.1620105 , 0.19158538, 0.21431605, 0.19292715, 0.23308884, 0.30519423, 0.31401994, 0.30569885, 0.31216375, 0.35147676, 0.25016472, 0.16232236, 0.09058787, 0.0604483 , 0.05168302, 0.21432774, 0.38149791, 0.5061975 , 0.44281541, 0.50646427, 0.43761581, 0.44989111, 0.47778238, 0.39944325, 0.32462726, 0.34560857, 0.3175776 , 0.30253441, 0.23059451, 0.24516185, 0.20708065, 0.26429751, 0.1830661 , 0.15155041, 0.16497299, 0.15794139, 0.13626666, 0.17839823, 0.13502886, 0.14148522, 0.10869864, 0.11723602, 0.09074029, 0.06922157, 0.07719777, 0.13181317, 0.11441895, 0.10655855, 0.12073767, 0.0846133 , 0.07974657, 0.06538693, 0.0573741 , 0.07864047, 0.08351471, 0.08130351, 0.0768824 , 0.07951992, 0.04478989, 0.0765122 , 0.04842814, 0.04355571, 0.05138656, 0.07215294, 0.04681987, 0.05790133, 0.06163808, 0.082449 , 0.06127927, 0.04971221, 0.05107901, 0.04493687, 0.06072161, 0.06094332, 0.03630467, 0.04162285, 0.04058228, 0.04526251, 0.06191432, 0.04901982, 0.0454908 , 0.06186274, 0.0407017 , 0.03865571, 0.04353665, 0.03898987, 0.04666321, 0.05856035, 0.04225933, 0.04797901, 0.03523971, 0.04728414, 0.05494382, 0.04773011, 0.03210954, 0.05651663, 0.03625933, 0.03596701, 0.03800191, 0.06267668, 0.06431192, 0.0602614 , 0.05139896, 0.04571979, 0.04375182, 0.0576867 , 0.07491418, 0.05339972, 0.07619115, 0.11569378, 0.07087871, 0.09076518, 0.13554717, 0.07811761, 0.07180695, 0.05831886, 0.06042863, 0.08759576, 0.06650081, 0.08420164, 0.08185432, 0.04338836, 0.04970979, 0.04008252, 0.03605485, 0.03456321, 0.05594584, 0.03856822, 0.03576337, 0.03118799, 0.0441686 , 0.0469118 , 0.03591666, 0.03562582, 0.04934832, 0.03280972, 0.03201576, 0.04338048, 0.07443531, 0.04121059, 0.03774147, 0.03717577, 0.03354207, 0.03806978, 0.0319364 , 0.03715712, 0.0379478 , 0.04867626, 0.0304592 , 0.03393844, 0.034518 , 0.04293514, 0.05177898, 0.05332907, 0.0352937 , 0.03359781, 0.04625272, 0.03733088, 0.03501259, 0.03346308, 0.04333749, 0.05741173])
cont = ConstantModel(prefix='cte_')
pars = cont.guess(y, x=x)
gauss = GaussianModel(prefix='g_')
pars.update( gauss.make_params())
pars['cte_c'].set(1)
pars['g_center'].set(4125, min=4120, max=4130)
pars['g_sigma'].set(1, min=0.5)
pars['g_amplitude'].set(-0.2, min=-0.5)
loren = LorentzianModel(prefix='l_')
pars.update( loren.make_params())
pars['l_center'].set(4106, min=4095, max=4115)
pars['l_sigma'].set(4, max=6)
pars['l_amplitude'].set(-6., max=-4.)
model = gauss + loren + cont
init = model.eval(pars, x=x)
result = model.fit(y, pars, x=x, weights=1/e)
#print(result.fit_report(min_correl=0.5))
fig, ax = plt.subplots(figsize=(8,6))
ax.plot(x, y, 'k-', lw=2) # data in red
ax.plot(x, init, 'g--', lw=2) # initial guess
ax.plot(x, result.best_fit, 'r-', lw=2) # best fit
ax.set(xlim=(4085,4135), ylim=(0.4,1.14))
If the bad point is always at the same x value, you could remove that point from the data, perhaps with something like:
import numpy as np
def index_nearest(array, value):
"""index of array nearest to value"""
return np.abs(array-value).argmin()
ybad = index_nearest(x, 4150)
y[ybad] = x[ybad] = np.nan
x = x[np.where(np.isfinite(y))]
y = y[np.where(np.isfinite(y))]
and then fit your model to those data with the bad point removed.
But, also: if there is not an obviously errant point and the data "just" noisy, there is probably no advantage to removing what looks like bad points. Your data looks noisy to me, but it's hard to see that there is a systematically bad point. If you are going to remove a point, remember that you are asserting that this measurement was not merely affected by normal noise, but was wrong.
Finally: another approach to treating noisy data might be to try to smooth the data, say with a Savitzky-Golay filter. There is always some danger of smoothing out features with such an approach, but a modest S-G filter is often good for cleaning up noisy data enough to detect features. Of course, if fits to filtered data give significantly different results from fits to unfiltered data, you will probably need to understand why that is.
Related
How to interpolate heatmaps (with nonuniform pixels) to draw contour plots in python
I have data that I have been able to plot using heatmaps with nonuniform pixel sizes, using the answer here. I'm now wondering what would be the best way to go about interpolating the heatmap and drawing a contour plot at a given value. Essentially, imagine if I wanted to draw a smooth curve on the plot generated in the linked question, corresponding to a value of 0.5. One way of going about this could be to fit the data to a 3d spline. Each pixel in the heatmap also has an error estimate. It would also be great if I could use this information in drawing the contour map.
In terms of interpolating pcolormesh, the answer here gives a couple of options. I chose to go with passing shading='gouraud' as argument to pcolormesh. When it comes to plotting a contour plot at 0.5, I found the answer here useful. Pretty much using coutour the same way you would with imshow. See code from the SO answer linked in your question adapted to my understanding of what you are trying to achieve: import matplotlib.pyplot as plt import matplotlib import seaborn as sns import numpy as np bounds1 = [ 0. , 3. , 27.25 , 51.5 , 75.75 , 100. ] bounds2 = [ 0. , 127., 165., 334. , 522. , 837., 1036., 1316., 1396., 3000] matrix = [[0.3 , 0.5 , 0.7 , 0.9 , 1. , 0.9 , 0.7 , 0.4 , 0.3 , 0.3 ], [0.22725, 0.37875, 0.53025, 0.68175, 0.7575, 0.68175, 0.53025, 0.303, 0.22725, 0.22725], [0.1545 , 0.2575 , 0.3605 , 0.4635 , 0.515 , 0.4635 , 0.3605 , 0.206, 0.1545 , 0.1545 ], [0.08175, 0.13625, 0.19075, 0.24525, 0.2725, 0.24525, 0.19075, 0.109, 0.08175, 0.08175], [0.009 , 0.015 , 0.021 , 0.027 , 0.03 , 0.027 , 0.021 , 0.012, 0.009 , 0.009 ], [0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ]] x2 = np.array([1.7765000e+00, 3.9435000e+00, 4.5005002e+00, 4.5005002e+00, 5.0325003e+00, 6.0124998e+00, 7.0035005e+00, 8.5289993e+00, 1.0150000e+01, 1.1111500e+01, 1.2193500e+01, 1.2193500e+01, 1.2193500e+01, 1.3665500e+01, 1.4780001e+01, 1.5908000e+01, 1.7007000e+01, 1.8597000e+01, 2.0439001e+01, 2.2047001e+01, 2.4724501e+01, 2.7719501e+01, 3.0307501e+01, 3.3042500e+01, 3.6326000e+01, 3.8622997e+01, 4.1292500e+01, 4.4293495e+01, 4.7881500e+01, 5.1105499e+01, 5.3708996e+01, 5.6908497e+01, 5.9103497e+01, 6.1926003e+01, 6.6175499e+01, 6.9841499e+01, 7.3534996e+01, 7.8712997e+01, 8.3992500e+01, 8.7227493e+01, 9.1489487e+01, 9.6500992e+01, 1.0068549e+02, 1.0625399e+02, 1.1245149e+02, 1.1828050e+02, 1.2343950e+02, 1.2875299e+02, 1.3531699e+02, 1.4146500e+02, 1.4726399e+02, 1.5307101e+02, 1.5917000e+02, 1.6554350e+02, 1.7167050e+02, 1.7897350e+02, 1.8766650e+02, 1.9705751e+02, 2.0610300e+02, 2.1421350e+02, 2.2146150e+02, 2.2975949e+02, 2.3886848e+02, 2.4766153e+02, 2.5618802e+02, 2.6506250e+02, 2.7528250e+02, 2.8465201e+02, 2.9246451e+02, 3.0088300e+02, 3.1069800e+02, 3.2031000e+02, 3.2950650e+02, 3.3929001e+02, 3.4919598e+02, 3.5904755e+02, 3.6873303e+02, 3.7849451e+02, 3.8831549e+02, 3.9915201e+02, 4.1044501e+02, 4.2201651e+02, 4.3467300e+02, 4.4735904e+02, 4.5926651e+02, 4.7117001e+02, 4.8231406e+02, 4.9426105e+02, 5.0784149e+02, 5.2100049e+02, 5.3492249e+02, 5.4818701e+02, 5.6144202e+02, 5.7350153e+02, 5.8634998e+02, 5.9905096e+02, 6.1240802e+02, 6.2555353e+02, 6.3893542e+02, 6.5263202e+02, 6.6708154e+02, 6.8029950e+02, 6.9236456e+02, 7.0441150e+02, 7.1579163e+02, 7.2795203e+02, 7.4106995e+02, 7.5507953e+02, 7.6881946e+02, 7.8363702e+02, 7.9864905e+02, 8.1473901e+02, 8.3018762e+02, 8.4492249e+02, 8.6007306e+02, 8.7455353e+02, 8.8938556e+02, 9.0509601e+02, 9.2196307e+02, 9.3774091e+02, 9.5391345e+02, 9.7015198e+02, 9.8671466e+02, 1.0042726e+03, 1.0209606e+03, 1.0379355e+03, 1.0547625e+03, 1.0726985e+03, 1.0912705e+03, 1.1100559e+03, 1.1288949e+03, 1.1476450e+03, 1.1654260e+03, 1.1823262e+03, 1.1997356e+03, 1.2171041e+03, 1.2353951e+03, 1.2535184e+03, 1.2718250e+03, 1.2903676e+03, 1.3086545e+03, 1.3270005e+03, 1.3444775e+03, 1.3612805e+03, 1.3784171e+03, 1.3958615e+03, 1.4131825e+03, 1.4311034e+03, 1.4489685e+03, 1.4677334e+03, 1.4869026e+03, 1.5062087e+03, 1.5258719e+03, 1.5452015e+03, 1.5653271e+03, 1.5853635e+03, 1.6053860e+03, 1.6247255e+03, 1.6436824e+03, 1.6632330e+03, 1.6819221e+03, 1.7011276e+03, 1.7198782e+03, 1.7383060e+03, 1.7565670e+03, 1.7749023e+03, 1.7950280e+03, 1.8149988e+03, 1.8360586e+03, 1.8572985e+03, 1.8782219e+03, 1.8991390e+03, 1.9200371e+03, 1.9395586e+03, 1.9595035e+03, 1.9790668e+03, 1.9995455e+03, 2.0203715e+03, 2.0416791e+03, 2.0616587e+03, 2.0819294e+03, 2.1032202e+03, 2.1253989e+03, 2.1470112e+03, 2.1686660e+03, 2.1908926e+03, 2.2129436e+03, 2.2349995e+03, 2.2567026e+03, 2.2784224e+03, 2.2997925e+03, 2.3198750e+03, 2.3393770e+03, 2.3588149e+03, 2.3783970e+03, 2.3988135e+03, 2.4175618e+03, 2.4363840e+03, 2.4572385e+03, 2.4773455e+03, 2.4965142e+03, 2.5157107e+03, 2.5354666e+03, 2.5554331e+03, 2.5757551e+03, 2.5955181e+03, 2.6157085e+03, 2.6348906e+03, 2.6535190e+03, 2.6727512e+03, 2.6923147e+03, 2.7118843e+03]) x1 = np.array([28.427988, 28.891748, 30.134018, 29.833858, 30.540195, 31.762226, 32.163025, 31.623648, 31.964993, 32.73733, 32.562325, 32.89953, 33.064743, 32.76882, 32.1024, 32.171394, 33.363426, 34.328148, 36.24527, 35.877434, 35.29762, 35.193832, 35.61119, 36.50994, 35.615444, 35.2758, 34.447975, 34.183205, 35.781815, 35.510662, 35.277668, 35.26543, 34.944313, 35.301414, 34.63578, 34.36223, 35.496872, 35.488243, 35.494583, 35.21087, 34.275524, 33.945126, 33.63986, 33.904293, 33.553017, 34.348408, 33.84105, 32.8437, 32.19287, 31.688663, 32.035015, 31.641226, 31.138266, 30.629492, 30.111526, 29.571909, 29.244211, 28.42031, 27.908197, 27.316568, 26.909412, 25.928982, 25.03047, 24.354822, 23.54626, 22.88031, 23.000391, 22.300774, 21.988918, 21.467094, 21.730871, 23.060678, 22.910374, 24.45383, 23.610855, 24.594006, 24.263508, 25.077124, 23.9773, 22.611958, 21.88306, 21.014484, 19.674965, 18.745205, 20.225956, 19.433172, 19.451014, 18.264421, 17.588757, 16.837574, 17.252535, 18.967127, 19.111462, 19.90994, 19.15653, 18.49522, 17.376019, 17.35794, 16.200405, 17.9445, 18.545986, 17.69698, 20.665318, 20.90071, 20.32658, 21.27805, 21.145922, 19.32898, 19.160307, 18.60541, 18.902897, 18.843922, 17.890692, 18.197395, 17.662706, 18.578962, 18.898802, 18.435923, 17.644451, 16.393314, 15.570944, 16.779602, 15.74104, 15.041967, 14.544464, 15.014386, 14.156769, 13.591232, 12.386208, 11.133551, 10.472783, 9.7923355, 10.571391, 11.245247, 10.063455, 10.742685, 8.819294, 8.141182, 6.9487176, 6.3410373, 7.033326, 6.5856943, 6.0214376, 6.6087174, 9.583405, 9.4608135, 9.183213, 10.673293, 9.477165, 8.667246, 7.3392615, 6.2609572, 5.5752296, 4.4312773, 4.0997415, 4.127005, 4.072541, 3.5704772, 2.7370691, 2.3750854, 2.0708292, 3.4086852, 3.8237891, 3.9072614, 3.1760776, 2.4963813, 1.5232614, 0.931248, 0.49159998, 0.21676798, 0.874704, 2.0560641, 1.5494559, 3.0944476, 2.6151357, 2.7285278, 3.4450078, 3.4614875, 5.779072, 8.063728, 7.7077436, 7.8576636, 7.4494233, 6.5933595, 6.1667037, 4.9452477, 5.6894236, 6.0578876, 5.9922714, 5.060448, 6.074832, 6.7870073, 5.7388477, 5.8681116, 4.7604475, 4.2740316, 3.785328, 4.060576, 4.9203672, 5.355184, 4.793792, 3.8007674, 3.6115997, 2.7794237, 2.5385118, 5.1410074, 5.5506234, 7.638063, 7.512544, 6.617264, 6.5637918, 6.452815]) # define colormap N = 5 # number of desired color bins cmap = plt.cm.get_cmap('RdYlGn_r', N) # define the bins and normalize bounds = np.linspace(0, 1, N + 1) norm = matplotlib.colors.BoundaryNorm(bounds, cmap.N) fig, ax = plt.subplots(figsize=(15, 10)) colormesh = ax.pcolormesh(bounds2, bounds1, matrix, cmap=cmap, norm=norm, linewidths=0.1,shading='gouraud') cs = plt.contour(bounds2,bounds1, matrix, [0.5], colors='k') ax.clabel(cs, cs.levels) ax.tick_params(axis='x', which='major', rotation=50) ax.set_xticks(bounds2) ax.set_yticks(bounds1) cbar = fig.colorbar(colormesh, ax=ax) cbar.set_ticks(bounds) ax.plot(x2, x1, color='black', marker='o') plt.show() And the output gives:
How to properly plot the pdf of a beta function in scipy.stats
I am trying to fit a beta distribution to some data, and then plot how well the beta distribution fits the data. But the output looks really weird and incorrect. import scipy.stats as stats import matplotlib.pyplot as plt x = np.array([0.9999999 , 0.9602287 , 0.8823198 , 0.83825594, 0.92847216, 0.9632976 , 0.90275735, 0.8383094 , 0.9826664 , 0.9141795 , 0.88799196, 0.9272752 , 0.94456017, 0.90466917, 0.8905505 , 0.95424247, 0.781545 , 0.9489085 , 0.9578988 , 0.8644015 ]) beta_params = stats.beta.fit(x) print(beta_params) #(3.243900357315478, 1.5909897101396109, 0.7270083219563888, 0.27811444901271615 beta_pdf = stats.beta.pdf(x, beta_params[0], beta_params[1], beta_params[2], beta_params[3]) print(beta_pdf) #[2.70181543 6.8442073 4.98204632 2.82445508 6.76055614 6.75910611 #5.90419012 2.82696622 5.58521916 6.34096675 5.2508072 6.73212694 #6.98854653 5.98225724 5.36937625 6.9519977 0.67812362 6.99116729 #6.89484982 4.10113147] plt.plot(x, beta_pdf)
I'm not a statistician, but looking at your code I see that x is unordered. Does sorting x before fit helps you? x = np.sort(x) beta_params = stats.beta.fit(x) Doing so, you'd get this:
Finding all points on a slope of a signal
I have a 1d signal that is given as following x = np.array([34.69936612, 34.70083619, 37.38802174, 39.67141565, 49.05662135, 63.87593075, 67.70815746, 72.06562117, 79.31063707, 85.13125285, 83.34185985, 72.74589905, 57.34778159, 58.63283664, 64.92526896, 65.89153823, 66.07273386, 59.68722257, 59.6801125 , 59.41456929, 58.19250575, 59.92192524, 58.42078866, 55.45131784, 55.09849914, 54.95270916, 49.60804717, 43.05198366, 36.10104167, 26.88848229, 25.38550393, 28.71305461, 30.03802157, 31.3520023 , 32.59509437, 32.67600055, 32.68801666, 32.61500098, 32.65303828, 32.72752018, 32.84099458, 31.46154937, 32.70809456, 27.67842221, 25.65302641, 30.08500957, 31.41003082, 32.91935844, 32.92452782, 35.56587345, 30.09272452, 35.60898454, 49.12005244, 85.79396522, 71.81950127, 63.91915245, 69.14879246, 70.43600086, 71.71703424, 71.74830965, 70.51400086, 70.50201501, 70.50202228, 67.91157904, 66.62396413, 67.90736076, 66.5410636 , 67.96748026, 67.94177515, 65.30929726, 65.29901863, 66.60282538, 66.60666811, 66.55100589, 65.33825435, 66.55222626, 65.29656691, 66.56003543, 65.30964145, 64.07556963, 63.99339626, 62.86668124, 60.43549001, 61.68116229, 61.61140279, 62.65181523, 62.70844205, 62.77783077, 64.03882299, 65.39701193, 65.40123835, 65.41845477, 65.42941287, 65.38851043, 65.36201151, 73.33102635, 73.84443755, 70.94806114, 68.18793023, 69.20003749, 66.61045573, 65.38106858, 64.05484531, 63.88684974, 62.64420529, 62.69196131, 62.74418993, 62.72175294, 64.01210311, 64.1590297 , 63.0284751 , 64.27265024, 64.24984689, 62.90213438, 62.68704697, 62.65233151, 63.09040365, 63.10330994, 62.72787413, 63.95427977, 63.89707325, 61.38203635, 57.48587612, 60.05764178, 62.70293674, 61.38484666, 60.07995823, 61.34569129, 62.66307354, 61.38549663, 61.34835356, 61.3888718 , 61.48381576, 62.74226583, 62.83945058, 41.78731982, 38.06452548, 40.57553545, 43.10410628, 43.17965777, 44.41576623, 45.67422069, 44.44681128, 44.52855717, 45.69118569, 45.7559632 , 49.9019806 , 50.90898633, 52.2603325 , 36.83061979, 48.36714502, 53.60110239, 53.58750501, 51.03745637, 52.15201941, 50.94600264, 48.50758345, 51.03154956, 51.32249134, 51.49705585, 53.46467209, 51.708078 , 48.1404585 , 46.32157084, 53.20416229, 60.52216104, 67.14976382, 66.6844348 , 63.99400013, 63.89292312, 63.94972283, 65.33551293, 66.54723199, 65.29004129, 67.87224117, 69.3810433 , 69.28915977, 65.32064534, 64.07644938, 64.59988251, 65.55365125, 64.3440046 , 64.4526091 , 63.38977665, 64.61810574, 63.52989024, 63.55126155, 64.4263114 , 64.43874937, 64.78594756, 66.03974204, 67.34958445, 70.07248445, 67.40968741, 66.56554542, 67.59965865, 67.85658168, 67.62022101, 67.87089721, 61.22552792, 54.07823817, 47.96332512, 53.22944931, 54.77573267, 59.55033053, 62.24247612, 62.24529416, 63.9429676 , 63.13145527, 63.29764489, 63.2723988 , 62.96359318, 63.3025575 , 63.47790181, 63.29642863, 63.50702402, 63.71413853, 63.71470992, 62.25079434, 63.46787461, 63.73497156, 63.77631175, 63.69024723, 63.55254533, 63.97794376, 64.05815662, 63.57687055, 66.80917018, 66.82863683, 66.27964922, 65.04852024, 65.29135318, 65.57783886, 65.52090561, 65.29656225, 65.32543578, 66.52825603, 67.1314033 , 50.03567181, 53.53803024, 53.56862071, 55.10515723, 55.14010716, 63.30760687, 62.7114906 , 62.95237442, 62.75869066, 64.19585539, 62.70371169, 62.65204241, 62.69394807, 62.94844878, 58.36397143, 59.68285611, 60.89452752, 60.97356663, 60.72068974, 59.62036073, 60.52789377, 59.27245489, 58.82200393, 60.10430588, 60.90874661, 61.51060014, 61.74838059, 63.28503148, 61.12237542, 60.87046418, 61.23634728, 60.99214796, 60.18921274, 60.07774571, 61.20623845, 61.65825197, 60.11025633, 60.52832382, 61.18188688, 61.31380433, 58.80528487, 57.84584698, 58.73805752, 54.85645345, 58.79988199, 60.07737149, 56.20096342, 60.3929374 , 36.77761826, 49.22568866, 55.10930206, 65.24736292, 57.08641006, 54.08806036, 53.89556268, 53.5613321 , 53.51515767, 52.30442805, 52.24562597, 53.50311397, 53.49561038, 53.53878528, 49.66610081, 52.35633014, 55.17584864, 53.945292 , 53.79353353, 54.8626422 , 54.87102507, 56.14098197, 57.38968051, 55.1146169 , 54.92290752, 54.87858275, 54.86639486, 56.34316676, 56.16200014, 69.90905494, 68.20948497, 68.51263756, 65.64670149, 65.53992678, 67.07185321, 67.0542345 , 66.79344433, 66.75400526, 66.76640135, 66.76742739, 65.53052634, 67.01174217, 67.98329773, 69.18915578, 66.69019707, 69.61506484, 67.94096632, 67.91401491, 66.84415179, 67.88935229, 67.89356226, 69.1984958 , 52.24244378, 52.37211419, 50.95591909, 51.07641848, 50.91919022, 52.13500015, 52.26717303, 60.03109894, 65.23341727, 72.11099746, 75.02859632, 81.93540828, 81.20708335, 80.86208705, 81.04817673, 71.74669785, 73.05200134, 74.34519255, 75.72326992, 78.55812705, 76.95800509, 77.08696036, 79.61302675, 79.68123466, 78.31207499, 77.08036041, 77.18815309, 77.11523959, 75.74423094, 75.73143868, 74.48319908, 73.17138546, 66.80804931, 53.88772644, 53.87714358, 53.6088119 , 53.65411471, 54.86536613, 53.49300076, 53.52447811, 53.52000034, 56.83649529, 57.43503283, 82.38440921, 83.83190983, 83.9128805 , 83.94305425, 83.06892508, 82.91998964, 82.29555463, 82.30635577, 82.23464297, 82.20709065, 80.98821075, 83.93336979, 81.32873456, 82.46698736, 82.70592498, 83.93335761, 83.80821766, 83.84313602, 82.59867874, 82.62361191, 83.94865746, 83.83137976, 83.46075784, 82.14902814, 82.18902896, 83.83722778, 83.60064452, 83.63187976, 85.04806926, 84.87213079, 84.92473511, 84.90790341, 83.55500539, 83.59501005, 84.81195299, 84.86952928, 84.85600059, 84.81955391, 82.33120262, 78.56908599, 73.14783901, 64.99883861, 66.78701764, 64.5916058 , 64.77055337, 64.56918786, 65.02605783, 65.01019955, 64.78145201, 64.77581828, 64.55221044, 64.34285288, 62.8764752 , 64.57949744, 63.17957281, 61.89857751, 63.48365778, 55.62801456, 43.17986365]) I want to find all the slopes for this signal. I have tried first order difference and second order difference (np.diff and taking the difference of the difference). But the point on the slope will have every small difference, in contrast to the point in the beginning or the end, where the difference is bigger. Here is what I have tried def detect_slope(signal, window_size = 3, threshold = 5): list_ = [] for i in range(window_size, len(signal)-window_size): diff_ = np.mean(signal[i-window_size:i]) - np.mean(signal[i:window_size+i]) list_.append(diff_) first_order_diff = np.array(list_) d = np.where(np.abs(first_order_diff) < threshold, 0 , first_order_diff) idx = np.where(np.abs(d) != 0) # might need some offset because we are doing some smoothing, but just use raw idx for now # second order different diff_list = np.array(list_).copy() dd = np.diff(diff_list) print(dd.shape) dd_idx = np.where(np.abs(dd) > 0.5) return diff_list, dd, idx, dd_idx I have played around the 1st-2nd order difference but nothing seems to work. I'm trying to find all the peaks and troughs and exclude all of them or neighbors with close enough values too. Attached is my desired output. Sorry for the crappy pic.
Not clear what you want here, but if your issue is that the diff is picking up local changes and you want to focus your attention on global changes, smooth the signal first. import numpy as np import matplotlib.pyplot as plt from scipy.signal import savgol_filter plt.plot(x) x = savgol_filter(x, 21, 3) plt.plot(x) diff_list, dd, idx, dd_idx = detect_slope(x) plt.plot(diff_list) plt.show() This gives - Blue is your original signal, orange is your smoothed signal and green is your new diff. You can set it to pick up changes at various levels by playing around with the two parameters of savgol_filter. The more aggressively you smooth your function, the more global changes(and less local changes) the derivative picks up.
You can try find_peaks function from scipy. As you guess it gets the peaks defining a parameter to be more or less sensitive. The best one in your case is prominence ("How much you have to go down before finding another peak"). I use with your function in positive for max peaks and negative for min peaks. import numpy as np from matplotlib import pyplot as plt from scipy.ndimage import median_filter from scipy.signal import find_peaks #Find peaks (maximum) yhat = x max_peaks,_ = find_peaks(yhat, prominence=10 ) min_peaks,_ = find_peaks(-yhat, prominence=10 ) #Plot data and max peak fig, ax2 = plt.subplots(figsize=(20, 10)) ax2.plot(max_peaks, yhat[max_peaks], "xr",markersize=20) ax2.plot(min_peaks, yhat[min_peaks], "xr",markersize=20) ax2.plot(yhat,'-') ax2.plot(yhat,'o',markersize=4)
Problem with 2D Surface Polar Plot in Python
I am trying to replicate this 2D Surface Polar Plot (it's the thickness distribution of a wafer): Here is my code (the data is included): import numpy as np from scipy.interpolate import griddata import matplotlib.pyplot as plt x_pos = np.array([ 0. , 11.748, 0. , -11.748, 0. , 21.705, 21.705, 8.988, -8.988, -21.705, -21.705, -8.988, 8.988, 35.245, 30.517, 17.623, 0. , -17.623, -30.517, -35.245, -30.517, -17.623, 0. , 17.623, 30.517, 46.098, 46.098, 39.078, 26.111, 9.164, -9.164, -26.111, -39.078, -46.098, -46.098, -39.078, -26.111, -9.164, 9.164, 26.111, 39.078]) y_pos = np.array([ 0. , 0. , 11.748, 0. , -11.748, -8.988, 8.988, 21.705, 21.705, 8.988, -8.988, -21.705, -21.705, 0. , 17.623, 30.517, 35.245, 30.517, 17.623, 0. , -17.623, -30.517, -35.245, -30.517, -17.623, -9.164, 9.164, 26.111, 39.078, 46.098, 46.098, 39.078, 26.111, 9.164, -9.164, -26.111, -39.078, -44.29 , -44.29 , -39.078, -26.111]) values = np.array([721.0099, 679.8029, 708.8115, 687.4061, 682.9654, 593.4934, 614.5019, 605.3102, 600.0777, 588.2717, 580.5319, 584.1863, 598.9501, 584.5857, 565.1545, 588.9718, 570.4216, 553.165 , 540.6561, 555.0057, 533.8918, 552.6648, 567.4707, 590.8452, 574.8677, 530.336 , 556.7502, 562.9214, 598.5813, 616.5076, 620.0647, 612.7661, 600.2197, 541.4696, 510.0406, 531.339 , 509.6992, 540.1819, 539.2797, 493.9553, 514.0744]) # Making the contour plot # CONVERTING TO Polar Coordinates def cart2pol(x, y): r = np.sqrt(x**2 + y**2) theta = np.arctan2(y, x) return(r, theta) r, theta = cart2pol(x_pos, y_pos) r_grid=np.linspace(0,50,50) theta_grid=np.linspace(-np.pi,np.pi,50) r_matrix, theta_matrix = np.meshgrid(r_grid,theta_grid) # Interpolate onto polar grid values_grid_interp = griddata((r, theta), values, (r_matrix,theta_matrix),method='linear') # #-- Plot... ------------------------------------------------ fig, ax = plt.subplots(subplot_kw=dict(projection='polar')) ax.contourf(theta_grid, r_grid, values_grid_interp) What I get is this: As you can see, it does not match the original plot at all, but I am having difficulties seeing what I did wrong.
Trying to make a histogram within python for number of magnitudes from a text file
I need help on making a histogram dealing with the number of times a magnitude is within a range. I have a histogram made with the galaxy number, but I realized that doesn't really give any information. I have tried making a bin of the galaxy numbers but realized that didn't really matter, nor did it work. import matplotlib matplotlib.use('TkAgg') import matplotlib.pyplot as plt import csv import math from collections import Counter import numpy as np from numpy.polynomial.polynomial import polyfit histflux = [] galnum = [] with open('/home/jacob/PHOTOMETRY/PHOTOM_CATS/SpARCS-0035_totalall_HAWKIKs.cat', 'r') as magfile: magplots = csv.reader(magfile) firstmagline = magfile.readline() for line in magfile: id , ra , dec , x , y , hawkiks_tot , k_flag , k_star , k_fluxrad , totmask , hawkiks , ehawkiks , vimosu , evimosu , vimosb , \ evimosb , vimosv , evimosv , vimosr , evimosr , vimosi , evimosi , decamz , edecamz , fourstarj1 , efourstarj1 , hawkij , ehawkij , \ irac1 , eirac1 , irac2, eirac2 , irac3 , eirac3 , irac4 , eirac4 = line.split() goodflag = float(k_flag) goodhawki = float(hawkiks) if goodflag != 0.0: continue try: histfluxk = -2.5 * math.log10(goodhawki) +25 except ValueError: print(histfluxk) histflux.append(histfluxk) galnum.append(float(id)) plt.hist([galnum, histflux]) plt.xlabel('Galaxy Number') plt.ylabel('K-Band Magnitude') plt.title('K-Band Magnitudes of Galaxies') plt.legend() plt.show() What I want to see is a histogram with the x axis ranging from 0-20 flux magnitudes in intervals of 2. The y-axis should be the number of times that the flux magnitudes were within those ranges. I am stumped on how to do this because I am new to python and especially making graphs on python.