Scipy interpolate.splprep error "Invalid Inputs" - python
I am trying to interpolate a curve to a set of (x,y) points using SciPy's interpolate.splprep method, using the procedure followed in this StackOverflow answer. My code (with the data) is given below. Please excuse me for using this large dataset, as the code works perfectly fine on a different dataset. Kindly scroll to the bottom to see the implemetation.
#!/usr/bin/env python3
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
# -----------------------------------------------------------------------------
# Data
xp=np.array([ -1.19824526e-01, -1.19795807e-01, -1.22298912e-01,
-1.24784611e-01, -1.27233423e-01, -1.27048456e-01,
-1.29424259e-01, -1.31781573e-01, -1.34102825e-01,
-1.36386619e-01, -1.41324999e-01, -1.43569618e-01,
-1.48471481e-01, -1.53300646e-01, -1.55387133e-01,
-1.57436481e-01, -1.53938796e-01, -1.58562951e-01,
-1.53139517e-01, -1.50456275e-01, -1.49637920e-01,
-1.48774455e-01, -1.47843528e-01, -1.44278335e-01,
-1.43299274e-01, -1.39716798e-01, -1.36111285e-01,
-1.32534352e-01, -1.28982866e-01, -1.25433151e-01,
-1.21912263e-01, -1.16106245e-01, -1.12701128e-01,
-1.09303316e-01, -1.05947571e-01, -1.00467194e-01,
-9.72083398e-02, -9.39822094e-02, -9.08033710e-02,
-8.96420533e-02, -8.65053261e-02, -8.34162875e-02,
-8.03788778e-02, -7.73929193e-02, -7.62032638e-02,
-7.32655732e-02, -7.03760465e-02, -6.91826390e-02,
-6.63378816e-02, -6.35537275e-02, -6.08302060e-02,
-5.96426925e-02, -5.69864087e-02, -5.43931715e-02,
-5.18641746e-02, -4.93958173e-02, -4.82415854e-02,
-4.58486281e-02, -4.35196817e-02, -4.01162919e-02,
-3.79466513e-02, -3.48161871e-02, -3.18596693e-02,
-2.90650417e-02, -2.64251761e-02, -2.31429101e-02,
-1.94312163e-02, -1.73997964e-02, -1.55068323e-02,
-1.43163160e-02, -1.31800087e-02, -1.20987991e-02,
-1.10708190e-02, -1.05380016e-02, -9.58116017e-03,
-9.06399242e-03, -8.54450012e-03, -7.67847396e-03,
-7.17608354e-03, -6.67181154e-03, -5.89474349e-03,
-5.40878144e-03, -4.92121197e-03, -4.43202070e-03,
-3.94148294e-03, -3.44986011e-03, -2.82410814e-03,
-2.35269319e-03, -1.88058008e-03, -1.47393691e-03,
-9.78376399e-04, -4.82633521e-04, 1.33099164e-05,
5.09212801e-04, 1.05098855e-03, 1.56929991e-03,
2.08706303e-03, 2.72055571e-03, 3.26012954e-03,
3.79870854e-03, 4.33573131e-03, 4.87172652e-03,
5.40640816e-03, 5.93914581e-03, 6.47004490e-03,
6.99921852e-03, 7.52610639e-03, 7.70592714e-03,
8.20559501e-03, 8.70268809e-03, 9.19766855e-03,
9.68963219e-03, 1.01781695e-02, 1.01960805e-02,
1.06577199e-02, 1.11156340e-02, 1.15703286e-02,
1.20215921e-02, 1.24693015e-02, 1.29129042e-02,
1.33526781e-02, 1.37884367e-02, 1.42204360e-02,
1.46473802e-02, 1.50699789e-02, 1.54884533e-02,
1.59020551e-02, 1.63103362e-02, 8.12110387e-02,
7.80794051e-02, 1.67140103e-02, 8.31537241e-02,
7.99472912e-02, 7.99472912e-02, 7.67983984e-02,
1.71128723e-02, 8.50656342e-02, 8.17851028e-02,
7.85638577e-02, 7.53861405e-02, 1.75061328e-02,
8.19411806e-02, 7.38391281e-02, 1.78939640e-02,
8.70866930e-02, 8.36940292e-02, 8.03586974e-02,
7.70534244e-02, 7.70534244e-02, 7.38013540e-02,
7.38013540e-02, 7.06147796e-02, 1.82766038e-02,
8.54279559e-02, 8.20231372e-02, 7.53294330e-02,
7.20765174e-02, 1.86539411e-02, 8.36524496e-02,
7.85095832e-02, 7.51592888e-02, 7.18792721e-02,
1.90250409e-02, 7.82997201e-02, 7.49183992e-02,
7.49183992e-02, 7.16144248e-02, 7.16144248e-02,
6.83771846e-02, 1.93904576e-02, 7.46192919e-02,
7.12865685e-02, 7.12865685e-02, 6.80175748e-02,
1.97501330e-02, 7.42568965e-02, 7.08996495e-02,
7.08996495e-02, 6.75887344e-02, 2.01042729e-02,
7.38173451e-02, 6.70923613e-02, 2.13903228e-02,
7.50479910e-02, 6.82108239e-02, 5.69753762e-02,
5.24303656e-02, 5.24303656e-02, 4.52683211e-02,
4.52683211e-02, 4.25493203e-02, 2.17470907e-02,
7.45062992e-02, 6.76173090e-02, 6.76173090e-02,
6.42925100e-02, 6.42925100e-02, 5.94649095e-02,
5.94649095e-02, 3.92303424e-02, 2.20977481e-02,
7.21341379e-02, 3.72338037e-02, 2.24415025e-02,
7.14448972e-02, 3.40025442e-02, 2.27777176e-02,
7.07064856e-02, 3.57533680e-02, 2.41421550e-02,
6.81719132e-02, 3.62534788e-02, 2.44798556e-02,
6.56110398e-02, 3.80586628e-02, 3.29287629e-02,
2.93070471e-02, 2.48093588e-02, 6.13326924e-02,
3.85518913e-02, 3.46206958e-02, 2.85091877e-02,
2.51312268e-02, 5.38330011e-02, 3.76841669e-02,
3.50540735e-02, 2.77018960e-02, 2.65615352e-02,
5.28838088e-02, 3.81396763e-02, 3.54777506e-02,
2.80364970e-02, 2.68822682e-02, 5.03377702e-02,
3.85814254e-02, 3.58887890e-02, 4.93316503e-02,
4.04098395e-02, 3.62892096e-02, 4.67615526e-02,
4.22828625e-02, 3.80435955e-02, 3.84376145e-02,
4.02332775e-02, 4.06156847e-02, 4.24553741e-02,
4.43352031e-02, 4.47040511e-02, 4.66233682e-02,
4.69790035e-02, 4.89341212e-02, 5.09256192e-02,
5.12584867e-02, 5.32790231e-02, 5.35890744e-02,
5.38831411e-02, 5.41625645e-02, 5.44267004e-02,
5.46700348e-02, 5.48984863e-02, 5.51117932e-02,
5.53082440e-02, 5.54849716e-02, 5.56464539e-02,
5.57928396e-02, 5.59201893e-02, 5.60294455e-02,
5.61233441e-02, 5.62020138e-02, 5.62604489e-02,
5.63017253e-02, 5.63275468e-02, 5.63341408e-02,
5.63226424e-02, 5.62957310e-02, 5.62533699e-02,
5.61937444e-02, 5.61140110e-02, 5.60191106e-02,
5.59087917e-02, 5.57801898e-02, 5.56328560e-02,
5.54704141e-02, 5.70775198e-02, 5.68728844e-02,
5.66515897e-02, 5.64149230e-02, 5.61622287e-02,
5.76630266e-02, 5.73643873e-02, 5.70502787e-02,
5.67190716e-02, 5.63668473e-02, 5.59997391e-02,
5.73489998e-02, 5.69355151e-02, 5.65029189e-02,
5.77751241e-02, 5.72977910e-02, 5.67990710e-02,
5.79863269e-02, 5.74393835e-02, 5.68773454e-02,
5.62926261e-02, 5.56922722e-02, 5.50771272e-02,
5.44454686e-02, 5.37935810e-02, 5.31273003e-02,
5.24468411e-02, 5.17483760e-02, 5.10330229e-02,
5.03036776e-02, 4.95607328e-02, 4.87997085e-02,
4.80238054e-02, 4.72347342e-02, 4.64331616e-02,
4.56132865e-02, 4.47805574e-02, 4.39358955e-02,
4.30782240e-02, 4.22044750e-02, 4.01052073e-02,
3.92354976e-02, 3.83523540e-02, 3.74567873e-02,
3.65508593e-02, 3.45751478e-02, 3.36740998e-02,
3.27625023e-02, 3.18417381e-02, 3.09129121e-02,
2.90665673e-02, 2.81454989e-02, 2.72171846e-02,
2.62807950e-02, 2.53342284e-02, 2.43816409e-02,
2.34221736e-02, 2.24541496e-02, 2.08179757e-02,
1.98678098e-02, 1.89113740e-02, 1.79488243e-02,
1.69806146e-02, 1.65158032e-02, 1.55075714e-02,
1.44932106e-02, 1.34746855e-02, 1.24525920e-02,
1.14268067e-02, 1.03968750e-02, 9.36414487e-03,
8.58823755e-03, 7.51804527e-03, 6.44485601e-03,
5.37002690e-03, 4.29398700e-03, 3.31511044e-03,
2.20302298e-03, 1.09069996e-03, -2.27320426e-05,
-1.16892664e-03, -2.31490869e-03, -3.46060569e-03,
-4.74178052e-03, -5.91852523e-03, -7.09360822e-03,
-8.26683115e-03, -9.43736653e-03, -1.06042682e-02,
-1.17686419e-02, -1.33107457e-02, -1.45010352e-02,
-1.56869180e-02, -1.68693838e-02, -1.80464175e-02,
-1.97732638e-02, -2.09722818e-02, -2.21650612e-02,
-2.40185758e-02, -2.52303300e-02, -2.71803154e-02,
-2.84115598e-02, -3.04489552e-02, -3.16936647e-02,
-3.29299358e-02, -3.50861051e-02, -3.63332401e-02,
-3.85745058e-02, -3.98348648e-02, -4.21660006e-02,
-4.34302610e-02, -4.46836493e-02, -4.59254575e-02,
-4.71530952e-02, -4.96209305e-02, -4.95594200e-02,
-5.07435074e-02, -5.19101301e-02, -5.16977894e-02,
-5.14280802e-02, -5.11057669e-02, -5.07251169e-02,
-5.16985297e-02, -5.12126585e-02, -5.06852098e-02,
-5.15589749e-02, -5.09397027e-02, -5.17615499e-02,
-5.10672514e-02, -5.18313966e-02, -5.25816754e-02,
-5.33179227e-02, -5.40360028e-02, -5.47358953e-02,
-5.54213064e-02, -5.77400978e-02, -5.84092053e-02,
-5.90603644e-02, -6.14284845e-02, -6.38379284e-02,
-6.62872262e-02, -6.69166162e-02, -6.93865431e-02,
-7.18947674e-02, -7.44284962e-02, -7.69969804e-02,
-7.96063191e-02, -8.01834105e-02, -8.28053535e-02,
-8.54623715e-02, -8.59961071e-02, -8.86660185e-02,
-8.91520913e-02, -9.18335218e-02, -9.45402708e-02,
-9.49610563e-02, -9.76401856e-02, -1.00332460e-01,
-1.03032191e-01, -1.03358935e-01, -1.06040606e-01,
-1.06322470e-01, -1.08984284e-01, -1.09195131e-01,
-1.11833426e-01, -1.11994247e-01, -1.14596404e-01,
-1.17192554e-01, -1.17248317e-01])
yp = np.array([ -3.90948536e-05, -2.12984775e-03, -4.31095583e-03,
-6.58019633e-03, -8.93758156e-03, -1.11568100e-02,
-1.36444162e-02, -1.62222092e-02, -1.88895170e-02,
-2.16446498e-02, -2.49629308e-02, -2.79508857e-02,
-3.16029501e-02, -3.54376380e-02, -3.87881494e-02,
-4.22310942e-02, -4.41873802e-02, -4.85246067e-02,
-4.68663315e-02, -4.60459599e-02, -4.86676408e-02,
-5.12750434e-02, -5.38586293e-02, -5.54310799e-02,
-5.79452426e-02, -5.93547929e-02, -6.06497762e-02,
-6.18505946e-02, -6.29584706e-02, -6.39609234e-02,
-6.48713094e-02, -6.44090476e-02, -6.51181556e-02,
-6.57260659e-02, -6.62541381e-02, -6.52943568e-02,
-6.56184758e-02, -6.58578685e-02, -6.60229010e-02,
-6.76012689e-02, -6.76366183e-02, -6.76004442e-02,
-6.74972483e-02, -6.73282385e-02, -6.86657097e-02,
-6.83738036e-02, -6.80140059e-02, -6.92366190e-02,
-6.87491258e-02, -6.82071471e-02, -6.76134579e-02,
-6.86669494e-02, -6.79695621e-02, -6.72259327e-02,
-6.64391135e-02, -6.56069234e-02, -6.64563885e-02,
-6.55361171e-02, -6.45783892e-02, -6.18312378e-02,
-6.07850085e-02, -5.80009440e-02, -5.52383021e-02,
-5.24888121e-02, -4.97523554e-02, -4.54714570e-02,
-3.98863362e-02, -3.73592876e-02, -3.48720213e-02,
-3.37707235e-02, -3.26655171e-02, -3.15625118e-02,
-3.04616664e-02, -3.06508019e-02, -2.95344258e-02,
-2.96968330e-02, -2.98505905e-02, -2.87101259e-02,
-2.88391064e-02, -2.89597166e-02, -2.77967360e-02,
-2.78958771e-02, -2.79854740e-02, -2.80670276e-02,
-2.81405467e-02, -2.82051366e-02, -2.69913041e-02,
-2.70365186e-02, -2.70739448e-02, -2.83768113e-02,
-2.83979671e-02, -2.84108899e-02, -2.84155794e-02,
-2.84104617e-02, -2.96993141e-02, -2.96767995e-02,
-2.96453017e-02, -3.09305120e-02, -3.08782748e-02,
-3.08172540e-02, -3.07460634e-02, -3.06652277e-02,
-3.05756546e-02, -3.04773301e-02, -3.03684498e-02,
-3.02505329e-02, -3.01240628e-02, -2.87032761e-02,
-2.85638294e-02, -2.84161924e-02, -2.82602014e-02,
-2.80957411e-02, -2.79220043e-02, -2.65224371e-02,
-2.63408455e-02, -2.61506690e-02, -2.59523304e-02,
-2.57465736e-02, -2.55333569e-02, -2.53114227e-02,
-2.50819674e-02, -2.48453976e-02, -2.46014650e-02,
-2.43490672e-02, -2.40896946e-02, -2.38232320e-02,
-2.35495727e-02, -2.32681400e-02, -1.11708561e-01,
-1.07398522e-01, -2.29799277e-02, -1.10281290e-01,
-1.06025945e-01, -1.06025945e-01, -1.01847844e-01,
-2.26850806e-02, -1.08812919e-01, -1.04614895e-01,
-1.00492396e-01, -9.64256156e-02, -2.23830803e-02,
-1.01124594e-01, -9.11212826e-02, -2.20738630e-02,
-1.03723227e-01, -9.96804013e-02, -9.57062055e-02,
-9.17682599e-02, -9.17682599e-02, -8.78935733e-02,
-8.78935733e-02, -8.40962884e-02, -2.17583603e-02,
-9.82127298e-02, -9.42965108e-02, -8.65980524e-02,
-8.28570139e-02, -2.14365508e-02, -9.28460674e-02,
-8.71354106e-02, -8.34157663e-02, -7.97743543e-02,
-2.11075333e-02, -8.39100274e-02, -8.02849723e-02,
-8.02849723e-02, -7.67428202e-02, -7.67428202e-02,
-7.32724167e-02, -2.07721464e-02, -7.72159766e-02,
-7.37663681e-02, -7.37663681e-02, -7.03828404e-02,
-2.04308432e-02, -7.42042591e-02, -7.08482147e-02,
-7.08482147e-02, -6.75385453e-02, -2.00834820e-02,
-7.12338454e-02, -6.47417418e-02, -2.06352744e-02,
-6.99333169e-02, -6.35600774e-02, -5.30876202e-02,
-4.88515872e-02, -4.88515872e-02, -4.21763073e-02,
-4.21763073e-02, -3.96425097e-02, -2.02588101e-02,
-6.70368116e-02, -6.08364913e-02, -6.08364913e-02,
-5.78440553e-02, -5.78440553e-02, -5.34994049e-02,
-5.34994049e-02, -3.52908904e-02, -1.98763502e-02,
-6.26583213e-02, -3.23368135e-02, -1.94880238e-02,
-5.99037138e-02, -2.85040222e-02, -1.90931928e-02,
-5.72132575e-02, -2.89247783e-02, -1.95297821e-02,
-5.32198482e-02, -2.82971986e-02, -1.91058177e-02,
-4.94013681e-02, -2.86515116e-02, -2.47888430e-02,
-2.20618305e-02, -1.86758942e-02, -4.45232330e-02,
-2.79827472e-02, -2.51286391e-02, -2.06919011e-02,
-1.82397645e-02, -3.76607947e-02, -2.63609122e-02,
-2.45208701e-02, -1.93767971e-02, -1.85788804e-02,
-3.56379420e-02, -2.56998805e-02, -2.39058698e-02,
-1.88908564e-02, -1.81130913e-02, -3.26595065e-02,
-2.50304222e-02, -2.32829732e-02, -3.07966353e-02,
-2.52257065e-02, -2.26527986e-02, -2.80693713e-02,
-2.53799880e-02, -2.28350066e-02, -2.21686432e-02,
-2.22782703e-02, -2.15723084e-02, -2.16081542e-02,
-2.15998200e-02, -2.08220272e-02, -2.07341864e-02,
-1.99180705e-02, -1.97463091e-02, -1.95241512e-02,
-1.86330762e-02, -1.83210810e-02, -1.73881714e-02,
-1.64501676e-02, -1.55073488e-02, -1.45603397e-02,
-1.36076891e-02, -1.26514336e-02, -1.16918550e-02,
-1.07281971e-02, -9.76103257e-03, -8.79150351e-03,
-7.81935696e-03, -6.84417527e-03, -5.86703766e-03,
-4.88857954e-03, -3.90851347e-03, -2.92690669e-03,
-1.94445885e-03, -9.62077293e-04, 2.10973681e-05,
1.00443470e-03, 1.98670872e-03, 2.96920518e-03,
3.95065293e-03, 4.93054490e-03, 5.90896238e-03,
6.88594418e-03, 7.86095305e-03, 8.83291761e-03,
9.80227952e-03, 1.11168744e-02, 1.21109612e-02,
1.31014370e-02, 1.40884671e-02, 1.50714343e-02,
1.65579859e-02, 1.75619959e-02, 1.85609524e-02,
1.95539892e-02, 2.05406220e-02, 2.15208623e-02,
2.31958067e-02, 2.41936890e-02, 2.51825785e-02,
2.69676402e-02, 2.79735240e-02, 2.89676199e-02,
3.08600313e-02, 3.18685118e-02, 3.28673845e-02,
3.38531747e-02, 3.48305552e-02, 3.57981735e-02,
3.67545357e-02, 3.76978426e-02, 3.86308181e-02,
3.95533112e-02, 4.04626970e-02, 4.13583593e-02,
4.22429533e-02, 4.31163338e-02, 4.39732984e-02,
4.48174616e-02, 4.56497573e-02, 4.64690781e-02,
4.72699006e-02, 4.80584575e-02, 4.88339015e-02,
4.95941309e-02, 5.03364921e-02, 4.95646923e-02,
5.02584615e-02, 5.09357803e-02, 5.15956682e-02,
5.22416815e-02, 5.13017754e-02, 5.18954788e-02,
5.24741267e-02, 5.30389590e-02, 5.35886852e-02,
5.24828002e-02, 5.29815950e-02, 5.34658214e-02,
5.39333680e-02, 5.43819816e-02, 5.48154596e-02,
5.52339801e-02, 5.56342267e-02, 5.42908141e-02,
5.46458325e-02, 5.49864909e-02, 5.53070690e-02,
5.56106186e-02, 5.76746395e-02, 5.79561804e-02,
5.82151857e-02, 5.84584712e-02, 5.86858866e-02,
5.88950787e-02, 5.90831689e-02, 5.92552324e-02,
6.12619766e-02, 6.14026109e-02, 6.15224608e-02,
6.16256880e-02, 6.17123394e-02, 6.36720486e-02,
6.37186812e-02, 6.37481408e-02, 6.56861133e-02,
6.56722393e-02, 6.56407664e-02, 6.55917721e-02,
6.74743714e-02, 6.73786194e-02, 6.72646677e-02,
6.71325153e-02, 6.69773741e-02, 6.68003322e-02,
6.66053297e-02, 6.83473413e-02, 6.81027202e-02,
6.78376164e-02, 6.75542903e-02, 6.72524243e-02,
6.88634515e-02, 6.85066915e-02, 6.81318613e-02,
6.96716731e-02, 6.92387957e-02, 7.07292209e-02,
7.02467261e-02, 7.16584645e-02, 7.11134550e-02,
7.05499862e-02, 7.18681370e-02, 7.12419450e-02,
7.24822144e-02, 7.17989089e-02, 7.29694293e-02,
7.22180480e-02, 7.14476517e-02, 7.06588875e-02,
6.98486903e-02, 7.08078412e-02, 6.81567317e-02,
6.72843393e-02, 6.63881936e-02, 6.37899997e-02,
6.12404950e-02, 5.87433383e-02, 5.62902969e-02,
5.53950962e-02, 5.29895052e-02, 5.06437557e-02,
4.97490264e-02, 4.74631181e-02, 4.65678359e-02,
4.43551128e-02, 4.34554011e-02, 4.25440351e-02,
4.16213883e-02, 4.06842153e-02, 3.97338457e-02,
3.87727819e-02, 3.89122376e-02, 3.78978623e-02,
3.68719281e-02, 3.68766567e-02, 3.68230044e-02,
3.67095055e-02, 3.55465346e-02, 3.53200609e-02,
3.50311849e-02, 3.46717730e-02, 3.42461153e-02,
3.37555022e-02, 3.23610029e-02, 3.17505933e-02,
3.10701527e-02, 2.95754797e-02, 2.87735213e-02,
2.72210019e-02, 2.62970023e-02, 2.52956273e-02,
2.36404433e-02, 2.25053642e-02, 2.12889860e-02,
1.99902757e-02, 1.81872330e-02, 1.67574555e-02,
1.49054892e-02, 1.33429656e-02, 1.14391250e-02,
9.74643800e-03, 7.79267351e-03, 5.96714375e-03,
4.05355227e-03, 2.00672241e-03])
# -----------------------------------------------------------------------------
# Use scipy to interpolate.
xp = np.r_[xp, xp[0]]
yp = np.r_[yp, yp[0]]
tck, u = interpolate.splprep([xp, yp], s=0, k=1, per=True)
xi, yi = interpolate.splev(np.linspace(0, 1, 1000), tck)
# -----------------------------------------------------------------------------
# Plot result
fig = plt.figure()
ax = plt.subplot(111)
ax.plot(xp, yp, '.', markersize=2)
ax.plot(xi, yi, alpha=0.5)
plt.show()
I get the following error on one machine (MacOS),
---> tck, u = interpolate.splprep([xp, yp], s=0, k=1, per=True)
SystemError: <built-in function _parcur> returned NULL without setting an error
And this error on another machine (Ubuntu),
----> tck, u = interpolate.splprep([xp, yp], s=0, k=1, per=True)
ValueError: Invalid inputs.
interpolate.splprep uses the FORTRAN parcur routine from FITPACK (from the documentation).
My questions are -
Why does the code work for different datasets? e.g. xp = np.array([0.1, 0.2, 0.3, 0.4]) yp = np.array([-0.1, -0.3, -0.4, 0.2]) and not for this particular one? What does the error mean?
How can I get this to work? (Using this method or any other method) i.e. either interpolate a curve or filter the outliers ...
Out of curiosity, why is the error machine (and OS) dependent?
This is how the data looks when plotted, I think you can guess which curve I'd like to interpolate to (and which outliers I'd like to remove, if possible)
Fitpack has a fit if it two consecutive inputs are identical. The error happens deep enough that it depends on how the libraries were compiled and linked, hence the assortment of errors.
For example, xp[147:149], yp[147:149] (and several others):
(array([ 0.07705342, 0.07705342]), array([-0.09176826, -0.09176826]))
These are okay:
okay = np.where(np.abs(np.diff(xp)) + np.abs(np.diff(yp)) > 0)
xp = np.r_[xp[okay], xp[-1], xp[0]]
yp = np.r_[yp[okay], yp[-1], yp[0]]
# the rest of your code
I add the last point back because the output of diff is always one element shorter, so the last one needs to be included manually. (And then of course, you put the 0th point again for periodicity)
Cutting off the weird part
This is my attempt to cut off the weird extruding part of the dataset. It uses a Gaussian filter from ndimage. The original points xp, yp are kept this time; the filtered ones are xn, yn.
jump = np.sqrt(np.diff(xp)**2 + np.diff(yp)**2)
smooth_jump = ndimage.gaussian_filter1d(jump, 5, mode='wrap') # window of size 5 is arbitrary
limit = 2*np.median(smooth_jump) # factor 2 is arbitrary
xn, yn = xp[:-1], yp[:-1]
xn = xn[(jump > 0) & (smooth_jump < limit)]
yn = yn[(jump > 0) & (smooth_jump < limit)]
So, we remove not only duplicate points but also the points where the values jump around too much. The rest goes as before, interpolation is built out of xn, yn now. I plot original points for comparison with the new (red) curve):
ax.plot(xp, yp, 'o', markersize=2)
ax.plot(xi, yi, 'r', alpha=0.5)
Related
.fill_between returns ValueError: 'y1' is not 1-dimensional
I am programming a GPR (Gaussian Process Regression) and would like to visualize it. I imported data from an excel file and now I would like to fill the area under and above the graph in a certain interval. This is the code i wrote: X_ = np.linspace(X.min()-5, X.max() + 15, 1000)[:, np.newaxis] y_pred, y_std = gpr.predict(X_, return_std = True) fig = plt.figure(figsize = (15,10)) plt.scatter(X, y, c = 'k', alpha = 0.55) plt.plot(X_, y_pred) plt.fill_between(X_[:,0], y_pred-y_std, y_pred+y_std, alpha = 0.5, color = 'k') plt.xlim(X_.min(), X_.max()) plt.xlabel('Temperature [°C]') plt.ylabel('fd [-]') plt.title('fd depending on the Temperature') plt.show() Every time I execute the program I get a value error (y1 is not 1-dimensional) for this part of the code: plt.fill_between(X_[:,0], y_pred-y_std, y_pred+y_std, alpha = 0.5, color = 'k') There seems to be a problem with the "y_pred" values. When I substitute "y_pred" for a number, then it works just fine. I would really appreciate any help I can get. Thank you in advance.
I cannot run your code but I had a similar problem when I was passing to fill_between a N,1 array instead of a N, array. I basically solved converting my data to a 1D list and then getting a N, array with fill_between(np.array(myList_x),np.array(myList_y)) It also works if you manage to copy your data in a dataframe column. fill_between(df['col_x'],df['col_y']))
Fitting a stretch exponential using python scipy.curve_fit()
I am trying to fit some data using a stretch exponential function of type : c*(exp(-x/tau)^beta). The value I am interested in is tau. The data I am trying to fit passes through zero and is also negative sometimes (For example, value goes from -1 to 1). def st_exp(x,c,tau,beta): return c*(np.exp(-(x/tau)**beta)) When I try to fit I get a runtime warning : RuntimeWarning: invalid value encountered in power return c*(np.exp(-(x/tau)**beta)) I want to fit the data as is, however, this shows a runtime warning and fit does not converge or fits only till zero is encountered. For fitting I used: def get_index(x0,x): return np.argmin(abs(x-x0)) init_vals = [max(y)-min(y),-1*x[get_index(np.mean(y),y)]/np.log(0.5),0.5] best_vals, covar = curve_fit(st_exp, x,y, p0=init_vals) The data I am trying to fit : x = np.arange(0,400000,1000) y = np.array([-45819., -37322., -34006., -28906., -26565., -13311., -10992., -11233., -3313., -2421., -1687., 9665., 11951., 12796., 22440., 20331., 24732., 26594., 25464., 30668., 37412., 33261., 34365., 39359., 39105., 40260., 48946., 48351., 49872., 44422., 49969., 54536., 54248., 57340., 61403., 61843., 63386., 61182., 64080., 64052., 68232., 68167., 76288., 71786., 74485., 76070., 76540., 70167., 82014., 79459., 80499., 80073., 80697., 88209., 80099., 83415., 93613., 86038., 89498., 86073., 86999., 94242., 91823., 91162., 93277., 94834., 89088., 92613., 97663., 95948., 92840., 105920., 98487., 100951., 88721., 95078., 99831., 94738., 102520., 98576., 99038., 103921., 102951., 103186., 100755., 103631., 107259., 107376., 105404., 109739., 110135., 107829., 103196., 110798., 104497., 107074., 111857., 110816., 111853., 111890., 107932., 111878., 109776., 112154., 112769., 113155., 114862., 109560., 111112., 111516., 110314., 115911., 115820., 118418., 113124., 114579., 118102., 115259., 112640., 121617., 118125., 114923., 115210., 121919., 115841., 111980., 117730., 112565., 120893., 113758., 121129., 110559., 118674., 122867., 118574., 118022., 118656., 117656., 116813., 118591., 119722., 110845., 126545., 119452., 121438., 118271., 125652., 121025., 119663., 119917., 121405., 124934., 117835., 121760., 123870., 126825., 120996., 116165., 119473., 120996., 120530., 122197., 119907., 123786., 116293., 118625., 123068., 123951., 123443., 120781., 126291., 119316., 119401., 125871., 120863., 117013., 125037., 124775., 117822., 123755., 121240., 122696., 117997., 124865., 123457., 124229., 117705., 126550., 121866., 123070., 123585., 126033., 126355., 124475., 121325., 125392., 125882., 126755., 128013., 123610., 123611., 123853., 124819., 125464., 123897., 128276., 120328., 125569., 128821., 128039., 126223., 123052., 121924., 121932., 122968., 129473., 124053., 122576., 124538., 127567., 129659., 126090., 130546., 131749., 118672., 130372., 125783., 126413., 126283., 125898., 124901., 130037., 123192., 122977., 125806., 125544., 131714., 130757., 128980., 130233., 129140., 127372., 118302., 126342., 126046., 127595., 129635., 121161., 123841., 124058., 124156., 131894., 124745., 129556., 127832., 126236., 130072., 121877., 121383., 136089., 123984., 127407., 128703., 127597., 126220., 124028., 122716., 127398., 129724., 128971., 124488., 127229., 130337., 132997., 126681., 127312., 123270., 123822., 127458., 127653., 122740., 132875., 124466., 132315., 129569., 128041., 127525., 124972., 123646., 122957., 130239., 126285., 127734., 131409., 128138., 133744., 131438., 130377., 130763., 127868., 129223., 130644., 131814., 132781., 127419., 124382., 127924., 129190., 127443., 132475., 130202., 128066., 130360., 130282., 125531., 130259., 123453., 126989., 129615., 132047., 129424., 126729., 127324., 128756., 121690., 132176., 126250., 127830., 128985., 133258., 125664., 123530., 130123., 126947., 123108., 125562., 126388., 131747., 128793., 121865., 121705., 127039., 132701., 128835., 133300., 125677., 134063., 136207., 128572., 127731., 130304., 129674., 126436., 132357., 128154., 129400., 126893., 132012., 129471., 124752., 127925., 123735., 125801., 126371., 128554., 126691., 126970., 129754., 130953., 125113., 133345., 127633., 128070., 127592., 125389., 127235., 125677., 131191., 130972., 124687., 132342., 130269., 133340., 127084., 132171., 131521., 133572., 124134., 132673., 131440., 122008., 129178., 133775., 126584., 131278., 133229., 128349., 139349., 127294., 133538.])
Your initial values are likely preventing you from finding a good fit. Try this: best_vals, covar = curve_fit(st_exp, x, y, p0=[10000.0, 10000.0, 1.0]) print(best_vals) # result: array([ 1.36046194e+05, 2.83889616e+04, -1.21296047e+00]) fig, ax = plt.subplots(1, 1) ax.plot(x, y, label="data") ax.plot(x, st_exp(x,*best_vals), label="fit") ax.legend(loc="best")
The error I was making was that I was not proving an offset for the fitting function : Either correct the offset before fitting. or Modify the fitting function as : def st_exp(x,c,tau,beta,y_offset): return c*(np.exp(-(x/tau)**beta))+y_offset
Contour of scattered data via interpolation or QHull in python
I'm trying to plot a contour at z = .95 out of my data however, I couldn't manage to interpolate as I want. I tried to use griddata as follows from scipy.interpolate import griddata N = 1000 xi = np.linspace(min(x), max(y), N) yi = np.linspace(min(x), max(y), N) c = griddata((np.array(x),np.array(y)), np.array(z), (xi[None,:], yi[:,None]), method='linear') fig, sys = plt.subplots() sys.contour(xi, yi, c, levels = [.95], colors=('darkred',),linestyles=('solid',),linewidths=(2,)) also as can be seen in the graph below I tried to use qhull by cutting the z-axis at 0.95. a = genfromtxt('data.txt')[:,[0,1]] #data where z <= .95 hull = ConvexHull(a) sys.plot(a[hull.vertices,0], a[hull.vertices,1], color='red', linestyle='--', lw=2.5, zorder=90, label=r"QHUL") Below I tried to illustrate both methods and also how it essentially should look like (its a different data just for illustration purposes), however, due to the dip in my data around (1.7, 420) I am getting zigzags in interpolation for that region which I couldn't even fix by treating pieces of data separately and QHULL method just misses accuracy of the data thus I can not use it. Is there any way to interpolate the data to get a similar curve as shown below? Thanks! My data is as follows (x,y,z); 1.950e+00 1.500e+02 9.557e-01 1.950e+00 4.800e+02 9.302e-01 1.950e+00 3.100e+02 9.467e-01 1.900e+00 5.500e+02 9.493e-01 1.700e+00 6.000e+02 9.359e-01 1.700e+00 5.500e+02 9.447e-01 8.430e-01 7.800e+02 9.906e-01 1.300e+00 9.000e+02 9.349e-01 1.655e+00 8.132e+02 9.406e-01 1.138e+00 8.453e+02 9.542e-01 1.728e+00 4.895e+02 9.335e-01 1.953e+00 2.254e+02 9.507e-01 1.932e+00 4.706e+01 9.552e-01 1.661e+00 8.081e+02 9.287e-01 1.956e+00 9.931e+00 9.320e-01 1.947e+00 4.457e+01 9.396e-01 1.949e+00 9.769e+01 9.575e-01 1.912e+00 4.441e+02 9.616e-01 1.956e+00 3.739e+01 9.344e-01 1.953e+00 1.042e+02 9.277e-01 1.957e+00 0.000e+00 9.329e-01 1.938e+00 3.455e+01 9.411e-01 1.946e+00 6.045e+01 9.381e-01 1.951e+00 8.227e+01 9.571e-01 1.962e+00 2.500e+01 9.478e-01 1.951e+00 2.778e+01 9.559e-01 1.949e+00 6.736e+01 9.630e-01 1.949e+00 1.097e+02 9.331e-01 1.708e+00 4.998e+02 9.526e-01 1.951e+00 1.250e+02 9.516e-01 1.730e+00 4.642e+02 9.332e-01 1.912e+00 4.780e+02 9.558e-01 1.927e+00 5.145e+02 9.401e-01 1.712e+00 5.203e+02 9.519e-01 1.722e+00 5.470e+02 9.396e-01 1.962e+00 1.117e+02 9.519e-01 1.962e+00 2.195e+01 9.269e-01 1.962e+00 3.366e+01 9.514e-01 1.959e+00 9.610e+01 9.270e-01 1.959e+00 4.537e+01 9.281e-01 1.959e+00 6.488e+01 9.277e-01 1.959e+00 7.659e+01 9.346e-01 1.953e+00 4.537e+01 9.615e-01 1.950e+00 1.820e+02 9.552e-01 1.950e+00 1.702e+02 9.547e-01 1.950e+00 1.415e+01 9.389e-01 1.947e+00 2.639e+02 9.517e-01 1.947e+00 2.015e+02 9.533e-01 1.941e+00 3.029e+02 9.533e-01 1.935e+00 2.873e+02 9.573e-01 1.959e+00 1.415e+01 9.314e-01 1.959e+00 2.439e+00 9.335e-01 1.899e+00 5.137e+02 9.549e-01 1.896e+00 5.371e+02 9.563e-01 1.888e+00 5.839e+02 9.531e-01 1.870e+00 5.917e+02 9.553e-01 1.722e+00 4.746e+02 9.468e-01 1.716e+00 4.278e+02 9.604e-01 1.704e+00 5.644e+02 9.482e-01 1.683e+00 5.800e+02 9.574e-01 1.609e+00 6.854e+02 9.477e-01 1.263e+00 8.766e+02 9.417e-01 1.198e+00 8.532e+02 9.524e-01 1.172e+00 8.532e+02 9.394e-01 1.927e+00 3.807e+02 9.540e-01 1.582e+00 8.424e+02 9.569e-01 1.000e+00 8.415e+02 9.526e-01 8.817e-01 7.985e+02 9.348e-01 1.954e+00 3.139e+00 9.364e-01 1.932e+00 3.583e+02 9.585e-01 1.910e+00 5.018e+02 9.500e-01 1.891e+00 5.628e+02 9.505e-01 1.858e+00 5.987e+02 9.470e-01 1.752e+00 4.874e+02 9.974e-01 1.711e+00 4.803e+02 9.477e-01 1.698e+00 5.341e+02 9.545e-01 1.687e+00 5.628e+02 9.570e-01 1.638e+00 6.596e+02 9.525e-01 1.624e+00 7.996e+02 9.559e-01 1.624e+00 8.211e+02 9.523e-01 1.619e+00 6.632e+02 9.550e-01 1.611e+00 8.283e+02 9.510e-01 1.605e+00 8.354e+02 9.537e-01 1.597e+00 6.776e+02 9.566e-01 1.592e+00 8.426e+02 9.445e-01 1.956e+00 7.908e+01 9.259e-01
It turns out that the data span and interpolation splitting is important N = 40 x = linspace(0.5,2.4,N) y = linspace(0.,1100.,N) mean_CL = griddata((Mgo,Mn1), mean_CLs, (x[None,:], y[:,None]), method='linear') sc.contour(x,y,mean_CL,levels = [.95],colors=('darkred',),linestyles=('solid',),linewidths=(2,)) did the job. However, instead of having data clustered in one region, one might need to span the entire x-y plane, points don't need to be too close I gathered grid 25x0.025 and it worked perfectly.
How to uniformly resample a non-uniform signal using SciPy?
I have an (x, y) signal with non-uniform sample rate in x. (The sample rate is roughly proportional to 1/x). I attempted to uniformly re-sample it using scipy.signal's resample function. From what I understand from the documentation, I could pass it the following arguments: scipy.resample(array_of_y_values, number_of_sample_points, array_of_x_values) and it would return the array of [[resampled_y_values],[new_sample_points]] I'd expect it to return an uniformly sampled data with a roughly identical form of the original, with the same minimal and maximalx value. But it doesn't: # nu_data = [[x1, x2, ..., xn], [y1, y2, ..., yn]] # with x values in ascending order length = len(nu_data[0]) resampled = sg.resample(nu_data[1], length, nu_data[0]) uniform_data = np.array([resampled[1], resampled[0]]) plt.plot(nu_data[0], nu_data[1], uniform_data[0], uniform_data[1]) plt.show() blue: nu_data, orange: uniform_data It doesn't look unaltered, and the x scale have been resized too. If I try to fix the range: construct the desired uniform x values myself and use them instead, the distortion remains: length = len(nu_data[0]) resampled = sg.resample(nu_data[1], length, nu_data[0]) delta = (nu_data[0,-1] - nu_data[0,0]) / length new_samplepoints = np.arange(nu_data[0,0], nu_data[0,-1], delta) uniform_data = np.array([new_samplepoints, resampled[0]]) plt.plot(nu_data[0], nu_data[1], uniform_data[0], uniform_data[1]) plt.show() What is the proper way to re-sample my data uniformly, if not this?
Please look at this rough solution: import matplotlib.pyplot as plt from scipy import interpolate import numpy as np x = np.array([0.001, 0.002, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 20]) y = np.exp(-x/3.0) flinear = interpolate.interp1d(x, y) fcubic = interpolate.interp1d(x, y, kind='cubic') xnew = np.arange(0.001, 20, 1) ylinear = flinear(xnew) ycubic = fcubic(xnew) plt.plot(x, y, 'X', xnew, ylinear, 'x', xnew, ycubic, 'o') plt.show() That is a bit updated example from scipy page. If you execute it, you should see something like this: Blue crosses are initial function, your signal with non uniform sampling distribution. And there are two results - orange x - representing linear interpolation, and green dots - cubic interpolation. Question is which option you prefer? Personally I don't like both of them, that is why I usually took 4 points and interpolate between them, then another points... to have cubic interpolation without that strange ups. That is much more work, and also I can't see doing it with scipy, so it will be slow. That is why I've asked about size of the data.
Python boxplot showing means and confidence intervals
How can I create a boxplot like the one below, in Python? I want to depict means and confidence bounds only (rather than proportions of IQRs, as in matplotlib boxplot). I don't have any version constraints, and if your answer has some package dependency that's OK too. Thanks!
Use errorbar instead. Here is a minimal example: import matplotlib.pyplot as plt x = [2, 4, 3] y = [1, 3, 5] errors = [0.5, 0.25, 0.75] plt.figure() plt.errorbar(x, y, xerr=errors, fmt = 'o', color = 'k') plt.yticks((0, 1, 3, 5, 6), ('', 'x3', 'x2', 'x1','')) Note that boxplot is not the right approach; the conf_intervals parameter only controls the placement of the notches on the boxes (and we don't want boxes anyway, let alone notched boxes). There is no way to customize the whiskers except as a function of IQR.
Thanks to America, I propose a way to automatize this kind of graph a little bit. Below an example of code generating 20 arrays from a normal distribution with mean=0.25 and std=0.1. I used the formula W = t * s / sqrt(n), to calculate the margin of error of the confidence interval, with t the constant from the t distribution (see scipy.stats.t), s the standard deviation and n the number of values in an array. list_samples=list() # making a list of arrays for i in range(20): list.append(np.random.normal(loc=0.25, scale=0.1, size=20)) def W_array(array, conf=0.95): # function that returns W based on the array provided t = stats.t(df = len(array) - 1).ppf((1 + conf) /2) W = t * np.std(array, ddof=1) / np.sqrt(len(array)) return W # the error W_list = list() mean_list = list() for i in range(len(list_samples)): W_list.append(W_array(list_samples[i])) # makes a list of W for each array mean_list.append(np.mean(list_samples[i])) # same for the means to plot plt.errorbar(x=mean_list, y=range(len(list_samples)), xerr=W_list, fmt='o', color='k') plt.axvline(.25, ls='--') # this is only to demonstrate that 95% # of the 95% CI contain the actual mean plt.yticks([]) plt.show();