matplotlib discrete data versus continuous function - python

I need to plot a ratio between a function introduced thorough a discrete data set, imported from a text file, for example:
x,y,z=np.loadtxt('example.txt',usecols=(0,1,2),unpack=True),
and a continuous function defined using the np.arange command, for example:
w=np.arange(0,0.5,0.01)
exfunct=w**4.
Clearly, solutions as
plt.plot(w,1-(x/w),'k--',color='blue',lw=2) as well
plt.plot(y,1-(x/w),'k--',color='blue',lw=2)
do not work. Despite having looked for the answer in the site (and outside it), I can not find any solution to my problem. Should I fit the discrete data set, to obtain a continuous function, and then define it in the same interval as the "exfunct"? Any suggestion? Thank you a lot.

At the end the solution has been easier than I thought. I had simply to define the continuous variable through the discrete data, as, for example:
w=x/y,
then define the function as already said:
exfunct=w**4
and finally plot the "continuous-discrete" function:
plt.plot(x,x/exfunct),'k-',color='red',lw=2)
I hope this can be useful.

Related

what to do to fit gaussian model in scherrer equation in python?

i'm new to python and i'm trying to fit the gaussian function into scherrer equation using python and the problem is that i don't know how to do it . similarly with the laurentzian model . can some one explains me how to do it . Thanks
More explanation : for x and y values i want them to be read from a text file and then use them in the process.
If you want a more specific solution you should probably provide an example.
In general, scipy.curve_fit is a great solution for the most fitting problems.
You can find a tutorial about it here. In particular, there is also an example of how to fit a Gaussian function: https://scipy-cookbook.readthedocs.io/items/FittingData.html#Fitting-gaussian-shaped-data.
You might want to take a look here:
Gaussian fit for Python
I have no idea how you get your data, but if you have just the function, try generating values using the function to get something you can actually fit the gauss curve.

Scipy randint vs numpy randint

I have a simple yet broad question regarding two methods:
scipy.stats.randint
and
numpy.random.randint
After reading the API for both methods I'm a bit confused as to when it is best to use each method; therefore, I was wondering if someone could outline the differences between the two and possibly offer some examples of when one method would be preferable to use over the other. Thanks!
Edit: Links to each method's documentation -> numpy.random.randint, scipy.stats.randint
The major difference seems to be that scipy.stats.randint allows you to explicitly name the lower or upper tail probability, as well as specify the distributions you want to draw the random ints from (see the methods section of the scipy.stats.randint documentation). It's therefore much more useful if you want to draw random intervals from a given density function.
If you really just want to draw a random integer that falls within a certain range, with no requirements regarding the distribution, then numpy.random.randint is more straightforward. They would be drawn directly from a discrete uniform distribution, with no built in option to modify that.

Continuous Interpolation in MATLAB?

I have a set of data that I would like to get an interpolating function for. MATLAB's interpolating functions seem to only return values at a finer set of discrete points. However, for my purposes, I need to be able to look up the function value for any input. What I'm looking for is something like SciPy's "interp1d."
That appears to be what ppval is for. It looks like many of the 1D interpolation functions have a pp variant that plugs into this.
Disclaimer: I haven't actually tried this.

Adjusted Boxplot in Python

For my thesis, I am trying to identify outliers in my data set. The data set is constructed of 160000 times of one variable from a real process environment. In this environment however, there can be measurements that are not actual data from the process itself but simply junk data. I would like to filter them out with I little help of literature instead of only "expert opinion".
Now I've read about the IQR method of seeing whether possible outliers lie when dealing with a symmetric distribution like the normal distribution. However, my data set is right skewed and by distribution fitting, inverse gamma and lognormal where the best fit.
So, during my search for methods for non-symmetric distributions, I found this topic on crossvalidated where user603's answer is interesting in particular: Is there a boxplot variant for Poisson distributed data?
In user603's answer, he states that an adjusted boxplot helps to identify possible outliers in your dataset and that R and Matlab have functions for this
(There is an 𝚁R implementation of this
(πš›πš˜πš‹πšžπšœπšπš‹πšŠπšœπšŽ::πšŠπšπš“πš‹πš˜πš‘()robustbase::adjbox()) as well as
a matlab one (in a library called πš•πš’πš‹πš›πšŠlibra)
I was wondering if there is such a function in Python. Or is there a way to calculate the medcouple (see paper in user603's answer) with python?
I really would like to see what comes out the adjusted boxplot for my data..
In the module statsmodels.stats.stattools there is a function medcouple(), which is the measure of the skewness used in the Adjusted Boxplot.
enter link description here
With this variable you can calculate the interval beyond which outliers are defined.

Scipy.optimize.minimize only iterates some variables.

I have written python (2.7.3) code wherein I aim to create a weighted sum of 16 data sets, and compare the result to some expected value. My problem is to find the weighting coefficients which will produce the best fit to the model. To do this, I have been experimenting with scipy's optimize.minimize routines, but have had mixed results.
Each of my individual data sets is stored as a 15x15 ndarray, so their weighted sum is also a 15x15 array. I define my own 'model' of what the sum should look like (also a 15x15 array), and quantify the goodness of fit between my result and the model using a basic least squares calculation.
R=np.sum(np.abs(model/np.max(model)-myresult)**2)
'myresult' is produced as a function of some set of parameters 'wts'. I want to find the set of parameters 'wts' which will minimise R.
To do so, I have been trying this:
res = minimize(get_best_weightings,wts,bounds=bnds,method='SLSQP',options={'disp':True,'eps':100})
Where my objective function is:
def get_best_weightings(wts):
wts_tr=wts[0:16]
wts_ti=wts[16:32]
for i,j in enumerate(portlist):
originalwtsr[j]=wts_tr[i]
originalwtsi[j]=wts_ti[i]
realwts=originalwtsr
imagwts=originalwtsi
myresult=make_weighted_beam(realwts,imagwts,1)
R=np.sum((np.abs(modelbeam/np.max(modelbeam)-myresult))**2)
return R
The input (wts) is an ndarray of shape (32,), and the output, R, is just some scalar, which should get smaller as my fit gets better. By my understanding, this is exactly the sort of problem ("Minimization of scalar function of one or more variables.") which scipy.optimize.minimize is designed to optimize (http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.optimize.minimize.html ).
However, when I run the code, although the optimization routine seems to iterate over different values of all the elements of wts, only a few of them seem to 'stick'. Ie, all but four of the values are returned with the same values as my initial guess. To illustrate, I plot the values of my initial guess for wts (in blue), and the optimized values in red. You can see that for most elements, the two lines overlap.
Image:
http://imgur.com/p1hQuz7
Changing just these few parameters is not enough to get a good answer, and I can't understand why the other parameters aren't also being optimised. I suspect that maybe I'm not understanding the nature of my minimization problem, so I'm hoping someone here can point out where I'm going wrong.
I have experimented with a variety of minimize's inbuilt methods (I am by no means committed to SLSQP, or certain that it's the most appropriate choice), and with a variety of 'step sizes' eps. The bounds I am using for my parameters are all (-4000,4000). I only have scipy version .11, so I haven't tested a basinhopping routine to get the global minimum (this needs .12). I have looked at minimize.brute, but haven't tried implementing it yet - thought I'd check if anyone can steer me in a better direction first.
Any advice appreciated! Sorry for the wall of text and the possibly (probably?) idiotic question. I can post more of my code if necessary, but it's pretty long and unpolished.

Categories