I'd like to calculate the time complexity of some algorithms that use Python's Scipy library. I found a table for general Python functions in https://wiki.python.org/moin/TimeComplexity that I plan to use. Is there a similar table for Scipy library?
Related
I am using one of the new MacBooks with the M1 chip. I cannot use SciPy natively unless I go through Rosetta, but for other reasons I cannot do that now.
The ONLY thing I need from SciPy is scipy.linalg.expm. Is there an alternative library where there is an implementation of expm that is equally (or almost equally) fast?
I work on a timeseries project with lot of timeseries and I want to settle it with an automatic function for arima/sarima model. I did it on R with auto.arima function which is very fast and now I'm on python and the auto_arima function (from the pmdarima package) I deal with is really slow.
I suggest you a more canonical Python package for ARIMA, such as statsmodels:
http://www.statsmodels.org/stable/generated/statsmodels.tsa.arima_model.ARIMA.html
I am writing a python script that needs to make a distribution fit against some generated data.
I found that this is possible using SciPy, or other packages having SciPy as dependency; however, due to administrative constraints, I am unable to install SciPy's dependencies (such as Blas) on the machine where the script will run.
Is there a way to perform distribution fitting in Python without using SciPy or packages depending on it?
EDIT: as asked in a comment, what I want to do is perform an Anderson-Darling test for normality.
The alternatives I found so far (but had to disregard):
statsmodel: has SciPy as dependency
R and Matlab python apis: need setup of external software, same problem for me as SciPy
Fitting the normal distribution only requires calculating mean and standard deviation.
The Anderson-Darling test only requires numpy or alternatively could be rewritten using list comprehension. The critical values for the AD-test are tabulated or based on a simple approximation formula. It does not use any difficult parts of scipy like optimize or special.
So, I think it should not be too difficult to translate either the scipy.stats or the statsmodels version to using pure Python or only with numpy as dependency.
Both SciPy and Numpy have built in functions for singular value decomposition (SVD). The commands are basically scipy.linalg.svd and numpy.linalg.svd. What is the difference between these two? Is any of them better than the other one?
From the FAQ page, it says scipy.linalg submodule provides a more complete wrapper for the Fortran LAPACK library whereas numpy.linalg tries to be able to build independent of LAPACK.
I did some benchmarks for the different implementation of the svd functions and found scipy.linalg.svd is faster than the numpy counterpart:
However, jax wrapped numpy, aka jax.numpy.linalg.svd is even faster:
Full notebook for the benchmarks are available here.
Apart from the error checking, the actual work seems to be done within lapack
both with numpy and scipy.
Without having done any benchmarking, I guess the performance should be identical.
I understand that GMPY2 supports the GMP library and numpy has fast numerical libraries. I want to know how the speed compares to actually writing C (or C++) code with GMP. Since Python is a scripting language, I don't think it will ever be as fast as a compiled language, however I have been wrong about these generalizations before.
I can't get GMP to work on my computer, so I can't run any tests. If I could, just general math like addition and maybe some trig functions. I'll figure out GMP later.
numpy and GMPY2 have different purposes.
numpy has fast numerical libraries but to achieve high performance, numpy is effectively restricted to working with vectors or arrays of low-level types - 16, 32, or 64 bit integers, or 32 or 64 bit floating point values. For example, numpy access highly optimized routines written in C (or Fortran) for performing matrix multiplication.
GMPY2 uses the GMP, MPFR, and MPC libraries for multiple-precision calculations. It isn't targeted towards vector or matrix operations.
The Python interpreter adds overhead to each call to an external library. Whether or not the slowdown is significant depends on the how much time is spend by the external library. If the running time of the external library is very short, say 10e-8 seconds, then Python's overhead is significant. If the running time of the external library is relatively long, several seconds or longer, then Python's overhead is probably insignificant.
Since you haven't said what you are trying to accomplish, I can't give a better answer.
Disclaimer: I maintain GMPY2.