Both SciPy and Numpy have built in functions for singular value decomposition (SVD). The commands are basically scipy.linalg.svd and numpy.linalg.svd. What is the difference between these two? Is any of them better than the other one?
From the FAQ page, it says scipy.linalg submodule provides a more complete wrapper for the Fortran LAPACK library whereas numpy.linalg tries to be able to build independent of LAPACK.
I did some benchmarks for the different implementation of the svd functions and found scipy.linalg.svd is faster than the numpy counterpart:
However, jax wrapped numpy, aka jax.numpy.linalg.svd is even faster:
Full notebook for the benchmarks are available here.
Apart from the error checking, the actual work seems to be done within lapack
both with numpy and scipy.
Without having done any benchmarking, I guess the performance should be identical.
Related
I am using one of the new MacBooks with the M1 chip. I cannot use SciPy natively unless I go through Rosetta, but for other reasons I cannot do that now.
The ONLY thing I need from SciPy is scipy.linalg.expm. Is there an alternative library where there is an implementation of expm that is equally (or almost equally) fast?
Has somebody managed to apply SciPy functions on CUDA kernels? As far as I see from the literature, there is no explicit support of SciPy for GPU computing.
I would like to speed-up code with some SciPy.optimize functions by applying it on CUDA kernels. If somebody knows of any other library for nonlinear optimization that can be applied on CUDA GPUs, would very much appreciate sharing it here.
Numba might be what you are looking for, it allows you to run python code on GPUs. Another good resource might be this tutorial
I have right now a similar problem and came across CuPy. It supports CUDA and has an interface that allows a drop-in replacement for NumPy and SciPy.
Have run into similar problems and I am now using tensorflow_probability's optimize capabilities. The biggest problem is that tensorflow ops often cannot be directly implemented from numpy functions, so there is a pretty steep learning curve. But it is very fast once your code is written.
I am writing a python script that needs to make a distribution fit against some generated data.
I found that this is possible using SciPy, or other packages having SciPy as dependency; however, due to administrative constraints, I am unable to install SciPy's dependencies (such as Blas) on the machine where the script will run.
Is there a way to perform distribution fitting in Python without using SciPy or packages depending on it?
EDIT: as asked in a comment, what I want to do is perform an Anderson-Darling test for normality.
The alternatives I found so far (but had to disregard):
statsmodel: has SciPy as dependency
R and Matlab python apis: need setup of external software, same problem for me as SciPy
Fitting the normal distribution only requires calculating mean and standard deviation.
The Anderson-Darling test only requires numpy or alternatively could be rewritten using list comprehension. The critical values for the AD-test are tabulated or based on a simple approximation formula. It does not use any difficult parts of scipy like optimize or special.
So, I think it should not be too difficult to translate either the scipy.stats or the statsmodels version to using pure Python or only with numpy as dependency.
Testing against NumPy/SciPy includes testing against several versions of them, since there is the need to support all versions since Numpy 1.6 and Scipy 0.11.
Testing all combinations would explode the build matrix in continuous integration (like travis-ci). I've searched the SciPy homepage for notes about version compatibility or sane combinations, but have not found something useful.
So my question is how to safely reduce the amount of combinations, while maintaining maximum testing compliance.
Is it possible to find all combinations in the wild? Or are there certain dependencies between Scipy and Numpy?
This doesn't completely answer your question, but I think the policy of scipy release management since 0.11 or earlier has been to support all of the numpy versions from 1.5.1 up to the numpy version in development at the time of the scipy release.
I understand that GMPY2 supports the GMP library and numpy has fast numerical libraries. I want to know how the speed compares to actually writing C (or C++) code with GMP. Since Python is a scripting language, I don't think it will ever be as fast as a compiled language, however I have been wrong about these generalizations before.
I can't get GMP to work on my computer, so I can't run any tests. If I could, just general math like addition and maybe some trig functions. I'll figure out GMP later.
numpy and GMPY2 have different purposes.
numpy has fast numerical libraries but to achieve high performance, numpy is effectively restricted to working with vectors or arrays of low-level types - 16, 32, or 64 bit integers, or 32 or 64 bit floating point values. For example, numpy access highly optimized routines written in C (or Fortran) for performing matrix multiplication.
GMPY2 uses the GMP, MPFR, and MPC libraries for multiple-precision calculations. It isn't targeted towards vector or matrix operations.
The Python interpreter adds overhead to each call to an external library. Whether or not the slowdown is significant depends on the how much time is spend by the external library. If the running time of the external library is very short, say 10e-8 seconds, then Python's overhead is significant. If the running time of the external library is relatively long, several seconds or longer, then Python's overhead is probably insignificant.
Since you haven't said what you are trying to accomplish, I can't give a better answer.
Disclaimer: I maintain GMPY2.