While searching for some numpy stuff, I came across a question discussing the rounding accuracy of numpy.dot():
Numpy: Difference between dot(a,b) and (a*b).sum()
Since I happen to have two (different) Computers with Haswell-CPUs sitting on my desk, that should provide FMAand everything, I thought I'd test the example given by Ophion in the first answer, and I got a result that somewhat surprised me:
After updating/installing/fixing lapack/blas/atlas/numpy, I get the following on both machines:
>>> a = np.ones(1000, dtype=np.float128)+1e-14
>>> (a*a).sum()
1000.0000000000199999
>>> np.dot(a,a)
1000.0000000000199948
>>> a = np.ones(1000, dtype=np.float64)+1e-14
>>> (a*a).sum()
1000.0000000000198
>>> np.dot(a,a)
1000.0000000000176
So the standard multiplication + sum() is more precise than np.dot(). timeit however confirmed that the .dot() version is faster (but not much) for both float64 and float128.
Can anyone provide an explanation for this?
edit: I accidentally deleted the info on numpy versions: same results for 1.9.0 and 1.9.3 with python 3.4.0 and 3.4.1.
It looks like they recently added a special Pairwise Summation to ndarray.sum for improved numerical stability.
From PR 3685, this affects:
all add.reduce calls that go over float_add with IS_BINARY_REDUCE true
so this also improves mean/std/var and anything else that uses sum.
See here for code changes.
Related
I have a codebase I've been developing on a Mac (and running on Linux machines) based largely on pandas (and therefore numpy). Very commonly I type-cast with astype(int).
Recently a Windows-based developer joined our team. In an effort to make the code base more platform-independent, we're trying to gracefully tackle this tricky issue whereby numpy uses a 32-bit type instead of the 64-bit type, which breaks longer integers.
On a Mac, we see:
ipdb> ids.astype(int)
id
1818726176 1818726176
1881879486 1881879486
2590366906 2590366906
284399109 284399109
299981685 299981685
370708200 370708200
387277023371 387277023371
387343898032 387343898032
406885699892 406885699892
5262665206 5262665206
544687374 544687374
6978317806 6978317806
Whereas on a Windows machine (in PowerShell), we see:
ipdb> ids.astype(int)
id
1818726176 1818726176
1881879486 1881879486
2590366906 -1704600390
284399109 284399109
299981685 299981685
370708200 370708200
387277023371 729966731
387343898032 796841392
406885699892 -1136193228
5262665206 967697910
544687374 544687374
6978317806 -1611616786
Other than using a sed call to change every astype(int) to astype(np.int64) (which would also require an import numpy as np at the top of every module where currently that doesn't exist), is there a way to do this?
In particular, I was hoping to map int to numpy.int64 somehow in a pandas option or something.
Thank you!
I'm not saying that this is a really good idea, but you can simply redefine int to whatever you want:
import numpy as np
x = 2384351503.0
print(np.array(x).astype(int))
#-2147483648
old_int = int
int = np.int64
print(np.array(x).astype(int))
#2384351503
int = old_int
print(np.array(x).astype(int))
#-2147483648
In the case you described I'd, however, strongly prefer to fix the source code instead of redefining standard data types. It's a one-time effort and any IDE can do it easyly.
Numpy is already implicitely imported by pandas, so it doesn't cost any additional time or resources. If you really want to avoid it (for whatever reason), you can use pd.Int64Dtype.type instead of np.int64 (see source).
I have tried to run the stokesCavity example, which uses lid-driven boundary conditions for the flow. At the end of the code, the values in the top-right cell are compared with some reference values.
>>> print(numerix.allclose(pressure.globalValue[..., -1], 162.790867927)) #doctest: +NOT_PYAMGX_SOLVER
1
>>> print(numerix.allclose(xVelocity.globalValue[..., -1], 0.265072740929)) #doctest: +NOT_PYAMGX_SOLVER
1
>>> print(numerix.allclose(yVelocity.globalValue[..., -1], -0.150290488304)) #doctest: +NOT_PYAMGX_SOLVER
1
When I tried running this example, my output was
False
False
False
The actual values I get for the top-right cell are 129.235, 0.278627 and -0.166620 (instead of 162.790867927, 0.265072740929 and -0.150290488304). Does anyone know why am I getting different values? I've tried changing the solver (used scipy, Trilinos and pysparse), but the results do not change up to the 12th digit. The velocity profile looks similar to the one shown in their manual, but I am still worried something is off.
I run it on Linux (python 2.7.14, fipy 3.2, pysparse 1.2.dev0, Trilinos 12.12, scipy 1.2.1) and on Windows (python 2.7.15, fipy 3.1.3, scipy 1.1.0).
When run as part of the test suite, this example only does 5 sweeps, and the numerical check is hard-wired for this. When you run the example in isolation, it does 300 sweeps and the solution is better (or at least differently) converged. There's nothing wrong with the example, other than it's not written in a very robust way. Thanks for asking about this; we'll try to clean up the example.
I have a piece of code which computes the Helmholtz-Hodge Decomposition.
I've been running on my Mac OS Yosemite and it was working just fine. A month ago, however, my Mac got pretty slow (it was really old), and I opted to buy a new notebook (Windows 8.1, Dell).
After installing all Python libs and so on, I continued my work running this same code (versioned in Git). And then the result was pretty weird, completely different from the one obtained in the old notebook.
For instance, what I do is to construct to matrices a and b(really long calculus) and then I call the solver:
s = numpy.linalg.solve(a, b)
This was returning a (wrong, and different of the result obtained in my Mac, which was right).
Then, I tried to use:
s = scipy.linalg.solve(a, b)
And the program exits with code 0 but at the middle of it.
Then, I just made a simple test of:
print 'here1'
s = scipy.linalg.solve(a, b)
print 'here2'
And here2 is never printed.
I tried:
print 'here1'
x, info = numpy.linalg.cg(a, b)
print 'here2'
And the same happens.
I also tried to check the solution after using numpy.linalg.solve:
print numpy.allclose(numpy.dot(a, s), b)
And I got a False (?!).
I don't know what is happening, how to find a solution, I just know that the same code runs in my Mac, but it would be very good if I could run it in other platforms. Now I'm stucked in this problem (don't have a Mac anymore) and with no clue about the cause.
The weirdest thing is that I don't receive any error on runtime warning, no feedback at all.
Thank you for any help.
EDIT:
Numpy Suit Test Results:
Scipy Suit Test Results:
Download Anaconda package manager
http://continuum.io/downloads
When you download this it will already have all the dependencies for numpy worked out for you. It installs locally and will work on most platforms.
This is not really an answer, but this blog discusses in length the problems of having a numpy ecosystem that evolves fast, at the expense of reproducibility.
By the way, which version of numpy are you using? The documentation for the latest 1.9 does not report any method called cg as the one you use...
I suggest the use of this example so that you (and others) can check the results.
>>> import numpy as np
>>> import scipy.linalg
>>> np.random.seed(123)
>>> a = np.random.random(size=(10000, 10000))
>>> b = np.random.random(size=(10000,))
>>> s_np = np.linalg.solve(a, b)
>>> s_sc = scipy.linalg.solve(a, b)
>>> np.allclose(s_np,s_sc)
>>> s_np
array([-15.59186559, 7.08345804, 4.48174646, ..., -16.43310046,
-8.81301553, -10.77509242])
I hope you can find the answer - one option in the future is to create a virtual machine for each of your projects, using Docker. This allows easy portability.
See a great article here discussing Docker for research.
In a slightly contrived experiment I wanted to compare some of Python's built-in functions to those of numpy. When I started timing these though, I found something bizarre.
When I wrote the following:
import timeit
timeit.timeit('import math; math.e**2', number=1000000)
I would get two different results in almost random alternation in a very statistically significant way.
This alternates between 2 seconds, and 0.5 seconds.
This confused me so I ran some experiments to figure out what was going on and I was only more confused. So I tried the following experiments:
[timeit.timeit('import math; math.e**2', number=1000000) for i in xrange(100)]
which led entirely to the 0.5 number. I then tried seeding this with a generator:
test = (timeit.timeit('import math; math.e**2', number=1000000) for i in xrange(100))
[item for item in test]
which led to a list entirely full of the 2.0 number.
On the suggestion of alecxe I changed my timeit statement to:
timeit.timeit('math.e**2', 'import math', number=1000000)
which similarly alternated between about 0.1 and 0.4 seconds, but when I reran the experiment comparing generators and list comprehensions, but this time the results were flipped. That is to say that the generator expression regularly came up with the 0.1 second number, while the list comprehension returned a full list of the 0.4 second number.
Direct console output:
>>> test = (timeit.timeit('math.e**2', 'import math', number=1000000) for i in xrange(100))
>>> test.next()
0.15114784240722656
>>> timeit.timeit('math.e**2', 'import math', number=1000000)
0.44176197052001953
>>>
Edit: I'm using Ubuntu 12.04 running dwm, and I've seen these results both in xterm and a gnome-terminal. I'm using python 2.7.3
Does anybody know what's going on here? This seems really bizarre to me.
Turns out there were a couple things happening here, though apparently some of these quirks my be specific to my machine, but nevertheless I figure it's worth posting them in case someone is puzzled by the same thing.
Firstly, there's a different between the two timeit functions in that the:
timeit.timeit('math.e**2', 'import math', number=1000000)
the import statements are lazily loaded. This becomes obvious if you try the following experiment:
timeit.timeit('1+1', 'import math', number=1000000)
versus:
timeit.timeit('1+1', number=1000000)
So when it was directly run in the list comprehension it looks like this import statement was being loaded for every entry. (Exact reasons for this are probably related to my configuration).
Past that, going back to the original question, it looks like 3/4 of the time was actually spent import math, so I'm guessing that when the equation was generated, there was no cache storage between iterations, while there was import caching within the list comprehension (again, the exact reason for this is probably configuration specific)
I recently reinstalled my python environment and a code that used to work very quickly now creeps at best (usually just hangs taking up more and more memory).
The point at which the code hangs is:
solve(exp(-alpha * x**2) - 0.01, alpha)
I've been able to reproduce this problem with a fresh IPython 0.13.1 session:
In [1]: from sympy import solve, Symbol, exp
In [2]: x = 14.7296138519
In [3]: alpha = Symbol('alpha', real=True)
In [4]: solve(exp(-alpha * x**2) - 0.01, alpha)
this works for integers but also quite slow. In the original code I looped over this looking for hundreds of different alpha's for different values of x (other than 14.7296138519) and it didn't take more than a second.
any thoughts?
The rational=False flag was introduced for such cases as this.
>>> q=14.7296138519
>>> solve(exp(-alpha * q**2) - 0.01, alpha, rational=False)
[0.0212257459123917]
(The explanation is given in the issue cited above.)
Rolling back from version 0.7.2 to 0.7.1 solved this problem.
easy_install sympy==0.7.1
I've reported this as a bug to sympy's google code.
I am currently running sympy ver: 1.11.1
This is for all I know the latest version.
However, hanging as was described, when solving a set of 3 differential for 3 angular double differentials persist.
Eg.:
sols = solve([LE1, LE2, LE3], (the_dd, phi_dd, psi_dd),
simplify=False, rational=False)