An np.ndarray, when pickled, increments reference counter from the dumps function, however the ref count is never decremented.
Python 3.6.4 Anaconda
Ubuntu 16.04.5 LTS
numpy 1.16.0
I have already tried converting to a list using numpy.array.tolist() however this method is far too slow.
import numpy as np
import pickle
import sys
a = np.ndarray((10, 10), dtype=np.uint8)
print(sys.getrefcount(a)) # 2
pickle.dumps(a)
print(sys.getrefcount(a)) # 3
I would expect the output to be 2, 2 due to the Py_DECREF that occurs in the pickler dumps function, however it remains.
Output is 2, 3 and I cannot fix it. I am leaking memory like crazy.
Currently digging into _pickle.c.
You've run into this specific bug, and it's a regression in Numpy 1.16.0 only. New code to add support for a new pickle protocol 5 was leaking references to a bound __reduce__ method in the fallback case.
You can either wait for that bug to be fixed and 1.16.1 to be released, or go back to Numpy 1.15.4.
Related
I have a codebase I've been developing on a Mac (and running on Linux machines) based largely on pandas (and therefore numpy). Very commonly I type-cast with astype(int).
Recently a Windows-based developer joined our team. In an effort to make the code base more platform-independent, we're trying to gracefully tackle this tricky issue whereby numpy uses a 32-bit type instead of the 64-bit type, which breaks longer integers.
On a Mac, we see:
ipdb> ids.astype(int)
id
1818726176 1818726176
1881879486 1881879486
2590366906 2590366906
284399109 284399109
299981685 299981685
370708200 370708200
387277023371 387277023371
387343898032 387343898032
406885699892 406885699892
5262665206 5262665206
544687374 544687374
6978317806 6978317806
Whereas on a Windows machine (in PowerShell), we see:
ipdb> ids.astype(int)
id
1818726176 1818726176
1881879486 1881879486
2590366906 -1704600390
284399109 284399109
299981685 299981685
370708200 370708200
387277023371 729966731
387343898032 796841392
406885699892 -1136193228
5262665206 967697910
544687374 544687374
6978317806 -1611616786
Other than using a sed call to change every astype(int) to astype(np.int64) (which would also require an import numpy as np at the top of every module where currently that doesn't exist), is there a way to do this?
In particular, I was hoping to map int to numpy.int64 somehow in a pandas option or something.
Thank you!
I'm not saying that this is a really good idea, but you can simply redefine int to whatever you want:
import numpy as np
x = 2384351503.0
print(np.array(x).astype(int))
#-2147483648
old_int = int
int = np.int64
print(np.array(x).astype(int))
#2384351503
int = old_int
print(np.array(x).astype(int))
#-2147483648
In the case you described I'd, however, strongly prefer to fix the source code instead of redefining standard data types. It's a one-time effort and any IDE can do it easyly.
Numpy is already implicitely imported by pandas, so it doesn't cost any additional time or resources. If you really want to avoid it (for whatever reason), you can use pd.Int64Dtype.type instead of np.int64 (see source).
I am trying PyPy for the first time because I need serializable continuations. Specifically, this is what I am attempting:
from _continuation import continulet
import pickle
def f(cont):
cont.switch(111)
cont.switch(222)
cont.switch(333)
c = continulet(f)
print(c.switch())
print(c.switch())
saved = pickle.dumps(c)
When I try to pickle c I get this error, though: NotImplementedError: continulet's pickle support is currently disabled.
So, is there some way to enable pickling of continuations? The message suggests this, but so far I couldn't find out how.
Edit: I am using "PyPy 7.3.1 with GCC 9.3.0" (Python 3.6.9) on Linux.
I'm currently using a python module called petsc4py (https://pypi.org/project/petsc4py/). My main issue is that none of the typical intellisense features seems to work with this module.
I'm guessing it might have something to do with it being a C extension module, but I am not sure exactly why this happens. I initially thought that intellisense was unable to look inside ".so" files, but it seems that numpy is able to do this with the array object, which in my case is inside a file called multiarray.cpython-37m-x86_64-linux-gnu (check example below).
Does anyone know why I see this behaviour in the petsc4py module. Is there anything that I (or the developers of petsc4py) can do to get intellisense to work?
Example:
import sys
import petsc4py
petsc4py.init(sys.argv)
from petsc4py import PETSc
x_p = PETSc.Vec().create()
x_p.setSizes(10)
x_p.setFromOptions()
u_p = x_p.duplicate()
import numpy as np
x_n = np.array([1,2,3])
u_n = x_n.copy()
In this example, when trying to work with a Vec object from petsc4py, doing u_p.duplicate() cannot find the function and the suggestion is simply a repetition of the function immediately before. However, using an array from numpy, doing u_n.copy() works perfectly.
If you're compiling in-place then you're bumping up against https://github.com/microsoft/python-language-server/issues/197.
Considering the following code
from collections import OrderedDict
import gc
gc.set_debug(gc.DEBUG_UNCOLLECTABLE | gc.DEBUG_SAVEALL)
def main():
leaking = OrderedDict()
leaking[('x', 'y')] = 4
leaking[('z', 'w')] = 4
return
main()
gc.collect()
print(gc.garbage)
one can see memory is leaking,
using objgraph it seems to be due to a circular reference in OrderedDict, which seems confirmed with this old bug in python http://bugs.python.org/issue9825 (but that is marked as closed)
using leaking.clear() does not seem to help
Is it a known bug ? Is there a way on my side to workaround it (The ordereddict is returned by a thirdparty library)?
I'm using python 2.7.10
I am using numpy and my model involves intensive matrix-matrix multiplication.
To speed up, I use OpenBLAS multi-threaded library to parallelize the numpy.dot function.
My setting is as follows,
OS : CentOS 6.2 server #CPUs = 12, #MEM = 96GB
python version: Python2.7.6
numpy : numpy 1.8.0
OpenBLAS + IntelMKL
$ OMP_NUM_THREADS=8 python test_mul.py
code, of which I took from https://gist.github.com/osdf/
test_mul.py :
import numpy
import sys
import timeit
try:
import numpy.core._dotblas
print 'FAST BLAS'
except ImportError:
print 'slow blas'
print "version:", numpy.__version__
print "maxint:", sys.maxint
print
x = numpy.random.random((1000,1000))
setup = "import numpy; x = numpy.random.random((1000,1000))"
count = 5
t = timeit.Timer("numpy.dot(x, x.T)", setup=setup)
print "dot:", t.timeit(count)/count, "sec"
when I use OMP_NUM_THREADS=1 python test_mul.py, the result is
dot: 0.200172233582 sec
OMP_NUM_THREADS=2
dot: 0.103047609329 sec
OMP_NUM_THREADS=4
dot: 0.0533880233765 sec
things go well.
However, when I set OMP_NUM_THREADS=8.... the code starts to "occasionally works".
sometimes it works, sometimes it does not even run and and gives me core dumps.
when OMP_NUM_THREADS > 10. the code seems to break all the time..
I am wondering what is happening here ? Is there something like a MAXIMUM number threads that each process can use ? Can I raise that limit, given that I have 12 CPUs in my machine ?
Thanks
Firstly, I don't really understand what you mean by 'OpenBLAS + IntelMKL'. Both of those are BLAS libraries, and numpy should only link to one of them at runtime. You should probably check which of these two numpy is actually using. You can do this by calling:
$ ldd <path-to-site-packages>/numpy/core/_dotblas.so
Update: numpy/core/_dotblas.so was removed in numpy v1.10, but you can check the linkage of numpy/core/multiarray.so instead.
For example, I link against OpenBLAS:
...
libopenblas.so.0 => /opt/OpenBLAS/lib/libopenblas.so.0 (0x00007f788c934000)
...
If you are indeed linking against OpenBLAS, did you build it from source? If you did, you should see that in the Makefile.rule there is a commented option:
...
# You can define maximum number of threads. Basically it should be
# less than actual number of cores. If you don't specify one, it's
# automatically detected by the the script.
# NUM_THREADS = 24
...
By default OpenBLAS will try to set the maximum number of threads to use automatically, but you could try uncommenting and editing this line yourself if it is not detecting this correctly.
Also, bear in mind that you will probably see diminishing returns in terms of performance from using more threads. Unless your arrays are very large it is unlikely that using more than 6 threads will give much of a performance boost because of the increased overhead involved in thread creation and management.