Can anyone explain why importing cv and numpy would change the behaviour of python's struct.unpack? Here's what I observe:
Python 2.7.3 (default, Aug 1 2012, 05:14:39)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from struct import pack, unpack
>>> unpack("f",pack("I",31))[0]
4.344025239406933e-44
This is correct
>>> import cv
libdc1394 error: Failed to initialize libdc1394
>>> unpack("f",pack("I",31))[0]
4.344025239406933e-44
Still ok, after importing cv
>>> import numpy
>>> unpack("f",pack("I",31))[0]
4.344025239406933e-44
And OK after importing cv and then numpy
Now I restart python:
Python 2.7.3 (default, Aug 1 2012, 05:14:39)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from struct import pack, unpack
>>> unpack("f",pack("I",31))[0]
4.344025239406933e-44
>>> import numpy
>>> unpack("f",pack("I",31))[0]
4.344025239406933e-44
So far so good, but now I import cv AFTER importing numpy:
>>> import cv
libdc1394 error: Failed to initialize libdc1394
>>> unpack("f",pack("I",31))[0]
0.0
I've repeated this a number of times, including on multiple servers, and it always goes the same way. I've also tried it with struct.unpack and struct.pack, which also makes no difference.
I can't understand how importing numpy and cv could have any impact at all on the output of struct.unpack (pack remains the same, btw).
The "libdc1394" thing is, I believe, a red-herring: ctypes error: libdc1394 error: Failed to initialize libdc1394
Any ideas?
tl;dr: importing numpy and then opencv changes the behaviour of struct.unpack.
UPDATE: Paulo's answer below shows that this is reproducible. Seborg's comment suggests that it's something to do with the way python handles subnormals, which sounds plausible. I looked into Contexts but that didn't seem to be the problem, as the context was the same after the imports as it had been before them.
This isn't an answer, but it's too big for a comment. I played with the values a bit to find the limits.
Without loading numpy and cv:
>>> unpack("f", pack("i", 8388608))
(1.1754943508222875e-38,)
>>> unpack("f", pack("i", 8388607))
(1.1754942106924411e-38,)
After loading numpy and cv, the first line is the same, but the second:
>>> unpack("f", pack("i", 8388607))
(0.0,)
You'll notice that the first result is the lower limit for 32 bit floats. I then tried the same with d.
Without loading the libraries:
>>> unpack("d", pack("xi", 1048576))
(2.2250738585072014e-308,)
>>> unpack("d", pack("xi", 1048575))
(2.2250717365114104e-308,)
And after loading the libraries:
>>> unpack("d",pack("xi", 1048575))
(0.0,)
Now the first result is the lower limit for 64 bit float precision.
It seems that for some reason, loading the numpy and cv libraries, in that order, constrains unpack to use 32 and 64 bit precision and return 0 for lower values.
Related
Bug entered at https://github.com/sympy/sympy/issues/14877
Is this a known issue? Is this a new bug? Will report if new.
What could cause it?
>which python
/opt/anaconda/bin/python
>pip list | grep sympy
sympy 1.1.1
>python
Python 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
from sympy import *
x=symbols('x');
integrate(exp(1-exp(x**2)*x+2*x**2)*(2*x**3+x)/(1-exp(x**2)*x)**2,x)
gives
.....
File "/opt/anaconda/lib/python3.6/site-packages/sympy/core/mul.py", line 1067, in <genexpr>
a.is_commutative for a in self.args)
RecursionError: maximum recursion depth exceeded
>>>
btw, the anti derivative should be
-exp(1-exp(x^2)*x)/(-1+exp(x^2)*x)
It is a known issue that SymPy fails to integrate many functions. This particular function probably wasn't reported yet, so by all means, add it to the ever-growing list.
SymPy tries several integration approaches. One of them, called "manual integration", is highly recursive: a substitution or integration by parts is attempted, and then the process is repeated for the resulting integral.
In this specific case, the expression has a lot of functions that look like candidates for substitution: x**2, the denominator, the content of another exponential function. And SymPy goes into an infinite chain of substitution that leads not to a solution but to a stack overflow... There is no pattern implemented in integrate that would tell SymPy to make the crucial substitution u = 1 - x*exp(x**2).
There is a separate, experimental, integrator called RUBI which could be used with
from sympy.integrals.rubi.rubi import rubi_integrate
rubi_integrate(exp(1-exp(x**2)*x+2*x**2)*(2*x**3+x)/(1-exp(x**2)*x)**2, x)
but it relies on MatchPy which I don't have installed, so I can't tell if it would help here.
I made a simple code on python interpreter and run it.
Python 3.5.3 (v3.5.3:1880cb95a742, Jan 16 2017, 16:02:32) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> x=np.array([0,1])
>>> w=np.array([0.5,0.5])
>>> b=-0.7
>>> np.sum(w*x)+b
-0.19999999999999996
the result -0.19999999999999996 is weird. I think.... it is caused by IEEE 754 rule. But when I try to run almost same code by file, result is a lot different.
import numpy as np
x = np.array([0,1])
w = np.array([0.5,0.5])
b = -0.7
print(np.sum(w * x) + b)
the result is "-0.2". IEEE 754 rule does not affect the result.
what is the difference between file based running and interpreter based running?
The difference is due to how the interpreter displays output.
The print function will try to use an object's __str__ method, but the interpreter will use an object's __repr__.
If, in the interpreter you wrote:
...
z = np.sum(w*x)+b
print(z)
(which is what you're doing in your code) you'd see -0.2.
Similarly, if in your code you wrote:
print(repr(np.sum(w * x) + b))
(which is what you're doing in the interpreter) you'd see -0.19999999999999996
I think the difference lies in the fact that you use print() for your file based code, which converts the number, while in the interpreter's case, you don't use print(), but rather ask the interpreter to show the result.
I am new to Python. I am adapting someone else's code from Python 2.X to 3.5. The code loads a file via cPickle. I changed all "cPickle" occurrences to "pickle" as I understand pickle superceded cPickle in 3.5. I get this execution error:
NameError: name 'cPickle' is not defined
Pertinent code:
import pickle
import gzip
...
def load_data():
f = gzip.open('../data/mnist.pkl.gz', 'rb')
training_data, validation_data, test_data = pickle.load(f, fix_imports=True)
f.close()
return (training_data, validation_data, test_data)
The error occurs in the pickle.load line when load_data() is called by another function. However, a) neither cPickle or cpickle no longer appear in any source files anywhere in the project (searched globally) and b) the error does not occur if I run the lines within load_data() individually in the Python shell (however, I do get another data format error). Is pickle calling cPickle, and if so how do I stop it?
Shell:
Python 3.5.0 |Anaconda 2.4.0 (x86_64)| (default, Oct 20 2015, 14:39:26)
[GCC 4.2.1 (Apple Inc. build 5577)] on darwin
IDE: IntelliJ 15.0.1, Python 3.5.0, anaconda
Unclear how to proceed. Any help appreciated. Thanks.
Actually, if you have pickled objects from python2.x, then generally can be read by python3.x. Also, if you have pickled objects from python3.x, they generally can be read by python2.x, but only if they were dumped with a protocol set to 2 or less.
Python 2.7.10 (default, Sep 2 2015, 17:36:25)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> x = [1,2,3,4,5]
>>> import math
>>> y = math.sin
>>>
>>> import pickle
>>> f = open('foo.pik', 'w')
>>> pickle.dump(x, f)
>>> pickle.dump(y, f)
>>> f.close()
>>>
dude#hilbert>$ python3.5
Python 3.5.0 (default, Sep 15 2015, 23:57:10)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pickle
>>> with open('foo.pik', 'rb') as f:
... x = pickle.load(f)
... y = pickle.load(f)
...
>>> x
[1, 2, 3, 4, 5]
>>> y
<built-in function sin>
Also, if you are looking for cPickle, it's now _pickle, not pickle.
>>> import _pickle
>>> _pickle
<module '_pickle' from '/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/lib-dynload/_pickle.cpython-35m-darwin.so'>
>>>
You also asked how to stop pickle from using the built-in (C++) version. You can do this by using _dump and _load, or the _Pickler class if you like to work with the class objects. Confused? The old cPickle is now _pickle, however dump, load, dumps, and loads all point to _pickle… while _dump, _load, _dumps, and _loads point to the pure python version. For instance:
>>> import pickle
>>> # _dumps is a python function
>>> pickle._dumps
<function _dumps at 0x109c836a8>
>>> # dumps is a built-in (C++)
>>> pickle.dumps
<built-in function dumps>
>>> # the Pickler points to _pickle (C++)
>>> pickle.Pickler
<class '_pickle.Pickler'>
>>> # the _Pickler points to pickle (pure python)
>>> pickle._Pickler
<class 'pickle._Pickler'>
>>>
So if you don't want to use the built-in version, then you can use pickle._loads and the like.
It's looking like the pickled data that you're trying to load was generated by a version of the program that was running on Python 2.7. The data is what contains the references to cPickle.
The problem is that Pickle, as a serialization format, assumes that your standard library (and to a lesser extent your code) won't change layout between serialization and deserialization. Which it did -- a lot -- between Python 2 and 3. And when that happens, Pickle has no path for migration.
Do you have access to the program that generated mnist.pkl.gz? If so, port it to Python 3 and re-run it to regenerate a Python 3-compatible version of the file.
If not, you'll have to write a Python 2 program that loads that file and exports it to a format that can be loaded from Python 3 (depending on the shape of your data, JSON and CSV are popular choices), then write a Python 3 program that loads that format then dumps it as Python 3 pickle. You can then load that Pickle file from your original program.
Of course, what you should really do is stop at the point where you have ability to load the exported format from Python 3 -- and use the aforementioned format as your actual, long-term storage format.
Using Pickle for anything other than short-term serialization between trusted programs (loading Pickle is equivalent to running arbitrary code in your Python VM) is something you should actively avoid, among other things because of the exact case you find yourself in.
In Anaconda Python3.5 :
one can access cPickle as
import _pickle as cPickle
credits to Mike McKerns
This bypasses the technical issues, but there might be a py3 version of that file named mnist_py3k.pkl.gz If so, try opening that file instead.
There is a code in github that does it: https://gist.github.com/rebeccabilbro/2c7bb4d1acfbcdcf9156e7b9b7577cba
I have tried it and it worked. You just need to specify the encoding, in this case it is 'latin1':
pickle.load(open('mnist.pkl','rb'), encoding = 'latin1')
My understanding is that in Pyside QString has been dropped. One can write a Python string into a QLineEdit, and when the QLineEdit is read, it is returned as a unicode string (16-bits per character).
Trying to write this string from my Gui process to a sub-process started using QProcess does not seem to work and just returns 0L (see below). If one changes the unicode string back to a Python string using the str() function, then self.my_process.write(str(u'test')) now returns 4L. This behaviour does not seem correct to me.
Would it be possible for someone to explain why QProcess.write() does not seem to work on unicode strings?
(Pdb) PySide.QtCore.QString()
*** AttributeError: 'module' object has no attribute 'QString'
(Pdb) self.myprocess.write(u'test')
0L
(Pdb) self.myprocess.write(str(u'test'))
4L
(Pdb)
PySide has never provided classes like QString, QStringList, QVariant, etc. It has always done implicit conversion to and from the equivalent python types - that is, in PyQt terminology, it only implements the v2 API (see PSEP 101 for more details).
However, the behaviour of QProcess when attempting to write unicode strings seems somewhat broken in PySide compared with PyQt4. Here's a simple test in PyQt4:
Python 2.7.8 (default, Sep 24 2014, 18:26:21)
[GCC 4.9.1 20140903 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from PyQt4 import QtCore
>>> QtCore.PYQT_VERSION_STR
'4.11.2'
>>> p = QtCore.QProcess()
>>> p.start('cat'); p.waitForStarted()
True
>>> p.write(u'fóó'); p.waitForReadyRead()
3L
True
>>> p.readAll()
PyQt4.QtCore.QByteArray('f\xf3\xf3')
So it seems that PyQt will implicitly encode unicode strings as 'latin-1' before passing them to QProcess.write() (which of course expects either const char * or a QByteArray). If you want a different encoding, it must be done explicitly:
>>> p.write(u'fóó'.encode('utf-8')); p.waitForReadyRead()
5L
True
>>> p.readAll()
PyQt4.QtCore.QByteArray('f\xc3\xb3\xc3\xb3')
Now let's see what happens with PySide:
Python 2.7.8 (default, Sep 24 2014, 18:26:21)
[GCC 4.9.1 20140903 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from PySide import QtCore, __version__
>>> __version__
'1.2.2'
>>> p = QtCore.QProcess()
>>> p.start('cat'); p.waitForStarted()
True
>>> p.write(u'fóó'); p.waitForReadyRead()
0L
^C
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyboardInterrupt
So: no implicit encoding, and the process just blocks instead of raising an error (which would seem to be a bug). However, re-trying with explicit encoding works as expected:
>>> p.start('cat'); p.waitForStarted()
True
>>> p.write(u'fóó'.encode('utf-8')); p.waitForReadyRead()
5L
True
>>> p.readAll()
PySide.QtCore.QByteArray('fóó')
I'm trying to sample 1e7 items from 1e5 strings but getting a memory error. It's fine sampling 1e6 items from 1e4 strings. I'm on a 64bit machine with 4GB RAM and don't think I should be reaching any memory limit at 1e7. Any ideas?
$ python3
Python 3.3.3 (default, Nov 27 2013, 17:12:35)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> K = 100
Works fine with 1e6 :
>>> N = int(1e6)
>>> np.random.choice(["id%010d"%x for x in range(N//K)], N)
array(['id0000005473', 'id0000005694', 'id0000004115', ..., 'id0000006958',
'id0000009972', 'id0000003009'],
dtype='<U12')
Error with N=1e7 :
>>> N = int(1e7)
>>> np.random.choice(["id%010d"%x for x in range(N//K)], N)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "mtrand.pyx", line 1092, in mtrand.RandomState.choice (numpy/random/mtrand/mtrand.c:8229)
MemoryError
>>>
I found this question but it seems to be about catching an error like this rather than solving it.
Python not catching MemoryError
I'd be happy with either a solution still using random.choice or a different method to do this. Thanks.
You can work round this using a generator function:
def item():
for i in xrange(N):
yield "id%010d"%np.random.choice(N//K,1)
This avoids needing all the items in memory at once.