Pickle a GdkPixbuf.Pixbuf object in Python3 - python

I want to pickle and unpickle a GdkPixbuf.Pixbuf in Python3. To be more specific the multiprocessing package of Python3 need to do it because I share such objects between process via a Queue.
The problem is that the object changes from
<GdkPixbuf.Pixbuf object at 0x7f8b9e9cfb88 (GdkPixbuf at 0x563b61725c60)>
to
<GdkPixbuf.Pixbuf object at 0x7f8b9e9eaea0 (uninitialized at 0x(nil))>
That is the minimal working example.
>>> import gi
>>> from gi.repository import GdkPixbuf
__main__:1: PyGIWarning: GdkPixbuf was imported without specifying a version first. Use gi.require_version('GdkPixbuf', '2.0') before import to ensure that the right version gets loaded.
>>> pf = GdkPixbuf.Pixbuf.new_from_file('_icon.png')
>>> pf
<GdkPixbuf.Pixbuf object at 0x7f8b9e9cfb88 (GdkPixbuf at 0x563b61725c60)>
>>> import pickle
>>> pickle.dump(pf, open('p', 'wb'))
>>> pb2 = pickle.load(open('p', 'rb'))
>>> pb2
<GdkPixbuf.Pixbuf object at 0x7f8b9e9eaea0 (uninitialized at 0x(nil))>
I see no other way to pickle. The icon need to be loaded in a separate process (on a different CPU core then the applications main/first process) and then should be transfered to the main process. This is done via a Queue which pickles all data.

My solution is holding the "icon" not as a Pixbuf object in memory but as raw bytes I read from the file.
After unpickling this bytes I convert them to a Pixbuf.
>>> import gi
>>> from gi.repository import GdkPixbuf, Gio, GLib
__main__:1: PyGIWarning: GdkPixbuf was imported without specifying a version first. Use gi.require_version('GdkPixbuf', '2.0') before import to ensure that the right version gets loaded.
>>> with open('_icon.png', 'rb') as f:
... icon_bytes = f.read()
...
>>>
>>> import pickle
>>> pickle.dump(icon_bytes, open('p', 'wb'))
>>>
>>> pb = pickle.load(open('p', 'rb'))
>>> pb = GLib.Bytes(pb)
>>> pb = GdkPixbuf.Pixbuf.new_from_stream(Gio.MemoryInputStream.new_from_bytes(pb))
>>> pb
<GdkPixbuf.Pixbuf object at 0x7fc0858ac5e8 (GdkPixbuf at 0x55e0d8d08b60)>

Related

`pickle`: yet another `ImportError: No module named my_module`

I have a class MyClass defined in my_module. MyClass has a method pickle_myself which pickles the instance of the class in question:
def pickle_myself(self, pkl_file_path):
with open(pkl_file_path, 'w+') as f:
pkl.dump(self, f, protocol=2)
I have made sure that my_module is in PYTHONPATH. In the interpreter, executing __import__('my_module') works fine:
>>> __import__('my_module')
<module 'my_module' from 'A:\my_stuff\my_module.pyc'>
However, when eventually loading the file, I get:
File "A:\Anaconda\lib\pickle.py", line 1128, in find_class
__import__(module)
ImportError: No module named my_module
Some things I have made sure of:
I have not changed the location of my_module.py (Python pickling after changing a module's directory)
I have tried to use dill instead, but still get the same error (More on python ImportError No module named)
EDIT -- A toy example that reproduces the error:
The example itself is spread over a bunch of files.
First, we have the module ball (stored in a file called ball.py):
class Ball():
def __init__(self, ball_radius):
self.ball_radius = ball_radius
def say_hello(self):
print "Hi, I'm a ball with radius {}!".format(self.ball_radius)
Then, we have the module test_environment:
import os
import ball
#import dill as pkl
import pickle as pkl
class Environment():
def __init__(self, store_dir, num_balls, default_ball_radius):
self.store_dir = store_dir
self.balls_in_environment = [ball.Ball(default_ball_radius) for x in range(num_balls)]
def persist(self):
pkl_file_path = os.path.join(self.store_dir, "test_stored_env.pkl")
with open(pkl_file_path, 'w+') as f:
pkl.dump(self, f, protocol=2)
Then, we have a module that has functions to make environments, persist them, and load them, called make_persist_load:
import os
import test_environment
#import pickle as pkl
import dill as pkl
def make_env_and_persist():
cwd = os.getcwd()
my_env = test_environment.Environment(cwd, 5, 5)
my_env.persist()
def load_env(store_path):
stored_env = None
with open(store_path, 'rb') as pkl_f:
stored_env = pkl.load(pkl_f)
return stored_env
Then we have a script to put it all together, in test_serialization.py:
import os
import make_persist_load
MAKE_AND_PERSIST = True
LOAD = (not MAKE_AND_PERSIST)
cwd = os.getcwd()
store_path = os.path.join(cwd, "test_stored_env.pkl")
if MAKE_AND_PERSIST == True:
make_persist_load.make_env_and_persist()
if LOAD == True:
loaded_env = make_persist_load.load_env(store_path)
In order to make it easy to use this toy example, I have put it all up on in a Github repository that simply needs to be cloned into your directory of choice.. Please see the README containing instructions, which I also reproduce here:
Instructions:
1) Clone repository into a directory.
2) Add repository directory to PYTHONPATH.
3) Open up test_serialization.py, and set the variable MAKE_AND_PERSIST to True. Run the script in an interpreter.
4) Close the previous interpreter instance, and start up a new one. In test_serialization.py, change MAKE_AND_PERSIST to False, and this will programmatically set LOAD to True. Run the script in an interpreter, causing ImportError: No module named test_environment.
5) By default, the test is set to use dill, instead of pickle. In order to change this, go into test_environment.py and make_persist_load.py, to change imports as required.
EDIT: after switching to dill '0.2.5.dev0', dill.detect.trace(True) output
C2: test_environment.Environment
# C2
D2: <dict object at 0x000000000A9BDAE8>
C2: ball.Ball
# C2
D2: <dict object at 0x000000000AA25048>
# D2
D2: <dict object at 0x000000000AA25268>
# D2
D2: <dict object at 0x000000000A9BD598>
# D2
D2: <dict object at 0x000000000A9BD9D8>
# D2
D2: <dict object at 0x000000000A9B0BF8>
# D2
# D2
EDIT: the toy example works perfectly well when run on Mac/Ubuntu (i.e. Unix-like systems?). It only fails on Windows.
I can tell from your question that you are probably doing something like this, with a class method that is attempting to pickle the instance of the class. It's ill-advised to do that, if you are doing that… it's much more sane to use pkl.dump external to the class instead (where pkl is pickle or dill etc). However, it can still work with this design, see below:
>>> class Thing(object):
... def pickle_myself(self, pkl_file_path):
... with open(pkl_file_path, 'w+') as f:
... pkl.dump(self, f, protocol=2)
...
>>> import dill as pkl
>>>
>>> t = Thing()
>>> t.pickle_myself('foo.pkl')
Then restarting...
Python 2.7.10 (default, Sep 2 2015, 17:36:25)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> f = open('foo.pkl', 'r')
>>> t = dill.load(f)
>>> t
<__main__.Thing object at 0x1060ff410>
If you have a much more complicated class, which I'm sure you do, then you are likely to run into trouble, especially if that class uses another file that is sitting in the same directory.
>>> import dill
>>> from bar import Zap
>>> print dill.source.getsource(Zap)
class Zap(object):
x = 1
def __init__(self, y):
self.y = y
>>>
>>> class Thing2(Zap):
... def pickle_myself(self, pkl_file_path):
... with open(pkl_file_path, 'w+') as f:
... dill.dump(self, f, protocol=2)
...
>>> t = Thing2(2)
>>> t.pickle_myself('foo2.pkl')
Then restarting…
Python 2.7.10 (default, Sep 2 2015, 17:36:25)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> f = open('foo2.pkl', 'r')
>>> t = dill.load(f)
>>> t
<__main__.Thing2 object at 0x10eca8090>
>>> t.y
2
>>>
Well… shoot, that works too. You'll have to post your code, so we can see what pattern you are using that dill (and pickle) fails for. I know having one module import another that is not "installed" (i.e. in some local directory) and expecting the serialization to "just work" doesn't for all cases.
See dill issues:
https://github.com/uqfoundation/dill/issues/128
https://github.com/uqfoundation/dill/issues/129
and this SO question:
Why dill dumps external classes by reference, no matter what?
for some examples of failure and potential workarounds.
EDIT with regard to updated question:
I don't see your issue. Running from the command line, importing from the interpreter (import test_serialization), and running the script in the interpreter (as below, and indicated in your steps 3-5) all work. That leads me to think you might be using an older version of dill?
>>> import os
>>> import make_persist_load
>>>
>>> MAKE_AND_PERSIST = False #True
>>> LOAD = (not MAKE_AND_PERSIST)
>>>
>>> cwd = os.getcwd()
>>> store_path = os.path.join(cwd, "test_stored_env.pkl")
>>>
>>> if MAKE_AND_PERSIST == True:
... make_persist_load.make_env_and_persist()
...
>>> if LOAD == True:
... loaded_env = make_persist_load.load_env(store_path)
...
>>>
EDIT based on discussion in comments:
Looks like it's probably an issue with Windows, as that seems to be the only OS the error appears.
EDIT after some work (see: https://github.com/uqfoundation/dill/issues/140):
Using this minimal example, I can reproduce the same error on Windows, while on MacOSX it still works…
# test.py
class Environment():
def __init__(self):
pass
and
# doit.py
import test
import dill
env = test.Environment()
path = "test.pkl"
with open(path, 'w+') as f:
dill.dump(env, f)
with open(path, 'rb') as _f:
_env = dill.load(_f)
print _env
However, if you use open(path, 'r') as _f, it works on both Windows and MacOSX. So it looks like the __import__ on Windows is more sensitive to file type than on non-Windows systems. Still, throwing an ImportError is weird… but this one small change should make it work.
In case someone is having same problem, I had the same problem running Python 2.7 and the problem was the pickle file created on windows while I am running Linux, what I had to do is running dos2unix which has to be downloaded first using
sudo yum install dos2unix
And then you need to convert the pickle file example
dos2unix data.p

Python Module issues

Im pretty new to python, and have been having a rough time learning it. I have a main file
import Tests
from Tests import CashAMDSale
CashAMDSale.AMDSale()
and CashAMDSale
import pyodbc
import DataFunctions
import automa
from DataFunctions import *
from automa.api import *
def AMDSale():
AMDInstance = DataFunctions.GetValidAMD()
And here is GetValidAMD
import pyodbc
def ValidAMD(GetValidAMD):
(short method talking to a database)
My error comes up on the line that has AMDInstance = DataFunctions.GetValidAMD()
I get the error AttributeError: 'module' object has no attribute 'GetValidAMD'
I have looked and looked for an answer, and nothing has worked. Any ideas? Thanks!
DataFunctions is a folder, which means it is a package and must contain an __init__.py file for python to recognise it as such.
when you import * from a package, you do not automatically import all it's modules. This is documented in the docs
so for your code to work, you either need to explicitly import the modules you need:
import DataFunctions.GetValidAMD
or you need to add the following to the __init__.py of DataFunctions:
__all__ = ["GetValidAMD"]
then you can import * from the package and everything listen in __all__ will be imported
When you create the file foo.py, you create a python module. When you do import foo, Python evaluates that file and places any variables, functions and classes it defines into a module object, which it assigns to the name foo.
# foo.py
x = 1
def foo():
print 'foo'
 
>>> import foo
>>> type(foo)
<type 'module'>
>>> foo.x
1
>>> foo.foo()
foo
When you create the directory bar with an __init__.py file, you create a python package. When you do import bar, Python evaluates the __init__.py file and places any variables, functions and classes it defines into a module object, which it assigns to the name bar.
# bar/__init__.py
y = 2
def bar():
print 'bar'
 
>>> import bar
>>> type(bar)
<type 'module'>
>>> bar.y
2
>>> bar.bar()
bar
When you create python modules inside a python package (that is, files ending with .py inside directory containing __init__.py), you must import these modules via this package:
>>> # place foo.py in bar/
>>> import foo
Traceback (most recent call last):
...
ImportError: No module named foo
>>> import bar.foo
>>> bar.foo.x
1
>>> bar.foo.foo()
foo
Now, assuming your project structure is:
main.py
DataFunctions/
__init__.py
CashAMDSale.py
def AMDSale(): ...
GetValidAMD.py
def ValidAMD(GetValidAMD): ...
your main script can import DataFunctions.CashAMDSale and use DataFunctions.CashAMDSale.AMDSale(), and import DataFunctions.GetValidAMD and use DataFunctions.GetValidAMD.ValidAMD().
Check out this.
It's the same problem. You are importing DataFunctions which is a module. I expct there to be a class called DataFunctions in that module which needs to be imported with
from DataFunctions import DataFunctions
...
AMDInstance = DataFunctions.GetValidAMD()

How can I get module "remote_tag" in Python?

There is a example in page 203 of Python Text Processing with NLTK 2.0 Cookbook that imports the module remote_tag. But I didn't find any site that can download the module. How can I get module remote_tag in Python?
>>> import execnet, remote_tag, nltk.tag, nltk.data
>>> from nltk.corpus import treebank
>>> import cPickle as pickle
>>> tagger = pickle.dumps(nltk.data.load(nltk.tag._POS_TAGGER))
>>> gw = execnet.makegateway()
>>> channel = gw.remote_exec(remote_tag)
>>> channel.send(tagger)
>>> channel.send(treebank.sents()[0])
>>> tagged_sentence = channel.receive()
>>> tagged_sentence == treebank.tagged_sents()[0]
True
>>> gw.exit()
create our own remote_tag.py module by following five lines of code that from page 204 of Python Text Processing with NLTK 2.0 Cookbook and place in the same directory as the program that we import it in.
import cPickle as pickle
if __name__ == '__channelexec__':
tagger = pickle.loads(channel.receive())
for sentence in channel:
channel.send(tagger.tag(sentence))

Python finding stdin filepath on Linux

How can I tell the file (or tty) that is attached to my stdios?
Something like:
>>> import sys
>>> print sys.stdin.__path__
'/dev/tty1'
>>>
I could look in proc:
import os, sys
os.readlink('/proc/self/fd/%s' % sys.stdin.fileno())
But seems like there should be a builtin way?
The sys.std* objects are standard Python file objects, so they have a name attribute and a isatty method:
>>> import sys
>>> sys.stdout.name
'<stdout>'
>>> sys.stdout.isatty()
True
>>> anotherfile = open('/etc/hosts', 'r')
>>> anotherfile.name
'/etc/hosts'
>>> anotherfile.isatty()
False
Short of telling you exactly what TTY device you got, that's the extend of the API offered by Python.
Got it!
>>> import os
>>> import sys
>>> print os.ttyname(sys.stdin.fileno())
'/dev/pts/0'
>>>
It raises OSError: [Errno 22] Invalid argument if stdin isn't a TTY; but thats easy enough to test for with isatty()

How to pass file descriptors from parent to child in python?

I am using multiprocessing module, and using pools to start multiple workers. But the file descriptors which are opened at the parent process are closed in the worker processes. I want them to be open..! Is there any way to pass file descriptors to be shared across parent and children?
On Python 2 and Python 3, functions for sending and receiving file descriptors exist in multiprocessing.reduction module.
Example code (Python 2 and Python 3):
import multiprocessing
import os
# Before fork
child_pipe, parent_pipe = multiprocessing.Pipe(duplex=True)
child_pid = os.fork()
if child_pid:
# Inside parent process
import multiprocessing.reduction
import socket
# has socket_to_pass socket object which want to pass to the child
socket_to_pass = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
socket_to_pass.connect("/dev/log")
# child_pid argument to send_handle() can be arbitrary on Unix,
# on Windows it has to be child PID
multiprocessing.reduction.send_handle(parent_pipe, socket_to_pass.fileno(), child_pid)
socket_to_pass.send("hello from the parent process\n".encode())
else:
# Inside child process
import multiprocessing.reduction
import socket
import os
fd = multiprocessing.reduction.recv_handle(child_pipe)
# rebuild the socket object from fd
received_socket = socket.fromfd(fd, socket.AF_INET, socket.SOCK_STREAM)
# socket.fromfd() duplicates fd, so we can close the received one
os.close(fd)
# and now you can communicate using the received socket
received_socket.send("hello from the child process\n".encode())
There is also a fork of multiprocessing called multiprocess, which replaces pickle with dill. dill can pickle file descriptors, and thus multiprocess can easily pass them between processes.
>>> f = open('test.txt', 'w')
>>> _ = map(f.write, 'hello world')
>>> f.close()
>>> import multiprocess as mp
>>> p = mp.Pool()
>>> f = open('test.txt', 'r')
>>> p.apply(lambda x:x, f)
'hello world'
>>> f.read()
'hello world'
>>> f.close()
multiprocessing itself has helper methods for transferring file descriptors between processes on Windows and Unix platforms that support sending file descriptors over Unix domain sockets in multiprocessing.reduction: send_handle and recv_handle. These are not documented but are in the module's __all__ so it may be safe to assume they are part of the public API. From the source it looks like these have been available since at least 2.6+ and 3.3+.
All platforms have the same interface:
send_handle(conn, handle, destination_pid)
recv_handle(conn)
Where:
conn (multiprocessing.Connection): connection over which to send the file descriptor
handle (int): integer referring to file descriptor/handle
destination_pid (int): integer pid of the process that is receiving the file descriptor - this is currently only used on Windows
There isn't a way that I know of to share file descriptors between processes.
If a way exists, it is most likely OS specific.
My guess is that you need to share data on another level.

Categories