Is embedded python Py_Finalize() blocking? - python

I'm seeing intermittent crashing when I run large embedded python programs. My question is does the Py_Finalize() call block until all of the python interpreter is in a safe state before continuing? If it doesn't, how do I know when the interpreter has destroyed everything?
My current code looks like this:
Py_Initialize();
...
...
Py_Finalize(); // Unsure if this returns immediately or returns after completing all Finalizing actions

I don't think this is totally answering the question I originally asked, but, I have found a way to make the garbage collector do a better job when we I call Py_Finalize. That is to stop using static class variables in Python.
Old code:
class MyClass(object):
a = {}
def __init__(self):
...
...
New code:
class MyClass(object):
def __init__(self):
self.a = {}
...
...

If I'm right calling Py_Finalize(); will clear the python interpreter (some exceptions are found on [1]).
I would suggest you to create a class for the python interpreter and to manually check all your tasks are finished before calling Py_Finalize();. In the projects where I have worked using the embedded python interpreter, this was suiting the best.
Hope it helps!
[Python Doc][1]
https://docs.python.org/2/c-api/init.html
== EDIT ==
For Py_Finalize()
Bugs and caveats: The destruction of modules and objects in modules is
done in random order; this may cause destructors (del() methods)
to fail when they depend on other objects (even functions) or modules.
Dynamically loaded extension modules loaded by Python are not
unloaded. Small amounts of memory allocated by the Python interpreter
may not be freed (if you find a leak, please report it). Memory tied
up in circular references between objects is not freed. Some memory
allocated by extension modules may not be freed. Some extensions may
not work properly if their initialization routine is called more than
once; this can happen if an application calls Py_Initialize and
Py_Finalize more than once.
Seems like if your program is only calling Py_Initialize() and Py_Finalize() once, you might find some trouble (which I never did) and have some memory leak. However if you are only Initializing the python interpreter and performing tasks while your main program is running (I'm more familiar to this approach) you won't have many trouble.

Related

How to catch runtime errors from native code in python?

I have the following problem, Lets have this python function
def func():
run some code here which calls some native code
Inside func() I am calling some functions which in turn calls some native C code.
If any crash happens the whole python process crashes alltoghether.
How is possible to catch and recover from such errors?
One way that came to my mind is run this function in a separate process, but not just starting another process because there is a lot of memory and objects used by the function, will be very hard to split that. Is there something like fork() in C available in python, to create a copy of the same exact process with same memory structures and etc?
Or maybe other ideas?
Update:
It seems that there is no real way of catching the C runtime errors in python, those are at a lower level and crashes the whole Python virtual machine.
As solutions you currently have two options:
Use os.fork() but work only in unix like OS env.
Use multiprocessing and a shared memory model to share big objects between processes. Usual serialization will just not work with objects that have multi-gigabytes in memory (you will just run out of memory). However there is a very good python library called Ray (https://docs.ray.io/en/master/) that performs in-memory big objects serialization using shared memory model and it's ideal for BigData/ML workloads - highly recommended.
As long as you are running on an operating system that supports fork that's already how the multiprocessing module creates subprocesses. You could os.fork, multiprocessing.Process or multiprocessing.Pool to get what you want. You can also use the os.fork() call on these systems.

Is it thread safe to modify a static variable?

Since C++11, static variable initialization is guaranteed to be thread safe. But how about modifying a static variable in multiple threads? like below
static int initialized = 0;
Initialize()
{
if (initialized)
return;
initialized = 1; // Is this thread safe?
}
The reason I ask this question is that I am reading the source code for
Py_Initialize(), I am trying to embed Python in a multithreaded C++ application, I am wondering if it is safe to call Py_Initialize() multiple times in several threads? The implementation of Py_Initialize() boils down to
function _Py_InitializeEx_Private, which is like below
// pylifecycle.c
static int initialized = 0;
_Py_InitializeEx_Private(int install_sigs, int install_importlib)
{
if (initialized)
return;
initialized = 1;
// a bunch of other stuff
}
And is the conclusion for C the same as C++?
EDIT
So all the answers are good, I chose the one which clears my head most.
No, static in this context is only about the storage duration (see http://en.cppreference.com/w/c/language/static_storage_duration).
The variable has no extra thread safety at all over some other variable.
Try using std::call_once for this, see http://en.cppreference.com/w/cpp/thread/call_once
It's not thread safe to modify a static variable, but initializing a static variable is thread safe. So you can do:
void my_py_init() {
static bool x = (Py_Initialize(), true);
}
That's it. You can now call my_py_init from as many threads as you want and Py_Initialize will only ever get called once.
Py_Initialize is not thread-safe. You can call it from multiple threads only if you know that the Python interpreter has already been initialized, but if you can prove that it would be silly to call the function.
Indeed, most Python C-API calls are not thread-safe; you need to acquire the Global Interpreter Lock (GIL) in order to interact with the Python interpreter. (See the Python C-API docs for more details. Read it carefully.)
However, as far as I know you cannot use the standard API to acquire the GIL until the interpreter has been initialized. So if you have multiple threads, any of which might initialize the same Python interpreter, you would need to protect the calls to Py_Initialize with your own mutex. You might well be better off doing the initialization once before you start up any threads, if that is possible with your program logic.
The code you cite:
static int initialized = 0;
void Initialize_If_Necessary()
{
if (initialized)
return;
initialized = 1;
// Do the initialization only once
}
is clearly not threadsafe in any language, even if initialized is an atomic type. Suppose two threads were simultaneously executing this code before any initialization happened: both of them see initialized as false, so both of them proceed with the initialization. (If you don't have two cores, you could imagine that the first process is task switched between the test of initialized and the assignment.)
Modifying a static variable across multiple threads is not safe, since if the variable is put into a register, then other cores' information in the same registers will be different (modifying the variable in another thread would be the same as attempting to access that core's version of the register, which contains completely different data).
The first code sample is the typical starting point for what is referred to as 'lazy-initialisation'. It's useful for guaranteeing once-only initialisation of "expensive objects"; but doing so only if needed just before any use of the object.
That specific example doesn't have any serious problems, but it's an oversimplification. And when you look more holistically at lazy-initialisation, you'll see that multi-threaded lazy-initialisation is not a good idea.
The concept of "Thread Safety" goes way beyond just a single variable (static or otherwise). You need to step back and consider things happening to the same1 resources (memory, objects, files, ...) at the same time.
1: Different instances of the same class are not the same thing; but their static members are.
Consider the following extract from your second example.
if (initialized)
return;
initialized = 1;
// a bunch of other stuff
In the first 3 lines, there's no serious harm if multiple threads execute that code approximately concurrently. Some threads might return early; others might be a little "too quick" and all perform the task of setting initialized = 1;. However, that wouldn't be a concern, since no matter how many threads set the shared variable, the net effect is always the same.
The problem comes in with the fourth line. The one almost nonchalantly brushed aside as "a bunch of other stuff". That "other stuff" is the really critical code, because if it's possible for initialized = 1; to be called multiple times, you need to consider the impact of calling "other stuff" multiple times and concurrently.
Now, in the unlikely event you satisfy yourself that "other stuff" can be called multiple times, there's another concern...
Consider the client code that might be using Python.
Py_Initialize();
//use Python
If 2 threads call the above simultaneously; with 1 'returning early' and the other actually performing the initialisation. Then the 'early-returning thread' would start (or try to start) using Python before it's fully initialised!
As a bit of a hack, you might try blocking at the if (initialized) line for the duration of the initialisation process. But this is undesirable for 2 reasons:
Multiple threads are likely to be stuck waiting in the early stages of their processing.
Even after initialisation is complete you'd have a small (but totally wasteful) overhead of checking the lock each time you 'lazy-initialise' the Python framework.
Conclusion
Lazy-initialisation has its uses. But you're much better off not trying to perform the lazy initialisation from multiple threads. Rather have a "safe thread" (main thread is usually good enough) that can perform the lazy-initialisation before even creating any threads that would try to use whatever has been initialised. Then you won't have to worry about the thread-safety at all.

How to workaround a memory leak in a vendor's DLL being used in Python?

I'm using a vendor's C API for a piece of business software by loading their library using Python's ctypes module.
After deploying the software I wrote, I found that the vendor's library leaks memory on a consistent and predictable basis, according to the number of calls of a certain function that's part of their API.
I even duplicated the leak in a C program that uses no heap allocations.
I contacted the vendor about the issue, and they said they're working on it, but I probably can't realistically expect a fix until the next version of the software.
I had the idea of reloading the vendor's dll after a certain threshold of calls to the leaking function, but this did not release the leaked memory.
I found that I could force the library to unload like so:
_ctypes.FreeLibrary(vendor_dll._handle)
This frees the memory, but causes the interpreter to crash seemingly randomly after a number of minutes of using the vendor's API.
I found this issue in the Python bug tracker that describes my situation:
https://bugs.python.org/issue14597
It seems that if there's still an open reference to the library, forcing it to unload will inevitably crash the Python interpreter.
Worst case scenario, I'm thinking I could load the vendor's library in a separate process, proxy requests using a multiprocessing Queue, and setup a watchdog to recreate the process if the interpreter dies.
Is there a better way to work around this?
In the end, I fixed the issue by loading the vendor's library in a separate process, and accessing it through Pyro4, like so:
class LibraryWorker(multiprocessing.Process):
def __init__(self):
super().__init__()
def run(self):
self.library = ctypes.windll.LoadLibrary(
'vendor_library.dll')
Pyro4.serveSimple(
{self, 'library'},
ns=False)
def lib_func(self):
res = self.library.func()
return res
It was a bit of extra work to massage the old code to not pass ctypes pointers between the two processes, but it works.
With the library loaded in a separate process, I can keep track of the memory usage. When it gets too high, I can terminate and recreate the process to free the memory.

Python C API from C++ app - know when to lock

I am trying to write a C++ class that calls Python methods of a class that does some I/O operations (file, stdout) at once. The problem I have ran into is that my class is called from different threads: sometimes main thread, sometimes different others. Obviously I tried to apply the approach for Python calls in multi-threaded native applications. Basically everything starts from PyEval_AcquireLock and PyEval_ReleaseLock or just global locks. According to the documentation here when a thread is already locked a deadlock ensues. When my class is called from the main thread or other one that blocks Python execution I have a deadlock.
Python> Cfunc1() - C++ func that creates threads internally which lead to calls in "my class",
It stuck on PyEval_AcquireLock, obviously the Python is already locked, i.e. waiting for C++ Cfunc1 call to complete... It completes fine if I omit those locks. Also it completes fine when Python interpreter is ready for the next user command, i.e. when thread is calling funcs in the background - not inside of a native call
I am looking for a workaround. I need to distinguish whether or not the global lock is allowed, i.e. Python is not locked and ready to receive the next command... I tried PyGIL_Ensure, unfortunately I see hang.
Any known API or solution for this ?
(Python 2.4)
Unless you have wrapped your C++ code quite peculiarly, when any Python thread calls into your C++ code, the GIL is held. You may release it in your C++ code (if you want to do some consuming task that doesn't require any Python interaction), and then will have to acquire it again when you want to do any Python interaction -- see the docs: if you're just using the good old C API, there are macros for that, and the recommended idiom is
Py_BEGIN_ALLOW_THREADS
...Do some blocking I/O operation...
Py_END_ALLOW_THREADS
the docs explain:
The Py_BEGIN_ALLOW_THREADS macro opens
a new block and declares a hidden
local variable; the
Py_END_ALLOW_THREADS macro closes the
block. Another advantage of using
these two macros is that when Python
is compiled without thread support,
they are defined empty, thus saving
the thread state and GIL
manipulations.
So you just don't have to acquire the GIL (and shouldn't) until after you've explicitly released it (ideally with that macro) and need to interact with Python in any way again. (Where the docs say "some blocking I/O operation", it could actually be any long-running operation with no Python interaction whatsoever).

Are Python extensions produced by Cython/Pyrex threadsafe?

If not, is there a way I can guarantee thread safety by programming a certain way?
To clarify, when talking about "threadsafe,' I mean Python threads, not OS-level threads.
It all depends on the interaction between your Cython code and Python's GIL, as documented in detail here. If you don't do anything special, Cython-generated code will respect the GIL (as will a C-coded extension that doesn't use the GIL-releasing macros); that makes such code "as threadsafe as Python code" -- which isn't much, but is easier to handle than completely free-threading code (you still need to architect multi-threaded cooperation and synchronization, ideally with Queue instances but possibly with locking &c).
Code that has relinquished the GIL and not yet acquired it back MUST NOT in any way interact with the Python runtime and the objects that the Python runtime uses -- this goes for Cython just as well as for C-coded extensions. The upside of it is of course that such code can run on a separate core (until it needs to sync up or in any way communicate with the Python runtime again, of course).
Python's global interpreter lock means that only one thread can be active in the interpreter at any one time. However, once control is passed out to a C extension another thread can be active within the interpreter. Multiple threads can be created, and nothing prevents a thread from being interrupted within the middle of a critical section. N
on thread-safe code can be implemented within the interpreter, so nothing about code running within the interpreter is inherently thread safe. Code in C or Pyrex modules can still modify data structures that are visible to python code. Native code can, of course, also have threading issues with native data structures.
You can't guarantee thread safety beyond using appropriate design and synchronisation - the GIL on the python interpreter doesn't materially change this.

Categories