When is PyEval_InitThreads meant to be called? [duplicate] - python

This question already has answers here:
PyEval_InitThreads in Python 3: How/when to call it? (the saga continues ad nauseam)
(7 answers)
Closed 9 years ago.
I'm a bit confused about when I'm supposed to call PyEval_InitThreads. In general, I understand that PyEval_InitThreads must be called whenever a non-Python thread (i.e. a thread that is spawned within an extension module) is used.
However, I'm confused if PyEval_InitThreads is for C programs which embed the Python interpreter, or Python programs which import C-extension modules, or both.
So, if I write a C extension module that will internally launch a thread, do I need to call PyEval_InitThreads when initializing the module?
Also, PyEval_InitThreads implicitly acquires the Global Interpreter Lock. So after calling PyEval_InitThreads, presumably the GIL must be released or deadlock will ensue. So how do you release the lock? After reading the documentation, PyEval_ReleaseLock() appears to be the way to release the GIL. However, in practice, if I use the following code in a C extension module:
PyEval_InitThreads();
PyEval_ReleaseLock();
...then at runtime Python aborts with:
Fatal Python error: drop_gil: GIL is not locked
So how do you release the GIL after acquiring it with PyEval_InitThreads?

Most applications never need to know about PyEval_InitThreads() at all.
The only time you should use it is if your embedding application or extension module will be making Python C API calls from more than one thread that it spawned itself outside of Python.
Don't call PyEval_ReleaseLock() in any thread which will later be making Python C API calls (unless you re-acquire it before those). In that case you should really use the Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros instead.

Related

Does holding CPython's GIL guarantee that all cpython's threads stop?

CPython is a multi-threaded application, and as such on Unix it uses (p)threads. Python extensions (written in C, say) often need to hold GIL to make sure Python objects don't get corrupted in critical sections of the code. How about other types of data? Specifically, does holding GIL in a Python extension guarantee that all other threads of CPython stop?
The reason for asking is that I am trying to port to FreeBSD a Python extension (which works on Linux and OSX) that embeds a Lisp compiler/system ECL using Boehm GC, and which crashes during the initialisation of the embedded Boehm GC. Backtraces suggest that another thread kicks in and causes havoc (pthread implementation on Linux are sufficiently different from FreeBSD's to expect trouble along these lines, too). Is there another mutex in CPython that may be used to achieve the locking?
Specifically, does holding GIL in a Python extension guarantee that all other threads of CPython stop?
The short answer is no - if other threads are executing code without holding the GIL (e.g. if they're running a C extension that releases the GIL), then they will keep running until try try to re-acquire the GIL (usually when they try to return (data) to the world of python).
It's also possible that core parts of the CPython (the core interpreter and/or built in fucntions/packages could release the GIL in similar circumstances/reasons that you would do in an extension. I have no idea if they actually do though.

How to solve ctypes "while running" frozen

I always use ctypes to call C++ function in Python. Currently I have designed a UI with PyQt5 to control a hardware, and call functions in dll files to control the device.
However, every time I run the function in C, python interface will just be frozen. But what I want is to run the C++ function in dll file, while I could output some information Python user interface. (Like a printing process, when I run the printing in dll file, I need to know how far the nozzle is going now).
I wonder if there is any solution to this problem.

Multithreading in C++ with embedded python modules

I'm trying to create a multi threaded program by launching a boost thread which calls a function which in turn calls some python module but the program hangs there as it acquires some PyGILState_Ensure() lock and waits for it to release indefinitely.Can you please tell me what is wrong here.
Yeah actually a python module calls my c++ code which calls another python module in separate threads , thats why I think its waiting for PyGIL to release which results in deadlock , so , is there any solution to it with using the patch for removing PyGIL?
The Python interpreter isn't re-entrant and needs to lock the interpreter while it's being called (see e.g. http://dabeaz.blogspot.be/2011/08/inside-look-at-gil-removal-patch-of.html). In your particular situation it seems like there's another Python call on the interpreter already running, and it's holding the GIL.

import inside of a Python thread

I have some functions that interactively load python modules using __import__
I recently stumbled upon some article about an "import lock" in Python, that is, a lock specifically for imports (not just the GIL). But the article was old so maybe that's not true anymore.
This makes me wonder about the practice of importing in a thread.
Are import/__import__ thread safe?
Can they create dead locks?
Can they cause performance issues in a threaded application?
EDIT 12 Sept 2012
Thanks for the great reply Soravux.
So import are thread safe, and I'm not worrying about deadlocks, since the functions that use __import__ in my code don't call each others.
Do you know if the lock is acquired even if the module has already been imported ?
If that is the case, I should probably look in sys.modules to check if the module has already been imported before making a call to __import__.
Sure this shouldn't make a lot of difference in CPython since there is the GIL anyway.
However it could make a lot of difference on other implementations like Jython or stackless python.
EDIT 19 Sept 2012
About Jython, here's what they say in the doc:
http://www.jython.org/jythonbook/en/1.0/Concurrency.html#module-import-lock
Python does, however, define a module import lock, which is
implemented by Jython. This lock is acquired whenever an import of any
name is made. This is true whether the import goes through the import
statement, the equivalent __import__ builtin, or related code. It’s
important to note that even if the corresponding module has already
been imported, the module import lock will still be acquired, if only
briefly.
So, it seems that it would make sense to check in sys.modules before making an import, to avoid acquiring the lock. What do you think?
Update: Since Python 3.3, import locks are per-module instead of global, and imp is deprecated in favor of importlib. More information on the changelog and this issue ticket.
The original answer below predates Python 3.3
Normal imports are thread-safe because they acquire an import lock prior to execution and release it once the import is done. If you add your own custom imports using the hooks available, be sure to add this locking scheme to it. Locking facilities in Python may be accessed by the imp module (imp.lock_held()/acquire_lock()/release_lock()). Edit: This is deprecated since Python 3.3, no need to manually handle the lock.
Using this import lock won't create any deadlocks or dependency errors aside from the circular dependencies that are already known (module a imports module b which imports module a). Edit: Python 3.3 changed for a per-module locking mechanism to prevent those deadlocks caused by circular imports.
There exist multiple ways to create new processes or threads, for example fork and clone (assuming a Linux environment). Each way yields different memory behaviors when creating the new process. By default, a fork copies most memory segments (Data (often COW), Stack, Code, Heap), effectively not sharing its content between the child and its parent. The result of a clone (often called a thread, this is what Python uses for threading) shares all memory segments with its parent except the stack. The import mechanism in Python uses the global namespace which is not placed on the stack, thus using a shared segment between its threads. This means that all memory modifications (except for the stack) performed by an import in a thread will be visible to all its other related threads and parent. If the imported module is Python-only, it is thread-safe by design. If an imported module uses non-Python libraries, make sure those are thread-safe, otherwise it will cause mayhem in your multithreaded Python code.
By the way, threaded programs in Python suffers the GIL which won't allow much performance gains unless your program is I/O bound or rely on C or external thread-safe libraries (since they should release the GIL before executing). Running in two threads the same imported Python function won't execute concurrently because of this GIL. Note that this is only a limitation of CPython and other implementations of Python may have a different behavior.
To answer your edit: imported modules are all cached by Python. If the module is already loaded in the cache, it won't be run again and the import statement (or function) will return right away. You don't have to implement yourself the cache lookup in sys.modules, Python does that for you and won't imp lock anything, aside from the GIL for the sys.modules lookup.
To answer your second edit: I prefer having to maintain a simpler code than trying to optimize calls to the libraries I use (in this case, the standard library). The rationale is that the time required to perform something is usually way more important than the time required to import the module that does it. Furthermore, the time required to maintain this kind of code throughout the project is way higher than the time it will take to execute. It all boils down to: "programmer time is more valuable than CPU time".
I could not find an answer in official documentation on this but it appears that in some versions of CPython 3.x, __import__ calls are not thread-safe, and can cause a deadlock. See: https://bugs.python.org/issue38884.

Interrupt Python program deadlocked in a DLL

How can I ensure a python program can be interrupted via Ctrl-C, or a similar mechanism, when it is deadlocked in code within a DLL?
Not sure if this is exactly what you are asking, but there are issues when trying to interrupt (via Ctrl-C) a multi-threaded python process. Here is a video of a talk about the python Global Interpreter Lock that also discusses that issue:
Mindblowing Python GIL
You might want to take a look at this mailing list for a couple other suggestions, but there aren't any conclusive answers.
I've encountered the issue several times, and I can at least confirm that this happens when using FFI in Haskell. I could have sworn that I once saw something in Haskell's FFI documentation mentioning that DLLs would not return from a ctrl-c signal, but I'm not having any luck finding that document.
You can try using ctrl-break, but that's not working to break out of a DLL in Haskell and I'm doubting it will work in Python either.
Update: ctrl-break does work for me in Python when ctrl-c does not, during a call to a DLL function in an infinite loop.

Categories