What causes a Python segmentation fault? - python

I am implementing Kosaraju's Strong Connected Component(SCC) graph search algorithm in Python.
The program runs great on small data set, but when I run it on a super-large graph (more than 800,000 nodes), it says "Segmentation Fault".
What might be the cause of it? Thank you!
Additional Info:
First I got this Error when running on the super-large data set:
"RuntimeError: maximum recursion depth exceeded in cmp"
Then I reset the recursion limit using
sys.setrecursionlimit(50000)
but got a 'Segmentation fault'
Believe me it's not a infinite loop, it runs correct on relatively smaller data. It is possible the program exhausted the resources?

This happens when a python extension (written in C) tries to access a memory beyond reach.
You can trace it in following ways.
Add sys.settrace at the very first line of the code.
Use gdb as described by Mark in this answer.. At the command prompt
gdb python
(gdb) run /path/to/script.py
## wait for segfault ##
(gdb) backtrace
## stack trace of the c code

I understand you've solved your issue, but for others reading this thread, here is the answer: you have to increase the stack that your operating system allocates for the python process.
The way to do it, is operating system dependant. In linux, you can check with the command ulimit -s your current value and you can increase it with ulimit -s <new_value>
Try doubling the previous value and continue doubling if it does not work, until you find one that does or run out of memory.

Segmentation fault is a generic one, there are many possible reasons for this:
Low memory
Faulty Ram memory
Fetching a huge data set from the db using a query (if the size of fetched data is more than swap mem)
wrong query / buggy code
having long loop (multiple recursion)

Updating the ulimit worked for my Kosaraju's SCC implementation by fixing the segfault on both Python (Python segfault.. who knew!) and C++ implementations.
For my MAC, I found out the possible maximum via :
$ ulimit -s -H
65532

Google search found me this article, and I did not see the following "personal solution" discussed.
My recent annoyance with Python 3.7 on Windows Subsystem for Linux is that: on two machines with the same Pandas library, one gives me segmentation fault and the other reports warning. It was not clear which one was newer, but "re-installing" pandas solves the problem.
Command that I ran on the buggy machine.
conda install pandas
More details: I was running identical scripts (synced through Git), and both are Windows 10 machine with WSL + Anaconda. Here go the screenshots to make the case. Also, on the machine where command-line python will complain about Segmentation fault (core dumped), Jupyter lab simply restarts the kernel every single time. Worse still, no warning was given at all.
Updates a few months later: I quit hosting Jupyter servers on Windows machine. I now use WSL on Windows to fetch remote ports opened on a Linux server and run all my jobs on the remote Linux machine. I have never experienced any execution error for a good number of months :)

I was experiencing this segmentation fault after upgrading dlib on RPI.
I tracebacked the stack as suggested by Shiplu Mokaddim above and it settled on an OpenBLAS library.
Since OpenBLAS is also multi-threaded, using it in a muilt-threaded application will exponentially multiply threads until segmentation fault. For multi-threaded applications, set OpenBlas to single thread mode.
In python virtual environment, tell OpenBLAS to only use a single thread by editing:
$ workon <myenv>
$ nano .virtualenv/<myenv>/bin/postactivate
and add:
export OPENBLAS_NUM_THREADS=1
export OPENBLAS_MAIN_FREE=1
After reboot I was able to run all my image recognition apps on rpi3b which were previously crashing it.
reference:
https://github.com/ageitgey/face_recognition/issues/294

Looks like you are out of stack memory. You may want to increase it as Davide stated. To do it in python code, you would need to run your "main()" using threading:
def main():
pass # write your code here
sys.setrecursionlimit(2097152) # adjust numbers
threading.stack_size(134217728) # for your needs
main_thread = threading.Thread(target=main)
main_thread.start()
main_thread.join()
Source: c1729's post on codeforces. Runing it with PyPy is a bit trickier.

I'd run into the same error. I learnt from another SO answer that you need to set the recursion limit through sys and resource modules.

Related

python segmentation fault (core dumped) due to recursion limit? [duplicate]

I am implementing Kosaraju's Strong Connected Component(SCC) graph search algorithm in Python.
The program runs great on small data set, but when I run it on a super-large graph (more than 800,000 nodes), it says "Segmentation Fault".
What might be the cause of it? Thank you!
Additional Info:
First I got this Error when running on the super-large data set:
"RuntimeError: maximum recursion depth exceeded in cmp"
Then I reset the recursion limit using
sys.setrecursionlimit(50000)
but got a 'Segmentation fault'
Believe me it's not a infinite loop, it runs correct on relatively smaller data. It is possible the program exhausted the resources?
This happens when a python extension (written in C) tries to access a memory beyond reach.
You can trace it in following ways.
Add sys.settrace at the very first line of the code.
Use gdb as described by Mark in this answer.. At the command prompt
gdb python
(gdb) run /path/to/script.py
## wait for segfault ##
(gdb) backtrace
## stack trace of the c code
I understand you've solved your issue, but for others reading this thread, here is the answer: you have to increase the stack that your operating system allocates for the python process.
The way to do it, is operating system dependant. In linux, you can check with the command ulimit -s your current value and you can increase it with ulimit -s <new_value>
Try doubling the previous value and continue doubling if it does not work, until you find one that does or run out of memory.
Segmentation fault is a generic one, there are many possible reasons for this:
Low memory
Faulty Ram memory
Fetching a huge data set from the db using a query (if the size of fetched data is more than swap mem)
wrong query / buggy code
having long loop (multiple recursion)
Updating the ulimit worked for my Kosaraju's SCC implementation by fixing the segfault on both Python (Python segfault.. who knew!) and C++ implementations.
For my MAC, I found out the possible maximum via :
$ ulimit -s -H
65532
Google search found me this article, and I did not see the following "personal solution" discussed.
My recent annoyance with Python 3.7 on Windows Subsystem for Linux is that: on two machines with the same Pandas library, one gives me segmentation fault and the other reports warning. It was not clear which one was newer, but "re-installing" pandas solves the problem.
Command that I ran on the buggy machine.
conda install pandas
More details: I was running identical scripts (synced through Git), and both are Windows 10 machine with WSL + Anaconda. Here go the screenshots to make the case. Also, on the machine where command-line python will complain about Segmentation fault (core dumped), Jupyter lab simply restarts the kernel every single time. Worse still, no warning was given at all.
Updates a few months later: I quit hosting Jupyter servers on Windows machine. I now use WSL on Windows to fetch remote ports opened on a Linux server and run all my jobs on the remote Linux machine. I have never experienced any execution error for a good number of months :)
I was experiencing this segmentation fault after upgrading dlib on RPI.
I tracebacked the stack as suggested by Shiplu Mokaddim above and it settled on an OpenBLAS library.
Since OpenBLAS is also multi-threaded, using it in a muilt-threaded application will exponentially multiply threads until segmentation fault. For multi-threaded applications, set OpenBlas to single thread mode.
In python virtual environment, tell OpenBLAS to only use a single thread by editing:
$ workon <myenv>
$ nano .virtualenv/<myenv>/bin/postactivate
and add:
export OPENBLAS_NUM_THREADS=1
export OPENBLAS_MAIN_FREE=1
After reboot I was able to run all my image recognition apps on rpi3b which were previously crashing it.
reference:
https://github.com/ageitgey/face_recognition/issues/294
Looks like you are out of stack memory. You may want to increase it as Davide stated. To do it in python code, you would need to run your "main()" using threading:
def main():
pass # write your code here
sys.setrecursionlimit(2097152) # adjust numbers
threading.stack_size(134217728) # for your needs
main_thread = threading.Thread(target=main)
main_thread.start()
main_thread.join()
Source: c1729's post on codeforces. Runing it with PyPy is a bit trickier.
I'd run into the same error. I learnt from another SO answer that you need to set the recursion limit through sys and resource modules.

MacOS Catalina Python forking issue

I am currently facing the issue, that multiprocessing in python with fork as starting method causes a crash on Catalina. The same Code worked perfectly fine on Mojave, even without the classic workaround OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES. This line does not seem to have any effect in Catalina anyway. The crash does not result in any catchable exception or traceback, so i am very sorry but i cannot provide more information. It occurs whenever the forked process uses openMP threading, id est spawns threads itself. Does anyone know how to fix the forking behaviour on Catalina ? Using another starting method is probably not an option since i am dealing with none pickable objects.
its kinda hard to determine your cause without a Crash Report.
Attempt this to find your crash reports:
Open the Console application:
Type “Console” into Spotlight or navigate to “Application -> Utilities -> Console.app.” then Click on Crash Reports.
I had similar issue, where Python kept crashing when i was running Ansible, on Mojave..the classic fix was to add "export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES"
However, with Catalina this was not the fix for me any longer. I had to deal with missing links to libcrypto openssl on OSX, The openssl this did the trick for me.
Try this script (or run the commands manually) if you indeed have an OpenSSL issue:
https://gist.github.com/cpavlatos/265d5091a89148eec1cfa2d10e200d32
Note: Prior to running the above script, you might want to cd to /usr/local/Cellar/openssl/ and see what version you have then change the script accordingly.
Also, worth checking is this thread:
https://github.com/Homebrew/homebrew-core/issues/44996#issuecomment-543945199
Summarized better here:
https://gist.github.com/llbbl/c54f44d028d014514d5d837f64e60bac#gistcomment-3115206

Runtime error occasionally interrupts my Python script

I seem to be getting a Runtime error whilst running my Python script in Blackmagic Fusion.
# "The application has requested the Runtime to terminate it in an unusual way".
This does not happen every time I run the script. It only seems to pop up when I feed the Python script a heavy workload, or if I run the Python script multiple times inside of the Blackmagic Fusion compositing software, without restarting the package. I thought this might be a memory leak, but when I check the CPU memory usage, it does not seem to flinch at all.
Does anyone have any idea what might be causing this, or at least a solution of how I might start to debug the script?
Many thanks.
if you know how to get runtime error, then run your script using pdb
Perhaps this'll help. It's apparently a common error with microsoft visual c++:
http://support.microsoft.com/kb/884538

MAKE through Cygwin overloads memory (too many processes)

What I'm trying to do is install SIP 4.14.7 through Cygwin using the make command. I'm running Python version 3.3.2 (with Python added to the PATH) on a Windows 7 x64 SP1 machine with 4GB RAM and an Intel Core 2 Duo. Since what I'm doing is from within the Cygwin terminal, I'll avoid using the Win32 path format.
Following the installation instructions provided with sip-4.14.7.zip, here is what I've done:
Uncompressed the .zip into /c/python33/SIP/
Launched the Cygwin terminal and went to the /cygdrive/c/python33/SIP/ folder
Ran python configure.py (No options since I was fine with the default settings)
Ran make install
As far as I can tell, I followed the instructions as I should have, but obviously I'm not doing something right here.
Here's what happens:SCREENSHOT
The number of make.exe processes go up to about 1800 before Windows gets too low on memory and the whole thing reverses itself until there are no more 'make.exe' processes running as shown here: SCREENSHOT2
I've Googled this and searched around here on stackoverflow.com but couldn't find anything related to this particular issue. It seems that unless using the -j option the MAKE command should only process one job at a time. I've also tried using the -l option thinking it would limit the processes unless enough memory was available, but the results were the same.
I tried to provide as much detail as possible, but if there is any more information that I should post to help diagnose this issue, I'd be glad to provide it. Otherwise, any suggestions here would be much appreciated.
The latest version of Cygwin includes the PyQT4 package (in All->Python within Setup.exe). It's python-pyqt4 and python3-pyqt4. If you are trying to live in Cygwin, I'd install that version into Cygwin and use it. No make required from the looks of it.

How to analyze memory usage from a core dump?

I have a core dump under Linux. The process went on memory allocation rampage and I need to find at least which library this happens in.
What tool do you suggest to get broad overview of where the memory is going? I know the problem is hard/unsolvable fully. Any tool that could at least give some clues would help.
[it's a python process, the suspicion is that the memory allocations are caused by one of the custom modules written in C]
Try running linux perf tool on the python process with callgraph enabled.
if its multi threaded process give all associated LWPs as arguments.
Problem: need to find which library malfunctions memory.
Solution:
1) Use valgrind to find out Invalid Write or Invalid Free of Memory
$ valgrind --tool=memcheck --error-limit=no --track-origins=yes (python your script)
2) Use gdb's mmap command to find out which address space the library is on
$ gdb (your executable) -c (core)
$ vmmap

Categories