Kivy application unexpectedly halting when using Threads - python

I am not sure whether this is an iOS issue or whether this is an issue with Kivy or even with Python (e.g. https://bugs.python.org/issue37788), but I am experiencing some problems with threading.
I have built an iPad app using the Kivy framework that makes several calls to an API, and uses the threading module to asynchronously make requests. Below is the code that handles the API requests:
import json
import requests
import base64
import threading
def thread(function):
def wrap(*args, **kwargs):
t = threading.Thread(target=function, args=args, kwargs=kwargs)
t.start()
return t
return wrap
class MathPixAPI:
stroke_url = '*******************'
header = {
"content-type": "application/json",
"app_id": "*******************",
"app_key": "*******************"
}
#thread
def post_data(self, file_name: str, root):
"""
Posts a base64 encoded image to the MathPixAPI then updates the data DictProperty of the ExpressionWriter that
calls this function
:param file_name: The name of the file - e.g. "image.png"
:param root: The ExpressionWriter that calls the function
"""
image_uri = "data:image/png;base64," + base64.b64encode(open(file_name, "rb").read()).decode()
r = requests.post("https://api.mathpix.com/v3/text",
data=json.dumps({'src': image_uri}),
headers=self.header)
root.data = json.loads(r.text)
The app makes no more than 5 asynchronous requests at one time, and is called from the function below:
def get_image_data(self):
"""
The function first saves the ExpressionWriter.canvas as a PNG file to the user_data_directory (automatically
determined depending on the device the user is running the app on). Then this images is sent to the MathPix API
which then return data on the handwritten answer (see api.py for more details). The api call updates self.data
which in turn calls self._on_data().
"""
file_name = f'{App.get_running_app().user_data_dir}/image_{self.number}.png'
self.export_to_png(file_name)
MathPixAPI().post_data(file_name, self)
This works really well, up until the 20th-25th request, upon which the program halts. In Xcode I receive the following error log:
021-04-09 18:11:02.300179+0100 ccc-writer-3[4261:4790641] [Animation] +[UIView setAnimationsEnabled:] being called from a background thread. Performing any operation from a background thread on UIView or a subclass is not supported and may result in unexpected and insidious behavior. trace=(
0 UIKitCore 0x0000000187cbb538 8518EAE3-832B-3FF0-9FA5-9DBE3041F26C + 17859896
1 libdispatch.dylib 0x0000000101ce56c0 _dispatch_client_callout + 20
2 libdispatch.dylib 0x0000000101ce71f8 _dispatch_once_callout + 136
3 UIKitCore 0x0000000187cbb4bc 8518EAE3-832B-3FF0-9FA5-9DBE3041F26C + 17859772
4 UIKitCore 0x0000000187cbb628 8518EAE3-832B-3FF0-9FA5-9DBE3041F26C + 17860136
5 UIKitCore 0x0000000187abbd64 8518EAE3-832B-3FF0-9FA5-9DBE3041F26C + 15764836
6 UIKitCore 0x0000000187aae150 8518EAE3-832B-3FF0-9FA5-9DBE3041F26C + 15708496
7 UIKitCore 0x00000001877b2f20 8518EAE3-832B-3FF0-9FA5-9DBE3041F26C + 12582688
8 UIKitCore 0x0000000187cb2b30 8518EAE3-832B-3FF0-9FA5-9DBE3041F26C + 17824560
9 UIKitCore 0x0000000187aacd50 8518EAE3-832B-3FF0-9FA5-9DBE3041F26C + 15703376
10 ccc-writer-3 0x0000000100822960 -[SDL_uikitviewcontroller showKeyboard] + 108
11 ccc-writer-3 0x0000000100823164 UIKit_ShowScreenKeyboard + 60
12 ccc-writer-3 0x00000001007ec490 SDL_StartTextInput + 92
... [A whole bunch of memory addresses] ...
74 ccc-writer-3 0x0000000100610df4 _PyEval_EvalFrameDefault + 5432
75 ccc-writer-3 0x000000010054dfe0 function_code_fastcall + 120
76 ccc-writer-3 0x00000001005505f8 method_vectorcall + 264
77 ccc-writer-3 0x000000010054d95c PyVectorcall_Call + 104
78 ccc-writer-3 0x0000000100770c40 t_bootstrap + 80
79 ccc-writer-3 0x000000010065e8e8 pythread_wrapper + 28
80 libsystem_pthread.dylib 0x00000001cfbb3cb0 _pthread_start + 320
81 libsystem_pthread.dylib 0x00000001cfbbc778 thread_start + 8
)
2021-04-09 18:11:02.308745+0100 ccc-writer-3[4261:4790641] *** Assertion failure in -[_UISimpleFenceProvider trackSystemAnimationFence:], _UISimpleFenceProvider.m:51
2021-04-09 18:11:02.311976+0100 ccc-writer-3[4261:4790641] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'main thread only'
*** First throw call stack:
(0x184dc686c 0x199de1c50 0x184ccc000 0x18606091c 0x186cd20bc 0x187777d30 0x1877cb888 0x186c00e58 0x1875b2610 0x1871c71b8 0x1871c54d0 0x1871c51f0 0x1871c674c 0x1871c67c8 0x1871c682c 0x187541c94 0x1871c3478 0x1871c2b88 0x1877b7f58 0x1877b2fc8 0x187cb2b30 0x187aacd50 0x100822960 0x100823164 0x1007ec490 0x1008d30b8 0x10055680c 0x100614e6c 0x100610df4 0x100615e98 0x10054e160 0x10055058c 0x100614e6c 0x100611d50 0x10054dfe0 0x100614e6c 0x100610df4 0x10054dfe0 0x100614e6c 0x100610df4 0x100615e98 0x10054e160 0x100550664 0x10054d95c 0x100612888 0x100615e98 0x10060f87c 0x100859f50 0x10085e5dc 0x10085f020 0x100bc7598 0x100bc5c78 0x100beac94 0x100591568 0x1005908dc 0x100c116a0 0x1005908dc 0x100610d94 0x10054dfe0 0x100614e6c 0x100610df4 0x100615e98 0x10054e160 0x10055058c 0x100614e6c 0x100611d50 0x100615e98 0x10060f87c 0x100859f50 0x10085e5dc 0x10085f020 0x100bc7598 0x100bc5c78 0x100bcbdc8 0x100beac94 0x100591568 0x1005908dc 0x100610d94 0x10054dfe0 0x10054d95c 0x100612888 0x10054dfe0 0x100614e6c 0x100610df4 0x10054dfe0 0x100614e6c 0x100610df4 0x10054dfe0 0x1005505f8 0x10054d95c 0x100770c40 0x10065e8e8 0x1cfbb3cb0 0x1cfbbc778)
libc++abi.dylib: terminating with uncaught exception of type NSException
*** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'main thread only'
terminating with uncaught exception of type NSException
I am not sure what "main thread only" means nor do I have any idea how to resolve this issue. Can anyone clarify what this means and explain what the problem is? How can I stop my program from halting?
Thanks in advance.

I fixed this issue in Xcode by amending the runtime API checking.
Navigate to:
Product > Scheme > Edit Scheme > Run / Debug > Diagnostics
then deselect Main Thread Checker

Related

PyAudio and Matplotlib crash when used together

Running the following code:
import threading
import pyaudio
from matplotlib import pyplot as plt
def output():
p = pyaudio.PyAudio()
stream_ = p.open(format=pyaudio.paFloat32,
channels=1,
rate=8000,
output=True)
stream_.stop_stream()
stream_.close()
p.terminate()
output_thread = threading.Thread(target=output, args=())
output_thread.start()
output_thread.join()
fig = plt.figure()
ax0 = fig.add_subplot(111)
ax0.plot([1,2,3])
plt.show()
causes Python to crash with the error below. How might I solve this? I am running Python 3.8, PyAudio 0.2.11 and Matplotlib 3.3.1 and Mac Os version 10.15.5.
Crashed Thread: 0 Dispatch queue: com.apple.main-thread
Exception Type: EXC_BAD_INSTRUCTION (SIGILL)
Exception Codes: 0x0000000000000001, 0x0000000000000000
Exception Note: EXC_CORPSE_NOTIFY
Termination Signal: Illegal instruction: 4
Termination Reason: Namespace SIGNAL, Code 0x4
Terminating Process: exc handler [4840]
Application Specific Information:
The current event queue and the main event queue are not the same. This is probably because _TSGetMainThread was called for the first time off the main thread. _TSGetMainThread was called for the first time here:
0 CarbonCore 0x00007fff36993345 _TSGetMainThread + 138
1 CarbonCore 0x00007fff3699324a GetThreadGlobals + 26
2 CarbonCore 0x00007fff3699d5e4 NewPtrClear + 14
3 CarbonCore 0x00007fff369b4aba AVLInit + 62
4 CarbonCore 0x00007fff369b49f1 __INIT_Folders_block_invoke + 9
5 libdispatch.dylib 0x00007fff6f603658 _dispatch_client_callout + 8
6 libdispatch.dylib 0x00007fff6f6047de _dispatch_once_callout + 20
Ok if I run:
fig = plt.figure()
before I create the thread I can avoid this crash. I'm guessing this allows me to call _TSGetMainThread on the main thread for the first time.

Python psutil how to find a child process

Given PyQt5 on Linux, I have an application that starts a terminal emulator (rxvt) and runs a command (gaurdian) which runs yet another program (goo). Like this
medi#medi:~> pstree -p 4610
rxvt(4610)─┬─gaurdian(4612)───goo(4613)
└─rxvt(4611)
I am trying to find pid of "goo". So I proceed with
gooPID = 12 # some random value to show my point
self.process.start(cmd, cmdOpts)
rxvtPID = self.process.processId()
try:
for c in psutil.Process(rxvtPID).children(True):
print("pid=%d name=%s" % (c.pid, c.name()))
if c.name() == 'gaurdian':
gooPID = c.pid
except (psutil.ZombieProcess, psutil.AccessDenied, psutil.NoSuchProcess) as err:
print(err)
print("gooPID=%d " % gooPID )
The trace log is showing:
rxvtPID=4610 name=rxvt
gooPID=12
which suggests that the initial value of gooPID was not changed. Also seems like traversal of children is not happening (ie I am not seeing children of children, etc).
Am I doing this right ?
I managed to solve this by inserting a sleep(1) before psutil begins to traverse the /proc filesystem. That is
209 # ----------------------- getPidByName() ----------------------
210 def getPidByName(self, name):
211 rxvtProc = psutil.Process(self.process.processId() )
212 time.sleep(1) # else /proc is not ready for read
213 pid = None
214 try:
215 for c in rxvtProc.children(True):
216 # assumption: gaurdian has only one child
217 if c.name() == name:
218 return psutil.Process(c.pid).children()[0].pid
219 except psutil.Error as err:
220 print(err)
221 return pid

Python is crashing due to libdispatch crashing child thread

I am using the pynetdicom library to receive and process medical dicom images. The processing is performed in the callback function "on_association_released". However, when receiving certain studies, it will cause Python to crash due to what appears to be a child thread crashing.
From the OSX crash report it seems to be libdispatch library that is the cause but not sure how or why.
This is the function:
def on_association_released(self):
if not self.auto_process:
self.incoming = []
return
dicoms = [Dicom(f=x) for x in self.incoming]
self.incoming = []
incoming = Study(dicom_list=dicoms)
log.info("Incoming study: {incoming}".format(**locals()))
completed_tasks = {}
time.sleep(1)
for task in AVAILABLE_PROCESS_TASKS:
log.info("Trying task: {task}".format(**locals()))
process_task = task(study=incoming)
try:
if process_task.valid:
log.info("{incoming} is valid for {process_task}".format(**locals()))
try:
process_task.process()
except Exception as e:
log.warning(
'Failed to perform {process_task} on {incoming}: \n {e}'.format(**locals())
)
else:
log.info("Completed {process_task} for {incoming} !".format(**locals()))
else:
log.warning("{incoming} is not a valid study for {process_task}".format(**locals()))
except Exception as e:
log.warning("{incoming} could not be assessed by {process_task}".format(**locals()))
myemail.nhs_mail(recipients=[admin],
subject=f"dicomserver {VERSION}: Failed to start listener",
message=f"{incoming} could not be assessed by {process_task}: {e.args}"
)
This is the final log message from the application log:
2019-03-15 12:19:06 I [process.py:on_association_released:171] Incoming study: Study(1.2.826.0.1.2112370.55.1.12145941)
This is the OSX Crash Report:
Process: Python [84177]
Path: /Library/Frameworks/Python.framework/Versions/3.6/Resources/Python.app/Contents/MacOS/Python
Identifier: Python
Version: 3.6.1 (3.6.1)
Code Type: X86-64 (Native)
Parent Process: Python [84175]
Responsible: Terminal [346]
User ID: 503
Date/Time: 2019-03-15 12:19:06.371 +0000
OS Version: Mac OS X 10.11.6 (15G1108)
Report Version: 11
Anonymous UUID: E7340644-9523-1C6B-0B2B-74D6043CFED6
Time Awake Since Boot: 590000 seconds
System Integrity Protection: enabled
Crashed Thread: 1
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000110
VM Regions Near 0x110:
-->
__TEXT 0000000100000000-0000000100001000 [ 4K] r-x/rwx SM=COW /Library/Frameworks/Python.framework/Versions/3.6/Resources/Python.app/Contents/MacOS/Python
Application Specific Information:
*** multi-threaded process forked ***
crashed on child side of fork pre-exec
This is the top of Thread 1 crash trace:
Thread 1 Crashed:
0 libdispatch.dylib 0x00007fff8e6cc661 _dispatch_queue_push_queue + 345
1 libdispatch.dylib 0x00007fff8e6cab06 _dispatch_queue_wakeup_with_qos_slow + 126
2 libdispatch.dylib 0x00007fff8e6d113f _dispatch_mach_msg_send + 1952
3 libdispatch.dylib 0x00007fff8e6d08dc dispatch_mach_send + 262
4 libxpc.dylib 0x00007fff86858fc9 xpc_connection_send_message_with_reply + 131
5 com.apple.CoreFoundation 0x00007fff8ef43b3f __66-[CFPrefsSearchListSource generationCountFromListOfSources:count:]_block_invoke_2 + 143
6 com.apple.CoreFoundation 0x00007fff8ef4396d _CFPrefsWithDaemonConnection + 381
7 com.apple.CoreFoundation 0x00007fff8ef42af6 __66-[CFPrefsSearchListSource generationCountFromListOfSources:count:]_block_invoke + 150
8 com.apple.CoreFoundation 0x00007fff8ef42893 -[CFPrefsSearchListSource generationCountFromListOfSources:count:] + 179
9 com.apple.CoreFoundation 0x00007fff8ef42174 -[CFPrefsSearchListSource alreadylocked_copyDictionary] + 324
10 com.apple.CoreFoundation 0x00007fff8ef41dbc -[CFPrefsSearchListSource alreadylocked_copyValueForKey:] + 60
11 com.apple.CoreFoundation 0x00007fff8ef41d4c ___CFPreferencesCopyAppValueWithContainer_block_invoke + 60
12 com.apple.CoreFoundation 0x00007fff8ef39a70 +[CFPrefsSearchListSource withSearchListForIdentifier:container:perform:] + 608
13 com.apple.CoreFoundation 0x00007fff8ef397c7 _CFPreferencesCopyAppValueWithContainer + 183
14 com.apple.SystemConfiguration 0x00007fff998b3a9b SCDynamicStoreCopyProxiesWithOptions + 163
15 _scproxy.cpython-36m-darwin.so 0x000000010f0f5a63 get_proxy_settings + 35
16 org.python.python 0x000000010006a604 _PyCFunction_FastCallDict + 436
17 org.python.python 0x00000001000f33e4 call_function + 612
18 org.python.python 0x00000001000f8d84 _PyEval_EvalFrameDefault + 21892
The issue looks awfully similar to a long-standing problem with Python on MacOS.
The root cause as far as I understand it is that fork() is hard to do right if there's threads involved, unless you immediately exec().
MacOS "protects" again the possible pitfalls by crashing a process if it's accessing certain system functionality such as libdispatch if it forked, but didn't exec yet.
Unfortunately these calls can happen in unexpected places, such as in _scproxy.cpython-36m-darwin.so which is shown at position 15 of the stack trace.
There's a number of Python bugs filed about this (1, 2, 3, for example), but there's no silver bullet as far as I know.
In your particular case, it might be possible to prevent the crash by running your Python interpreter with the environment variable no_proxy=*. This should prevent calls to the system configuration framework scproxy to find proxy settings.

exception without explicit reason when deploying U-SQL jobs to Azure by Python SDK

I am using python SDK to submit jobs to Azure using adlaJobClient , I have around 30 dynamic USQLs constructed using JINJA2, which I am populating in a list and then pushing them off to Azure using adlaJobClient one by one, The problem which I am facing is after a random number of successful deployments python throws an exception in the program console without any further explanation.there is no instance of U-SQL job failure in Azure either , below mentioned is error stack trace in python , when I run the same U-SQL query,which I am generating dynamically.. for which the execution is stopping it runs without fail in Azure (manually)
***** starting query number **** 24
Job is not yet done, waiting for 3 seconds. Current state: Compiling
Job is not yet done, waiting for 3 seconds. Current state: Compiling
Job is not yet done, waiting for 3 seconds. Current state: Compiling
Job is not yet done, waiting for 3 seconds. Current state: starting
Job is not yet done, waiting for 3 seconds. Current state: starting
An exception has occurred, use %tb to see the full traceback.
SystemExit: 1
for script in sql_query_list:
jobId = str(uuid.uuid4())
jobResult = adlaJobClient.job.create(adla,jobId,JobInformation(name='Submit ADLA Job '+jobId,type='USql',properties=USqlJobProperties(script=script)))
try:
while(jobResult.state != JobState.ended):
print('Job is not yet done, waiting for 3 seconds. Current state: ' + jobResult.state.value)
time.sleep(3)
jobResult = adlaJobClient.job.get(adla, jobId)
print(' ******* JOB ID ********',jobId)
print("****QUERY no FINISHED *****",sql_query_list.index(script))
print ('**** JOB ID RESULT: ****** ' + jobResult.result.value)
except Exception as e:
raise ValueError
print ("xxxxxx JOB SUBMISSION TO ADLA FAILED xxxxxxx")
print(e)
Option A: Manually log into Portal
The easiest way to check on a failed job is to log into your Azure Portal (http://portal.azure.com), navigate to the Data Lake Analytics account, and click "View all jobs". From the list of jobs you can navigate to your job and view the output with specific error messages. (Keep reading for the automated method.)
Option B: Automated Python Job
You can get the job properties with error messages using job_result.properties.errors. The below code sample will perform a "pretty print" of any U-SQL job error, raising an exception with those details.
Parsing the error info:
def get_pretty_error(error_message_obj):
"""
Returns a string describing the USQL error.
error_message_obj can be obtained via `job_result.error_message[0]`
"""
err_info = error_message_obj.__dict__
error_msgs = "=" * 80 + "\n"
error_msgs += "=" * 6 + " ERROR: {}".format(err_info.pop("description", None)) + "\n"
error_msgs += "=" * 80 + "\n"
error_msgs += err_info.pop("details", None).replace("\\r\\n", "\n").replace("\\n", "\n").replace("\\t", "\t").rstrip() + "...\n"
error_msgs += "=" * 80 + "\n"
error_msgs += "Message: {}\n".format(err_info.pop("message", None))
error_msgs += "Severity: {}\n".format(str(err_info.pop("severity", None)).upper())
error_msgs += "Resolution: {}\n".format(err_info.pop("resolution", None))
inner = err_info.pop("inner_error", None)
for key in ["end_offset", "line_number", "start_offset", "source", "additional_properties"]:
# ignore (don't print these)
err_info.pop(key, None)
err_info = {x: y for x, y in err_info.items() if y} # Remove empty keys
error_msgs += "Addl. Info:\n\t{}\n".format(
yaml.dump(err_info,
default_flow_style=True
).replace("\\t", "\t").replace("\\n", "\n").replace("\n", "\n\t"))
if inner:
# If there's an inner error, concatenate that message as well recursively
error_msgs += _get_pretty_error(inner, ordinal_text + " (B)")
return error_msgs
Making use of this in a wait-for-job function:
def wait_for_usql_job(adlaJobClient, adla_account, job_id):
"""Wait for completion, on error raise an exception (with specific details)"""
print("Waiting for job ID '{}'".format(job_id))
job_result = adlaJobClient.job.get(adla_account, job_id)
while(job_result.state != JobState.ended):
print('Job is not yet done, waiting for 3 seconds. Current state: ' + job_result.state.value)
time.sleep(3)
job_result = adlaJobClient.job.get(adla_account, job_id)
job_properties = job_result.properties.__dict__
detail_msg = (
"\tCompilation Time: {}\n".format(job_properties.pop("total_running_time", None)) +
"\tQueued Time: {}\n".format(job_properties.pop("total_compilation_time", None)) +
"\tExecution Time: {}\n".format(job_properties.pop("total_queued_time", None)))
print('Job completed with result: {}\n{}'
.format(job_result.result.value, detail_msg))
if job_result.result.value == "Succeeded":
return job_result.result.value
elif job_result.result.value == "Failed":
error_msgs = ""
for error in job_result.error_message:
# Loop through errors and concatenate error messages
error_msgs += get_pretty_error(error)
raise Exception("Job execution failed for job_id '{}':\n{}"
.format(job_id, error_msgs))
Notes:
Because these classes are not well documented online, I used the __dict__ property to explore all properties of the JobProperties and ErrorInfo objects, disposing of any the properties that are blank or which I don't need and printing the rest.
You can optionally rewrite this code to call those properties explicitly, without using __dict__.
Any error can have an inner error, which is why I wrote get_pretty_error() as it's own function - so that it can then call itself recursively.

SIGILL after fork in Sage/Python

I'm doing some calculations with Sage.
I am playing around with fork. I have a very simple test case which is basically like this:
def fork_test():
import os
pid = os.fork()
if pid != 0:
print "parent, child: %i" % pid
os.waitpid(pid, 0)
else:
print "child"
try:
# some dummy matrix calculation
finally:
os._exit(0)
(Look below for _fork_test_func() for some matrix calculations.)
And I'm getting:
------------------------------------------------------------------------
Unhandled SIGILL: An illegal instruction occurred in Sage.
This probably occurred because a *compiled* component of Sage has a bug
in it and is not properly wrapped with sig_on(), sig_off(). You might
want to run Sage under gdb with 'sage -gdb' to debug this.
Sage will now terminate.
------------------------------------------------------------------------
With this (incomplete) backtrace:
Crashed Thread: 0 Dispatch queue: com.apple.root.default-priority
Exception Type: EXC_BAD_INSTRUCTION (SIGILL)
Exception Codes: 0x0000000000000001, 0x0000000000000000
Application Specific Information:
BUG IN LIBDISPATCH: flawed group/semaphore logic
Thread 0 Crashed:: Dispatch queue: com.apple.root.default-priority
0 libsystem_kernel.dylib 0x00007fff8c6d1d46 __kill + 10
1 libcsage.dylib 0x0000000101717f33 sigdie + 124
2 libcsage.dylib 0x0000000101717719 sage_signal_handler + 364
3 libsystem_c.dylib 0x00007fff86b1094a _sigtramp + 26
4 libdispatch.dylib 0x00007fff89a66c74 _dispatch_thread_semaphore_signal + 27
5 libdispatch.dylib 0x00007fff89a66f3e _dispatch_apply2 + 143
6 libdispatch.dylib 0x00007fff89a66e30 dispatch_apply_f + 440
7 libBLAS.dylib 0x00007fff906ca435 APL_dtrsm + 1963
8 libBLAS.dylib 0x00007fff906702b6 cblas_dtrsm + 882
9 matrix_modn_dense_double.so 0x0000000108612615 void FFLAS::Protected::ftrsmRightLowerNoTransUnit<double>::delayed<FFPACK::Modular<double> >(FFPACK::Modular<double> const&, unsigned long, unsigned long, FFPACK::Modular<double>::Element*, unsigned long, FFPACK::Modular<double>::Element*, unsigned long, unsigned long, unsigned long) + 2853
10 matrix_modn_dense_double.so 0x0000000108611daa void FFLAS::Protected::ftrsmRightLowerNoTransUnit<double>::delayed<FFPACK::Modular<double> >(FFPACK::Modular<double> const&, unsigned long, unsigned long, FFPACK::Modular<double>::Element*, unsigned long, FFPACK::Modular<double>::Element*, unsigned long, unsigned long, unsigned long) + 698
11 matrix_modn_dense_double.so 0x0000000108612ccf void FFLAS::Protected::ftrsmRightLowerNoTransUnit<double>::operator()<FFPACK::Modular<double> >(FFPACK::Modular<double> const&, unsigned long, unsigned long, FFPACK::Modular<double>::Element*, unsigned long, FFPACK::Modular<double>::Element*, unsigned long) + 831
12 ??? 0x00007f99e481a028 0 + 140298940424232
Thread 1:
0 libsystem_kernel.dylib 0x00007fff8c6d26d6 __workq_kernreturn + 10
1 libsystem_c.dylib 0x00007fff86b24f4c _pthread_workq_return + 25
2 libsystem_c.dylib 0x00007fff86b24d13 _pthread_wqthread + 412
3 libsystem_c.dylib 0x00007fff86b0f1d1 start_wqthread + 13
Thread 2:
0 libsystem_kernel.dylib 0x00007fff8c6d26d6 __workq_kernreturn + 10
1 libsystem_c.dylib 0x00007fff86b24f4c _pthread_workq_return + 25
2 libsystem_c.dylib 0x00007fff86b24d13 _pthread_wqthread + 412
3 libsystem_c.dylib 0x00007fff86b0f1d1 start_wqthread + 13
Thread 0 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000000 rbx: 0x00007fff5ec8e418 rcx: 0x00007fff5ec8df28 rdx: 0x0000000000000000
rdi: 0x000000000000b8f7 rsi: 0x0000000000000004 rbp: 0x00007fff5ec8df40 rsp: 0x00007fff5ec8df28
r8: 0x00007fff5ec8e418 r9: 0x0000000000000000 r10: 0x000000000000000a r11: 0x0000000000000202
r12: 0x00007f99ea500de0 r13: 0x0000000000000003 r14: 0x00007fff5ec8e860 r15: 0x00007fff906ca447
rip: 0x00007fff8c6d1d46 rfl: 0x0000000000000202 cr2: 0x00007fff74a29848
Logical CPU: 0
Is there something special I need to do after a fork? I looked up the fork decorator of Sage and it looks like it basically does the same.
The crash also happens with the fork decorator of Sage itself. Another test case:
def fork_test2():
def test():
# do some stuff
from sage.parallel.decorate import fork
test_ = fork(test, verbose=True)
test_()
Even simpler test case:
def _fork_test_func():
while True:
m = matrix(QQ, 100, [randrange(-100,100) for i in range(100*100)])
m.right_kernel()
def fork_test():
import os
pid = os.fork()
if pid != 0:
print "parent, child: %i" % pid
os.waitpid(pid, 0)
else:
print "child"
try:
_fork_test_func()
finally:
os._exit(0)
Results in a slightly different crash:
python(48672) malloc: *** error for object 0x11185f000: pointer being freed already on death-row
*** set a breakpoint in malloc_error_break to debug
With backtrace:
Crashed Thread: 1 Dispatch queue: com.apple.root.default-priority
Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Application Specific Information:
*** error for object 0x11185f000: pointer being freed already on death-row
Thread 0:: Dispatch queue: com.apple.main-thread
0 matrix2.so 0x0000000107fa403f __pyx_pw_4sage_6matrix_7matrix2_6Matrix_71right_kernel_matrix + 27551
1 ??? 0x000000000000000d 0 + 13
Thread 1 Crashed:: Dispatch queue: com.apple.root.default-priority
0 libsystem_kernel.dylib 0x00007fff8c6d239a __semwait_signal_nocancel + 10
1 libsystem_c.dylib 0x00007fff86b17e1b nanosleep$NOCANCEL + 138
2 libsystem_c.dylib 0x00007fff86b7b9a8 usleep$NOCANCEL + 54
3 libsystem_c.dylib 0x00007fff86b67eca __abort + 203
4 libsystem_c.dylib 0x00007fff86b67dff abort + 192
5 libsystem_c.dylib 0x00007fff86b43905 szone_error + 580
6 libsystem_c.dylib 0x00007fff86b43f7d free_large + 229
7 libsystem_c.dylib 0x00007fff86b3b8f8 free + 199
8 libBLAS.dylib 0x00007fff906b0431 __APL_dgemm_block_invoke_0 + 132
9 libdispatch.dylib 0x00007fff89a65f01 _dispatch_call_block_and_release + 15
10 libdispatch.dylib 0x00007fff89a620b6 _dispatch_client_callout + 8
11 libdispatch.dylib 0x00007fff89a631fa _dispatch_worker_thread2 + 304
12 libsystem_c.dylib 0x00007fff86b24d0b _pthread_wqthread + 404
13 libsystem_c.dylib 0x00007fff86b0f1d1 start_wqthread + 13
The same happens also for this:
def fork_test2():
from sage.parallel.decorate import fork
test_ = fork(_fork_test_func, verbose=True)
test_()
-- but only if you used some other matrix calculations before.
This test case also works on a fresh Sage session:
def _fork_test_func(iterator=None):
if not iterator:
import itertools
iterator = itertools.count()
for i in iterator:
m = matrix(QQ, 100, [randrange(-100,100) for i in range(100*100)])
m.right_kernel()
def fork_test():
_fork_test_func(range(10))
import os
pid = os.fork()
if pid != 0:
print "parent, child: %i" % pid
os.waitpid(pid, 0)
else:
print "child"
try:
_fork_test_func()
finally:
os._exit(0)
I have downloaded the binaries for MacOSX 64bit of Sage 5.8.
(Note that I also asked on ask.sagemath.org here.)
Both of these crashreports indicate that a multi-threaded process fork()ed, which greatly restricts the set of operations that are safe to execute in the child, essentially you can only call execve() et al, along with a few other functions from the list of async-signal-safe functions
This is documented in the CAVEATS section of the fork(2) manpage as well as in the standard:
A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
Since many APIs in Mac OS X frameworks will cause the process to become multithreaded, if you want the fork-child to be fully useable, you must limit you operations in the parent process before fork to APIs documented not to make a process multithreaded (essentially only POSIX APIs).

Categories