I have a GetVars() function (not supposed to be changed), it throws sys.exit(1) in some cases.
I want to do some cleanup for this case:
try:
common_func.GetVars()
except SystemExit:
cmdline = "-user user1"
sys.argv = ['config.py', cmdline]
import config
config.py should take some args, but it contains the only print statement for now. But it is not executed - is there any way to do it? Just trying to understand what happens, I know the code looks odd :)
UPD:
now I'm trying to run
cur_dir = os.path.dirname(os.path.realpath(__file__))
gcti_cfgDir = os.path.join(cur_dir, "..", "cfg-scripts")
sys.path.append(gcti_cfgDir)
import config
try:
sys.exit(1)
except SystemExit:
try:
import config
except:
print "errr"
I tried it with this mymodule.py file:
$ cat mymodule.py
import sys
try:
sys.exit(1)
except SystemExit:
cmdline = "-user user1"
sys.argv = ['config.py', cmdline]
import config
and with this config.py file:
$ cat config.py
print "Everything is ok for now"
The result is the expected one:
$ python mymodule.py
Everything is ok for now
I am mostly sure the problem is not in the import per se, neither in the SystemExit capturing... Probably your config.py file is broken.
UPDATE: Ah, I believe I got it!
According to your new code, you are importing the config module before the sys.exit(1) and importing it in the except block, too:
import config # <-- here...
try:
sys.exit(1)
except SystemExit:
try:
import config # <-- and here
except:
print "errr"
However, the code of the module is executed just at the first import. The following ones either put the module in the namespace or, if it is already in the namespace, do nothing.
The best solution, as I see it, is to define a function inside your module. If this is the content of your module:
print "Everything is ok for now"
just replace it by
def run():
print "Everything is ok for now"
And instead of importing the module where you want it to be executed, import it only once and call the function:
import config # This line just imports, does not print nothing
config.run() # OTOH, this one prints the line...
try:
sys.exit(1)
except SystemExit:
try:
config.run() # This one prints the line too
except:
print "errr"
Actually, this is surely the best way of dealing with code in Python: put code into functions, put functions into modules, call the functions. It is not generally a good practice to put your executable code directly inside the module, as you were trying to do.
2nd UPDATE: If you cannot change the code of the config.py module, you can still call it using subprocess.call to call the Python interpreter. You can even pass the parameters you were trying to add to sys.argv:
import subprocess
# ...
subprocess.call(['python', 'config.py'])
try:
sys.exit(1)
except SystemExit:
try:
subprocess.call(['python', 'config.py', "-user", "user1"]) # Extra args
except:
print "errr"
Use the atexit module to register cleanup handlers or if you're sure that GetVars is calling sys.exit, monkey patch the latter to throw a custom exception instead of really running so that you can catch it and manually handle it.
Related
I wrote a package that is using multiprocessing.Pool inside one of its functions.
Due to this reason, it is mandatory (as specified in here under "Safe importing of main module") that the outermost calling function can be imported safely e.g. without starting a new process. This is usually achieved using the if __name__ == "__main__": statement as explicitly explained at the link above.
My understanding (but please correct me if I'm wrong) is that multiprocessing imports the outermost calling module. So, if this is not "import-safe", this will start a new process that will import again the outermost module and so on recursively, until everything crashes.
If the outermost module is not "import-safe" when the main function is launched it usually hangs without printing any warning, error, message, anything.
Since using if __name__ == "__main__": is not usually mandatory and the user is usually not always aware of all the modules used inside a package, I would like to check at the beginning of my function if the user complied with this requirement and, if not, raise a warning/error.
Is this possible? How can I do this?
To show this with an example, consider the following example.
Let's say I developed my_module.py and I share it online/in my company.
# my_module.py
from multiprocessing import Pool
def f(x):
return x*x
def my_function(x_max):
with Pool(5) as p:
print(p.map(f, range(x_max)))
If a user (not me) writes his own script as:
# Script_of_a_good_user.py
from my_module import my_function
if __name__ == '__main__':
my_function(10)
all is good and the output is printed as expected.
However, if a careless user writes his script as:
# Script_of_a_careless_user.py
from my_module import my_function
my_function(10)
then the process hangs, no output is produces, but no error message or warning is issued to the user.
Is there a way inside my_function, BEFORE opening Pool, to check if the user used the if __name__ == '__main__': condition in its script and, if not, raise an error saying it should do it?
NOTE: I think this behavior is only a problem on Windows machines where fork() is not available, as explained here.
You can use the traceback module to inspect the stack and find the information you're looking for. Parse the top frame, and look for the main shield in the code.
I assume this will fail when you're working with a .pyc file and don't have access to the source code, but I assume developers will test their code in the regular fashion first before doing any kind of packaging, so I think it's safe to assume your error message will get printed when needed.
Version with verbose messages:
import traceback
import re
def called_from_main_shield():
print("Calling introspect")
tb = traceback.extract_stack()
print(traceback.format_stack())
print(f"line={tb[0].line} lineno={tb[0].lineno} file={tb[0].filename}")
try:
with open(tb[0].filename, mode="rt") as f:
found_main_shield = False
for i, line in enumerate(f):
if re.search(r"__name__.*['\"]__main__['\"]", line):
found_main_shield = True
if i == tb[0].lineno:
print(f"found_main_shield={found_main_shield}")
return found_main_shield
except:
print("Coulnd't inspect stack, let's pretend the code is OK...")
return True
print(called_from_main_shield())
if __name__ == "__main__":
print(called_from_main_shield())
In the output, we see that the first called to called_from_main_shield returns False, while the second returns True:
$ python3 introspect.py
Calling introspect
[' File "introspect.py", line 24, in <module>\n print(called_from_main_shield())\n', ' File "introspect.py", lin
e 7, in called_from_main_shield\n print(traceback.format_stack())\n']
line=print(called_from_main_shield()) lineno=24 file=introspect.py
found_main_shield=False
False
Calling introspect
[' File "introspect.py", line 27, in <module>\n print(called_from_main_shield())\n', ' File "introspect.py", lin
e 7, in called_from_main_shield\n print(traceback.format_stack())\n']
line=print(called_from_main_shield()) lineno=27 file=introspect.py
found_main_shield=True
True
More concise version:
def called_from_main_shield():
tb = traceback.extract_stack()
try:
with open(tb[0].filename, mode="rt") as f:
found_main_shield = False
for i, line in enumerate(f):
if re.search(r"__name__.*['\"]__main__['\"]", line):
found_main_shield = True
if i == tb[0].lineno:
return found_main_shield
except:
return True
Now, it's not super elegant to use re.search() like I did, but it should be reliable enough. Warning: since I defined this function in my main script, I had to make sure that line didn't match itself, which is why I used ['\"] to match the quotes instead of using a simpler RE like __name__.*__main__. Whatever you chose, just make sure it's flexible enough to match all legal variants of that code, which is what I aimed for.
I think the best you can do is to try execute the code and provide a hint if it fails. Something like this:
# my_module.py
import sys # Use sys.stderr to print to the error stream.
from multiprocessing import Pool
def f(x):
return x*x
def my_function(x_max):
try:
with Pool(5) as p:
print(p.map(f, range(x_max)))
except RuntimeError as e:
print("Whoops! Did you perhaps forget to put the code in `if __name__ == '__main__'`?", file=sys.stderr)
raise e
This is of course not a 100% solution, as there might be several other reasons the code throws a RuntimeError.
If it doesn't raise a RuntimeError, an ugly solution would be to explicitly force the user to pass in the name of the module.
# my_module.py
from multiprocessing import Pool
def f(x):
return x*x
def my_function(x_max, module):
"""`module` must be set to `__name__`, for example `my_function(10, __name__)`"""
if module == '__main__':
with Pool(5) as p:
print(p.map(f, range(x_max)))
else:
raise Exception("This can only be called from the main module.")
And call it as:
# Script_of_a_careless_user.py
from my_module import my_function
my_function(10, __name__)
This makes it very explicit to the user.
Struggling to succinctly describe this in the title...
I have a module I want to test:
mod.py:
import subprocess
class MyStuff(object):
def my_fun(self):
try:
print subprocess
out = subprocess.check_output(["echo", "pirates"])
except subprocess.CalledProcessError:
print "caught exception"
And the test module test_mod.py:
import unittest
import mock
from mod import MyStuff
import subprocess
class Tests(unittest.TestCase):
def setUp(self):
self.patched_subprocess = mock.patch(
'mod.subprocess', autospec=True)
self.mock_subprocess = self.patched_subprocess.start()
self.my_stuff = MyStuff()
def tearDown(self):
self.patched_subprocess.stop()
def test_my_fun(self):
self.mock_subprocess.check_output = mock.Mock(
side_effect=subprocess.CalledProcessError(0, "hi", "no"))
with self.assertRaises(subprocess.CalledProcessError):
out = self.my_stuff.my_fun()
if __name__ == '__main__':
unittest.main()
I then run python test_mod.py and I see the following output:
<NonCallableMagicMock name='subprocess' spec='module' id='140654009377872'>
.
----------------------------------------------------------------------
Ran 1 test in 0.007s
OK
I'm pleased that the subprocess object has been mocked, but why is the print "caught exception" statement not executed? I'm guessing it's because the real exception getting throw in test_mod.subprocess.CalledProcessException and not subprocess.CalledProcessException as I intend, but I'm not sure how to resolve that. Any suggestion? Thanks for your time.
I solved this eventually...
The problem was I was mocking the entire subprocess module, which included the CalledProcessError exception! That's why it didn't seem to match the exception I was raising in my test module, because it was a completely different object.
The fix is to mock just subprocess.check_output, D'oh!
I've got a script that imports modules dynamically based on configuration. I'm trying to implement a daemon context (using the python-daemon module) on the script, and it seems to be interfering with python's ability to find the modules in question.
Insite mymodule/__init__.py in setup() I do this:
load_modules(args, config, logger)
try:
with daemon.DaemonContext(
files_preserve = getLogfileHandlers(logger)
):
main_loop(config)
I've got a call to setup() inside mymodule/__main__.py and I'm loading the whole thing this way:
PYTHONPATH=. python -m mymodule
This works fine, but a listening port that gets set up inside load_modules() is closed by the newly added daemon context, so I want to move that function call inside the daemon context like so:
try:
with daemon.DaemonContext(
files_preserve = getLogfileHandlers(logger)
):
load_modules(args, config, logger)
main_loop(config)
Modules are loaded inside load_modules() this way:
for mysubmodule in modules:
try:
i = importlib.import_module("mymodule.{}".format(mysubmodule))
except ImportError as err:
logger.error("import of mymodule.{} failed: {}".format(
mysubmodule, err))
With load_modules() outside the daemon context this works fine. When I move it inside the daemon context it seems to be unable to find the modules it's looking for. I get this:
import of mymodule.submodule failed: No module named submodule
It looks like some sort of namespace problem -- I note that the exception only refers to the submodule portion of the module name I try to import -- but I've compared everything I can think of inside and outside the daemon context, and I can't find the important difference. sys.path is unchanged, the daemon context isn't clearing the environemnt, or chrooting. The cwd changes to / of course, but that shouldn't have any effect on python's ability to find modules, since the absolute path to . appears in sys.path.
What am I missing here?
EDIT: I'm adding an SSCCE to make the situation more clear. The following three files create a module called "mymodule" that can be run from the command line as PYTHONPATH=. python -m mymodule. There are two calls to load_module() in __init__.py, one commented out. You can demonstrate the problem by swapping which one is commented.
mymodule/__main__.py
from mymodule import setup
import sys
if __name__ == "__main__":
sys.exit(setup())
mymodule/__init__.py
import daemon
import importlib
import logging
def main_loop():
logger = logging.getLogger('loop')
logger.debug("Code runs here.")
def load_module():
logger = logging.getLogger('load_module')
submodule = 'foo'
try:
i = importlib.import_module("mymodule.{}".format(submodule))
except ImportError as e:
logger.error("import of mymodule.{} failed: {}".format(
submodule, e))
def setup_logging():
logfile = 'mymodule.log'
fh = logging.FileHandler(logfile)
root_logger = logging.getLogger()
root_logger.addHandler(fh)
root_logger.setLevel(logging.DEBUG)
def get_logfile_handlers(logger):
handlers = []
for handler in logger.handlers:
handlers.append(handler.stream.fileno())
return handlers
def setup():
setup_logging()
logger = logging.getLogger()
# load_module()
with daemon.DaemonContext(
files_preserve = get_logfile_handlers(logger)
):
load_module()
main_loop()
mymodule/foo.py
import logging
logger=logging.getLogger('foo')
logger.debug("Inside foo.py")
I spent a good 4 hours trying to work this one out when I hit it in my own project. The clue is here:
If the module being imported is supposed to be contained within a package then the second argument passed to find_module(), __path__ on the parent package, is used as the source of paths.
(From https://docs.python.org/2/reference/simple_stmts.html#import)
Once you have successfully imported mymodule, python2 no longer uses sys.path to search for the submodules, it uses sys.modules["mymodule"].__path__. When you import mymodule, python2 unhelpfully sets its __path__ to the relative directory it was stored in:
mymodule.__path__ = ['mymodule']
After daemonizing, python's CWD is set to / and the only place the import internals search for mysubmodule is in /mymodule.
I worked around this by using os.chdir() to change CWD back to the old dir after daemonizing:
oldcwd = os.getcwd()
with DaemonizeContext():
os.chdir(oldcwd)
# ... daemon things
This works fine, but a listening port that gets set up inside load_modules() is closed by the newly added daemon context, so
No. load_modules() should load modules. It should not open ports.
If you need to preserve a file or socket opened outside the context, pass it to files_preserve. If possible, it is preferred to simply open files and such inside the context instead, as I suggest above.
I had one script with custom exception classes in the form of:
class DirectionError(Exception):
pass
I had my functions in the same script in the form of:
def func1(x):
if x == 1:
raise DirectionError
I put my function calls into a try/except/except block in the form of:
try:
func1(2)
except DirectionError:
logging.debug("Custom error message")
sys.exit()
except:
logging.debug(traceback.format_exc())
I subsequently moved the functions into a seperate mytools.py file. I import the mytools.py file into my main python script.
I moved the custom exception classes into the mytools.py file but exception is not reaching the main python script.
How do I get those functions in the mytools.py file to send the exception back to the try/except block in my main python script?
Thanks.
It depends on how did you import mytools.
If you imported it as
import mytools
then changing:
except DirectionError:
to:
except mytools.DirectionError:
should work.
If you imported only your function with:
from mytools import func1
change it to:
from mytools import func1, DirectionError
Basically, you need to import the DirectionError class into your main code and reference it correctly.
Besides, your exception raise only when you call func1(1), and you are calling func1(2).
Define the exception in its own scriptfile and then import that file into both mytools.py and your main script
How can I log my Python exceptions?
try:
do_something()
except:
# How can I log my exception here, complete with its traceback?
Use logging.exception from within the except: handler/block to log the current exception along with the trace information, prepended with a message.
import logging
LOG_FILENAME = '/tmp/logging_example.out'
logging.basicConfig(filename=LOG_FILENAME, level=logging.DEBUG)
logging.debug('This message should go to the log file')
try:
run_my_stuff()
except:
logging.exception('Got exception on main handler')
raise
Now looking at the log file, /tmp/logging_example.out:
DEBUG:root:This message should go to the log file
ERROR:root:Got exception on main handler
Traceback (most recent call last):
File "/tmp/teste.py", line 9, in <module>
run_my_stuff()
NameError: name 'run_my_stuff' is not defined
Use exc_info options may be better, remains warning or error title:
try:
# coode in here
except Exception as e:
logging.error(e, exc_info=True)
My job recently tasked me with logging all the tracebacks/exceptions from our application. I tried numerous techniques that others had posted online such as the one above but settled on a different approach. Overriding traceback.print_exception.
I have a write up at http://www.bbarrows.com/ That would be much easier to read but Ill paste it in here as well.
When tasked with logging all the exceptions that our software might encounter in the wild I tried a number of different techniques to log our python exception tracebacks. At first I thought that the python system exception hook, sys.excepthook would be the perfect place to insert the logging code. I was trying something similar to:
import traceback
import StringIO
import logging
import os, sys
def my_excepthook(excType, excValue, traceback, logger=logger):
logger.error("Logging an uncaught exception",
exc_info=(excType, excValue, traceback))
sys.excepthook = my_excepthook
This worked for the main thread but I soon found that the my sys.excepthook would not exist across any new threads my process started. This is a huge issue because most everything happens in threads in this project.
After googling and reading plenty of documentation the most helpful information I found was from the Python Issue tracker.
The first post on the thread shows a working example of the sys.excepthook NOT persisting across threads (as shown below). Apparently this is expected behavior.
import sys, threading
def log_exception(*args):
print 'got exception %s' % (args,)
sys.excepthook = log_exception
def foo():
a = 1 / 0
threading.Thread(target=foo).start()
The messages on this Python Issue thread really result in 2 suggested hacks. Either subclass Thread and wrap the run method in our own try except block in order to catch and log exceptions or monkey patch threading.Thread.run to run in your own try except block and log the exceptions.
The first method of subclassing Thread seems to me to be less elegant in your code as you would have to import and use your custom Thread class EVERYWHERE you wanted to have a logging thread. This ended up being a hassle because I had to search our entire code base and replace all normal Threads with this custom Thread. However, it was clear as to what this Thread was doing and would be easier for someone to diagnose and debug if something went wrong with the custom logging code. A custome logging thread might look like this:
class TracebackLoggingThread(threading.Thread):
def run(self):
try:
super(TracebackLoggingThread, self).run()
except (KeyboardInterrupt, SystemExit):
raise
except Exception, e:
logger = logging.getLogger('')
logger.exception("Logging an uncaught exception")
The second method of monkey patching threading.Thread.run is nice because I could just run it once right after __main__ and instrument my logging code in all exceptions. Monkey patching can be annoying to debug though as it changes the expected functionality of something. The suggested patch from the Python Issue tracker was:
def installThreadExcepthook():
"""
Workaround for sys.excepthook thread bug
From
http://spyced.blogspot.com/2007/06/workaround-for-sysexcepthook-bug.html
(https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1230540&group_id=5470).
Call once from __main__ before creating any threads.
If using psyco, call psyco.cannotcompile(threading.Thread.run)
since this replaces a new-style class method.
"""
init_old = threading.Thread.__init__
def init(self, *args, **kwargs):
init_old(self, *args, **kwargs)
run_old = self.run
def run_with_except_hook(*args, **kw):
try:
run_old(*args, **kw)
except (KeyboardInterrupt, SystemExit):
raise
except:
sys.excepthook(*sys.exc_info())
self.run = run_with_except_hook
threading.Thread.__init__ = init
It was not until I started testing my exception logging I realized that I was going about it all wrong.
To test I had placed a
raise Exception("Test")
somewhere in my code. However, wrapping a a method that called this method was a try except block that printed out the traceback and swallowed the exception. This was very frustrating because I saw the traceback bring printed to STDOUT but not being logged. It was I then decided that a much easier method of logging the tracebacks was just to monkey patch the method that all python code uses to print the tracebacks themselves, traceback.print_exception.
I ended up with something similar to the following:
def add_custom_print_exception():
old_print_exception = traceback.print_exception
def custom_print_exception(etype, value, tb, limit=None, file=None):
tb_output = StringIO.StringIO()
traceback.print_tb(tb, limit, tb_output)
logger = logging.getLogger('customLogger')
logger.error(tb_output.getvalue())
tb_output.close()
old_print_exception(etype, value, tb, limit=None, file=None)
traceback.print_exception = custom_print_exception
This code writes the traceback to a String Buffer and logs it to logging ERROR. I have a custom logging handler set up the 'customLogger' logger which takes the ERROR level logs and send them home for analysis.
You can log all uncaught exceptions on the main thread by assigning a handler to sys.excepthook, perhaps using the exc_info parameter of Python's logging functions:
import sys
import logging
logging.basicConfig(filename='/tmp/foobar.log')
def exception_hook(exc_type, exc_value, exc_traceback):
logging.error(
"Uncaught exception",
exc_info=(exc_type, exc_value, exc_traceback)
)
sys.excepthook = exception_hook
raise Exception('Boom')
If your program uses threads, however, then note that threads created using threading.Thread will not trigger sys.excepthook when an uncaught exception occurs inside them, as noted in Issue 1230540 on Python's issue tracker. Some hacks have been suggested there to work around this limitation, like monkey-patching Thread.__init__ to overwrite self.run with an alternative run method that wraps the original in a try block and calls sys.excepthook from inside the except block. Alternatively, you could just manually wrap the entry point for each of your threads in try/except yourself.
You can get the traceback using a logger, at any level (DEBUG, INFO, ...). Note that using logging.exception, the level is ERROR.
# test_app.py
import sys
import logging
logging.basicConfig(level="DEBUG")
def do_something():
raise ValueError(":(")
try:
do_something()
except Exception:
logging.debug("Something went wrong", exc_info=sys.exc_info())
DEBUG:root:Something went wrong
Traceback (most recent call last):
File "test_app.py", line 10, in <module>
do_something()
File "test_app.py", line 7, in do_something
raise ValueError(":(")
ValueError: :(
EDIT:
This works too (using python 3.6)
logging.debug("Something went wrong", exc_info=True)
What I was looking for:
import sys
import traceback
exc_type, exc_value, exc_traceback = sys.exc_info()
traceback_in_var = traceback.format_tb(exc_traceback)
See:
https://docs.python.org/3/library/traceback.html
Uncaught exception messages go to STDERR, so instead of implementing your logging in Python itself you could send STDERR to a file using whatever shell you're using to run your Python script. In a Bash script, you can do this with output redirection, as described in the BASH guide.
Examples
Append errors to file, other output to the terminal:
./test.py 2>> mylog.log
Overwrite file with interleaved STDOUT and STDERR output:
./test.py &> mylog.log
Here is a version that uses sys.excepthook
import traceback
import sys
logger = logging.getLogger()
def handle_excepthook(type, message, stack):
logger.error(f'An unhandled exception occured: {message}. Traceback: {traceback.format_tb(stack)}')
sys.excepthook = handle_excepthook
This is how I do it.
try:
do_something()
except:
# How can I log my exception here, complete with its traceback?
import traceback
traceback.format_exc() # this will print a complete trace to stout.
maybe not as stylish, but easier:
#!/bin/bash
log="/var/log/yourlog"
/path/to/your/script.py 2>&1 | (while read; do echo "$REPLY" >> $log; done)
To key off of others that may be getting lost in here, the way that works best with capturing it in logs is to use the traceback.format_exc() call and then split this string for each line in order to capture in the generated log file:
import logging
import sys
import traceback
try:
...
except Exception as ex:
# could be done differently, just showing you can split it apart to capture everything individually
ex_t = type(ex).__name__
err = str(ex)
err_msg = f'[{ex_t}] - {err}'
logging.error(err_msg)
# go through the trackback lines and individually add those to the log as an error
for l in traceback.format_exc().splitlines():
logging.error(l)
Heres a simple example taken from the python 2.6 documentation:
import logging
LOG_FILENAME = '/tmp/logging_example.out'
logging.basicConfig(filename=LOG_FILENAME,level=logging.DEBUG,)
logging.debug('This message should go to the log file')