How to extract error detail from traceback by using python regex?

How to extract error detail from traceback by using python regex? - python

I want to extract error detail from traceback, those tracebacks are extract from log file by using this method, and there are many different kinds of tracebacks, like below:
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/root/env/common/test/test/__main__.py", line 5, in <module>
main()
File "/root/env/common/test/test/cli/parser.py", line 55, in main
run_script(args)
File "/root/env/common/test/test/cli/runner.py", line 124, in run_script
exec_script(args.script, scope=globals(), root=True)
File "/root/workspace/group/test_regression_utils.py", line 123, in exec_script
cli_exec_script(*args,**kwargs)
File "/root/env/common/test/test/cli/runner.py", line 186, in exec_script
exec(compile(code, scriptpath, 'exec')) in scope
File "shiju_12.py", line 30, in <module>
File "/root/moaworkspace/group/testscript/utils/shiju_public.py", line 37, in shiju_move
Exception: close failed!
EndOfStream
EndOfStream
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/root/env/common/test/test/__main__.py", line 5, in <module>
main()
File "/root/env/common/test/test/cli/parser.py", line 55, in main
run_script(args)
File "/root/env/common/test/test/cli/runner.py", line 124, in run_script
exec_script(args.script, scope=globals(), root=True)
File "/root/env/common/test/test/cli/runner.py", line 186, in exec_script
exec(compile(code, scriptpath, 'exec')) in scope
File "/root/env/common/mator/mator/mator.py", line 520, in start
raise IOError("RPC server not started!")
IOError: RPC server not started
the expect result is:
("XXXX", "Exception: close failed!")
("XXXX","IOError: RPC server not started")
I have tried detail_regex = r'Traceback.*\n(\s+.*\n)*(.*)\n*'
the second traceback is right, but the first traceback result is ("Exception: close failed!", "EndOfStream")
any ideas?

If I change your pattern to Traceback.*\n(\s+.*\n)*(.*?)\n* it works for the given example. I am not sure though, that it solves your problem completely.

That will match:
Traceback \(most recent call last\): Match literally
(?:\n.*)+? Repeat in a non capturing group matching a newline followed by 0+ times any character
\n(.*?(?:Exception|Error):) Match newline and capturinggroup 0+ characters non greedy and than match Exception of Error followed by :`
\s* Match 0+ whitespace characters
(.+) Capturing group 1+ times any character
For example:
import re
import traceback
EXCEPTION_PATTERN = re.compile(
r"Traceback \(most recent call last\):(?:\n.*)+?\n(.*?(?:Exception|Error):)\s*(.+)"
)
try:
hello()
except Exception as ex:
try:
1/0
except Exception as ex:
ex_match = EXCEPTION_PATTERN.findall(traceback.format_exc())
print(ex_match)
output:
[('NameError:', "name 'hello' is not defined"), ('ZeroDivisionError:', 'division by zero')]
If you need to count Exception / Error with details:
for i, data in enumerate(ex_match, start=1):
print(f"No.{i} - {data[0]} {data[1]}")
output:
No.1 - NameError: name 'hello' is not defined
No.2 - ZeroDivisionError: division by zero

Related

Why same python re pattern regex works in single line but not in multi line

import re
regex =re.compile('''
((.*\n){2}
Cannot display: file marked as a binary type.\n
(.*\n){1})
''', re.X)
Above code throws error
Traceback (most recent call last):
File "/test.py", line 8, in <module>
''', re.X)
File "/usr/lib64/python2.7/re.py", line 190, in compile
return _compile(pattern, flags)
File "/usr/lib64/python2.7/re.py", line 242, in _compile
raise error, v # invalid expression
sre_constants.error: nothing to repeat
while writing the regex in a single line works fine and there is no error
regex = re.compile('((.*\n){2}Cannot display: file marked as a binary type.\n(.*\n){1})')

Error extracting an archive using `tarfile`

I am getting an error while trying to extract a .tar.gz archive using the tarfile library.
Here is the relevant code snippet:
# `gzip_archive_bytes_content` is the content of the gzip archive, in "bytes" format
repo_sources_file_object = io.BytesIO(gzip_archive_bytes_content)
repo_sources_tar_object = tarfile.TarFile(fileobj=repo_sources_file_object)
repo_sources_tar_object.extractall(path="/tmp/")
This the error I get:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/tarfile.py", line 186, in nti
s = nts(s, "ascii", "strict")
File "/usr/local/lib/python3.7/tarfile.py", line 170, in nts
return s.decode(encoding, errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x9a in position 1: ordinal not in range(128)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/tarfile.py", line 2289, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "/usr/local/lib/python3.7/tarfile.py", line 1095, in fromtarfile
obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
File "/usr/local/lib/python3.7/tarfile.py", line 1037, in frombuf
chksum = nti(buf[148:156])
File "/usr/local/lib/python3.7/tarfile.py", line 189, in nti
raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.7/site-packages/my-package/__main__.py", line 87, in <module>
function(**function_args)
File "/usr/local/lib/python3.7/site-packages/my-package/chart.py", line 107, in reinstall
install()
File "/usr/local/lib/python3.7/site-packages/my-package/chart.py", line 89, in install
repo_sources_tar_object = tarfile.TarFile(fileobj=repo_sources_file_object)
File "/usr/local/lib/python3.7/tarfile.py", line 1484, in __init__
self.firstmember = self.next()
File "/usr/local/lib/python3.7/tarfile.py", line 2301, in next
raise ReadError(str(e))
tarfile.ReadError: invalid header
Python version: 3.7

I switched from instanciating directly a tarfile.TarFile object to using the tarfile.open() constructor, and it fixed it:
repo_sources_tar_object = tarfile.open(fileobj=repo_sources_file_object)
There is actually a warning on this in the documentation, here:
Do not use this class directly: use tarfile.open() instead.

The best practice is to use a context manager in order to automatically close the file when the job is done.
One could write:
import contextlib
import io
import tarfile
gzip_archive_bytes_content = b"..."
repo_sources_file_object = io.BytesIO(gzip_archive_bytes_content)
with contextlib.closing(tarfile.open(fileobj=repo_sources_file_object)) as arch:
arch.extractall(path="/tmp/")
This is available with tarfile.TarFile, but not with tarfile.open().
So you could write:
with tarfile.TarFile(...) as arch:
...

Output parameter in "qgis:heatmapkerneldensityestimation" is not working

This is the code:
import processing
import os
#defines the folder
folder="C:/Users/Pueyo/Google Drive/Consultoria/Mapa escolar/Dades UMAT/heatmap"
#capts all the files in the folder
filelist=os.listdir(folder)
feedback = QgsProcessingFeedback()
#if the file is a shapefile, run the algortihm
for file in filelist:
if file.endswith('.shp'):
layer=QgsVectorLayer(folder+file,file,'ogr')
file2 = file.replace(".shp",""
output=str(folder + "/hm200_" + file2 + ".tif")
parameters={'INPUT':layer,'RADIUS':200, 'PIXEL_SIZE':5,'OUTPUT':output}
processing.runAndLoadResults('qgis:heatmapkerneldensityestimation', parameters, feedback=feedback)
A screen capture:
https://i.stack.imgur.com/2vLxn.png
This is the error I get:
Traceback (most recent call last):
File "C:\PROGRA~1\QGIS3~1.4\apps\Python37\lib\code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "<string>", line 15, in <module>
File "C:/PROGRA~1/QGIS3~1.4/apps/qgis-
ltr/./python/plugins\processing\tools\general.py", line 138, in runAndLoadResults
return Processing.runAlgorithm(alg, parameters=parameters, onFinish=handleAlgorithmResults, feedback=feedback, context=context)
File "C:/PROGRA~1/QGIS3~1.4/apps/qgis-ltr/./python/plugins\processing\core\Processing.py", line 183, in runAlgorithm raise QgsProcessingException(msg)
_core.QgsProcessingException: There were errors executing the algorithm.
The problem must be the definition of the output string, because I tried the same code but writing a route directly on the parameters list and it worked.

Check if a variable exists

I am using following method isset(var) to determine if a variable exists.
def isset(variable):
try:
variable
except NameError:
return False
else:
return True
It returns True if variable exists. But if a variable doesn't exist I get following:
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/lenovo/pyth/master/vzero/__main__.py", line 26, in <module>
ss.run()
File "vzero/ss.py", line 4, in run
snap()
File "vzero/ss.py", line 7, in snap
core.display()
File "vzero/core.py", line 77, in display
stdout(session(username()))
File "vzero/core.py", line 95, in session
if isset(ghi): #current_sessions[user]):
NameError: global name 'ghi' is not defined
I don't want all these errors. I just want it return False. No output. How can I do this?

Instead of writing a complex helper function isset and calling it
if not isset('variable_name'):
# handle the situation
in the place where you want to check the presence of the variable do:
try:
# some code with the variable in question
except NameError:
# handle the situation

Dynamically alter traceback/stack frame function name

From time to time, I write a generic function in Python that gets called multiple times with different arguments. Usually, this is driven by a definition somewhere else.
For example:
def issue_sql_query(name, select_stmt):
...
QUERIES = [
"get_user_rows", "SELECT name, rowid FROM table WHERE type == 'USER';"
...
]
results = []
for name, select_stmt in QUERIES:
results.append((name, issue_sql_query(name, select_stmt)))
If there's an exception in the generic function (i.e., issue_sql_query or somewhere deeper), I have relatively little info in the traceback to identify which definition caused the error.
What I'd like to do is dynamically rename or augment the function name/stack frame so that tracebacks would include some identifying info.
What would be nice is something like this:
File "test.py", line 21, in <module>
results.append((name, issue_sql_query(select_stmt)))
File "test.py", line 11, in issue_sql_query(name="get_user_rows")
raise RuntimeError("Some error")
RuntimeError: Some error
I could, of course, stick exception handling at the generic points and rebuild the exception using traceback to have more context, which is pretty straightforward and likely the right choice. It gets a little tricky when you have multiple levels of generic functions, but that's certainly possible to handle.
Any other ideas on how to accomplish this? Did I miss some easy way to change the stack frame name?
Edit:
Adding an example traceback showing a relatively not useful traceback:
Traceback (most recent call last):
File "C:\Python27\lib\runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "C:\Python27\lib\runpy.py", line 72, in _run_code
exec code in run_globals
File "c:\tmp\report.py", line 767, in <module>
main()
File "c:\tmp\report.py", line 750, in main
charts.append(report.get_chart(title, definition))
File "c:\tmp\report.py", line 614, in get_chart
return self.get_bar_chart(title, definition)
File "c:\tmp\report.py", line 689, in get_bar_chart
definition, cursor, **kwargs))
File "c:\tmp\report.py", line 627, in create_key_table
for row in cursor.execute(full_select_stmt):
sqlite3.OperationalError: near "==": syntax error

You could create wrapper functions for each named query so that you can control the exception thrown:
>>> fns = {}
>>> def issue_sql_query(name, stmt):
... if name not in fns:
... def f(name, stmt):
... # run query here
... raise Exception(name)
...
... fns[name] = f
... return fns[name](name, stmt)
...
>>>
>>> issue_sql_query('b', 'SQL')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 9, in issue_sql_query
File "<stdin>", line 5, in f
Exception: b
>>> issue_sql_query('a', 'SQL')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 9, in issue_sql_query
File "<stdin>", line 5, in f
Exception: a
>>>

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to extract error detail from traceback by using python regex? - python

If I change your pattern to Traceback.\n(\s+.\n)(.?)\n* it works for the given example. I am not sure though, that it solves your problem completely.

Related

Why same python re pattern regex works in single line but not in multi line

Error extracting an archive using `tarfile`

Output parameter in "qgis:heatmapkerneldensityestimation" is not working

Check if a variable exists

Dynamically alter traceback/stack frame function name

Categories

Resources