Parse postgresql -pycparser.plyparser.ParseError before: pgwin32_signal_event - python

I need to parse an open-source project Postgresql using pycparser.
While parsing its source-code the following error arises:
Traceback (most recent call last):
File "examples\using_cpp_libc.py", line 48, in <module>
getAllFiles(projectName)
File "examples\using_cpp_libc.py", line 29, in getAllFiles
ast = parse_file(dirName+'\\'+fname, use_cpp = True, cpp_path = 'cpp',
cpp_args = [r'-nostdinc',r'-Iutils/fake_libc_include',r'-
Iprojects/postgresql/src/include'])
File "G:\python\pycparser-master\pycparser\__init__.py", line 92, in
parse_file
return parser.parse(text, filename)
File "G:\python\pycparser-master\pycparser\c_parser.py", line 152, in parse
debug=debuglevel)
File "G:\python\pycparser-master\pycparser\ply\yacc.py", line 334, in parse
return self.parseopt_notrack(input, lexer, debug, tracking, tokenfunc)
File "G:\python\pycparser-master\pycparser\ply\yacc.py", line 1204, in
parseopt_notrack
tok = call_errorfunc(self.errorfunc, errtoken, self)
File "G:\python\pycparser-master\pycparser\ply\yacc.py", line 193, in
call_errorfunc
r = errorfunc(token)
File "G:\python\pycparser-master\pycparser\c_parser.py", line 1838, in
p_error
column=self.clex.find_tok_column(p)))
File "G:\python\pycparser-master\pycparser\plyparser.py", line 67, in
_parse_error
raise ParseError("%s: %s" % (coord, msg))
pycparser.plyparser.ParseError:
projects/postgresql/src/include/pg_config_os.h:366:15: before:
pgwin32_signal_event
I am using postgresql-9.6.9, build it using visual studio express 2017 on windows 10 (64-bit)

The blog post you quoted in the comment is the canonical resource. Parsing large C projects is not easy - they have their own quirks - so it takes work. I doubt it's resolvable within the confines of a Stack Overflow question.
You need to start tackling the issues one by one - for example look at the pgwin32_signal_event token in pg_config_os.h - why can't it be parsed? Perhaps its type is unparsable? Was it defined? Could it be added to a "fake" header, etc. Unfortunately, there's no easy way to do this except working through the issues one by one.
Be sure to preprocess the file you're parsing first, dumping the full preprocessed version into a single .c file - this gets all the types into a single file you can work with.

Related

Google App Engine dev_appserver.py: watcher_ignore_re flag "is not JSON serializable"

Why I run the dev_appserver.py with the option watcher_ignore_re, I get an error message that the regex is not JSON serializable.
Is this a bug with the development server? Am I using this command improperly? The command and callstack is printed below.
C:\Users\mes65\Documents\MyProject>"C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\bin\dev_appserver.py" ^
--watcher_ignore_re="(.*\.git|.*\.idea|tmp\.py)" ^
"C:\Users\mes65\Documents\MyProject"
WARNING 2018-06-06 09:28:59,161 appinfo.py:1622] lxml version "2.3" is deprecated, use one of: "3.7.3"
INFO 2018-06-06 09:28:59,187 devappserver2.py:120] Skipping SDK update check.
Traceback (most recent call last):
File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\dev_appserver.py", line 96, in <module>
_run_file(__file__, globals())
File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\dev_appserver.py", line 90, in _run_file
execfile(_PATHS.script_file(script_name), globals_)
File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\devappserver2.py", line 454, in <module>
main()
File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\devappserver2.py", line 442, in main
dev_server.start(options)
File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\devappserver2.py", line 163, in start
bool(ssl_certificate_paths), options)
File "C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\metrics.py", line 166, in Start
self._cmd_args = json.dumps(vars(cmd_args)) if cmd_args else None
File "C:\Python27\lib\json\__init__.py", line 244, in dumps
return _default_encoder.encode(obj)
File "C:\Python27\lib\json\encoder.py", line 207, in encode
chunks = self.iterencode(o, _one_shot=True)
File "C:\Python27\lib\json\encoder.py", line 270, in iterencode
return _iterencode(o, 0)
File "C:\Python27\lib\json\encoder.py", line 184, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <_sre.SRE_Pattern object at 0x00000000063C2188> is not JSON serializable
It looks like it is an issue with the google analytics code built into dev_appserver2 (google-cloud-sdk\platform\google_appengine\google\appengine\tools\devappserver2\devappserver2.py on or around line 316). It wants to send all of your command line options to google analytics. If you remove the analytics client id by adding the command line option --google_analytics_client_id= (note: '=' without any following value) the appserver won't call the google analytics code where it is trying to JSON serialize an SRE object and failing. However, since you are on Windows I find that the --watcher_ignore_re does not work anyway even when you get past this issue.
There is a comment in file_watcher.py
TODO: b/33178251 - Add watcher_ignore_re support for windows.
I also faced with this usability problem on Windows and was really disappointed. I tried to find some workarounds but I hadn't found any appropriate way.
In the end, I decided to make my own implementation of support watcher_ignore_re for Windows. I put required changes in my Github repo
If describe them in several words:
Add _watcher_ignore_re, _skip_files_re properties and its setters
Add import statement from google.appengine.tools.devappserver2 import watcher_common and use it in newly created def _path_ginored
Filter additional_changes before adding them to watcher changed files
For resolving mentioned problem with not serializable regex attribute we should drop them from the serialized dictionary. Fix for this is added as consequent commit and can be checked at metrics.py:185-193.
I hope it helps other guys enjoy developing on GAE on Windows :)

pycparser ParseError

I am trying to creat AST usinfg pyCparser,
The following error printed:
Traceback (most recent call last):
File "C:\Work\RE\Tools\VarsExporter\BuildExportedDb.py", line 1076, in
main()
File "C:\Work\RE\Tools\VarsExporter\BuildExportedDb.py", line 1032, in main
ast = parse_file(i_file)
File "C:\Python27\lib\site-packages\pycparser_init_.py", line 93, in
parse_file
return parser.parse(text, filename)
File "C:\Python27\lib\site-packages\pycparser\c_parser.py", line 152, in
parse
debug=debuglevel)
File "C:\Python27\lib\site-packages\pycparser\ply\yacc.py", line 331, in
parse
return self.parseopt_notrack(input, lexer, debug, tracking, tokenfunc)
File "C:\Python27\lib\site-packages\pycparser\ply\yacc.py", line 1199, in
parseopt_notrack
tok = call_errorfunc(self.errorfunc, errtoken, self)
File "C:\Python27\lib\site-packages\pycparser\ply\yacc.py", line 193, in
call_errorfunc
r = errorfunc(token)
File "C:\Python27\lib\site-packages\pycparser\c_parser.py", line 1761, in
p_error
column=self.clex.find_tok_column(p)))
File "C:\Python27\lib\site-packages\pycparser\plyparser.py", line 67, in _
parse_error
raise ParseError("%s: %s" % (coord, msg))
ParseError: Objectffly\SerDb.i:43:18: before: __loff_t
What causes the above issue? How Can I handle it?
Any suggestions how can I debug it, and find out what's going on?
From pyCparser git FAQ:
C code almost always #includes various header files from the standard C library, like stdio.h. While (with some effort) pycparser can be made to parse the standard headers from any C compiler, it's much simpler to use the provided "fake" standard includes in utils/fake_libc_include. These are standard C header files that contain only the bare necessities to allow valid parsing of the files that use them.
To solve the issue I have successfully used the method described here.

Unable to ping managed nodes using ansible-2.0

I downloaded the ansible-2.0.0-0.2.alpha2.tar.gz and installed it on my control machine. However now I'm not able to ping any of my machines. Previously using v1.9.2 i could communicate with them. Now it gives the following error:
Unexpected Exception: lstat() argument 1 must be encoded string without NULL bytes, not str
the full traceback was:
Traceback (most recent call last):
File "/usr/bin/ansible", line 79, in
sys.exit(cli.run())
File "/usr/lib/python2.6/site-packages/ansible/cli/adhoc.py", line 111, in run
inventory = Inventory(loader=loader, variable_manager=variable_manager, host_list=self.options.inventory)
File "/usr/lib/python2.6/site-packages/ansible/inventory/init.py", line 77, in init
self.parse_inventory(host_list)
File "/usr/lib/python2.6/site-packages/ansible/inventory/init.py", line 133, in parse_inventory
host.vars = combine_vars(host.vars, self.get_host_variables(host.name))
File "/usr/lib/python2.6/site-packages/ansible/inventory/init.py", line 499, in get_host_variables
self.vars_per_host[hostname] = self.get_host_variables(hostname, vault_password=vault_password)
File "/usr/lib/python2.6/site-packages/ansible/inventory/__init.py", line 529, in get_host_variables
vars = combine_vars(vars, self.get_host_vars(host))
File "/usr/lib/python2.6/site-packages/ansible/inventory/__init_.py", line 653, in get_host_vars
return self.get_hostgroup_vars(host=host, group=None, new_pb_basedir=new_pb_basedir)
File "/usr/lib/python2.6/site-packages/ansible/inventory/__init_.py", line 702, in _get_hostgroup_vars
base_path = os.path.realpath(os.path.join(basedir, "host_vars/%s" % host.name))
File "/usr/lib64/python2.6/posixpath.py", line 365, in realpath
if islink(component):
File "/usr/lib64/python2.6/posixpath.py", line 132, in islink
st = os.lstat(path)
TypeError: lstat() argument 1 must be encoded string without NULL bytes, not str
Any help would be appreciated.
This is a known bug due to some Unicode changes made to the playbook parser in 2.0. Several versions of Python shipped with a version of shlex.split() that fails horribly on Unicode input- you likely have one of them installed. The bug has been worked around and will be included in the next drop. See https://github.com/ansible/ansible/issues/12257

Merging PDF files with Python3

I am writing a small script that needs to merge many one-page pdf files. I want the script to run with Python3 and to have as few dependencies as possible.
For the PDF merging part, I tried using PyPdf. However, the Python 3 support seems to be buggy; It can't handle inkscape generated PDF files (which I need). I have the current git version of PyPdf installed, and the following test script doesn't work:
import PyPDF2
output_pdf = PyPDF2.PdfFileWriter()
with open("testI.pdf", "rb") as input:
input_pdf = PyPDF2.PdfFileReader(input)
output_pdf.addPage(input_pdf.getPage(0))
with open("test.pdf", "wb") as output:
output_pdf.write(output)
It throws the following stack trace:
Traceback (most recent call last):
File "test.py", line 7, in <module>
output.addPage(input.getPage(0))
File "/usr/lib/python3.3/site-packages/pyPdf/pdf.py", line 420, in getPage
self._flatten()
File "/usr/lib/python3.3/site-packages/pyPdf/pdf.py", line 574, in _flatten
self._flatten(page.getObject(), inherit)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 165, in getObject
return self.pdf.getObject(self).getObject()
File "/usr/lib/python3.3/site-packages/pyPdf/pdf.py", line 616, in getObject
retval = readObject(self.stream, self)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 66, in readObject
return DictionaryObject.readFromStream(stream, pdf)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 526, in readFromStream
value = readObject(stream, pdf)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 57, in readObject
return ArrayObject.readFromStream(stream, pdf)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 152, in readFromStream
obj = readObject(stream, pdf)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 86, in readObject
return NumberObject.readFromStream(stream)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 231, in readFromStream
return FloatObject(name.decode("ascii"))
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 207, in __new__
return decimal.Decimal.__new__(cls, str(value), context)
TypeError: optional argument must be a context
The same script, however, works flawlessly with Python 2.7.
What am I doing wrong here? Is it a bug in the library? Can I work around it without touching the PyPDF library?
So I found the answer. The decimal.Decimal module in Python3.3 shows some weird behaviour. This is the corresponding StackOverflow question: Instantiate Decimal class I added some workaround to the PyPDF2 library and submitted a pull request.
Just to make sure you are aware of already existing tools that do exactly this:
PDFtk
PDFjam (my favourite, requires LaTeX though)
Directly with GhostScript:
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=finished.pdf file1.pdf file2.pdf

Parsing XML exception

I'm new to python, and seriously need help! I have a number of errors I can't figure out. I'm using python 2.7 on a mac. Here is the list of errors:
Traceback (most recent call last):
File "minihiveosc.py", line 378, in <module>
swhive = SWMiniHiveOSC( options.host, options.hport, options.ip, options.port, options.minibees, options.serial, options.baudrate, options.config, [1,options.minibees], options.verbose, options.apimode )
File "minihiveosc.py", line 280, in __init__
self.hive.load_from_file( config )
File "/Users/Puffin/Documents/python/pydon/pydon/pydonhive.py", line 396, in load_from_file
hiveconf = cfgfile.read_file( filename )
File "/Users/Puffin/Documents/python/pydon/pydon/minibeexml.py", line 116, in read_file
tree = ET.parse( filename )
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1183, in parse
tree.parse(source, parser)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
parser.feed(data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1643, in feed
self._raiseerror(v)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1507, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 164, column 8
Any chance someone can help me?
Thanks!
What you posted in your question is called a "Traceback", and it shows only one error:
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 164, column 8
All the lines before it show how python got there; in the file minihiveosc.py, on line 378 some code was executed (shown in the traceback), which then led to line 280 of the same file, where something else was called, etc.
Every time Python calls a function the current state is pushed onto the stack to make room for the next context, and when an exception occurs python can show you this stack to help you diagnose your problem
In this case, you are trying to feed an XML document to the XML parser that has an error in it; by the time the parser gets to line 164, column 8, it found something it didn't expect. You'll need to inspect that document to see what the problem is, it'll be around that area.
It just because that your XML file is not wellformed at line 8. When the parser tries to read that line it raises that error. Have a look at your document to see what it is.
This is one error with stack trace.
Creation of SWMiniHiveOSC object caused error when executing load_from_file(config) method. File name or file content is in 'options.config'. Your XML config file is not well-formed, there is invalid token at line 164, column 8 in this file. The problem is with XML file, not python code.

Categories