dxf file parsing with dxfgrabber library in the python - python

I want to parse dxf file for obtain objects (line, point, text and so on) with dxfgrabber library.
The code is as below
#!/usr/bin/env python
import dxfgrabber
dxf = dxfgrabber.readfile("1.dxf")
print ("DXF version : {}".format(dxf.dxfversion))
But it gets some error...
Traceback (most recent call last):
File "parsing.py", line 6, in <module>
dxf = dxfgrabber.readfile("1.dxf")
File "/usr/local/lib/python2.7/dist-packages/dxfgrabber/__init__.py", line 43, in readfile
with io.open(filename, encoding=get_encoding()) as fp:
File "/usr/local/lib/python2.7/dist-packages/dxfgrabber/__init__.py", line 39, in get_encoding
info = dxfinfo(fp)
File "/usr/local/lib/python2.7/dist-packages/dxfgrabber/tags.py", line 96, in dxfinfo
tag = next(tagreader)
File "/usr/local/lib/python2.7/dist-packages/dxfgrabber/tags.py", line 52, in __next__
return next_tag()
File "/usr/local/lib/python2.7/dist-packages/dxfgrabber/tags.py", line 45, in next_tag
raise StopIteration()
StopIteration
The simple 1.dxf file only contain line.
file link is https://docs.google.com/file/d/0BySHG7k180kETlQ2UnRxQmxoUk0/edit?usp=sharing
Is this bug of dxfgrabber library?
Is there any good library for parsing dxf file in the python?
I am using dxfgrabber 0.4 and python 2.7.3.

I contacted the developer and he says that in current version 0.5.1 make line 49 of __init__.py the following: with io.open(filename) as fp:.
Then it works (io was missing).
He will make this correction official in version 0.5.2 soon.

You can only read dxf made in AutoCAD format!
Try "DraftSight" which is a free AutoCAD clone which exports dxf quite well. Try dxf R12 format.
This will solve your problems.

Related

How to Display .raw Dataset?

I'm trying to write a script to display the images in the file burned_wood_with_tape_1664x512x256_12bit.raw from this website: https://figshare.com/articles/SSOCT_test_dataset_for_OCTproZ/12356705
for a research project. However, I can't find a way to display the images in this .raw dataset.
This is the software I have, using other questions on StackOverflow:
import rawpy
import imageio
path = "Datasets/burned_wood_with_tape_1664x512x256_12bit.raw"
for item in path:
item_path = path + item
raw = rawpy.imread(item_path)
rgb = raw.postprocess()
rawpy.imshow(rgb)
But I get this error:
Traceback (most recent call last):
File "[ENTER PATH]", line 7, in <module>
raw = rawpy.imread(item_path)
File "[ENTER PATH]\lib\site-packages\rawpy\__init__.py", line 20, in imread
d.open_file(pathOrFile)
File "rawpy\_rawpy.pyx", line 404, in rawpy._rawpy.RawPy.open_file
File "rawpy\_rawpy.pyx", line 914, in rawpy._rawpy.RawPy.handle_error
rawpy._rawpy.LibRawIOError: b'Input/output error'
The data that you have is not a ".raw" file. It is a dataset that can be used with the "Virtual OCT System" of OCTproZ (https://github.com/spectralcode/OCTproZ/). The "rawpy" library is not useful in this case. That library works for ".raw" photos.

Parse postgresql -pycparser.plyparser.ParseError before: pgwin32_signal_event

I need to parse an open-source project Postgresql using pycparser.
While parsing its source-code the following error arises:
Traceback (most recent call last):
File "examples\using_cpp_libc.py", line 48, in <module>
getAllFiles(projectName)
File "examples\using_cpp_libc.py", line 29, in getAllFiles
ast = parse_file(dirName+'\\'+fname, use_cpp = True, cpp_path = 'cpp',
cpp_args = [r'-nostdinc',r'-Iutils/fake_libc_include',r'-
Iprojects/postgresql/src/include'])
File "G:\python\pycparser-master\pycparser\__init__.py", line 92, in
parse_file
return parser.parse(text, filename)
File "G:\python\pycparser-master\pycparser\c_parser.py", line 152, in parse
debug=debuglevel)
File "G:\python\pycparser-master\pycparser\ply\yacc.py", line 334, in parse
return self.parseopt_notrack(input, lexer, debug, tracking, tokenfunc)
File "G:\python\pycparser-master\pycparser\ply\yacc.py", line 1204, in
parseopt_notrack
tok = call_errorfunc(self.errorfunc, errtoken, self)
File "G:\python\pycparser-master\pycparser\ply\yacc.py", line 193, in
call_errorfunc
r = errorfunc(token)
File "G:\python\pycparser-master\pycparser\c_parser.py", line 1838, in
p_error
column=self.clex.find_tok_column(p)))
File "G:\python\pycparser-master\pycparser\plyparser.py", line 67, in
_parse_error
raise ParseError("%s: %s" % (coord, msg))
pycparser.plyparser.ParseError:
projects/postgresql/src/include/pg_config_os.h:366:15: before:
pgwin32_signal_event
I am using postgresql-9.6.9, build it using visual studio express 2017 on windows 10 (64-bit)
The blog post you quoted in the comment is the canonical resource. Parsing large C projects is not easy - they have their own quirks - so it takes work. I doubt it's resolvable within the confines of a Stack Overflow question.
You need to start tackling the issues one by one - for example look at the pgwin32_signal_event token in pg_config_os.h - why can't it be parsed? Perhaps its type is unparsable? Was it defined? Could it be added to a "fake" header, etc. Unfortunately, there's no easy way to do this except working through the issues one by one.
Be sure to preprocess the file you're parsing first, dumping the full preprocessed version into a single .c file - this gets all the types into a single file you can work with.

NBT Parser Minecraft mca file not a gzipped file error

I try to read a Minecraft world with Python from the filesystem and the .mca region/anvil files using the NBT 1.4.1 module (Named Binary Tag Reader/Writer), which is supposed to read the NBT format used in Minecraft. It works fine for files such as level.dat, but throws an error for the region files such as r.0.0.mca
Edit: I am referring to the auto generated world files that minecraft stores in the .minecraft/saves/"MyWorld"/ folder. Such as the level.dat (which works), and the mca files stored in the .minecraft/saves/"MyWorld"/region/ folder such as r.0.0.mca which don't work. I uploaded two sample files from one of my worlds.
Code:
from nbt import nbt
level_file = nbt.NBTFile("level.dat", "rb") # works
region_file = nbt.NBTFile("r.0.0.mca", "rb")# does not work
Error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 508, in __init__
self.parse_file()
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 532, in parse_file
type = TAG_Byte(buffer=self.file)
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 85, in __init__
self._parse_buffer(buffer)
File "/usr/local/lib/python3.5/dist-packages/nbt/nbt.py", line 90, in _parse_buffer
self.value = self.fmt.unpack(buffer.read(self.fmt.size))[0]
File "/usr/lib/python3.5/gzip.py", line 274, in read
return self._buffer.read(size)
File "/usr/lib/python3.5/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/usr/lib/python3.5/gzip.py", line 461, in read
if not self._read_gzip_header():
File "/usr/lib/python3.5/gzip.py", line 409, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'\x00\x00')
Any suggestions how to get this working?
r.0.0.mca is most definitely not compressed. About 80% of the bytes are zeros.
It turns out that the NBT library only supports .mcr region files which have been replaced by .mca files about 6 years ago. However, mcedit is written in Python and supports those files. Due the changes in the Minecraft save format, the interpretation of the content needs to be adjusted though, but the files can be successfully read.

Python: 'NoneType' object has no attribute 'decompressobj'

I'm using Python 2.7.11 on Ubuntu.
I'm trying to open an Excel file (.xlsx) in Python using xlrd package. However I get the following error when I try to use the open_workbook() function from the package to open my Excel file:
Traceback (most recent call last):
File "TileInserter.py", line 15, in <module>
book = open_workbook(sheetPath, on_demand=True)
File "/usr/local/lib/python2.7/site-packages/xlrd/__init__.py", line 422, in open_workbook
ragged_rows=ragged_rows,
File "/usr/local/lib/python2.7/site-packages/xlrd/xlsx.py", line 761, in open_workbook_2007_xml
zflo = zf.open(component_names['xl/_rels/workbook.xml.rels'])
File "/usr/local/lib/python2.7/zipfile.py", line 1010, in open
close_fileobj=should_close)
File "/usr/local/lib/python2.7/zipfile.py", line 526, in __init__
self._decompressor = zlib.decompressobj(-15)
AttributeError: 'NoneType' object has no attribute 'decompressobj'
I tried to google the cause of this error and found that this could happen if the zlib library is not installed. But when I checked using PHP's phpinfo() function, it shows that zlib is installed. And that too the latest version (version 1.2.8).
So I'm kinda stuck now. Does anyone know how to solve this issue?
EDIT: My actual code in TileInserter.py goes like this (TileInserter.py and TileList.xlsx being in the same directory):
from xlrd import open_workbook
sheetPath = "TileList.xlsx"
#some more variables
#Open Excel file
book = open_workbook(sheetPath, on_demand=True)
for name in book.sheet_names():
if name.endswith('1'):
sheet = book.sheet_by_name(name)
I see on http://www.python-excel.org/ that there's a library openpyxl that is recommended for working with .xlsx files. This may be what you need instead of xlrd.

Merging PDF files with Python3

I am writing a small script that needs to merge many one-page pdf files. I want the script to run with Python3 and to have as few dependencies as possible.
For the PDF merging part, I tried using PyPdf. However, the Python 3 support seems to be buggy; It can't handle inkscape generated PDF files (which I need). I have the current git version of PyPdf installed, and the following test script doesn't work:
import PyPDF2
output_pdf = PyPDF2.PdfFileWriter()
with open("testI.pdf", "rb") as input:
input_pdf = PyPDF2.PdfFileReader(input)
output_pdf.addPage(input_pdf.getPage(0))
with open("test.pdf", "wb") as output:
output_pdf.write(output)
It throws the following stack trace:
Traceback (most recent call last):
File "test.py", line 7, in <module>
output.addPage(input.getPage(0))
File "/usr/lib/python3.3/site-packages/pyPdf/pdf.py", line 420, in getPage
self._flatten()
File "/usr/lib/python3.3/site-packages/pyPdf/pdf.py", line 574, in _flatten
self._flatten(page.getObject(), inherit)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 165, in getObject
return self.pdf.getObject(self).getObject()
File "/usr/lib/python3.3/site-packages/pyPdf/pdf.py", line 616, in getObject
retval = readObject(self.stream, self)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 66, in readObject
return DictionaryObject.readFromStream(stream, pdf)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 526, in readFromStream
value = readObject(stream, pdf)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 57, in readObject
return ArrayObject.readFromStream(stream, pdf)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 152, in readFromStream
obj = readObject(stream, pdf)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 86, in readObject
return NumberObject.readFromStream(stream)
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 231, in readFromStream
return FloatObject(name.decode("ascii"))
File "/usr/lib/python3.3/site-packages/pyPdf/generic.py", line 207, in __new__
return decimal.Decimal.__new__(cls, str(value), context)
TypeError: optional argument must be a context
The same script, however, works flawlessly with Python 2.7.
What am I doing wrong here? Is it a bug in the library? Can I work around it without touching the PyPDF library?
So I found the answer. The decimal.Decimal module in Python3.3 shows some weird behaviour. This is the corresponding StackOverflow question: Instantiate Decimal class I added some workaround to the PyPDF2 library and submitted a pull request.
Just to make sure you are aware of already existing tools that do exactly this:
PDFtk
PDFjam (my favourite, requires LaTeX though)
Directly with GhostScript:
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=finished.pdf file1.pdf file2.pdf

Categories