pyth error [rtf to xml/html] - python

I am trying to convert RTF to XML/xhtml using python 3.6.1.
Python Code: https://github.com/brendonh/pyth/blob/master/examples/reading/rtf15.py
import sys
import os.path
from pyth.plugins.rtf15.reader import Rtf15Reader
from pyth.plugins.xhtml.writer import XHTMLWriter
if len(sys.argv) > 1:
filename = sys.argv[1]
else:
filename = os.path.normpath(os.path.join(os.path.dirname(__file__),'../../tests/rtfs/sample.rtf'))
doc = Rtf15Reader.read(open(filename, "rb"))
print(XHTMLWriter.write(doc, pretty=True).read())
Error:
Traceback (most recent call last):
File "C:\xx\file1.py", line 14, in <module>
from pyth.plugins.rtf15.reader import Rtf15Reader
File "C:\Python 3.6.1\lib\site-packages\pyth\plugins\rtf15\reader.py", line 594
match = re.match(ur'HYPERLINK "(.*)"', destination)
^
SyntaxError: invalid syntax
May I know how to solve the syntax issue?
Thank you.

Please check the link:
https://pypi.python.org/pypi/pyth/0.6.0
The pyth package is just used for Python 2.x, not worked for Python 3.x version.
PS:
At you sample code, the
print XHTMLWriter.write(doc, pretty=True).read()
is the Python 2.x version, not Python 3.x version. Please check.

Related

Using type stubs for Python stdlib with mypy

Consider the following MWE:
import hashlib
def tstfun(h: hashlib._hashlib.HASH):
print(h)
h = hashlib.md5()
tstfun(h)
# reveal_type(h)
Running this as-is yields - no surprise:
$ python mypytest.py
<md5 _hashlib.HASH object # 0x7fa645dedd90>
But checking this with mypy fails with:
$ mypy mypytest.py
mypytest.py:4: error: Name 'hashlib._hashlib.HASH' is not defined
Found 1 error in 1 file (checked 1 source file)
Now, revealing the type on h (commenting in that reveal_type line):
$ mypy mypytest.py
mypytest.py:4: error: Name 'hashlib._hashlib.HASH' is not defined
mypytest.py:10: note: Revealed type is 'hashlib._Hash'
Found 1 error in 1 file (checked 1 source file)
Well, ok, then changing the type hint from hashlib._hashlib.HASH to hashlib._Hash:
$ python mypytest.py
Traceback (most recent call last):
File "/radarugs/hintze/s4-cnc-tools/mypytest.py", line 4, in <module>
def tstfun(h: hashlib._HASH):
AttributeError: module 'hashlib' has no attribute '_HASH'
$ mypy mypytest.py
mypytest.py:4: error: Name 'hashlib._HASH' is not defined
Found 1 error in 1 file (checked 1 source file)
...which is the worst outcome.
How to check if the type stubs for the hashlib are correctly found and used by mypy? What else to check? What do I get wrong?
According to the traceback, you used hashlib._HASH.
With this code:
import hashlib
def tstfun(h: hashlib._Hash):
print(h)
h = hashlib.md5()
tstfun(h)
Mypy reports: Success: no issues found in 1 source file
Using hashlib._Hash is correct, but you also need to from __future__ import annotations if you don't want to use quotes. See https://github.com/python/typeshed/issues/2928
from __future__ import annotations
import hashlib
def tstfun(h: hashlib._Hash):
print(h)
h = hashlib.md5()
tstfun(h)
N.B.: __future__.annotations is available starting in python 3.7.0b1. See https://docs.python.org/3/library/__future__.html

name 'OSPF_Link' is not defined

I have a python script like this:
#!/usr/bin/env python
from scapy.all import *
from ospf import *
def ourSend(packet):
sendp(packet,iface='eth1')
host1='10.0.3.2'
advr_routers='10.0.8.7'
host2='10.0.2.2'
sequence=0x80000918
link2host1 = OSPF_Link(id=host1,data='10.0.3.1',type=2,metric=1)
link2host2 = OSPF_Link(id=host2,data='10.0.2.2',type=2,metric=1)
link2victim = OSPF_Link(id="192.168.200.20",data="255.255.255.255",type=3,metric=1)
IPlayer=IP(src='10.0.1.2',dst='224.0.0.5')
OSPFHdr=OSPF_Hdr(src='10.0.6.1')
rogueLsa=Ether()/IPlayer/OSPFHdr/OSPF_LSUpd(lsacount=1,lsalist=[OSPF_Router_LSA(options=0x22,id='10.0.3.1',adrouter=advr_routers,seq=sequence,\
linkcount=3,linklist=[link2victim,link2host1,link2host2])])
ourSend(rogueLsa)
When I run it it has an scapy error.. So I resolved it with git pyrt...
now when I want to run the python script I have other error:
$ python scipt.py
WARNING: No route found for IPv6 destination :: (no default route?)
Traceback (most recent call last):
File "s.py", line 19, in <module>
link2host1 = OSPF_Link(id=host1,data='10.0.3.1',type=2,metric=1)
NameError: name 'OSPF_Link' is not defined
Thank you
I Should separately get ospf.py again and run my script.
This is what I tried and worked...
When you git pyrt The module OSPF_Link can't be added. SO I used the ospf.py again and problem is solved now....

python NameError: global name '__file__' is not defined

When I run this code in python 2.7, I get this error:
Traceback (most recent call last):
File "C:\Python26\Lib\site-packages\pyutilib.subprocess-3.5.4\setup.py", line 30, in <module>
long_description = read('README.txt'),
File "C:\Python26\Lib\site-packages\pyutilib.subprocess-3.5.4\setup.py", line 19, in read
return open(os.path.join(os.path.dirname(__file__), *rnames)).read()
NameError: global name '__file__' is not defined
code is:
import os
from setuptools import setup
def read(*rnames):
return open(os.path.join(os.path.dirname(__file__), *rnames)).read()
setup(name="pyutilib.subprocess",
version='3.5.4',
maintainer='William E. Hart',
maintainer_email='wehart#sandia.gov',
url = 'https://software.sandia.gov/svn/public/pyutilib/pyutilib.subprocess',
license = 'BSD',
platforms = ["any"],
description = 'PyUtilib utilites for managing subprocesses.',
long_description = read('README.txt'),
classifiers = [
'Development Status :: 4 - Beta',
'Intended Audience :: End Users/Desktop',
'License :: OSI Approved :: BSD License',
'Natural Language :: English',
'Operating System :: Microsoft :: Windows',
'Operating System :: Unix',
'Programming Language :: Python',
'Programming Language :: Unix Shell',
'Topic :: Scientific/Engineering :: Mathematics',
'Topic :: Software Development :: Libraries :: Python Modules'],
packages=['pyutilib', 'pyutilib.subprocess', 'pyutilib.subprocess.tests'],
keywords=['utility'],
namespace_packages=['pyutilib'],
install_requires=['pyutilib.common', 'pyutilib.services']
)
This error comes when you append this line os.path.join(os.path.dirname(__file__)) in python interactive shell.
Python Shell doesn't detect current file path in __file__ and it's related to your filepath in which you added this line
So you should write this line os.path.join(os.path.dirname(__file__)) in file.py. and then run python file.py, It works because it takes your filepath.
I had the same problem with PyInstaller and Py2exe so I came across the resolution on the FAQ from cx-freeze.
When using your script from the console or as an application, the functions hereunder will deliver you the "execution path", not the "actual file path":
print(os.getcwd())
print(sys.argv[0])
print(os.path.dirname(os.path.realpath('__file__')))
Source:
http://cx-freeze.readthedocs.org/en/latest/faq.html
Your old line (initial question):
def read(*rnames):
return open(os.path.join(os.path.dirname(__file__), *rnames)).read()
Substitute your line of code with the following snippet.
def find_data_file(filename):
if getattr(sys, 'frozen', False):
# The application is frozen
datadir = os.path.dirname(sys.executable)
else:
# The application is not frozen
# Change this bit to match where you store your data files:
datadir = os.path.dirname(__file__)
return os.path.join(datadir, filename)
With the above code you could add your application to the path of your os, you could execute it anywhere without the problem that your app is unable to find it's data/configuration files.
Tested with python:
3.3.4
2.7.13
I've run into cases where __file__ doesn't work as expected. But the following hasn't failed me so far:
import inspect
src_file_path = inspect.getfile(lambda: None)
This is the closest thing to a Python analog to C's __FILE__.
The behavior of Python's __file__ is much different than C's __FILE__. The C version will give you the original path of the source file. This is useful in logging errors and knowing which source file has the bug.
Python's __file__ only gives you the name of the currently executing file, which may not be very useful in log output.
change your codes as follows! it works for me.
`
os.path.dirname(os.path.abspath("__file__"))
If you're using the code inside a .py file: Use
os.path.abspath(__file__)
If you're using the code on a script directly or in Jupyter Notebooks:
Put the file inside double-quotes.
os.path.abspath("__file__")
Are you using the interactive interpreter? You can use
sys.argv[0]
You should read: How do I get the path of the current executed file in Python?
If all you are looking for is to get your current working directory os.getcwd() will give you the same thing as os.path.dirname(__file__) as long as you have not changed the working directory elsewhere in your code. os.getcwd() also works in interactive mode.
So
os.path.join(os.path.dirname(__file__))
becomes
os.path.join(os.getcwd())
You will get this if you are running the commands from the python shell:
>>> __file__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name '__file__' is not defined
You need to execute the file directly, by passing it in as an argument to the python command:
$ python somefile.py
In your case, it should really be python setup.py install
If you're exec'ing a file via command line, you can use this hack
import traceback
def get_this_filename():
try:
raise NotImplementedError("No error")
except Exception as e:
exc_type, exc_value, exc_traceback = sys.exc_info()
filename = traceback.extract_tb(exc_traceback)[-1].filename
return filename
This worked for me in the UnrealEnginePython console, calling py.exec myfile.py
if you are using jupyter notebook like:
MODEL_NAME = os.path.basename(file)[:-3]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-10-f391bbbab00d> in <module>
----> 1 MODEL_NAME = os.path.basename(__file__)[:-3]
NameError: name '__file__' is not defined
you should place a ' ! ' in front like this
!MODEL_NAME = os.path.basename(__file__)[:-3]
/bin/bash: -c: line 0: syntax error near unexpected token `('
/bin/bash: -c: line 0: `MODEL_NAME = os.path.basename(__file__)[:-3]'
done.....
I'm having exacty the same problem and using probably the same tutorial. The function definition:
def read(*rnames):
return open(os.path.join(os.path.dirname(__file__), *rnames)).read()
is buggy, since os.path.dirname(__file__) will not return what you need. Try replacing os.path.dirname(__file__) with os.path.dirname(os.path.abspath(__file__)):
def read(*rnames):
return open(os.path.join(os.path.dirname(os.path.abspath(__file__)), *rnames)).read()
I've just posted Andrew that the code snippet in current docs don't work, hopefully, it'll be corrected.
I think you can do this which get your local file path
if not os.path.isdir(f_dir):
os.mkdirs(f_dir)
try:
approot = os.path.dirname(os.path.abspath(__file__))
except NameError:
approot = os.path.dirname(os.path.abspath(sys.argv[1]))
my_dir= os.path.join(approot, 'f_dir')

pandas.DataFrame.load/save between python2 and python3: pickle protocol issues

I haven't figure out how to do pickle load/save's between python 2 and 3 with pandas DataFrames. There is a 'protocol' option in the pickler that I've played with unsuccessfully but I'm hoping someone has a quick idea for me to try. Here is the code to get the error:
python2.7
>>> import pandas; from pylab import *
>>> a = pandas.DataFrame(randn(10,10))
>>> a.save('a2')
>>> a = pandas.DataFrame.load('a2')
>>> a = pandas.DataFrame.load('a3')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/pandas-0.10.1-py2.7-linux-x86_64.egg/pandas/core/generic.py", line 30, in load
return com.load(path)
File "/usr/local/lib/python2.7/site-packages/pandas-0.10.1-py2.7-linux-x86_64.egg/pandas/core/common.py", line 1107, in load
return pickle.load(f)
ValueError: unsupported pickle protocol: 3
python3
>>> import pandas; from pylab import *
>>> a = pandas.DataFrame(randn(10,10))
>>> a.save('a3')
>>> a = pandas.DataFrame.load('a3')
>>> a = pandas.DataFrame.load('a2')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.3/site-packages/pandas-0.10.1-py3.3-linux-x86_64.egg/pandas/core/generic.py", line 30, in load
return com.load(path)
File "/usr/local/lib/python3.3/site-packages/pandas-0.10.1-py3.3-linux-x86_64.egg/pandas/core/common.py", line 1107, in load
return pickle.load(f)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf4 in position 0: ordinal not in range(128)
Maybe expecting pickle to work between python version is a bit optimistic?
I had the same problem. You can change the protocol of the dataframe pickle file with the following function in python3:
import pickle
def change_pickle_protocol(filepath,protocol=2):
with open(filepath,'rb') as f:
obj = pickle.load(f)
with open(filepath,'wb') as f:
pickle.dump(obj,f,protocol=protocol)
Then you should be able to open it in python2 no problem.
If somebody uses pandas.DataFrame.to_pickle() then do the following modification in source code to have the capability of pickle protocol setting:
1) In source file /pandas/io/pickle.py (before modification copy the original file as /pandas/io/pickle.py.ori) search for the following lines:
def to_pickle(obj, path):
pkl.dump(obj, f, protocol=pkl.HIGHEST_PROTOCOL)
Change these lines to:
def to_pickle(obj, path, protocol=pkl.HIGHEST_PROTOCOL):
pkl.dump(obj, f, protocol=protocol)
2) In source file /pandas/core/generic.py (before modification copy the original file as /pandas/core/generic.py.ori) search for the following lines:
def to_pickle(self, path):
return to_pickle(self, path)
Change these lines to:
def to_pickle(self, path, protocol=None):
return to_pickle(self, path, protocol)
3) Restart your python kernel if it runs then save your dataframe using any available pickle protocol (0, 1, 2, 3, 4):
# Python 2.x can read this
df.to_pickle('my_dataframe.pck', protocol=2)
# protocol will be the highest (4), Python 2.x can not read this
df.to_pickle('my_dataframe.pck')
4) After pandas upgrade, repeat step 1 & 2.
5) (optional) Ask the developers to have this capability in official releases (because your code will throw exception on any other Python environments without these changes)
Nice day!
You can override the highest protocol available for the pickle package:
import pickle as pkl
import pandas as pd
if __name__ == '__main__':
# this constant is defined in pickle.py in the pickle package:"
pkl.HIGHEST_PROTOCOL = 2
# 'foo.pkl' was saved in pickle protocol 4
df = pd.read_pickle(r"C:\temp\foo.pkl")
# 'foo_protocol_2' will be saved in pickle protocol 2
# and can be read in pandas with Python 2
df.to_pickle(r"C:\temp\foo_protocol_2.pkl")
This is definitely not an elegant solution but it does the work without changing pandas code directly.
UPDATE: I found that the newer version of pandas, allow to specify the pickle version in the .to_pickle function:
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_pickle.html[1]
DataFrame.to_pickle(path, compression='infer', protocol=4)

commands module problem

Hi I'm trying to execute bash command in python by importing commands module.I think I ask the same question here before. However this time it doesn't work.
The script is as below:
#!/usr/bin/python
import os,sys
import commands
import glob
path= '/home/xxx/nearline/bamfiles'
bamfiles = glob.glob(path + '/*.bam')
for bamfile in bamfiles:
fullpath = os.path.join(path,bamfile)
txtfile = commands.getoutput('/share/bin/samtools/samtools ' + 'view '+ fullpath)
line=txtfile.readlines()
print line
this samtools view will produce (I think) .txt file
I got the errors:
Traceback (most recent call last):
File "./try.py", line 12, in ?
txtfile = commands.getoutput('/share/bin/samtools/samtools ' + 'view '+ fullpath)
File "/usr/lib64/python2.4/commands.py", line 44, in getoutput
return getstatusoutput(cmd)[1]
File "/usr/lib64/python2.4/commands.py", line 54, in getstatusoutput
text = pipe.read()
SystemError: Objects/stringobject.c:3518: bad argument to internal function
Seems it's the problem with commands.getoutput
Thanks
I would recommend using subprocess
From the commands documentation:
Deprecated since version 2.6: The commands module has been removed in Python 3.0. Use the subprocess module instead.
Update: Just realized you're using Python 2.4. An easy way to execute a command is os.system()
A quick google search for "SystemError: Objects/stringobject.c:3518: bad argument to internal function" brings up several bug reports. Such as https://www.mercurial-scm.org/bts/issue1225 and http://www.modpython.org/pipermail/mod_python/2007-June/023852.html. It appears to be an issue with Fedora in combination with Python 2.4, but I am not exactly sure about that. I would suggest that you follow Michael's advice and use os.system or os.popen to accomplish this task. To do this the changes in your code will be:
import os,sys
import glob
path= '/home/xxx/nearline/bamfiles'
bamfiles = glob.glob(path + '/*.bam')
for bamfile in bamfiles:
fullpath = os.path.join(path,bamfile)
txtfile = os.popen('/share/bin/samtools/samtools ' + 'view '+ fullpath)
line=txtfile.readlines()
print line

Categories