MaltParser Not Working in Python NLTK - python

I am trouble getting the Maltparser to work with Python NLTK.
Here is my code so far:
import nltk
os.environ["MALT_PARSER"] = "C:/Python34/maltparser-1.8.1"
os.environ["MALTPARSERHOME"] = "C:/Python34/maltparser-1.8.1"
parser8 = nltk.parse.malt.MaltParser(
... working_dir="C:/Python34/maltparser-1.8.1", mco="engmalt.poly-1.7",
... additional_java_args=['-Xmx512m'])
txt = "This is a test sentence"
parser8.raw_parse(txt)
I have downloaded and selected to used a pre-trained model.
This is the response I get:
runfile('C:/Anaconda/Lib/site-packages/nltk/malt2.py', wdir='C:/Anaconda/Lib/site-packages/nltk')
Traceback (most recent call last):
File "<ipython-input-38-73069e4ee673>", line 1, in <module>
runfile('C:/Anaconda/Lib/site-packages/nltk/malt2.py', wdir='C:/Anaconda/Lib/site-packages/nltk')
File "C:\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 580, in runfile
execfile(filename, namespace)
File "C:/Anaconda/Lib/site-packages/nltk/malt2.py", line 14, in <module>
parser8.raw_parse(txt)
File "C:\Anaconda\lib\site-packages\nltk\parse\malt.py", line 139, in raw_parse
return self.parse(words, verbose)
File "C:\Anaconda\lib\site-packages\nltk\parse\malt.py", line 126, in parse
return self.parse_sents([sentence], verbose)[0]
File "C:\Anaconda\lib\site-packages\nltk\parse\malt.py", line 114, in parse_sents
return self.tagged_parse_sents(tagged_sentences, verbose)
File "C:\Anaconda\lib\site-packages\nltk\parse\malt.py", line 194, in tagged_parse_sents
"code %d" % (' '.join(cmd), ret))
Exception: MaltParser parsing (java -Xmx512m -jar C:/Python34/maltparser-1.8.1\malt.jar -w C:/Python34/maltparser-1.8.1 -c engmalt.poly-1.7.mco -i C:\Python34\maltparser-1.8.1\malt_input.conllqgpbye -o C:\Python34\maltparser-1.8.1\malt_output.conllib1nx0 -m parse) failed with exit code 2
I have followed all the advice on this post How to use malt parser in python nltk.
Specifically:
-I downloaded the latest version of MaltParser.
-Using Pip, I uninstalled and re installed NLTK to get the latest version, which includes the addition in malt/py that allows 'additional_java_args' to be added as a parameter.
-I renamed the jar file to 'malt.jar'.
-I set an environmental variable pointing both MALT_PARSER and MALTPARSERHOME to the working directory.
-I've tried both the linear and poly pre-trained models.
-The code for malt.py can be found here http://www.nltk.org/_modules/nltk/parse/malt.html
If there isn't an apparent solution, how can I continue to debug this myself?
It seems that there's some slash (/ ) inconsistency with the exception raised. Nothing I do seems to fix it though.

Related

Unable to build V8 in Windows 10

I'm using this link as reference (https://medium.com/angular-in-depth/how-to-build-v8-on-windows-and-not-go-mad-6347c69aacd4) to build v8 but i think its out of date or i'm doing something wrong.
I can't run this (ninja -C out.gn/x64.release) command because its showing this error constantly
ninja: error: loading 'build.ninja': The system cannot find the file specified.
ninja: Entering directory `out.gn\foo
also im getting this error too
D:\v8_dev\v8Engine\v8>gn args out.gn\foo
Waiting for editor on "D:\v8_dev\v8Engine\v8\out.gn\foo\args.gn"...
Generating files...
Traceback (most recent call last):
File "D:/v8_dev/v8Engine/v8/build/vs_toolchain.py", line 561, in <module>
sys.exit(main())
File "D:/v8_dev/v8Engine/v8/build/vs_toolchain.py", line 557, in main
return commands[sys.argv[1]](*sys.argv[2:])
File "D:/v8_dev/v8Engine/v8/build/vs_toolchain.py", line 371, in CopyDlls
_CopyRuntime(target_dir, runtime_dir, target_cpu, debug=False)
File "D:/v8_dev/v8Engine/v8/build/vs_toolchain.py", line 346, in _CopyRuntime
suffix)
File "D:/v8_dev/v8Engine/v8/build/vs_toolchain.py", line 284, in _CopyUCRTRuntime
assert len(ucrt_files) > 0
AssertionError
ERROR at //build/toolchain/win/BUILD.gn:49:3: Script returned non-zero exit code.
exec_script("../../vs_toolchain.py",
^----------
Current dir: D:/v8_dev/v8Engine/v8/out.gn/foo/
Command: D:/v8_dev/depot_tools/bootstrap-3_8_0_chromium_8_bin/python/bin/python.exe D:/v8_dev/v8Engine/v8/build/vs_toolchain.py copy_dlls D:/v8_dev/v8Engine/v8/out.gn/foo Release x64
Returned 1.
See //BUILD.gn:905:1: which caused the file to be included.
action("postmortem-metadata") {
^------------------------------
Ultimately i want it to work like as its shown in the embedded picture
if you are looking for the solution then follow this link (https://github.com/pmed/v8-nuget) for visual studio users, its working really well without all the hassle.

How can I compile Bootstrap 4 scss with python webassets?

I'm trying to simply compile Bootstrap 4 with python webassets and having zero success. For now I'm just trying to do this within the bootstrap/scss directory so path issues are less of a big deal. Within this directory I have added a main.scss file with one line:
#import "bootstrap.scss";
I have a script called test_scss.py that looks like this:
from webassets import Bundle, Environment
my_env = Environment(directory='.', url='/')
css = Bundle('main.scss', filters='scss', output='all.css')
my_env.register('css_all', css)
print(my_env['css_all'].urls())
When I run this command, I get an error trace like this:
Traceback (most recent call last):
File "./test_scss.py", line 11, in <module>
print(my_env['css_all'].urls())
File "/Users/benlindsay/miniconda/lib/python3.6/site-packages/webassets/bundle.py", line 806, in urls
urls.extend(bundle._urls(new_ctx, extra_filters, *args, **kwargs))
File "/Users/benlindsay/miniconda/lib/python3.6/site-packages/webassets/bundle.py", line 765, in _urls
*args, **kwargs)
File "/Users/benlindsay/miniconda/lib/python3.6/site-packages/webassets/bundle.py", line 619, in _build
force, disable_cache=disable_cache, extra_filters=extra_filters)
File "/Users/benlindsay/miniconda/lib/python3.6/site-packages/webassets/bundle.py", line 543, in _merge_and_apply
kwargs=item_data)
File "/Users/benlindsay/miniconda/lib/python3.6/site-packages/webassets/merge.py", line 276, in apply
return self._wrap_cache(key, func)
File "/Users/benlindsay/miniconda/lib/python3.6/site-packages/webassets/merge.py", line 218, in _wrap_cache
content = func().getvalue()
File "/Users/benlindsay/miniconda/lib/python3.6/site-packages/webassets/merge.py", line 251, in func
getattr(filter, type)(data, out, **kwargs_final)
File "/Users/benlindsay/miniconda/lib/python3.6/site-packages/webassets/filter/sass.py", line 196, in input
self._apply_sass(_in, out, os.path.dirname(source_path))
File "/Users/benlindsay/miniconda/lib/python3.6/site-packages/webassets/filter/sass.py", line 190, in _apply_sass
return self.subprocess(args, out, _in, cwd=child_cwd)
File "/Users/benlindsay/miniconda/lib/python3.6/site-packages/webassets/filter/__init__.py", line 527, in subprocess
proc.returncode, stdout, stderr))
webassets.exceptions.FilterError: scss: subprocess returned a non-success result code: 65, stdout=b'',
stderr=b'DEPRECATION WARNING: Importing from the current working directory will
not be automatic in future versions of Sass. To avoid future errors, you can add it
to your environment explicitly by setting `SASS_PATH=.`, by using the -I command
line option, or by changing your Sass configuration options.
Error: Invalid CSS after "...lor}: #{$value}": expected "{", was ";"
on line 4 of /Users/benlindsay/scratch/python/webassets/test-2/bootstrap/scss/_root.scss
from line 11 of /Users/benlindsay/scratch/python/webassets/test-2/bootstrap/scss/bootstrap.scss
from line 1 of standard input
Use --trace for backtrace.
If I follow the instructions and set environment variable SASS_PATH=., that gets rid of that part of the error message, but I still get the error
Error: Invalid CSS after "...lor}: #{$value}": expected "{", was ";"
on line 4 of /Users/benlindsay/scratch/python/webassets/test-2/bootstrap/scss/_root.scss
from line 11 of /Users/benlindsay/scratch/python/webassets/test-2/bootstrap/scss/bootstrap.scss
from line 1 of standard input
Use --trace for backtrace.
I don't know SCSS syntax well yet, but I'd bet a lot of money this is me doing something wrong and not an error in the Bootstrap SCSS. Any thoughts of what I'm doing wrong would be greatly appreciated.
Turns out it actually kind of was a problem on Bootstrap's end. See https://github.com/sass/sass/issues/2383, specifically the quote:
This is a bug in our implementation—the parser shouldn't crash—but those Bootstrap styles aren't valid for Sass 3.5 as written.
Anyway, I just needed to update to the latest version of Ruby Sass (which apparently the webassets module depends on) and that fixed it.

How to get TextBlob to work with all users on Ubuntu?

I'm trying to get TextBlob up and running for some teammates on a Unix server, it appears to be working just fine when I run scripts that make use of TextBlob when running as root, however when I try on the new account I create I get the following error:
**********************************************************************
Resource u'tokenizers/punkt/english.pickle' not found. Please
use the NLTK Downloader to obtain the resource: >>>
nltk.download()
Searched in:
- '/home/USERNAME/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- u''
**********************************************************************
Traceback (most recent call last):
File "sampleClassifier.py", line 25, in <module>
cl = NaiveBayesClassifier(train)
File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 192, in __init__
self.train_features = [(self.extract_features(d), c) for d, c in self.train_set]
File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 169, in extract_features
return self.feature_extractor(text, self.train_set)
File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 81, in basic_extractor
word_features = _get_words_from_dataset(train_set)
File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 63, in _get_words_from_dataset
return set(all_words)
File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 62, in <genexpr>
all_words = chain.from_iterable(tokenize(words) for words, _ in dataset)
File "/usr/local/lib/python2.7/dist-packages/textblob/classifiers.py", line 59, in tokenize
return word_tokenize(words, include_punc=False)
File "/usr/local/lib/python2.7/dist-packages/textblob/tokenizers.py", line 72, in word_tokenize
for sentence in sent_tokenize(text))
File "/usr/local/lib/python2.7/dist-packages/textblob/base.py", line 64, in itokenize
return (t for t in self.tokenize(text, *args, **kwargs))
File "/usr/local/lib/python2.7/dist-packages/textblob/decorators.py", line 38, in decorated
raise MissingCorpusError()
textblob.exceptions.MissingCorpusError:
Looks like you are missing some required data for this feature.
To download the necessary data, simply run
python -m textblob.download_corpora
or use the NLTK downloader to download the missing data: http://nltk.org/data.html
If this doesn't fix the problem, file an issue at https://github.com/sloria/TextBlob/issues.
The machine we're working with is very small so I can't overwhelm it by downloading the corpora several times for different users - does anyone know how I might fix this issue? I already have it installed for root, but I don't know where the packages are or how to find them.
Following the instructions in the docs should work. Try setting NLTK_DATA environment variable and see if it works for the new user.

pycparser.plyparser.ParseError on complex struct

I'm trying to use pycparser to parse this C code:
https://github.com/nbeaver/mx-trunk/blob/0b80678773582babcd56fe959d5cfbb776cc0004/libMx/d_adsc_two_theta.c
A repo with a minimal example and Makefile is here:
https://github.com/nbeaver/pycparser-problem
Using pycparser v2.14 (from pip) and gcc 4.9.2 on Debian Jessie.
Things I have tried:
Pass the -nostdinc flag to gcc and including the fake_libc_include folder.
Use -D'__attribute__(x)=' to take out GCC extensions
Use fake headers for e.g. <sys/param.h>
Use the -std=c99 in case the code is not C99 compatible.
Reproduce the redis example in case there is something weird with my machine.
This is what the traceback looks like:
Traceback (most recent call last):
File "just_parse.py", line 21, in <module>
parse(path)
File "just_parse.py", line 9, in parse
ast = pycparser.parse_file(filename)
File "/home/nathaniel/.local/lib/python2.7/site-packages/pycparser/__init__.py", line 93, in parse_file
return parser.parse(text, filename)
File "/home/nathaniel/.local/lib/python2.7/site-packages/pycparser/c_parser.py", line 146, in parse
debug=debuglevel)
File "/home/nathaniel/.local/lib/python2.7/site-packages/pycparser/ply/yacc.py", line 265, in parse
return self.parseopt_notrack(input,lexer,debug,tracking,tokenfunc)
File "/home/nathaniel/.local/lib/python2.7/site-packages/pycparser/ply/yacc.py", line 1047, in parseopt_notrack
tok = self.errorfunc(errtoken)
File "/home/nathaniel/.local/lib/python2.7/site-packages/pycparser/c_parser.py", line 1680, in p_error
column=self.clex.find_tok_column(p)))
File "/home/nathaniel/.local/lib/python2.7/site-packages/pycparser/plyparser.py", line 55, in _parse_error
raise ParseError("%s: %s" % (coord, msg))
pycparser.plyparser.ParseError: in/d_adsc_two_theta.c:63:82: before: .
The traceback points to this line:
https://github.com/nbeaver/mx-trunk/blob/0b80678773582babcd56fe959d5cfbb776cc0004/libMx/d_adsc_two_theta.c#L63
Which in turn points to this #define macro:
https://github.com/nbeaver/mx-trunk/blob/0b80678773582babcd56fe959d5cfbb776cc0004/libMx/mx_motor.h#L484
The cause appears to be the offsetof() function. The minimal working examples are fixed by recent commits, however:
https://github.com/eliben/pycparser/issues/87

Python error trying to convert JaCoCo to Cobertura

I am trying to convert a JaCoCo coverage report to Cobertura format (since Shippable only supports Cobertura). This guy claims to have a tool to convert JaCoCo to Cobertura, however when running his script I get the following error:
Traceback (most recent call last):
File "cover2cover.py", line 151, in <module>
jacoco2cobertura(filename, source_root)
File "cover2cover.py", line 139, in jacoco2cobertura
convert_root(root, into, source_root)
File "cover2cover.py", line 127, in convert_root
packages.append(convert_package(package))
File "cover2cover.py", line 113, in convert_package
c_classes.append(convert_class(j_class, j_package))
File "cover2cover.py", line 100, in convert_class
c_methods.append(convert_method(j_method, j_method_lines))
File "cover2cover.py", line 85, in convert_method
convert_lines(j_lines, c_method)
File "cover2cover.py", line 33, in convert_lines
for jline in j_lines:
File "cover2cover.py", line 23, in method_lines
larger = list(int(jm.attrib['line']) for jm in jmethods if int(jm.attrib['line']) > start_line)
File "cover2cover.py", line 23, in <genexpr>
larger = list(int(jm.attrib['line']) for jm in jmethods if int(jm.attrib['line']) > start_line)
KeyError: 'line'
I know nothing about python, so any help would be appreciated.
I don't know python either, but I know that python 2 and python 3 have significant differences. Perhaps you ran into that?
I was able to run the script ok with this version:
$> python --version
Python 2.7.11
To ensure I got the script without any download or browser or line-ending type issues, I did clone the git repo:
$> git clone https://github.com/rix0rrr/cover2cover.git
Then the script ran first try on my jacoco XML file.

Categories