Error while running haystack models in CPU [duplicate]

Error while running haystack models in CPU [duplicate] - python

2023-01-25 08:21:21,659 - ERROR - Traceback (most recent call last):
File "/home/xyzUser/project/queue_handler/document_queue_listner.py", line 148, in __process_and_acknowledge
pipeline_result = self.__process_document_type(message, pipeline_input)
File "/home/xyzUser/project/queue_handler/document_queue_listner.py", line 194, in __process_document_type
pipeline_result = bill_parser_pipeline.process(pipeline_input)
File "/home/xyzUser/project/main/billparser/__init__.py", line 18, in process
bill_extractor_model = MachineGeneratedBillExtractorModel()
File "/home/xyzUser/project/main/billparser/models/qa_model.py", line 25, in __new__
cls.__model = TransformersReader(model_name_or_path=cls.__model_path, use_gpu=False)
File "/home/xyzUser/project/.env/lib/python3.8/site-packages/haystack/nodes/base.py", line 48, in wrapper_exportable_to_yaml
init_func(self, *args, **kwargs)
File "/home/xyzUser/project/.env/lib/python3.8/site-packages/haystack/nodes/reader/transformers.py", line 93, in __init__
self.model = pipeline(
File "/home/xyzUser/project/.env/lib/python3.8/site-packages/transformers/pipelines/__init__.py", line 542, in pipeline
return task_class(model=model, framework=framework, task=task, **kwargs)
File "/home/xyzUser/project/.env/lib/python3.8/site-packages/transformers/pipelines/question_answering.py", line 125, in __init__
super().__init__(
File "/home/xyzUser/project/.env/lib/python3.8/site-packages/transformers/pipelines/base.py", line 691, in __init__
self.device = device if framework == "tf" else torch.device("cpu" if device < 0 else f"cuda:{device}")
TypeError: '<' not supported between instances of 'torch.device' and 'int'
This is the error message i got after installing a requirement.txt file from my project. I think it is related to torch but also dont know how to fix it. I am new to hugging face transformers and dont know if it is a version issue.

This was a bug with the transformers package for a number of versions prior to v4.22.0, given that particular line of code does not discern between the type of the device argument could be a torch.device before comparing that with an int. Tracing through git blame, we can find that this specific change made in changeset 9d4a45509ab include the much needed if isinstance(device, torch.device): provided by line 764 in the resulting file, which will ensure this error won't happen. Checking the tags above will show that the release for v4.22.0 and after should include this particular fix. As a refresher, to update a specific package, activate the environment, and issue the following:
pip install -U transformers
Alternatively with a specific version, e.g.:
pip install -U transformers==4.22.0

Related

Impossible to retrieve data form pyattck module

I am using the pyattck module to retrieve information from mitre att&ck.
Versions:
- pyattck==7.0.0
- pyattck-data==2.5.2
Then, I just created a simple main.py file to test the module.
from pyattck import Attck
def main():
attck = Attck()
for technique in attck.enterprise.techniques:
print(technique.name)
if __name__ == '__main__':
main()
When running the main.py script I get the following exception:
Traceback (most recent call last):
File "/<path>/main.py", line 15, in <module>
main()
File "/<path>/main.py", line 8, in main
for technique in attck.enterprise.techniques:
File "/<path_venv>/lib/python3.10/site-packages/pyattck/attck.py", line 253, in enterprise
from .enterprise import EnterpriseAttck
File "/<path_venv>/lib/python3.10/site-packages/pyattck/enterprise.py", line 7, in <module>
class EnterpriseAttck(Base):
File "/<path_venv>/lib/python3.10/site-packages/pyattck/enterprise.py", line 42, in EnterpriseAttck
__attck = MitreAttck(**Base.config.get_data("enterprise_attck_json"))
File "/<path_venv>/lib/python3.10/site-packages/pyattck_data/attack.py", line 55, in __init__
raise te
File "/<path_venv>/lib/python3.10/site-packages/pyattck_data/attack.py", line 53, in __init__
self.__attrs_init__(**kwargs)
File "<attrs generated init pyattck_data.attack.MitreAttck>", line 14, in __attrs_init__
File "/<path_venv>/lib/python3.10/site-packages/pyattck_data/attack.py", line 66, in __attrs_post_init__
raise te
File "/<path_venv>/lib/python3.10/site-packages/pyattck_data/attack.py", line 62, in __attrs_post_init__
data = TYPE_MAP.get(item['type'])(**item)
TypeError: 'NoneType' object is not callable
Anyone knows where is the issue? Maybe I have forgotten to import something? It would be helpful to know if this module actually works in another version. This one is the lasted stable one ATTOW.
UPDATE
There is am issue with this project. Mitre added some new features that are not supported by the module and make it unusable.
There is an issue on github related to this.

They have already fixed this issue in future releases. You just need to update your package pyattck-data form the bugged version 2.5.2 to 2.6.1 (or any newer).
If you are using pip, just run this:
pip install --upgrade pyattck-data
If you are using conda (inside your venv):
conda update pyattck-data

Error can't get attribute Net when saving PyTorch model with MLFlow

After installing MLFlow using one-click-mlflow I save a pytorch model using the default command that I found in the user guide. You can find the command bellow:
mlflow.pytorch.log_model(net, artifact_path="model", pickle_module=pickle)
The neural network saved is very simple, this is basically a two layer neural network with Xavier initialization and hyperbolic tangent as activation function.
class Net(T.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.hid1 = T.nn.Linear(n_features, 10)
self.hid2 = T.nn.Linear(10, 10)
self.oupt = T.nn.Linear(10, 1)
T.nn.init.xavier_uniform_(self.hid1.weight)
T.nn.init.zeros_(self.hid1.bias)
T.nn.init.xavier_uniform_(self.hid2.weight)
T.nn.init.zeros_(self.hid2.bias)
T.nn.init.xavier_uniform_(self.oupt.weight)
T.nn.init.zeros_(self.oupt.bias)
def forward(self, x):
z = T.tanh(self.hid1(x))
z = T.tanh(self.hid2(z))
z = self.oupt(z)
return z
Every things is runing fine in the Jupyter Notebook. I can log metrics and other artifact but when I save the model I got the following error message:
2021/10/13 09:21:00 WARNING mlflow.utils.requirements_utils: Found torch version (1.9.0+cu111) contains a local version label (+cu111). MLflow logged a pip requirement for this package as 'torch==1.9.0' without the local version label to make it installable from PyPI. To specify pip requirements containing local version labels, please use `conda_env` or `pip_requirements`.
2021/10/13 09:21:00 WARNING mlflow.utils.requirements_utils: Found torchvision version (0.10.0+cu111) contains a local version label (+cu111). MLflow logged a pip requirement for this package as 'torchvision==0.10.0' without the local version label to make it installable from PyPI. To specify pip requirements containing local version labels, please use `conda_env` or `pip_requirements`.
2021/10/13 09:21:01 ERROR mlflow.utils.environment: Encountered an unexpected error while inferring pip requirements (model URI: /tmp/tmpnl9dsoye/model/data, flavor: pytorch)
Traceback (most recent call last):
File "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/mlflow/utils/environment.py", line 212, in infer_pip_requirements
return _infer_requirements(model_uri, flavor)
File "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/mlflow/utils/requirements_utils.py", line 263, in _infer_requirements
modules = _capture_imported_modules(model_uri, flavor)
File "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/mlflow/utils/requirements_utils.py", line 221, in _capture_imported_modules
_run_command(
File "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/mlflow/utils/requirements_utils.py", line 173, in _run_command
raise MlflowException(msg)
mlflow.exceptions.MlflowException: Encountered an unexpected error while running ['/home/ucsky/.virtualenv/mymodel/bin/python', '/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/mlflow/utils/_capture_modules.py', '--model-path', '/tmp/tmpnl9dsoye/model/data', '--flavor', 'pytorch', '--output-file', '/tmp/tmplyj0w2fr/imported_modules.txt', '--sys-path', '["/home/ucsky/project/ofi-ds-research/incubator/ofi-pe-fr/notebook/guillaume-simon/06-modelisation-pytorch", "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/git/ext/gitdb", "/usr/lib/python39.zip", "/usr/lib/python3.9", "/usr/lib/python3.9/lib-dynload", "", "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages", "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/IPython/extensions", "/home/ucsky/.ipython", "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/gitdb/ext/smmap"]']
exit status: 1
stdout:
stderr: Traceback (most recent call last):
File "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/mlflow/utils/_capture_modules.py", line 125, in <module>
main()
File "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/mlflow/utils/_capture_modules.py", line 118, in main
importlib.import_module(f"mlflow.{flavor}")._load_pyfunc(model_path)
File "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/mlflow/pytorch/__init__.py", line 723, in _load_pyfunc
return _PyTorchWrapper(_load_model(path, **kwargs))
File "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/mlflow/pytorch/__init__.py", line 626, in _load_model
return torch.load(model_path, **kwargs)
File "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/torch/serialization.py", line 607, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/torch/serialization.py", line 882, in _load
result = unpickler.load()
File "/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/torch/serialization.py", line 875, in find_class
return super().find_class(mod_name, name)
AttributeError: Can't get attribute 'Net' on <module '__main__' from '/home/ucsky/.virtualenv/mymodel/lib/python3.9/site-packages/mlflow/utils/_capture_modules.py'>
Can somebody explain me what is wrong?

Error when loading a custom model with spacy

I'm trying to load a custom model called 'ru2' into spacy (for npl processing).
it can be found there: https://github.com/buriy/spacy-ru
The problem is when I call the function
nlp = spacy.load('ru2')
doc = nlp(text)
I see the error
C:\ProgramData\Anaconda3\lib\importlib\_bootstrap.py:205: RuntimeWarning: spacy.tokens.span.Span size changed, may indicate binary incompatibility. Expected 72 from C header, got 80 from PyObject
return f(*args, **kwds)
Traceback (most recent call last):
File "C://.../nlp/src/ie/main.py", line 125, in <module>
main(examp_dict['Poroshenko'])
File "C://.../nlp/src/ie/main.py", line 92, in main
nlp = spacy.load('ru2')
File "C:\ProgramData\Anaconda3\lib\site-packages\spacy\__init__.py", line 27, in load
return util.load_model(name, **overrides)
File "C:\ProgramData\Anaconda3\lib\site-packages\spacy\util.py", line 133, in load_model
return load_model_from_path(Path(name), **overrides)
File "C:\ProgramData\Anaconda3\lib\site-packages\spacy\util.py", line 173, in load_model_from_path
return nlp.from_disk(model_path)
File "C:\ProgramData\Anaconda3\lib\site-packages\spacy\language.py", line 791, in from_disk
util.from_disk(path, deserializers, exclude)
File "C:\ProgramData\Anaconda3\lib\site-packages\spacy\util.py", line 630, in from_disk
reader(path / key)
File "C:\ProgramData\Anaconda3\lib\site-packages\spacy\language.py", line 781, in <lambda>
deserializers["tokenizer"] = lambda p: self.tokenizer.from_disk(p, exclude=["vocab"])
File "tokenizer.pyx", line 391, in spacy.tokenizer.Tokenizer.from_disk
File "tokenizer.pyx", line 432, in spacy.tokenizer.Tokenizer.from_bytes
File "C:\ProgramData\Anaconda3\lib\site-packages\spacy\util.py", line 606, in from_bytes
msg = srsly.msgpack_loads(bytes_data)
File "C:\ProgramData\Anaconda3\lib\site-packages\srsly\_msgpack_api.py", line 29, in msgpack_loads
msg = msgpack.loads(data, raw=False, use_list=use_list)
File "C:\ProgramData\Anaconda3\lib\site-packages\srsly\msgpack\__init__.py", line 60, in unpackb
return _unpackb(packed, **kwargs)
File "_unpacker.pyx", line 191, in srsly.msgpack._unpacker.unpackb
TypeError: unhashable type: 'list'
I was searching for similar questions in the Internet:
https://github.com/explosion/spaCy/issues/2715
https://spacy.io/usage#unhashable-list
But non of those solutions work for me.
I use
msgpack==0.5.6 (even downgraded as suggested in the link above)
spacy==2.1.4

Here is from https://spacy.io/usage#troubleshooting
If you’re training models, writing them to disk, and versioning them with git, you might encounter this error when trying to load them in a Windows environment. This happens because a default install of Git for Windows is configured to automatically convert Unix-style end-of-line characters (LF) to Windows-style ones (CRLF) during file checkout (and the reverse when committing). While that’s mostly fine for text files, a trained model written to disk has some binary files that should not go through this conversion. When they do, you get the error above. You can fix it by either changing your core.autocrlf setting to "false", or by committing a .gitattributes file] to your repository to tell git on which files or folders it shouldn’t do LF-to-CRLF conversion, with an entry like path/to/spacy/model/** -text. After you’ve done either of these, clone your repository again.

It might be because the version number of SpaCy used to generate your model is not the same as the version of SpaCy you have installed. (I don't know of course, just mentioning it in case it helps.)

Adding to the answer above, another quick fix would be to manually download the zip from the repository.

Python, how to handle the "ValueError: unsupported pickle protocol: 4" error?

I'm new to Python.
I've to run this TargetFinder script ("Custom Analyses").
I installed all the required python packages, and copied the code into a script I named main.py, and ran it.
I got this error:
[davide#laptop]$ python main.py
Traceback (most recent call last):
File "main.py", line 8, in <module>
training_df = pd.read_hdf('./paper/targetfinder/K562/output-epw/training.h5', 'training').set_index(['enhancer_name', 'promoter_name'])
File "/usr/lib64/python2.7/site-packages/pandas/io/pytables.py", line 330, in read_hdf
return store.select(key, auto_close=auto_close, **kwargs)
File "/usr/lib64/python2.7/site-packages/pandas/io/pytables.py", line 680, in select
return it.get_result()
File "/usr/lib64/python2.7/site-packages/pandas/io/pytables.py", line 1364, in get_result
results = self.func(self.start, self.stop, where)
File "/usr/lib64/python2.7/site-packages/pandas/io/pytables.py", line 673, in func
columns=columns, **kwargs)
File "/usr/lib64/python2.7/site-packages/pandas/io/pytables.py", line 2786, in read
values = self.read_array('block%d_values' % i)
File "/usr/lib64/python2.7/site-packages/pandas/io/pytables.py", line 2327, in read_array
data = node[:]
File "/usr/lib64/python2.7/site-packages/tables/vlarray.py", line 677, in __getitem__
return self.read(start, stop, step)
File "/usr/lib64/python2.7/site-packages/tables/vlarray.py", line 817, in read
outlistarr = [atom.fromarray(arr) for arr in listarr]
File "/usr/lib64/python2.7/site-packages/tables/atom.py", line 1211, in fromarray
return cPickle.loads(array.tostring())
ValueError: unsupported pickle protocol: 4
I've no idea about what this pickle protocol means, and also my colleagues know nothing about it.
How can I solve this problem?
I'm using Python 2.7.5 on a CentOS Linux release 7.2.1511 (Core) operating system

The Pickle protocol is basically the file format. From the documentation,
The higher the protocol used, the more recent the version of Python needed to read the pickle produced. ... Pickle protocol version 4 was added in Python 3.4, your python version (2.7.5) does not support this.
Either upgrade to Python 3.4 or later (current is 3.5) or create the pickle using a lower protocol (2) in the third parameter to pickle.dump().

This sometimes happens due to incorrect data in the redis database. Try it:
sudo redis-cli flushall

It's a python version issue, upgrade it to the latest python version and try.

Unable to ping managed nodes using ansible-2.0

I downloaded the ansible-2.0.0-0.2.alpha2.tar.gz and installed it on my control machine. However now I'm not able to ping any of my machines. Previously using v1.9.2 i could communicate with them. Now it gives the following error:
Unexpected Exception: lstat() argument 1 must be encoded string without NULL bytes, not str
the full traceback was:
Traceback (most recent call last):
File "/usr/bin/ansible", line 79, in
sys.exit(cli.run())
File "/usr/lib/python2.6/site-packages/ansible/cli/adhoc.py", line 111, in run
inventory = Inventory(loader=loader, variable_manager=variable_manager, host_list=self.options.inventory)
File "/usr/lib/python2.6/site-packages/ansible/inventory/init.py", line 77, in init
self.parse_inventory(host_list)
File "/usr/lib/python2.6/site-packages/ansible/inventory/init.py", line 133, in parse_inventory
host.vars = combine_vars(host.vars, self.get_host_variables(host.name))
File "/usr/lib/python2.6/site-packages/ansible/inventory/init.py", line 499, in get_host_variables
self.vars_per_host[hostname] = self.get_host_variables(hostname, vault_password=vault_password)
File "/usr/lib/python2.6/site-packages/ansible/inventory/__init.py", line 529, in get_host_variables
vars = combine_vars(vars, self.get_host_vars(host))
File "/usr/lib/python2.6/site-packages/ansible/inventory/__init_.py", line 653, in get_host_vars
return self.get_hostgroup_vars(host=host, group=None, new_pb_basedir=new_pb_basedir)
File "/usr/lib/python2.6/site-packages/ansible/inventory/__init_.py", line 702, in _get_hostgroup_vars
base_path = os.path.realpath(os.path.join(basedir, "host_vars/%s" % host.name))
File "/usr/lib64/python2.6/posixpath.py", line 365, in realpath
if islink(component):
File "/usr/lib64/python2.6/posixpath.py", line 132, in islink
st = os.lstat(path)
TypeError: lstat() argument 1 must be encoded string without NULL bytes, not str
Any help would be appreciated.

This is a known bug due to some Unicode changes made to the playbook parser in 2.0. Several versions of Python shipped with a version of shlex.split() that fails horribly on Unicode input- you likely have one of them installed. The bug has been worked around and will be included in the next drop. See https://github.com/ansible/ansible/issues/12257

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Error while running haystack models in CPU [duplicate] - python

Related

Impossible to retrieve data form pyattck module

Error can't get attribute Net when saving PyTorch model with MLFlow

Error when loading a custom model with spacy

Python, how to handle the "ValueError: unsupported pickle protocol: 4" error?

Unable to ping managed nodes using ansible-2.0

Categories

Resources