Cant load model after adding custom component in pipeline(spaCy) - python

I trained a NER from scratch with custom dataset without any other component in pipeline. I loaded the model and added another component(Phrase Matcher) to existing pipeline and trained NER again.Now I cant load the new saved model.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/deusxmachine/.local/lib/python2.7/site-packages/spacy/__init__.py", line 19, in load
return util.load_model(name, **overrides)
File "/home/deusxmachine/.local/lib/python2.7/site-packages/spacy/util.py", line 116, in load_model
return load_model_from_path(Path(name), **overrides)
File "/home/deusxmachine/.local/lib/python2.7/site-packages/spacy/util.py", line 156, in load_model_from_path
component = nlp.create_pipe(name, config=config)
File "/home/deusxmachine/.local/lib/python2.7/site-packages/spacy/language.py", line 215, in create_pipe
raise KeyError("Can't find factory for '{}'.".format(name))
KeyError: u"Can't find factory for 'my_component'."`

Related

Calling a pre-trained model from huggingface results in AttributeError

I'm trying to set up a pre-trained model from huggingface. This is the start of my code:
from transformers import pipeline, TFAutoModelForSequenceClassification
model_name = "microsoft/deberta-base-mnli"
text_classifier = pipeline("text-classification", model=model_name)
model = TFAutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2, ignore_mismatched_sizes=True, from_pt=True)
But already for this last line I get the following error:
Traceback (most recent call last):
File "/home/liv/PycharmProjects/ZSTC_for_MHC/models/baselines/single_task.py", line 29, in <module>
model = TFAutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2, ignore_mismatched_sizes=True,
File "/home/liv/PycharmProjects/ZSTC_for_MHC/.venv/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 446, in from_pretrained
return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
File "/home/liv/PycharmProjects/ZSTC_for_MHC/.venv/lib/python3.9/site-packages/transformers/modeling_tf_utils.py", line 1972, in from_pretrained
return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True)
File "/home/liv/PycharmProjects/ZSTC_for_MHC/.venv/lib/python3.9/site-packages/transformers/modeling_tf_pytorch_utils.py", line 122, in load_pytorch_checkpoint_in_tf2_model
logger.info(f"PyTorch checkpoint contains {sum(t.numel() for t in pt_state_dict.values()):,} parameters")
File "/home/liv/PycharmProjects/ZSTC_for_MHC/.venv/lib/python3.9/site-packages/transformers/modeling_tf_pytorch_utils.py", line 122, in <genexpr>
logger.info(f"PyTorch checkpoint contains {sum(t.numel() for t in pt_state_dict.values()):,} parameters")
AttributeError: 'dict' object has no attribute 'numel'
I'm not calling numel myself and I don't even know which dictionary. Furthermore, I want to use TensorFlow for this, so I don't know if it's alright that it shows a PyTorch error. Anyone know how to fix this? Google didn't provide any solutions.

Error while loading fine-tuned simpletransformer model in Docker Container

I am saving and loading a model using torch.save() and torch.load() commands.
While loading a fine-tuned simple transformer model in Docker Container, I am facing this error which I am not able to resolve:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 594, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 853, in _load
result = unpickler.load()
File "/usr/local/lib/python3.7/dist-packages/transformers/models/xlm_roberta/tokenization_xlm_roberta.py", line 161, in __setstate__
self.sp_model.Load(self.vocab_file)
File "/usr/local/lib/python3.7/dist-packages/sentencepiece.py", line 367, in Load
return self.LoadFromFile(model_file)
File "/usr/local/lib/python3.7/dist-packages/sentencepiece.py", line 177, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
OSError: Not found: "/home/jupyter/.cache/huggingface/transformers/9df9ae4442348b73950203b63d1b8ed2d18eba68921872aee0c3a9d05b9673c6.00628a9eeb8baf4080d44a0abe9fe8057893de20c7cb6e6423cddbf452f7d4d8": No such file or directory Error #2
If anyone has any idea about it, please let me know.
I am using:
torch ==1.7.1+cu101
sentence-transformers 0.3.9
simpletransformers 0.51.15
transformers 4.4.2
tensorflow 2.2.0
I suggest using state_dict objects - the Python dictionaries as they can be easily saved, updated and restored giving you a flexibility for restoring the model later. Here are the recommended Save/Load methods for saving models with state_dict:
Save
torch.save(model.state_dict(), PATH)
Load
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()

Cannot load BERT from local disk

I am trying to use Huggingface transformer api to load a locally downloaded M-BERT model but it is throwing an exception.
I clone this repo: https://huggingface.co/bert-base-multilingual-cased
bert = TFBertModel.from_pretrained("input/bert-base-multilingual-cased")
The directory structure is:
But I am getting this error:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_utils.py", line 1277, in from_pretrained
missing_keys, unexpected_keys = load_tf_weights(model, resolved_archive_file, load_weight_prefix)
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_utils.py", line 467, in load_tf_weights
with h5py.File(resolved_archive_file, "r") as f:
File "/usr/local/lib/python3.7/dist-packages/h5py/_hl/files.py", line 408, in __init__
swmr=swmr)
File "/usr/local/lib/python3.7/dist-packages/h5py/_hl/files.py", line 173, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (file signature not found)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 81, in <module>
__main__()
File "train.py", line 59, in __main__
model = create_model(num_classes)
File "/content/drive/My Drive/msc-project/code/model.py", line 26, in create_model
bert = TFBertModel.from_pretrained("input/bert-base-multilingual-cased")
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_utils.py", line 1280, in from_pretrained
"Unable to load weights from h5 file. "
OSError: Unable to load weights from h5 file. If you tried to load a TF 2.0 model from a PyTorch checkpoint, please set from_pt=True.
Where am I going wrong?
Need help!
Thanks in advance.
As it was already pointed in the comments - your from_pretrained param should be either id of a model hosted on huggingface.co or a local path:
A path to a directory containing model weights saved using
save_pretrained(), e.g., ./my_model_directory/.
See documentation
Looking at your stacktrace it seems like your code is run inside:
/content/drive/My Drive/msc-project/code/model.py so unless your model is in:
/content/drive/My Drive/msc-project/code/input/bert-base-multilingual-cased/ it won't load.
I would also set the path to be similar to documentation example ie:
bert = TFBertModel.from_pretrained("./input/bert-base-multilingual-cased/")

from where to download resnet50.h5 file

I got the following error when trying to load a ResNet50 model. Where should I download the resnet50.h5 file?
Traceback (most recent call last):
File "C:\Users\drlng\Desktop\image-captioning-keras-resnet-main\app.py", line 61, in <module>
resnet = load_model('resnet.h5')
File "C:\Users\drlng\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\saving\save.py", line 211, in load_model
loader_impl.parse_saved_model(filepath)
File "C:\Users\drlng\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\saved_model\loader_impl.py", line 111, in parse_saved_model
raise IOError("SavedModel file does not exist at: %s/{%s|%s}" %
OSError: SavedModel file does not exist at: resnet.h5/{saved_model.pbtxt|saved_model.pb}
I use resnet50.py for making my model
and read weight of resnet50 from below link:
weights best!
you can download pre train models
It works well
If you are looking for pre-trained weights of ResNet-50, you can find it here

KeyError: u'NearestNeighbors' on loading saved model from tf.contrib.factorization.KMeansClustering

I am trying to do the following:
Run kmeans clustering using tensorflow (1.8.0)
Save the model using kmeans.export_savedmodel
Use the model using tf.saved_model.loader.load
I am using the exact script at: https://www.tensorflow.org/api_docs/python/tf/contrib/factorization/KMeansClustering
I am using following code for saving the model:
Input Reciever:
def serving_input_receiver_fn():
feature_spec = {"x": tf.FixedLenFeature(dtype=tf.float32, shape=[2])}
model_placeholder = tf.placeholder(dtype=tf.string,shape=[None],name='input')
receiver_tensors = {"model_inputs": model_placeholder}
features = tf.parse_example(model_placeholder, feature_spec)
return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)
Export:
kmeans.export_savedmodel("/path/", serving_input_receiver_fn)
To import I use:
tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING],"/path")
On last step I run into this issue:
Traceback (most recent call last):
File "restore_model.py", line 6, in <module>
tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], "/Users/z001t3k/work/codebase/ContentPipeline/cep-scripts/cep/datacollection/algorithms/cluster_model/1525963476")
File "/Users/z001t3k/python_virtualenvs/tensorflow/lib/python2.7/site-packages/tensorflow/python/saved_model/loader_impl.py", line 219, in load
saver = tf_saver.import_meta_graph(meta_graph_def_to_load, **saver_kwargs)
File "/Users/z001t3k/python_virtualenvs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1955, in import_meta_graph
**kwargs)
File "/Users/z001t3k/python_virtualenvs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/meta_graph.py", line 743, in import_scoped_meta_graph
producer_op_list=producer_op_list)
File "/Users/z001t3k/python_virtualenvs/tensorflow/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 432, in new_func
return func(*args, **kwargs)
File "/Users/z001t3k/python_virtualenvs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 460, in import_graph_def
_RemoveDefaultAttrs(op_dict, producer_op_list, graph_def)
File "/Users/z001t3k/python_virtualenvs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 227, in _RemoveDefaultAttrs
op_def = op_dict[node.op]
KeyError: u'NearestNeighbors'
Tensorflow is having trouble locating the NearestNeighbors op, which is part of the graph you're loading. Ops defined in contrib are loaded dynamically when you import the corresponding contrib package in Python.
So just add
import tensorflow.contrib.factorization
before loading the SavedModel.

Categories