Unable to download saved model from online resource, pickle error - python

I am unable to download and use a model I saved earlier from a online-repository. Here's the code:
model = Model().double() # Model is defined in another class
state_dict = torch.hub.load_state_dict_from_url(r'https://filebin.net/j2977ux7kts41aft/checkpoint_best.pt?t=wjbujfoo')
model.load_state_dict(state_dict)
model.eval()
Which gives me the following error:
Traceback (most recent call last):
File "/path/file.py", line 47, in <module>
state_dict = torch.hub.load_state_dict_from_url(r'https://filebin.net/j2977ux7kts41aft/checkpoint_best.pt?t=wjbujfoo')
File "anaconda3/envs/torch_env/lib/python3.6/site-packages/torch/hub.py", line 466, in load_state_dict_from_url
return torch.load(cached_file, map_location=map_location)
File "/anaconda3/envs/torch_env/lib/python3.6/site-packages/torch/serialization.py", line 386, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "anaconda3/envs/torch_env/lib/python3.6/site-packages/torch/serialization.py", line 563, in _load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '\x0a'.
The model resides in:
https://filebin.net/j2977ux7kts41aft/checkpoint_best.pt?t=wjbujfoo
Note that I can perfectly download it manually, and then use torch.load(path) to load it without errors, but I need to do it from code! Could it be that the serializing when downloading from url somehow messes up the pickle encoding?
Edit: I don't have to use filebin, any online-storage that supports what I try to do will suffice.

This code with link from 'download button' and 'map_location' parameter works fine for me:
state_dict = torch.hub.load_state_dict_from_url(r'https://filebin.net/j2977ux7kts41aft/checkpoint_best.pt?t=wjbujfoo', map_location=torch.device('cpu'))

The problem was indeed within the environment configuration. I created the model with PyTorch 1.0.2 and then updated to 1.2.0 in order to use torch.hub. This gave me the pickle error. After training a new model in 1.2.0, the error is now gone.
Hope this help someone in the future :)

Related

Changing a custom resnet 18 architecture subtly and still use it in pre-trained mode

Can I change a custom resnet 18 architecture and still use it in pre-trained = true mode? I am doing a subtle change in the architecture of a custom resnet18 and when i run it, i get the following error:
This is how the custom resnet18 is called:
model = Resnet_18.resnet18(pretrained=True, embedding_size=args.dim_embed)
The new change in the custom resnet18:
self.layer_attend1 = nn.Sequential(nn.Conv2d(layers[0], layers[0], stride=2, padding=1, kernel_size=3),
nn.AdaptiveAvgPool2d(1),
nn.Softmax(1))
I am loading the checkpoint using:
checkpoint = torch.load(args.resume, encoding='latin1')
args.start_epoch = checkpoint['epoch']
best_acc = checkpoint['best_prec1']
tnet.load_state_dict(checkpoint['state_dict'])
The output of running the model is:
/scratch3/venv/fashcomp/lib/python3.8/site-packages/torchvision/transforms/transforms.py:310: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
warnings.warn("The use of the transforms.Scale transform is deprecated, " +
=> loading checkpoint 'runs/nondisjoint_l2norm/model_best.pth.tar'
Traceback (most recent call last):
File "main.py", line 352, in <module>
main()
File "main.py", line 145, in main
tnet.load_state_dict(checkpoint['state_dict'])
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tripletnet:
Missing key(s) in state_dict: "embeddingnet.embeddingnet.layer_attend1.0.weight", "embeddingnet.embeddingnet.layer_attend1.0.bias". /scratch3/venv/fashcomp/lib/python3.8/site-packages/torchvision/transforms/transforms.py:310: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
warnings.warn("The use of the transforms.Scale transform is deprecated, " +
=> loading checkpoint 'runs/nondisjoint_l2norm/model_best.pth.tar'
Traceback (most recent call last):
File "main.py", line 352, in <module>
main()
File "main.py", line 145, in main
tnet.load_state_dict(checkpoint['state_dict'])
File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tripletnet:
Missing key(s) in state_dict: "embeddingnet.embeddingnet.layer_attend1.0.weight", "embeddingnet.embeddingnet.layer_attend1.0.bias".
So, how can you implement small architectural changes without retraining from the scratch every time?
P.S.: Cross-posting here: https://discuss.pytorch.org/t/can-i-change-a-custom-resnet-18-architecture-subtly-and-still-use-it-in-pre-trained-true-mode/130783 Thanks a lot to Rodrigo Berriel teach me about https://meta.stackexchange.com/a/141824/913043
If you really want to do this, you should construct the model and then call load_state_dict with the argument strict=False (https://pytorch.org/docs/stable/generated/torch.nn.Module.html?highlight=load_state_dict#torch.nn.Module.load_state_dict).
Keep in mind that A) you should initialize any new layers you added explicitly, because they won't be initialized by the state dict, and B) the model will probably not work out of the box because of the uninitialized weights, but it should train faster than a randomly initialized model.

Error while loading fine-tuned simpletransformer model in Docker Container

I am saving and loading a model using torch.save() and torch.load() commands.
While loading a fine-tuned simple transformer model in Docker Container, I am facing this error which I am not able to resolve:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 594, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 853, in _load
result = unpickler.load()
File "/usr/local/lib/python3.7/dist-packages/transformers/models/xlm_roberta/tokenization_xlm_roberta.py", line 161, in __setstate__
self.sp_model.Load(self.vocab_file)
File "/usr/local/lib/python3.7/dist-packages/sentencepiece.py", line 367, in Load
return self.LoadFromFile(model_file)
File "/usr/local/lib/python3.7/dist-packages/sentencepiece.py", line 177, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
OSError: Not found: "/home/jupyter/.cache/huggingface/transformers/9df9ae4442348b73950203b63d1b8ed2d18eba68921872aee0c3a9d05b9673c6.00628a9eeb8baf4080d44a0abe9fe8057893de20c7cb6e6423cddbf452f7d4d8": No such file or directory Error #2
If anyone has any idea about it, please let me know.
I am using:
torch ==1.7.1+cu101
sentence-transformers 0.3.9
simpletransformers 0.51.15
transformers 4.4.2
tensorflow 2.2.0
I suggest using state_dict objects - the Python dictionaries as they can be easily saved, updated and restored giving you a flexibility for restoring the model later. Here are the recommended Save/Load methods for saving models with state_dict:
Save
torch.save(model.state_dict(), PATH)
Load
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()

AttributeError: format not found - pyodide + joblib.dump + scikit-learn (TfidfVectorizer)

I have pickled a SMS spam prediction model using pickle. Now, I want to use Pyodide to load the model in the browser.
I have loaded the pickled file using pickle.loads in the browser:
console.log("Pyodide loaded, downloading pretrained ML model...")
const model = (await blobToBase64(await (await fetch("/model.pkl")).blob())).replace("data:application/octet-stream;base64,", "")
console.log("Loading model into Pyodide...")
await pyodide.loadPackage("scikit-learn")
await pyodide.loadPackage("joblib")
pyodide.runPython(`
import base64
import pickle
from io import BytesIO
classifier, vectorizer = pickle.loads(base64.b64decode('${model}'))
`)
This works.
But, when I try to call:
const prediction = pyodide.runPython(`
vectorized_message = vectorizer.transform(["Call +172949 if you want to get $1000 immediately!!!!"])
classifier.predict(vectorized_message)[0]
`)
It gives an error(in vectorizer.transform): AttributeError: format not found
Full error dump is below.
Uncaught (in promise) Error: Traceback (most recent call last):
File "/lib/python3.8/site-packages/pyodide/_base.py", line 70, in eval_code
eval(compile(mod, "<exec>", mode="exec"), ns, ns)
File "<exec>", line 2, in <module>
File "/lib/python3.8/site-packages/sklearn/feature_extraction/text.py", line 1899, in transform
return self._tfidf.transform(X, copy=False)
File "/lib/python3.8/site-packages/sklearn/feature_extraction/text.py", line 1513, in transform
X = X * self._idf_diag
File "/lib/python3.8/site-packages/scipy/sparse/base.py", line 319, in __mul__
return self._mul_sparse_matrix(other)
File "/lib/python3.8/site-packages/scipy/sparse/compressed.py", line 478, in _mul_sparse_matrix
other = self.__class__(other) # convert to this format
File "/lib/python3.8/site-packages/scipy/sparse/compressed.py", line 28, in __init__
if arg1.format == self.format and copy:
File "/lib/python3.8/site-packages/scipy/sparse/base.py", line 525, in __getattr__
raise AttributeError(attr + " not found")
AttributeError: format not found
_hiwire_throw_error https://cdn.jsdelivr.net/pyodide/v0.16.1/full/pyodide.asm.js:8
__runPython https://cdn.jsdelivr.net/pyodide/v0.16.1/full/pyodide.asm.js:8
_runPythonInternal https://cdn.jsdelivr.net/pyodide/v0.16.1/full/pyodide.asm.js:8
runPython https://cdn.jsdelivr.net/pyodide/v0.16.1/full/pyodide.asm.js:8
<anonymous> http://localhost/:41
async* http://localhost/:46
pyodide.asm.js:8:39788
In Python it works fine though.
What I might be doing wrong?
It's likely a pickle portability issue. Pickles should be portable between architectures¹, here amd64 and wasm32 however they are not portable across package versions. This means that package versions should be identical between the environement where you train your model and where you do the inference (pyodide).
pyodide 0.16.1 includes Python 3.8.2, scipy 0.17.1 and scikit-learn 0.22.2. Which unfortunately means that you will have to build that version of scipy (and possibly numpy) from sources to train the model, since a Python 3.8 binary wheel doesn't exist for such an outdated version of scipy. In the future this should be resolved with pyodide#1293.
The particular error you are getting is likely due to scipy.sparse version mimatch see scipy#6533
¹Though, tree based models in scikit-learn at present are not portable across architectures, and so won't unpickle in pyodide. This is known bug that should be fixed (scikit-learn#19602)

No module named 'parse_config' while tryhing to load checkpoint in PyTorch

I have a checkpoint file saved after training a model in Pytorch. I have to inspect it in a different module so I tried to load the checkpoint using the following code.
map_location = lambda storage, loc: storage
checkpoint = torch.load("model.pt", map_location=map_location)
But it is raising ModuleNotFoundError issue, which I couldn't find a way to resolve.
The error traceback :
Traceback (most recent call last):
File "main.py", line 11, in <module>
model = loadmodel(hook_feature)
File "/home/../model_loader.py", line 21, in loadmodel
checkpoint = torch.load(settings.MODEL_FILE, map_location=map_location)
File "/home/../.conda/envs/envreporting/lib/python3.6/site-packages/torch/serialization.py", line 584, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/home/../.conda/envs/envreporting/lib/python3.6/site-packages/torch/serialization.py", line 842, in _load
result = unpickler.load()
ModuleNotFoundError: No module named 'parse_config'
I couldn't find an already existing issue relevant to this one.
Is it possible that you have used https://github.com/victoresque/pytorch-template for training the model ? In that case, the project also saves its config in the checkpoint and you also need to import parse_config.py file in order to load it.

Why is ValueError thrown by keras.models.model_from_config() in the Keras-to-Tensorflow exporting example?

The Keras website has this article about exporting Keras models to core Tensorflow. However the step
new_model = model_from_config(config)
throws an error:
Traceback (most recent call last):
File "/home/hal9000/tf_serving_experiments/sndbx.py", line 38, in <module>
new_model = model_from_config(config)
File "/home/hal9000/keras2env/local/lib/python2.7/site-packages/keras/models.py", line 304, in model_from_config
return layer_module.deserialize(config, custom_objects=custom_objects)
File "/home/hal9000/keras2env/local/lib/python2.7/site-packages/keras/layers/__init__.py", line 54, in deserialize
printable_module_name='layer')
File "/home/hal9000/keras2env/local/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 122, in deserialize_keras_object
raise ValueError('Improper config format: ' + str(config))
ValueError: Improper config format: {'layers': [{'class_name': 'InputLayer', 'config': {...
People have suggested that there's a problem using the model_from_config() method with Keras v1 models since the release of v2. However I have tried this with a range of models from different versions, including the built-in Keras ResNet50 and a simple single-layer MLP defined in that very script. All throw the same error.
It would appear that the keras.utils.generic_utils.deserialize_keras_object() method wants to find a key "class_name" or "config" in the config dictionary (see source). Upon inspection of the config dict that get_config() creates, there is no such entry; instead there are keys:
"input_layers"
"layers"
"name"
"output_layers"
I also opened an issue https://github.com/fchollet/keras/issues/7232 and created a Gist that you can run for yourself and see the error. https://gist.github.com/9thDimension/e1cdb2cd11f11309bfaf297b276f7456
Keras 2.0.6
Tensorflow 1.1.0
For whatever reason the dictionry object that keras.models.Model.get_config() returns is not compatible with the keras.models.model_from_config() method to rehydrate models.
I replaced these with equivalent calls to keras.models.Model.model_to_json() and keras.models.model_from_json() and was able to proceed successfully.
The inverse of keras.models.Model.get_config() seems to be keras.models.Model.from_config(), not keras.models.model_from_config() https://keras.io/models/about-keras-models/. Try that instead?

Categories