Cannot load BERT from local disk

Cannot load BERT from local disk - python

I am trying to use Huggingface transformer api to load a locally downloaded M-BERT model but it is throwing an exception.
I clone this repo: https://huggingface.co/bert-base-multilingual-cased
bert = TFBertModel.from_pretrained("input/bert-base-multilingual-cased")
The directory structure is:
But I am getting this error:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_utils.py", line 1277, in from_pretrained
missing_keys, unexpected_keys = load_tf_weights(model, resolved_archive_file, load_weight_prefix)
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_utils.py", line 467, in load_tf_weights
with h5py.File(resolved_archive_file, "r") as f:
File "/usr/local/lib/python3.7/dist-packages/h5py/_hl/files.py", line 408, in __init__
swmr=swmr)
File "/usr/local/lib/python3.7/dist-packages/h5py/_hl/files.py", line 173, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (file signature not found)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 81, in <module>
__main__()
File "train.py", line 59, in __main__
model = create_model(num_classes)
File "/content/drive/My Drive/msc-project/code/model.py", line 26, in create_model
bert = TFBertModel.from_pretrained("input/bert-base-multilingual-cased")
File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_utils.py", line 1280, in from_pretrained
"Unable to load weights from h5 file. "
OSError: Unable to load weights from h5 file. If you tried to load a TF 2.0 model from a PyTorch checkpoint, please set from_pt=True.
Where am I going wrong?
Need help!
Thanks in advance.

As it was already pointed in the comments - your from_pretrained param should be either id of a model hosted on huggingface.co or a local path:
A path to a directory containing model weights saved using
save_pretrained(), e.g., ./my_model_directory/.
See documentation
Looking at your stacktrace it seems like your code is run inside:
/content/drive/My Drive/msc-project/code/model.py so unless your model is in:
/content/drive/My Drive/msc-project/code/input/bert-base-multilingual-cased/ it won't load.
I would also set the path to be similar to documentation example ie:
bert = TFBertModel.from_pretrained("./input/bert-base-multilingual-cased/")

Related

Does checkpointing with torch.save fail with hugging face -- if not what is the right way to checkpoint and load a hugging face (HF) model?

Does torch.save work on hugging face models (I am using vit)? I assumed yes.
My error:
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/serialization.py", line 379, in save
_save(obj, opened_zipfile, pickle_module, pickle_protocol)
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/serialization.py", line 499, in _save
zip_file.write_record(name, storage.data_ptr(), num_bytes)
OSError: [Errno 116] Stale file handle
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/shared/rsaas/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_dist_maml_l2l.py", line 1815, in <module>
main()
File "/shared/rsaas/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_dist_maml_l2l.py", line 1748, in main
train(args=args)
File "/shared/rsaas/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_dist_maml_l2l.py", line 1795, in train
meta_train_iterations_ala_l2l(args, args.agent, args.opt, args.scheduler)
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/training/meta_training.py", line 213, in meta_train_iterations_ala_l2l
log_train_val_stats(args, args.it, step_name, train_loss, train_acc, training=True)
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/logging_uu/wandb_logging/supervised_learning.py", line 55, in log_train_val_stats
_log_train_val_stats(args=args,
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/logging_uu/wandb_logging/supervised_learning.py", line 113, in _log_train_val_stats
save_for_supervised_learning(args, ckpt_filename='ckpt.pt')
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/checkpointing_uu/supervised_learning.py", line 54, in save_for_supervised_learning
torch.save({'training_mode': args.training_mode,
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/serialization.py", line 380, in save
return
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/serialization.py", line 259, in __exit__
self.file_like.write_end_of_file()
RuntimeError: [enforce fail at inline_container.cc:298] . unexpected pos 2736460544 vs 2736460432
my code:
# - ckpt
args_pickable: Namespace = uutils.make_args_pickable(args)
# note not saving any objects, to make sure checkpoint is loadable later with no problems
torch.save({'training_mode': args.training_mode,
'it': args.it,
'epoch_num': args.epoch_num,
# 'args': args_pickable, # some versions of this might not have args!
# decided only to save the dict version to avoid this ckpt not working, make it args when loading
'args_dict': vars(args_pickable), # some versions of this might not have args!
'model_state_dict': get_model_from_ddp(args.model).state_dict(),
'model_str': str(args.model), # added later, to make it easier to check what optimizer was used
'model_hps': args.model_hps,
'model_option': args.model_option,
'opt_state_dict': args.opt.state_dict(),
'opt_str': str(args.opt),
'opt_hps': args.opt_hps,
'opt_option': args.opt_option,
'scheduler_str': str(args.scheduler),
'scheduler_state_dict': try_to_get_scheduler_state_dict(args.scheduler),
'scheduler_hps': args.scheduler_hps,
'scheduler_option': args.scheduler_option,
},
pickle_module=pickle,
f=args.log_root / ckpt_filename)
if this is not the right way to checkpoint hugging face (HF) models, what is?
cross: hf discussion forum: https://discuss.huggingface.co/t/torch-save-with-hugging-face-models-fails/25034

Getting filenotfound error when trying to open a h5 file

I have a h5 file that contains the "catvnoncat" dataset. when I try to run the following code, I get an error which I will include at the bottom. I have tried getting the dataset from three different sources to exclude to possibility of a corrupted file.
What I would like to know is what is causing the problem?
# My Code
import h5py
train_dataset = h5py.File('train_catvnoncat.h5', "r")
# The Error
Traceback (most recent call last):
File "f:\Downloads\Compressed\coursera-deep-learning-specialization-master_2\coursera-deep-learning-specialization-master\C1 - Neural Networks and Deep Learning\Week 2\Logistic Regression as a Neural Network\lr_utils.py", line 4, in <module>
train_dataset = h5py.File('train_catvnoncat.h5', "r")
File "C:\Users\Mohsen\AppData\Local\Programs\Python\Python39\lib\site-packages\h5py\_hl\files.py", line 507, in __init__
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
File "C:\Users\Mohsen\AppData\Local\Programs\Python\Python39\lib\site-packages\h5py\_hl\files.py", line 220, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py\h5f.pyx", line 106, in h5py.h5f.open
FileNotFoundError: [Errno 2] Unable to open file (unable to open file: name = 'train_catvnoncat.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

Your code is looking in the current directory which is not where the file is.
Based on the error message, it looks like you are on windows. Is the file 'train_catvnoncat.h5' in your Downloads folder? Find that file on your system and copy the full path. You can then update this:
train_dataset = h5py.File('train_catvnoncat.h5', "r")
to something that contains the full path. Similar to this:
train_dataset = h5py.File('C:/Users/Moshen/Downloads/train_catvnoncat.h5', "r")
Pay attention to the slashes and make sure you update the file path to the actual value.

VSCode BERT ValueError: Unable to access local path

I have written the code for an entity extraction model using bert but when I run the train.py file I get a value error.
This is the structure of my code with the configuration file in VSCode, I have downloaded bert models from here
Error
>> (myenv) PS D:\Transformers\bert-entity-extraction> python src/train.py
Configuration Complete!
Traceback (most recent call last):
File "src/train.py", line 83, in <module>
model = EntityModel(num_tag = num_tag, num_pos = num_pos)
File "D:\Transformers\bert-entity-extraction\src\model.py", line 25, in __init__
self.bert = transformers.BertModel.from_pretrained(config.BASE_MODEL_PATH)
File "C:\Users\hp\anaconda3\envs\myenv\lib\site-packages\transformers\modeling_utils.py", line 1080, in from_pretrained
**kwargs,
File "C:\Users\hp\anaconda3\envs\myenv\lib\site-packages\transformers\configuration_utils.py", line 427, in from_pretrained
config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "C:\Users\hp\anaconda3\envs\myenv\lib\site-packages\transformers\configuration_utils.py", line 492, in get_config_dict
user_agent=user_agent,
File "C:\Users\hp\anaconda3\envs\myenv\lib\site-packages\transformers\file_utils.py", line 1289, in cached_path
raise ValueError(f"unable to parse {url_or_filename} as a URL or as a local path")
ValueError: unable to parse D:\Transformers\bert-entity-extraction\input\bert-base-uncased_L-12_H-768_A-12\config.json as a URL or as a local path
How to fix this?

from where to download resnet50.h5 file

I got the following error when trying to load a ResNet50 model. Where should I download the resnet50.h5 file?
Traceback (most recent call last):
File "C:\Users\drlng\Desktop\image-captioning-keras-resnet-main\app.py", line 61, in <module>
resnet = load_model('resnet.h5')
File "C:\Users\drlng\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\keras\saving\save.py", line 211, in load_model
loader_impl.parse_saved_model(filepath)
File "C:\Users\drlng\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\saved_model\loader_impl.py", line 111, in parse_saved_model
raise IOError("SavedModel file does not exist at: %s/{%s|%s}" %
OSError: SavedModel file does not exist at: resnet.h5/{saved_model.pbtxt|saved_model.pb}

I use resnet50.py for making my model
and read weight of resnet50 from below link:
weights best!
you can download pre train models
It works well

If you are looking for pre-trained weights of ResNet-50, you can find it here

What train_dir to use for Tensorflow imagenet_train to train from scratch?

I am following the below page
https://github.com/tensorflow/models/tree/master/inception
I got to the point I have to run:
bazel-bin/inception/imagenet_train --num_gpus=1 --batch_size=32 --train_dir=/tmp/imagenet_train --data_dir=/tmp/imagenet_data
However, I got below error:
Traceback (most recent call last):
File "/home/demo/anaconda3/envs/tensorflow/models/inception/bazel-bin/inception/imagenet_train.runfiles/inception/inception/imagenet_train.py", line 41, in <module>
tf.app.run()
File "/home/demo/anaconda3/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/demo/anaconda3/envs/tensorflow/models/inception/bazel-bin/inception/imagenet_train.runfiles/inception/inception/imagenet_train.py", line 35, in main
tf.gfile.DeleteRecursively(FLAGS.train_dir)
File "/home/demo/anaconda3/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 420, in delete_recursively
pywrap_tensorflow.DeleteRecursively(compat.as_bytes(dirname), status)
File "/home/demo/anaconda3/envs/tensorflow/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/home/demo/anaconda3/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: /tmp/imagenet_train
My DATA_DIR is /tmp/imagenet_data from previous step bazel-bin/inception/download_and_preprocess_imagenet "${DATA_DIR}"
But what would be my train_dir? The doc doesn't mention it? Look like an empty folder is incorrect.

For me, it works if I set the path of --train_dir=/tmp. Also, you have the processed dataset in the same directory. The --train_dir and --data_dir should not coincide with each other.
Location of where to place the ImageNet data DATA_DIR=$HOME/imagenet-data
Can you tell me if you are still running into problems after changing the directory?

--train_dir is the path to an empty directory where the model checkpoints and events files are stored as the model is trained.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Cannot load BERT from local disk - python

Related

Does checkpointing with torch.save fail with hugging face -- if not what is the right way to checkpoint and load a hugging face (HF) model?

Getting filenotfound error when trying to open a h5 file

VSCode BERT ValueError: Unable to access local path

from where to download resnet50.h5 file

What train_dir to use for Tensorflow imagenet_train to train from scratch?

Categories

Resources