Key Error When Creating Training Data in TensorFlow Object Detection - python

I keep on getting this error with my new training data. I have tried the example data and that works but when I use my own, it gives a "Key Error". The only thing different to my data and the training data is that mine has more classes.
Full Error:
Traceback (most recent call last):
File "object_detection/", line 185, in <module>
File "C:\Users\edupt\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\platform\", line 125, in run
File "object_detection/", line 180, in main
image_dir, train_examples)
File "object_detection/", line 152, in create_tf_record
tf_example = dict_to_tf_example(data, label_map_dict, image_dir)
File "object_detection/", line 97, in dict_to_tf_example
KeyError: '300424' <---------- THAT IS THE NAME OF ONE OF THE CLASSES


Does checkpointing with fail with hugging face -- if not what is the right way to checkpoint and load a hugging face (HF) model?

Does work on hugging face models (I am using vit)? I assumed yes.
My error:
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/", line 379, in save
_save(obj, opened_zipfile, pickle_module, pickle_protocol)
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/", line 499, in _save
zip_file.write_record(name, storage.data_ptr(), num_bytes)
OSError: [Errno 116] Stale file handle
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/shared/rsaas/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/", line 1815, in <module>
File "/shared/rsaas/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/", line 1748, in main
File "/shared/rsaas/miranda9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/", line 1795, in train
meta_train_iterations_ala_l2l(args, args.agent, args.opt, args.scheduler)
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/training/", line 213, in meta_train_iterations_ala_l2l
log_train_val_stats(args,, step_name, train_loss, train_acc, training=True)
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/logging_uu/wandb_logging/", line 55, in log_train_val_stats
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/logging_uu/wandb_logging/", line 113, in _log_train_val_stats
save_for_supervised_learning(args, ckpt_filename='')
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/checkpointing_uu/", line 54, in save_for_supervised_learning{'training_mode': args.training_mode,
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/", line 380, in save
File "/home/miranda9/miniconda3/envs/metalearning_gpu/lib/python3.9/site-packages/torch/", line 259, in __exit__
RuntimeError: [enforce fail at] . unexpected pos 2736460544 vs 2736460432
my code:
# - ckpt
args_pickable: Namespace = uutils.make_args_pickable(args)
# note not saving any objects, to make sure checkpoint is loadable later with no problems{'training_mode': args.training_mode,
'epoch_num': args.epoch_num,
# 'args': args_pickable, # some versions of this might not have args!
# decided only to save the dict version to avoid this ckpt not working, make it args when loading
'args_dict': vars(args_pickable), # some versions of this might not have args!
'model_state_dict': get_model_from_ddp(args.model).state_dict(),
'model_str': str(args.model), # added later, to make it easier to check what optimizer was used
'model_hps': args.model_hps,
'model_option': args.model_option,
'opt_state_dict': args.opt.state_dict(),
'opt_str': str(args.opt),
'opt_hps': args.opt_hps,
'opt_option': args.opt_option,
'scheduler_str': str(args.scheduler),
'scheduler_state_dict': try_to_get_scheduler_state_dict(args.scheduler),
'scheduler_hps': args.scheduler_hps,
'scheduler_option': args.scheduler_option,
f=args.log_root / ckpt_filename)
if this is not the right way to checkpoint hugging face (HF) models, what is?
cross: hf discussion forum:

XGBoostError: Unicode-3 is not supported

I am trying to load an XGBClassifier in my streamlit app from a pickle file.
When I load it and try to predict on the new input values, it throws the error:
XGBoostError: [11:25:40] c:\users\administrator\workspace\xgboost-win64_release_1.6.0\src\data\array_interface.h:462: Unicode-3 is not supported.
The entire traceback is:
2022-07-02 11:25:40.046 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\\Anaconda3\lib\site-packages\streamlit\scriptrunner\", line 554, in _run_script
exec(code, module.__dict__)
File "", line 250, in <module>
File "C:\Users\\Anaconda3\lib\site-packages\xgboost\", line 1434, in predict
class_probs = super().predict(
File "C:\Users\\Anaconda3\lib\site-packages\xgboost\", line 1049, in predict
predts = self.get_booster().inplace_predict(
File "C:\Users\\Anaconda3\lib\site-packages\xgboost\", line 2102, in inplace_predict
File "C:\Users\\Anaconda3\lib\site-packages\xgboost\", line 203, in _check_call
raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: [11:25:40] c:\users\administrator\workspace\xgboost-win64_release_1.6.0\src\data\array_interface.h:462: Unicode-3
is not supported.
I load the model this way:
clf = pickle.load(open('xgb.pkl', "rb"))
clf = xgboost.XGBClassifier(tree_method ="hist", enable_categorical=True)
And I predict using:
I had a similar problem which came along with the the same XGBoostError. In my case the reason was the dtype of ndarray, which was supposed to be object.
Assuming that your feat_list is numpy.ndarray and that you create it in such way:
feat_list = np.array(features)
adding dtype=object:
feat_list = np.array(features, dtype=object)
should do the trick.

Cannot load BERT from local disk

I am trying to use Huggingface transformer api to load a locally downloaded M-BERT model but it is throwing an exception.
I clone this repo:
bert = TFBertModel.from_pretrained("input/bert-base-multilingual-cased")
The directory structure is:
But I am getting this error:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/transformers/", line 1277, in from_pretrained
missing_keys, unexpected_keys = load_tf_weights(model, resolved_archive_file, load_weight_prefix)
File "/usr/local/lib/python3.7/dist-packages/transformers/", line 467, in load_tf_weights
with h5py.File(resolved_archive_file, "r") as f:
File "/usr/local/lib/python3.7/dist-packages/h5py/_hl/", line 408, in __init__
File "/usr/local/lib/python3.7/dist-packages/h5py/_hl/", line 173, in make_fid
fid =, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 88, in
OSError: Unable to open file (file signature not found)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "", line 81, in <module>
File "", line 59, in __main__
model = create_model(num_classes)
File "/content/drive/My Drive/msc-project/code/", line 26, in create_model
bert = TFBertModel.from_pretrained("input/bert-base-multilingual-cased")
File "/usr/local/lib/python3.7/dist-packages/transformers/", line 1280, in from_pretrained
"Unable to load weights from h5 file. "
OSError: Unable to load weights from h5 file. If you tried to load a TF 2.0 model from a PyTorch checkpoint, please set from_pt=True.
Where am I going wrong?
Need help!
Thanks in advance.
As it was already pointed in the comments - your from_pretrained param should be either id of a model hosted on or a local path:
A path to a directory containing model weights saved using
save_pretrained(), e.g., ./my_model_directory/.
See documentation
Looking at your stacktrace it seems like your code is run inside:
/content/drive/My Drive/msc-project/code/ so unless your model is in:
/content/drive/My Drive/msc-project/code/input/bert-base-multilingual-cased/ it won't load.
I would also set the path to be similar to documentation example ie:
bert = TFBertModel.from_pretrained("./input/bert-base-multilingual-cased/")

“ValueError: too many values to unpack (expected 3)” when trying to run StyleGAN2 generating command

Got this error when trying to use Derrick Schultz's repository for StyleGAN2 neural training in Google Colab
The command were:
!python generate-images --network=/content/drive/My\ Drive/stylegan2-colab-test/stylegan2/results/00002-stylegan2-birdaus-1gpu-config-f/submit_config.pkl --seeds=3875451-3876000 --truncation-psi=0.7
Everything prior were done as in the tutorial but with my own dataset. But then I got:
Local submit - run_dir: results/00006-generate-images
dnnlib: Running run_generator.generate_images() on localhost...
Loading networks from "/content/drive/My Drive/stylegan2-colab-test/stylegan2/results/00002-stylegan2-birdaus-1gpu-config-f/submit_config.pkl"...
Traceback (most recent call last):
File "", line 490, in <module>
File "", line 485, in main
dnnlib.submit_run(sc, func_name_map[subcmd], **kwargs)
File "/content/drive/My Drive/stylegan2-colab-test/stylegan2/dnnlib/submission/", line 343, in submit_run
return farm.submit(submit_config, host_run_dir)
File "/content/drive/My Drive/stylegan2-colab-test/stylegan2/dnnlib/submission/internal/", line 22, in submit
return run_wrapper(submit_config)
File "/content/drive/My Drive/stylegan2-colab-test/stylegan2/dnnlib/submission/", line 280, in run_wrapper
File "/content/drive/My Drive/stylegan2-colab-test/stylegan2/", line 120, in generate_images
_G, _D, Gs = pretrained_networks.load_networks(network_pkl)
File "/content/drive/My Drive/stylegan2-colab-test/stylegan2/", line 76, in load_networks
G, D, Gs = pickle.load(stream, encoding='latin1')
ValueError: too many values to unpack (expected 3)
This code were supposed to work as is, but doesn't. Unfortunately I'm not a coder myself. What needs to be changed here?

What train_dir to use for Tensorflow imagenet_train to train from scratch?

I am following the below page
I got to the point I have to run:
bazel-bin/inception/imagenet_train --num_gpus=1 --batch_size=32 --train_dir=/tmp/imagenet_train --data_dir=/tmp/imagenet_data
However, I got below error:
Traceback (most recent call last):
File "/home/demo/anaconda3/envs/tensorflow/models/inception/bazel-bin/inception/imagenet_train.runfiles/inception/inception/", line 41, in <module>
File "/home/demo/anaconda3/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/demo/anaconda3/envs/tensorflow/models/inception/bazel-bin/inception/imagenet_train.runfiles/inception/inception/", line 35, in main
File "/home/demo/anaconda3/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/lib/io/", line 420, in delete_recursively
pywrap_tensorflow.DeleteRecursively(compat.as_bytes(dirname), status)
File "/home/demo/anaconda3/envs/tensorflow/lib/python2.7/", line 24, in __exit__
File "/home/demo/anaconda3/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/", line 466, in raise_exception_on_not_ok_status
tensorflow.python.framework.errors_impl.FailedPreconditionError: /tmp/imagenet_train
My DATA_DIR is /tmp/imagenet_data from previous step bazel-bin/inception/download_and_preprocess_imagenet "${DATA_DIR}"
But what would be my train_dir? The doc doesn't mention it? Look like an empty folder is incorrect.
For me, it works if I set the path of --train_dir=/tmp. Also, you have the processed dataset in the same directory. The --train_dir and --data_dir should not coincide with each other.
Location of where to place the ImageNet data DATA_DIR=$HOME/imagenet-data
Can you tell me if you are still running into problems after changing the directory?
--train_dir is the path to an empty directory where the model checkpoints and events files are stored as the model is trained.
