I'm building a Named Entity Recognition (NER) model using the Hugging Face implementation of emilyalsentzer/Bio_ClinicalBERT. Up to today, I've had no issues with the model. I'm hopeful that someone can help me understand why it's currently not working as expected.
Question 1 - today, trying to train using:
MODEL_NAME = 'emilyalsentzer/Bio_ClinicalBERT'
model = text.sequence_tagger('bilstm-bert', preproc, bert_model=MODEL_NAME)
results in this error:
404 Client Error: Not Found for url: https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT/resolve/main/tf_model.h5
Does Hugging Face offer any kind of health check to ascertain the status of their models?
Question 2 - working with files (model.h5, model.json, and preproc.sav) I'd saved from earlier training iterations, I'm getting the same 404 error shown above. I don't understand wherein these files the call to Hugging Face is occurring. It doesn't seem to be in the .json, and the .h5 and .sav file formats are hard to inspect. Read more about what these files are:
https://medium.com/analytics-vidhya/how-to-deploy-your-neural-network-model-using-ktrain-ae255b134c77
Back in February, I'd used these exact model.h5, model.json, and preproc.sav files to run the NER app using Streamlit, no problem. Not sure if this is temporary issue with Bio_ClinicalBERT or if I need to retool my original approach due to potentially permanent problems with this transformer model.
Judging from the URL you provided for Question 1, I understand that you are trying to load pre-trained weights for a TensorFlow model (tf_model.h5)
However, the pre-trained model you are trying to load is only available in a PyTorch version. Go on the link and look at the tags at the top of the page. This model only has a PyTorch tag, not a Tensorflow tag.
It's my understanding that in this case you should load pre-trained weights from the PyTorch checkpoint. To that end, you would add this parameter to the function that does the loading:
from_pt=True
As mentioned in this post in the ktrain repo, you should be able to safely ignore the error.
The 404 error simply means that transformers was not able to find a Tensorflow version of this particular model (which is being requested by ktrain). If you go to the Hugging Face model hub, you'll see that the model you're using was only uploaded as a PyTorch model.
In cases like this, the PyTorch version of the model checkpoint will be automatically downloaded by ktrain and then loaded as a Tensorflow model for training/fine-tuning. If you type model.summary(), it should show that the model was loaded successfully.
The reason you didn't see the error before is because newer versions of Hugging Face transformers show the 404 error, I think.
Related
I have trained my model of doors in yolo-v3 but now I need it in TensorFlow-Lite. But, I am facing a problem, that is, if I want to train my model for tensorflow, I need annotation file in ".csv" or ".xml" but the ones I have are "*.txt". I did found a software to create annotation files manually from drawing rectangles in pictures but I can not do that for thousands of images due to time shortage.
Can anyone guide me how to handle such situation?
I have followed the following link but the resulted model did not work.
https://medium.com/analytics-vidhya/yolov3-to-tensorflow-lite-conversion-4602cec5c239
i think it will be good to train tensorflow implementation on your data , then converting tensrflow model to tflite should be easy
here is yolov3 in tf : https://github.com/YunYang1994/tensorflow-yolov3
then use official tensorflow codes to convert to tflite : https://www.tensorflow.org/lite/convert
I used a spacy blank model with Gensim custom word vectors. Then I trained the model to get the pipeline in the respective order-
entityruler1, ner1, entity ruler2, ner2
After training it, I saved it in a folder through
nlp.to_disk('path to folder')
However, if I try to load the same model using
nlp1 = spacy.load('path to folder')
It gives me this error-
ValueError: [E109] Model for component 'ner' not initialized. Did you forget to load a model, or forget to call begin_training()?
I cannot find any solution online. What might be the reason I am getting this? How do I successfully load and use my pretrained model?
Upgrading to spacy version 2.3.7 resolved this error. :)
I trained a model in Google Cloud ML and saved it as a saved model format. I've attached the directory for the saved model below.
https://drive.google.com/drive/folders/18ivhz3dqdkvSQY-dZ32TRWGGW5JIjJJ1?usp=sharing
I am trying to load the model into R using the following code but it is returning <tensorflow.python.training.tracking.tracking.AutoTrackable> with an object size of 552 bytes, definetly not correct. If anyone can properly load the model, I would love to know how you did it. It should also be able to be loaded into python I assume, that could work too. The model was trained on GPU, not sure which tensorflow version. Thank you very much!
library(keras)
list.files("/path/to/inceptdual400OG")
og400<-load_model_tf("/path/to/inceptdual400OG")
Since the shared model is not available anymore (it says that is in the trash folder)and it is not specified in the question I can't tell which framework you used to save the model on first place. I will suggest trying the Keras load function or the Tensorflow load function depending on which type of saved file model you have.
Bear in mind modify this argument as "compile = FALSE" if you have the model already compiled.
Remember to import the latest libraries if you trained your model with tf>=2.0 because of dependencies incompatibilities {Tensorflow, Keras} and rsconnect::appDependencies() output would be worth checking.
I am trying to load a pre-trained ResNet model from the MadryLab CIFAR-10 challenge into CleverHans to compute transfer attacks.
However restoring the saved models into the model_zoo.madry_lab_challenges.cifar10_model.ResNet object does not work. It appears to restore fine initially, but when I try to actually use the model, I get an error such as:
Attempting to use uninitialized value
ResNet/unit_3_1/residual_only_activation/BatchNorm/moving_mean
The easiest way to reproduce this error is to actually just run the provided attack_model.py example included in CleverHans here:
https://github.com/tensorflow/cleverhans/blob/master/examples/madry_lab_challenges/cifar10/attack_model.py
It encounters the same error after loading the model when it tries to use it, on both adv_trained and naturally_trained models.
Is there a workaround to this problem?
It seems the other option is to use the cleverhans.model.CallableModelWrapper instead, but I haven't been able to find an example of how to use that.
I'm currently developing a prediction model using Tensorflow and my model works well for a customer, so I'm tring to make it as a real product.
My model needs to be retrained using customer's input as time passes, and it should be deployed on customers infrastructure. (Not a SaaS or cloud.) Moreover, I'd like to protect my codes and models.
From my understanding of Tensorflow, trained model can be exported as protobuf, freezed and kept nodes that are required by prediction. freeze_graph.py at Tensorflow repo, I tried it and I successfully ran my prediction model using Golang + libtensorflow.so runtime. (Or, I could use Tensorflow Serving & C++)
If I can train my model on our company's infra, I could say "Okay, let's get some beers". However, my model has to be trained on the customer's infra, and without python code, it seems like I cannot train my model.
https://www.tensorflow.org/versions/r0.12/how_tos/language_bindings/index.html
At this time, support for gradients, functions and control flow operations ("if" and "while") is not available in languages other than Python. This will be updated when the C API provides necessary support.
Is there any workaround deploying TF app without exposing python code or model? Thanks in advance.
You can still use Python with a pre-trained model, without exposing all the code you needed to build it in the first place. As an example of this, have a look at the Inception retraining code, which loads a pretrained GraphDef and then retrains a new top layer:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py