Issue while using transformers package inside the docker image

Issue while using transformers package inside the docker image - python

I am using transformers pipeline to perform sentiment analysis on sample texts from 6 different languages. I tested the code in my local Jupyterhub and it worked fine. But when I wrap it in a flask application and create a docker image out of it, the execution is hanging at the pipeline inference line and its taking forever to return the sentiment scores.
mac os catalina 10.15.7 (no GPU)
Python version : 3.8
Transformers package : 4.4.2
torch version : 1.6.0
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
classifier = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)
results = classifier(["We are very happy to show you the Transformers library.", "We hope you don't hate it."])
print([i['score'] for i in results])
The above code works fine in Jupyter notebook and it has provided me the expected result
[0.7495927810668945,0.2365245819091797]
So now if I create a docker image with flask wrapper its getting stuck at the results = classifier([input_data]) line and the execution is running forever.
My folder structure is as follows:
- src
|-- app
|--main.py
|-- Dockerfile
|-- requirements.txt
I used the below Dockerfile to create the image
FROM tiangolo/uwsgi-nginx-flask:python3.8
COPY ./requirements.txt /requirements.txt
COPY ./app /app
WORKDIR /app
RUN pip install -r /requirements.txt
RUN echo "uwsgi_read_timeout 1200s;" > /etc/nginx/conf.d/custom_timeout.conf
And my requirements.txt file is as follows:
pandas==1.1.5
transformers==4.4.2
torch==1.6.0
My main.py script look like this :
from flask import Flask, json, request, jsonify
import traceback
import pandas as pd
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
app = Flask(__name__)
app.config["JSON_SORT_KEYS"] = False
model_name = 'nlptown/bert-base-multilingual-uncased-sentiment'
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
nlp = pipeline('sentiment-analysis', model=model_path, tokenizer=model_path)
#app.route("/")
def hello():
return "Model: Sentiment pipeline test"
#app.route("/predict", methods=['POST'])
def predict():
json_request = request.get_json(silent=True)
input_list = [i['text'] for i in json_request["input_data"]]
results = nlp(input_list) ########## Getting stuck here
for result in results:
print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
score_list = [round(i['score'], 4) for i in results]
return jsonify(score_list)
if __name__ == "__main__":
app.run(host='0.0.0.0', debug=False, port=80)
My input payload is of the form
{"input_data" : [{"text" : "We are very happy to show you the Transformers library."},
{"text" : "We hope you don't hate it."}]}
I tried looking into the transformers github issues but couldn't find one. I execution works fine even when using the flask development server but it runs forever when I wrap it and create a docker image. I am not sure if I am missing any additional dependency to be included while creating the docker image.
Thanks.

I was having a similar issue. It seems that starting the app somehow polutes the memory of transformers models. Probably something to do with how Flask does threading but no idea why. What fixed it for me was doing the things that are causing trouble (loading the models) in a different thread.
import threading
def preload_models():
"LOAD MODELS"
return 0
def start_app():
app = create_app()
register_handlers(app)
preloading = threading.Thread(target=preload_models)
preloading.start()
preloading.join()
return app
First reply here. I would be really glad if this helps.

Flask uses port 5000. In creating a docker image, it's important to make sure that the port is set up this way. Replace the last line with the following:
app.run(host="0.0.0.0", port=int(os.environ.get("PORT", 5000)))
Be also sure to import os at the top
Lastly, in Dockerfile, add
EXPOSE 5000
CMD ["python", "./main.py"]

Related

Problem with init() function for model deployment in Azure

I want to deploy model in Azure but I'm struggling with the following problem.
I have my model registered in Azure. The file with extension .sav is located locally. The registration looks the following:
import urllib.request
from azureml.core.model import Model
# Register model
model = Model.register(ws, model_name="my_model_name.sav", model_path="model/")
I have my score.py file. The init() function in the file looks like this:
import json
import numpy as np
import pandas as pd
import os
import pickle
from azureml.core.model import Model
def init():
global model
model_path = Model.get_model_path(model_name = 'my_model_name.sav', _workspace='workspace_name')
model = pickle(open(model_path, 'rb'))
But when I try to deploy I se the following error:
"code": "AciDeploymentFailed",
"statusCode": 400,
"message": "Aci Deployment failed with exception: Your container application crashed. This may be caused by errors in your scoring file's init() function.
1. Please check the logs for your container instance: leak-tester-pm. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.
And when I run print(service.logs()) I have the following output (I have only one model registered in Azure):
None
Am I doing something wrong with loading model in score.py file?
P.S. The .yml file for the deployment:
name: project_environment
dependencies:
# The python interpreter version.
# Currently Azure ML only supports 3.5.2 and later.
- python=3.6.2
- pip:
- scikit-learn==0.24.2
- azureml-defaults
- numpy
- pickle-mixin
- pandas
- xgboost
- azure-ml-api-sdk
channels:
- anaconda
- conda-forge

The local inference server allows you to quickly debug your entry script (score.py). In case the underlying score script has a bug, the server will fail to initialize or serve the model. Instead, it will throw an exception & the location where the issues occurred.
There are two possible reasons for the error or exception occurred.
HTTP server issue. Need to troubleshoot it
Docker deployment.
You need to debug the procedure which you followed. In some cases, HTTP server issues will cause the problem of initialization (init())
Check Azure Machine Learning inference HTTP server for better debugging from server perspective.
The docker file mentioned is looking good, but it's better to debug once by the steps mentioned in https://learn.microsoft.com/en-us/azure/machine-learning/how-to-troubleshoot-deployment-local#dockerlog

Try below code inside init() function:
def init():
global model
model_path = "modelfoldername/model.pkl"
filename = 'mymodel.sav'
pickle.dump(model_path, open(filename, 'wb'))
# load the model from disk
model= pickle.load(open(filename, 'rb'))

How can I run python code after a DBT run (or a specific model) is completed?

I would like to be able to run an ad-hoc python script that would access and run analytics on the model calculated by a dbt run, are there any best practices around this?

We recently built a tool that could that caters very much to this scenario. It leverages the ease of referencing tables from dbt in Python-land. It's called fal.
The idea is that you would define the python scripts you would like to run after your dbt models are run:
# schema.yml
models:
- name: iris
meta:
owner: "#matteo"
fal:
scripts:
- "notify.py"
And then the file notify.py is called if the iris model was run in the last dbt run:
# notify.py
import os
from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError
CHANNEL_ID = os.getenv("SLACK_BOT_CHANNEL")
SLACK_TOKEN = os.getenv("SLACK_BOT_TOKEN")
client = WebClient(token=SLACK_TOKEN)
message_text = f"""Model: {context.current_model.name}
Status: {context.current_model.status}
Owner: {context.current_model.meta['owner']}"""
try:
response = client.chat_postMessage(
channel=CHANNEL_ID,
text=message_text
)
except SlackApiError as e:
assert e.response["error"]
Each script is ran with a reference to the current model for which it is running in a context variable.
To start using fal, just pip install fal and start writing your python scripts.

For production, I'd recommend an orchestration layer such as apache airflow.
See this blog post to get started, but essentially you'll have an orchestration DAG (note - not a dbt DAG) that does something like:
dbt run <with args> -> your python code
Fair warning, though, this can add a bit of complexity to your project.
I suppose you could get a similar effect with a CI/CD tool like github actions or circleCI

Running out of memory when deploying an extremely simple Flask app in Heroku

I want to deploy a simple machine learning model (resnet34) made with Fast AI to Heroku.
My whole flask app is a single file:
from flask import Flask
from fastai.vision.all import *
app = Flask(__name__)
learn = load_learner("./export.pkl")
#app.route("/<path:image_url>")
def hello_world(image_url):
print(image_url)
response = requests.get(image_url)
img = PILImage.create(response.content)
predictions = learn.predict(img)
print(predictions)
return predictions[0]
It works fine a couple of times, but heroku then starts logging things like:
I don't understand why this is happening... my intuition tells me that the garbage collector should be fine here.
Here are my requirements.txt
-f https://download.pytorch.org/whl/torch_stable.html
torch==1.8.1+cpu
torchvision==0.9.1+cpu
fastai>=2.3.1
Flask==2.0.1
gunicorn==20.1.0
Pillow
requests==2.26.0
EDIT: The answer I posted myself is not completely right. The root cause was that I wasn't closing the images:
correct code:
#app.route("/<path:image_url>")
def hello_world(image_url):
print(image_url)
response = requests.get(image_url)
img = PILImage.create(response.content)
predictions = learn.predict(img)
img.close()
return predictions[0]

I believe the issue was that the pycache was getting bigger and bigger.
Be sure to run your app with the following env var set:
PYTHONDONTWRITEBYTECODE=1

Having error messages for importing tensorflow package even after installing it

Good day everyone. I got a module from the internet which is a module about NMT. In the module I have an import for tensorflow, but unfortunately even after the installation of tensorflow in my system using pip, I still get the error. Here is the error
from tensorflow.keras.models import load_model
ModuleNotFoundError: No module named 'tensorflow'
The module hello_app.py is below:
from flask import Flask
from flask import request
from flask import jsonify
import uuid
import os
from tensorflow.keras.models import load_model
import numpy as np
EXPECTED = {
"cylinders":{"min":3,"max":8},
"displacement":{"min":68.0,"max":455.0},
"horsepower":{"min":46.0,"max":230.0},
"weight":{"min":1613,"max":5140},
"acceleration":{"min":8.0,"max":24.8},
"year":{"min":70,"max":82},
"origin":{"min":1,"max":3}
}
# Load neural network when Flask boots up
model = load_model(os.path.join("../dnn/","mpg_model.h5"))
#app.route('/api/mpg', methods=['POST'])
def calc_mpg():
content = request.json
errors = []
for name in content:
if name in EXPECTED:
expected_min = EXPECTED[name]['min']
expected_max = EXPECTED[name]['max']
value = content[name]
if value < expected_min or value > expected_max:
errors.append(f"Out of bounds: {name}, has value of: {value}, but should be between {expected_min} and {expected_max}.")
else:
errors.append(f"Unexpected field: {name}.")
# Check for missing input fields
for name in EXPECTED:
if name not in content:
errors.append(f"Missing value: {name}.")
if len(errors) <1:
x = np.zeros( (1,7) )
# Predict
x[0,0] = content['cylinders']
x[0,1] = content['displacement']
x[0,2] = content['horsepower']
x[0,3] = content['weight']
x[0,4] = content['acceleration']
x[0,5] = content['year']
x[0,6] = content['origin']
pred = model.predict(x)
mpg = float(pred[0])
response = {"id":str(uuid.uuid4()),"mpg":mpg,"errors":errors}
else:
response = {"id":str(uuid.uuid4()),"errors":errors}
print(content['displacement'])
return jsonify(response)
if __name__ == '__main__':
app.run(host= '0.0.0.0',debug=True)
Please I would really appreciate your answers. Thank you.
This is the github repo where I got the code
https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_13_01_flask.ipynb

To avoid package or version conflict, one can use virtual environment.
pip install virtualenv
virtualenv -p /usr/bin/python3 tf
source tf/bin/activate
tf$ pip install tensorflow
If you have Anaconda or conda
#Set Up Anaconda Environments
conda create --name tf python=3
#Activate the new Environment
source activate tf
tf$pip install tensorflow

how to log hydra's multi-run in mlflow

I am trying to manage the results of machine learning with mlflow and hydra.
So I tried to run it using the multi-run feature of hydra.
I used the following code as a test.
import mlflow
import hydra
from hydra import utils
from pathlib import Path
import time
#hydra.main('config.yaml')
def main(cfg):
print(cfg)
mlflow.set_tracking_uri('file://' + utils.get_original_cwd() + '/mlruns')
mlflow.set_experiment(cfg.experiment_name)
mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
with mlflow.start_run() :
mlflow.log_artifact(Path.cwd() / '.hydra/config.yaml')
if __name__ == '__main__':
main()
This code will not work.
I got the following error
Exception: Run with UUID [RUNID] is already active. To start a new run, first end the current run with mlflow.end_run(). To start a nested run, call start_run with nested=True
So I modified the code as follows
import mlflow
import hydra
from hydra import utils
from pathlib import Path
import time
#hydra.main('config.yaml')
def main(cfg):
print(cfg)
mlflow.set_tracking_uri('file://' + utils.get_original_cwd() + '/mlruns')
mlflow.set_experiment(cfg.experiment_name)
mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
with mlflow.start_run(nested=True) :
mlflow.log_artifact(Path.cwd() / '.hydra/config.yaml')
if __name__ == '__main__':
main()
This code works, but the artifact is not saved.
The following corrections were made to save the artifacts.
import mlflow
import hydra
from hydra import utils
from pathlib import Path
import time
#hydra.main('config.yaml')
def main(cfg):
print(cfg)
mlflow.set_tracking_uri('file://' + utils.get_original_cwd() + '/mlruns')
mlflow.set_experiment(cfg.experiment_name)
mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
# mlflow.log_param('param1',5)
mlflow.log_artifact(Path.cwd() / '.hydra/config.yaml')
if __name__ == '__main__':
main()
As a result, the artifacts are now saved.
However, when I run the following command
python test.py model=A,B hidden=12,212,31 -m
Only the artifact of the last execution condition was saved.
How can I modify mlflow to manage the parameters of the experiment by taking advantage of the multirun feature of hydra?

MLFlow is not officially supported by Hydra. At some point there will be a plugin that will make this smoother.
Looking at the errors you are reporting (and without running your code):
One thing that you can try to to use the Joblib launcher plugin to get job isolation through processes (this requires Hydra 1.0.0rc1 or newer).

What you are observing is due to the interaction between MLFlow and Hydra. As far as MLflow can tell, all of your Hydra multiruns are the same MLflow run!
Since both frameworks use the term "run", I will need to be verbose in the following text. Please bear with me.
If you didn't explicitly start a MLflow run, MLflow will do it for you when you do mlflow.log_params or mlflow.log_artifacts. Within a Hydra multirun context, it appears that instead of creating a new MLflow run for each Hydra run, the previous MLflow run is inherited after the first Hydra run. This is why you would get this error where MLflow thinks you are trying to update parameter values in logging: mlflow.exceptions.MlflowException: Changing param values is not allowed.
You can fix this by wrapping your MLFlow logging code within a with mlflow.start_run() context manager:
import mlflow
import hydra
from hydra import utils
from pathlib import Path
#hydra.main(config_path="", config_name='config.yaml')
def main(cfg):
print(cfg)
mlflow.set_tracking_uri('file://' + utils.get_original_cwd() + '/mlruns')
mlflow.set_experiment(cfg.experiment_name)
with mlflow.start_run() as run:
mlflow.log_params(cfg)
mlflow.log_artifact(Path.cwd() / '.hydra/config.yaml')
print(run.info.run_id) # just to show each run is different
if __name__ == '__main__':
main()
The context manager will start and end MLflow runs properly, preventing the issue from occuring.
Alternatively, you can also start and end an MLFlow run manually:
activerun = mlflow.start_run()
mlflow.log_params(cfg)
mlflow.log_artifact(Path.cwd() / '.hydra/config.yaml')
print(activerun.info.run_id) # just to show each run is different
mlflow.end_run()

This is related to the way you defined your MLFlow run. You use log_params and then start_run, so you have two concurrent runs of mlflow which explains the error. You could try getting rid of the following line in your first code sample and see what happens
mlflow.log_param('param1',5)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Issue while using transformers package inside the docker image - python

Related

Problem with init() function for model deployment in Azure

How can I run python code after a DBT run (or a specific model) is completed?

Running out of memory when deploying an extremely simple Flask app in Heroku

Having error messages for importing tensorflow package even after installing it

how to log hydra's multi-run in mlflow

Categories

Resources