import matplotlib failed while deploying my model in AWS sagemaker

import matplotlib failed while deploying my model in AWS sagemaker - python

I have deployed my AWS model successfully.
but while testing i am getting runtime Error: "import matplotlib.pyplot as plt" . I think it is due to pytorch framework version i used(framework_version=1.2.0). I am facing the same issue when i use higher versions as well.
PyTorchModel(model_data=model_artifact,
role = role,
framework_version=1.2.0,
entry_point='predict.py',
predictor_cls=ImagePredictor)
I have other issue when i use version=1.0.0. i.e i am not able to import libraries from sub directories and deployment itself is failing.
Eg: i have some code files in "Code" directory.
from Code.CTModel import NetWork ---> **this line will fail as "No module named Code" when i use version=1.0.0**
Ultimately i want to how to use/import libraries which are written under sub-directories.

It sounds like you want to inject some additional code libraries into the SageMaker PyTorch serving container. You might have to dig into the source code for how the PyTorch serving container is built to further customize it: https://github.com/aws/sagemaker-pytorch-inference-toolkit, or build your own image.
Digging into that source code a bit, I see that the container has enabled the importing of arbitrary code, but only when "multi-model mode" is enabled. Can you verify that the code exists under a directory "code" in your model directory and that "multi-model mode" is enabled?
def initialize(self, context):
# Adding the 'code' directory path to sys.path to allow importing user modules when multi-model mode is enabled.
if (not self._initialized) and ENABLE_MULTI_MODEL:
code_dir = os.path.join(context.system_properties.get("model_dir"), 'code')
sys.path.append(code_dir)
self._initialized = True
Reference: https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/c4e7abc49aeebc2f9b6035337548a90e4330113d/src/sagemaker_pytorch_serving_container/handler_service.py#L47
If this all seems complicated to you (it is), you might want to look into some standardized formats for serializing your PyTorch model such as https://onnx.ai/. I'd love to learn more about what you're trying to do here sometime if you reach out to me at contact#modelzoo.dev. I'm beta-testing a platform that enables deployment in a single line of code and would love to test it out here.

Let me make my query little bit high level: I have predict.py, jupyter notebook , Code(Direcotry),Evoludation(directory) and other .py files in source_dir.
--Code
--ResNet.py
--Densenet.py
--DataLoader.py
--Evaluation
--Evaluation.py
--predict.py
--CT_Code.ipynb
When i execute the predict file from jupyter notebook in my local system, all the modules are imported properly and everything is working fine. But when i am deploying same thing in sagemaker notebook facing issues as mentioned in my question.(Not able to import libraries from Code directory and some basic modules like imageio,PIL, Matplotlib)

Related

Converting a Python project to DLL or decreasing the size and imports

I have a python project for OCR MRZ detection with 2 modules 1 is for ID which uses EasyOcr,pythorch and other one is for Passport documents which uses Pytesseract and tensorflow.
I need to prepare this project for deployment I have tried some methods but none of them was practible for deployment process.
I have tried pyinstaller with couple of configurations with --onefile
option the setup is great but it takes too long to unpack the exe
when executed.
I have then tried --onedir option the delay was gone but now
installation package was too complicated and size was too
large(1.8GB).
I have tried to "compile" python code by using Cython but even with a
helloworld.py sample app I couldn't manage to make this one work I
got couple of errors during gcc compiling the last error I got due to
msvcp package which i have installed but still got the error.
And as last I have used Nuitka to get a dll-like file to import this
in C# and use it like a package, I have successfully created a test
.pyd file from a helloworld.py but i couldn't import it in C# as i
planned.
What I need is to prepare this project as a simpler and low-sized application which is hard to reverse-engineered for source codes ready for deployment. For passport OCR I can switch development to C# but for ID I couldn't find any alternative OCR library to get the MRZ information so at least I need to use ID OCR module from my Python project.
Any help would be appreciated,
Thanks

MWAA - Airflow - PythonVirtualenvOperator requires virtualenv

I am using AWS's MWAA service (2.2.2) to run a variety of DAGs, most of which are implemented with standard PythonOperator types. I bundle the DAGs into an S3 bucket alongside any shared requirements, then point MWAA to the relevant objects & versions. Everything runs smoothly so far.
I would now like to implement a DAG using the PythonVirtualenvOperator type, which AWS acknowledge is not supported out of the box. I am following their guide on how to patch the behaviour using a custom plugin, but continue to receive an error from Airflow, shown at the top of the dashboard in big red writing:
DAG Import Errors (1)
... ...
AirflowException: PythonVirtualenvOperator requires virtualenv, please install it.
I've confirmed that the plugin is indeed being picked up by Airflow (I see it referenced in the admin screen), and for the avoidance of doubt I am using the exact code provided by AWS in their examples for the DAG. AWS's documentation on this is pretty light and I've yet to stumble across any community discussion for the same.
From AWS's docs, we'd expect the plugin to run at startup prior to any DAGs being processed. The plugin itself appears to effectively rewrite the venv command to use the pip-installed version, rather than that which is installed on the machine, however I've struggled to verify that things are happening in the order I expect. Any pointers on debugging the instance's behavior would be very much appreciated.
Has anyone faced a similar issue? Is there a gap in the MWAA documentation that needs addressing? Am I missing something incredibly obvious?
Possibly related, but I do see this warning in the scheduler's logs, which may indicate why MWAA is struggling to resolve the dependency?
WARNING: The script virtualenv is installed in '/usr/local/airflow/.local/bin' which is not on PATH.

Airflow uses shutil.which to look for virtualenv. The installed virtualenv via requirements.txt isn't on the PATH. Adding the path to virtualenv to PATH solves this.
The doc here is wrong https://docs.aws.amazon.com/mwaa/latest/userguide/samples-virtualenv.html
import os
from airflow.plugins_manager import AirflowPlugin
import airflow.utils.python_virtualenv
from typing import List
def _generate_virtualenv_cmd(tmp_dir: str, python_bin: str, system_site_packages: bool) -> List[str]:
cmd = ['python3','/usr/local/airflow/.local/lib/python3.7/site-packages/virtualenv', tmp_dir]
if system_site_packages:
cmd.append('--system-site-packages')
if python_bin is not None:
cmd.append(f'--python={python_bin}')
return cmd
airflow.utils.python_virtualenv._generate_virtualenv_cmd=_generate_virtualenv_cmd
#This is the added path code
os.environ["PATH"] = f"/usr/local/airflow/.local/bin:{os.environ['PATH']}"
class VirtualPythonPlugin(AirflowPlugin):
name = 'virtual_python_plugin'

Python Serverless (SLS): Runtime.ImportModuleError: Unable to import module

I am working on a project that is using AWS CodeBuild to deploy a Serverless (SLS) function that is written in Python.
The deployment works fine within code build. It successfully creates the function and I can view the lambda within the Lambda AWS UI. Whenever the function is triggered, I get the error seen below:
Runtime.ImportModuleError: Unable to import module 'some/function': attempted relative import with no known parent package
It is extremely frustrating as I know the function exists at that directory listed above. During the CodeBuild script, I can ls into the directory and confirm that it indeed exists. The function is defined in my serverless.yml file as follows:
functions:
file-blaster:
runtime: python3.7
handler: some/function.function_name
events:
- existingS3:
bucket: some_bucket
events:
- s3:ObjectCreated:*
rules:
- prefix: ${opt:stage}/some/prefix
Sadly, I haven't been able to crack this one. Has anyone had a similar experience while working with SLS and python in the cloud?
It seems odd that SLS would build and deploy successfully, but the Lambda itself cant find the function.

This will be a short answer for what is a somewhat longer discussion on Python imports. You can do the research yourself on the hectic and confusing battle between relative and absolute imports as a design for a python project.
The Gist:
It is necessary to understand that the base of the python importing for SLS functions IS where the serverless.yml file exists (I imagine that it is similar to having a main.py that calls the other files that are referenced as "functions" in the sls yml). For my case above, I did not structure the imports using absolute imports when I had my issues. I switched all of my imports to have absolute paths, so when I moved the package around, it would continue to work.
The error that I was given Runtime.ImportModuleError: Unable to import module 'some/function': attempted relative import with no known parent package was really poor to describe the actual issue. The error should have included that the packages being used by some/function were not found when attempting a relative import because that was the actual problem that needed fixing.
Hopefully this helps someone else out someday. Let me know if I can provide more information where I haven't already.

I think you need to change your handler property from :
handler: some/function.function_name
to
handler: some/function.{lambda handler name}
like, my folder structure is:
- some
- function1.py
then my template will be:
functions:
file-blaster:
runtime: python3.7
handler: some/function1.lambda_handler
for more details check here https://serverless.com/framework/docs/providers/aws/guide/functions/

In Shiny, Python Virtual environment PERMISSION DENIED (Error 126)

We are building a User Interface APP (predicting a continuous variable through a machine learning model) through R Shiny.
Since we built the machine learning model in Python3 sklearn module, we hope that we could write python codes in R Shiny to call that model and corresponding functions.
We used R-package "reticulate" to create virtual python environment where it would save python packages, and through which we could call python3 functions.
We created the virtual environment using the following line of code (the function in R package "reticulate")
use_virtualenv("env", required = TRUE)
Where we indeed have the following directory "env/bin" in which there are python and python3 to execute.
The Shiny APP worked perfectly locally. HOWEVER, when we made attempts to publish, it gave the following error (please see picture) (after the APP was successfully deployed and on shinyapps.io, it said the APP was running).
The issue was "Error 126", which denied the permission for our APP to access the virtual environment. This issue had no previous (similar) case on Stackoverflow, and therefore we spent a long time to debug (issue not resolved).
If anyone knows how to solve this problem, would it be possible for you to kindly mark your solution tips below? (We hope your solution would not modify our basic layout, i.e. "calling python-made model in Shiny and publish through Shiny") We really appreciate your efforts to help us out!
Thank you so much!

Could you share the code where actual call to python script is being made? is it a python module function that you are calling from Rshiny? what does the python module/function do and return? I have used reticulate inside shiny to call Python scripts and it works fine. Didn't require to set the environment. Just provide the source to python script and call it like any other R function.

If you're trying to deploy to shinyapps.io, you may need to set the RETICULATE_PYTHON env variable so that reticulate uses the correct version of Python when running your app:
VIRTUALENV_NAME = 'env'
Sys.setenv(RETICULATE_PYTHON = paste0('/home/shiny/.virtualenvs/',
VIRTUALENV_NAME,
'/bin/python'))
Full example here demonstrates one method for configuring a Shiny + reticulate app so that it can easily run both locally and on shinyapps.io.

Got errors, while running exe file built with pyinstaller and Google Cloud API integration in python

I am working one file python project.
I integrated google-cloud-API for realtime speech streaming and recognition.
It works with python aaa.py command well.
Now I need windows build file(.exe), so I used pyinstaller program and I got aaa.exe file successfully.
But I got this error while running speech streaming by using Google cloud API.
[Errno 2] No such file or directory:
'D:\AI\ai\dist\AAA\google\cloud\gapic\speech\v1\speech_client_config.json'
So I copied this speech_client_config.json file in needed path, after that I got below error again.
Exception in 'grpc._cython.cygrpc.ssl_roots_override_callback'
ignored E0511 01:13:14.320000000 3108
src/core/lib/security/security_connector/security _connector.cc:1170]
assertion failed: pem_root_certs != nullptr
Then, I can not find solution to get working version with google-cloud API.
I am using python version 2.7.14
I need your friendly help.
Thanks.

I had the same problem. If you are willing to distribute roots.pem with your executable (just search for the file - it should be buried deep within the installation directory of grpcio), I had luck fixing this by setting GRPC_DEFAULT_SSL_ROOTS_FILE_PATH environment variable to the full path of this roots.pem file.

Update 2021
To anyone who is experiencing this issue. I got it working thanks to these amazing people. See the full conversation on this github issue.
Here is the link
Step 1
Credits to #cbenhagen & #rising-stark on this github link.
A PyInstaller hook called hook-grpc.py looking like this would do the trick:
Create a python file named hook-grpc.py with this code.
from PyInstaller.utils.hooks import collect_data_files
datas = collect_data_files('grpc')
Step 2
Put the hook-grpc.py file in your \site-packages\PyInstaller\hooks directory of the python environment you are running on. So basically you can find it at
C:\Users\yourusername\AppData\Local\Programs\Python\Python37\Lib\site-packages\PyInstaller\hooks
Note:
Just change the yourusername and Python37 to your
respective username and python version you are using.
For Anaconda users it might be different. Check this site
to find the anaconda python environment path you are using.
Step 3
Once you've done that you can now convert your .py python program to .exe using pyinstaller and it should work.

This looks to me like a SSL credentials mistake. I think you are not being allowed to GC. Check this code snippet and this documentation.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.