How to load private python package when loading a MLFlow model? - python

I am trying to use a private Python package as a model using the mlflow.pyfunc.PythonModel.
My conda.yaml looks like
channels:
- defaults
dependencies:
- python=3.10.4
- pip
- pip:
- mlflow==2.1.1
- pandas
- --extra-index-url <private-pypa-repo-link>
- <private-package>
name: model_env
python_env.yaml
python: 3.10.4
build_dependencies:
- pip==23.0
- setuptools==58.1.0
- wheel==0.38.4
dependencies:
- -r requirements.txt
requirements.txt
mlflow==2.1.1
pandas
--extra-index-url <private-pypa-repo-link>
<private-package>
When running the following
import mlflow
model_uri = '<run_id>'
# Load model as a PyFuncModel.
loaded_model = mlflow.pyfunc.load_model(model_uri)
# Predict on a Pandas DataFrame.
import pandas as pd
t = loaded_model.predict(pd.read_json("test.json"))
print(t)
The result is
WARNING mlflow.pyfunc: Encountered an unexpected error (InvalidRequirement('Parse error at "\'--extra-\'": Expected W:(0-9A-Za-z)')) while detecting model dependency mismatches. Set logging level to DEBUG to see the full traceback.
Adding in the following before loading the mode makes it work
dep = mlflow.pyfunc.get_model_dependencies(model_uri)
print(dep)
import subprocess
import sys
subprocess.check_call([sys.executable, "-m", "pip", "install", "-r", dep])
Is there a way automatically install these dependencies rather than doing it explicitly? What are my options to get mlflow to install the private package?

Answering my own question here. Turns out the issue is that I was trying to use the keyring library which needs to be pre-installed and is not supported when doing inference in a virtual environment.
There are ways to get around it though.
Add the authentication token to the extra-index-url itself. You can find it documented in this stackoverflow question.
MlFlow allows you to log any dependencies with the model itself using the code_path argument (link). Using this method, you can skip adding in your private package as a requirement. This question also touches on the same topic. The code would look a bit like this.
mlflow.pyfunc.save_model(
path=dest_path,
python_model=MyModel(),
artifacts=_get_artifact_dict(t_dir),
conda_env=conda_env,
# Adding the current script file as dependency
code_path=[os.path.realpath(__file__), #Add any other script]
)
Opt for first approach if saving authentication token in the requirements.txt is feasible, otherwise use the second approach. The downside of using the code_path solution is that with each model, your packages' code is getting replicated.

Related

Can't import deepspeech on kivy for android

I am using kivy to create an android app. I need to install the deepspeech framework, however, in order for deepspeech to be installed it is necessary to create a recipe.
I created a recipe and built the apk, there were no errors in the build, it created the apk and also, as far as I could see in the folders, the deepspeech was built. However after I install the app in the phone and try to run the app, it crashes and says there is no module named deepspeech.
Does anyone know what i am doing wrong? I've been stuck on this for a while now, and can't seem to find the end of this :/.
from pythonforandroid.recipe import PythonRecipe
from pythonforandroid.toolchain import current_directory, shprint
import sh
class deepspeechRecipe(PythonRecipe):
version = 'v0.9.2'
url = 'https://github.com/mozilla/DeepSpeech/archive/{version}.tar.gz'
depends = ['numpy', 'setuptools']
call_hostpython_via_targetpython = False
site_packages_name = 'deepspeech'
def build_arch(self, arch):
env = self.get_recipe_env(arch)
with current_directory(self.get_build_dir(arch.arch)):
# Build python bindings
hostpython = sh.Command(self.hostpython_location)
shprint(hostpython,
'setup.py',
'build_ext', _env=env)
# Install python bindings
super().build_arch(arch)
def get_recipe_env(self, arch):
env = super().get_recipe_env(arch)
numpy_recipe = self.get_recipe('numpy', self.ctx)
env['CFLAGS'] += ' -I' + numpy_recipe.get_build_dir(arch.arch)
#env['LDFLAGS'] += ' -L' + sqlite_recipe.get_lib_dir(arch)
env['LIBS'] = env.get('LIBS', '') + ' -lnumpy'
return env
recipe = deepspeechRecipe()
Buildozer:1.4.0
requirements = python3==3.7.14, hostpython3==3.7.14, kivy, kivymd, sqlite3, numpy==1.14.5, deepspeech, apsw
If you need any extra information I can add.
I have already tried using tensorflow to run the model, however, the model gives an array as the output and I don't know the right procedures to transform that into a text form.
I have already tried other recipes (like opencv) and all work fine.
Edit:
I found out that when i use the recipe it does run, and it does build properly, but only the deepspeech_training part because the setup.py only installs that. To install other parts like the model class it is necessary to use another setup.py located in "native_client/python", but that requires the rest of the folders, so I still need to figure that out.
Edit2:I was able to build the packages that i wanted (the inference of deepspeech) however when i run it gives the following error.
python : ImportError: dlopen failed: library "libc++_shared.so" not found: needed by /data/user/0/org.test.myapp/files/app/_python_bundle/site-packages/deepspeech/_impl.so in namespace classloader-namespace
python : Python for android ended.
Add pillow in your requirements and check if it works!
requirements = python3==3.7.14, hostpython3==3.7.14, kivy, kivymd, sqlite3, numpy==1.14.5, deepspeech, apsw, pillow

Custom Docker file for Azure ML Environment that contains COPY statements errors with COPY failed: /path no such file or directory

I'm trying to submit an experiment to Azure ML using a Python script.
The Environment being initialised uses a custom Dockerfile.
env = Environment(name="test")
env.docker.base_image = None
env.docker.base_dockerfile = './Docker/Dockerfile'
env.docker.enabled = True
However the DockerFile needs a few COPY statements but those fail as follow:
Step 9/23 : COPY requirements-azure.txt /tmp/requirements-azure.txt
COPY failed: stat /var/lib/docker/tmp/docker-builder701026190/requirements-azure.txt: no such file or directory
The Azure host environment responsible to build the image does not contain the files the Dockerfile requires, those exist in my local development machine from where I initiate the python script.
I've been searching for the whole day of a way to add to the environment these files but without success.
Below an excerpt from the Dockerfile and the python script that submits the experiment.
FROM mcr.microsoft.com/azureml/base:intelmpi2018.3-ubuntu16.04 as base
COPY ./Docker/requirements-azure.txt /tmp/requirements-azure.txt # <- breaks here
[...]
Here is how I'm submitting the experiment:
from azureml.core.environment import Environment
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core import Workspace, Experiment
from azureml.core.compute import ComputeTarget
from azureml.core import Experiment, Workspace
from azureml.train.estimator import Estimator
import os
ws = Workspace.from_config(path='/mnt/azure/config/workspace-config.json')
env = Environment(name="test")
env.docker.base_image = None
env.docker.base_dockerfile = './Docker/Dockerfile'
env.docker.enabled = True
compute_target = ComputeTarget(workspace=ws, name='GRComputeInstance')
estimator = Estimator(
source_directory='/workspace/',
compute_target=compute_target,
entry_script="./src/ml/train/main.py",
environment_definition=env
)
experiment = Experiment(workspace=ws, name="estimator-test")
run = experiment.submit(estimator)
run.wait_for_completion(show_output=True, wait_post_processing=True)
Any idea?
I think the correct way to setup the requirements.txt for your project is using Define an inference configuration as:
name: project_environment
dependencies:
- python=3.6.2
- scikit-learn=0.20.0
- pip:
# You must list azureml-defaults as a pip dependency
- azureml-defaults>=1.0.45
- inference-schema[numpy-support]
See this
I think you need to look for "using your own base image", e.g. in the Azure docs here. For building the actual Docker image you have two options:
Build on Azure build servers. Here you need to upload all required files together with your Dockerfile to the build environment. (Alternatively, you could consider making the requirements-azure.txt file available via HTTP, such that the build environment can fetch it from anywhere.)
Build locally with your own Docker-installation and upload the final image to the correct Azure registry.
This is just the broad outline, at the moment I can't give more detailed recommendations. Hope it helps anyway.

Dependency missing when running AzureML Estimator in docker environment

Scenario description
I'm trying to submit a training script to AzureML (want to use AmlCompute, but I'm starting/testing locally first, for debugging purposes).
The train.py script I have uses a custom package (arcus.ml) and I believe I have specified the right settings and dependencies, but still I get the error:
User program failed with ModuleNotFoundError: No module named 'arcus.ml'
Code and reproduction
This the python code I have:
name='test'
script_params = {
'--test-par': 0.2
}
est = Estimator(source_directory='./' + name,
script_params=script_params,
compute_target='local',
entry_script='train.py',
pip_requirements_file='requirements.txt',
conda_packages=['scikit-learn','tensorflow', 'keras'])
run = exp.submit(est)
print(run.get_portal_url())
This is the (fully simplified) train.py script in the testdirectory:
from arcus.ml import dataframes as adf
from azureml.core import Workspace, Dataset, Datastore, Experiment, Run
# get hold of the current run
run = Run.get_context()
ws = run.get_environment()
print('training finished')
And this is my requirements.txt file
arcus-azureml
arcus-ml
numpy
pandas
azureml-core
tqdm
joblib
scikit-learn
matplotlib
tensorflow
keras
Logs
In the logs file of the run, I can see this section, sot it seems the external module is being installed anyhow.
Collecting arcus-azureml
Downloading arcus_azureml-1.0.3-py3-none-any.whl (3.1 kB)
Collecting arcus-ml
Downloading arcus_ml-1.0.6-py3-none-any.whl (2.1 kB)
It could be there's an issue with arcus-ml 1.0.6 wheel installable, like Anders pointed out it doesn't seem to have any code. Could you try with earlier version arcus-ml==1.0.5 ?
I think this error isn't necessarily about Azure ML. I think the error has to do w/ the difference b/w using a hyphen and a period in your package name. But I'm a python packaging newb.
In a new conda environment on my laptop, I ran the following
> conda create -n arcus python=3.6 -y
> conda activate arcus
> pip install arcus-ml
> python
>>> from arcus.ml import dataframes as adf
ModuleNotFoundError: No module named 'arcus'
When I look in the env's site packages folder, I didn't see the arcus/ml folder structure I was expecting. There's no arcus code there at all, only the .dist-info file
~/opt/anaconda3/envs/arcus/lib/python3.6/site-packages

How to make custom Op in TensorFlow importable in Python?

I have implemented a kernel for my custom Op, and put it into /tensorflow/core/user_ops as custom_op.cc. Inside the Op I do all the registering stuff, like REGISTER_OP and REGISTER_KERNEL_BUILDER.
Then I implemented gradient for this Op in Python, and I put it in the same folder as custom_op_grad.py. I did all the registering here as well (#ops.RegisterGradient).
I have created the BUILD file, with the following content:
load("//tensorflow:tensorflow.bzl", "tf_custom_op_library")
tf_custom_op_library(
name = "custom_op.so",
srcs = ["custom_op.cc"],
)
py_library(
name = "custom_op_grad",
srcs = ["custom_op_grad.py"],
srcs_version = "PY2",
deps = [
":custom_op_grad",
"//tensorflow:tensorflow_py",
],
)
After that, I rebuild Tensorflow:
pip uninstall tensorflow
bazel clean
bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
cp -r bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/__main__/* bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-0.8.0-py2-none-any.whl
When I try to use my Op after all this, by calling tf.user_ops.custom_op it tells me that module doesn't have it.
Maybe there are some additional steps I have to do? Or I am doing something wrong with the BUILD file?
Ok, I found the solution. I just removed the BUILD file, and my custom Op was successfully built and was importable in Python using tensorflow.user_ops.custom_op().
To use the gradient I had to put it's code directly inside the tensorflow/python/user_ops/user_ops.py. Not the most elegant solution, but working for now.

/usr/bin/python3.3 not found from brp-scl-python-bytecompile during mock build

I'm trying to build python33-python-virtualenv under CentOS6. I'm currently just trying to rebuild the current version as present in: https://www.softwarecollections.org/repos/rhscl/python33/epel-6-x86_64/python33-python-virtualenv-1.10.1-1.el6.src.rpm
I'm getting an error: /usr/lib/rpm/brp-scl-python-bytecompile: line 47: /usr/bin/python3.3: No such file or directory
Any idea what I might be doing wrong?
NB: I'm doing this in a mock environment, with scl defined to python33.
You need to have a 'python33-build' package installed in mock every time you build a sub-package of the python33 collection. You need to modify the mock config as follows:
replace: config_opts['chroot_setup_cmd'] = 'install #buildsys-build'
with: config_opts['chroot_setup_cmd'] = 'install #build scl-utils-build python33-build'
Generally, there need to be a '-build' package installed every time you build a sub-package for that collection. The '-build' package is built from the meta package source. In this specific case it would come from python33 source:
https://copr.fedoraproject.org/coprs/rhscl/python33-el7/build/27227/

Categories