Use Pandas in Azure Functions - python

Trying to create a basic Python Function and use it in Azure Function App(Consumption based). Used the HTTP Template via VS Code and able to use and get it deployed on Azure. However when I try to use "Pandas" in the logic, I get the error which I am not able to rectify. Me being a rookie in Python. Can you suggest how to rectify ?
Tool Used : VS Code , Azure Functions Tools
Python version installed locally : 3.8.5
Azure Function App Python Version : 3.8

It seems the pandas module hasn't been installed in your function on azure. You need to add the pandas module into your local requirements.txt and then deploy the function from local to azure. It will install the modules according to the lines in requirements.txt.
You can run this command in "Terminal" window to generate the pandas line in your requirements.txt automatically.
pip freeze > requirements.txt
After running the command above, your requirements.txt should be like:

Related

ModuleNotFoundError: No module named 'xgboost.sklearn' only on cloud. It works locally

i have a problem with module xgboost.sklearn.
I have a project developed with visual studio code (+ azure extensions). I write in python. I need to import xgboost.sklearn so i add this to requirements.txt:
azure-functions
azure-cosmos
pybind11
scipy==1.5.4
pyyaml==6.0
numpy==1.19.5
pandas==1.1.5
scikit-learn==0.24.2
xgboost==0.80
I run it locally (F5 - start debugging). Everything works just fine. So i deployed functions and call my endpoint. I got 500:
Result: Failure Exception: ModuleNotFoundError: No module named 'xgboost.sklearn' Stack:.......
(line with import xgboost.sklearn)
I try pip freeze > requirements.txt - not working
I have "azureFunctions.scmDoBuildDuringDeployment": true in settings.json
My resources on Azure contains xgboost and sklearn (.python_packages/lib/site-packages): enter image description here
How can i fix it?
After reproducing from my end, When I tried deploying with manually given requirements.txt I have received the similar issue. This is working fine after following the below steps.
Step1: Installed the scikit-learn and xgboost locally and then generated rerequirements.txt file using pip freeze > requirements.txt.
Step2: Now deployed the function to Function App
Then I have tested my function App which executed successfully.
Try using hit and trail on multiple versions of scikit-learn to install and see what works.

Install Python modules in Azure Functions

I am learning how to use Azure functions and using my web scraping script in it.
It uses BeautifulSoup (bs4) and pymysql modules.
It works fine when I tried it locally in the virtual environment as per this MS guide:
https://learn.microsoft.com/en-us/azure/azure-functions/functions-create-first-azure-function-azure-cli?pivots=programming-language-python&tabs=cmd%2Cbrowser#run-the-function-locally
But when I create the function App and publish the script to it, Azure Functions logs give me this error:
Failure Exception: ModuleNotFoundError: No module named 'pymysql'.
It must happen when attempting to import it.
I really don't know how to proceed, where should I specify what modules it needs to install?
You need to check if you have generated the requirements.txt which includes all of the information of the modules. When you deploy the function to azure, it will install the modules by the requirements.txt automatically.
You can generate the information of modules in requirements.txt file by the command below in local:
pip freeze > requirements.txt
And then deploy the function to azure by running the publish command:
func azure functionapp publish hurypyfunapp --build remote
For more information about deploy python function from local to auzre, please refer to this tutorial.
By the way, if you use consumption plan for your python function, the "Kudu" is not available for us. If you want to use "Kudu", you need to create app service plan for it but not consumption plan.
Hope it helps~
You need to upload the installed modules when deploying to azure. You can upload them using Kudu:
https://github.com/projectkudu/kudu/wiki/Kudu-console
as an alternative, you can also use Kudu and run pip install using the console:
Install python packages from the python code itself with the following snippet: (Tried and verified on Azure functions)
def install(package):
# This function will install a package if it is not present
from importlib import import_module
try:
import_module(package)
except:
from sys import executable as se
from subprocess import check_call
check_call([se,'-m','pip','-q','install',package])
for package in ['beautifulsoup4','pymysql']:
install(package)
Desired libraries mentioned the list gets installed when the azure function is triggered for the first time. for the subsequent triggers, you can comment/ remove the installation code.

Firebase on AWS Lambda Import Error

I am trying to connect Firebase with an AWS Lambda. I am using their firebase-admin sdk. I have installed and created the dependancy package as described here. But I am getting this error on Lambda:
Unable to import module 'index':
Failed to import the Cloud Firestore library for Python.
Make sure to install the "google-cloud-firestore" module.
I have previously also tried setting up a similar function using node.js but I received an error message because GRPC was not configured. I think that this error message might be stemming from that same problem. I don't know how to fix this. I have tried:
pip install grpcio -t path/to/...
and installing google-cloud-firestore, but neither fixed the problem. When I run the code from my terminal, I get no errors.
Part of the problem here is that grpcio compiles a platform specific dynamic module: cygrpc.cpython-37m-darwin.so (in my case). According to this response you cannot import dynamic modules in a zip file: https://stackoverflow.com/a/58140801
Updating to python 3.8 fix this for me
As Alex DeBrie mentioned in his article on serverless.com,
The plugins section registers the plugin with the Framework. In the custom section, we tell the plugin to use Docker when installing packages with pip. It will use a Docker container that's similar to the Lambda environment so the compiled extensions will be compatible. You will need Docker installed for this to work.
Which means, the environment is different between Local and Lambda, so the compiled extensions would differ. If use a container to contain packages installed by pip, the container would mimic the environment of Lambda, then everything would run well.
If you use Serverless Frame work to deploy your Python app to AWS Lambda, add these lines to serverless.yml file:
...
plugins:
- serverless-python-requirements
...
custom:
pythonRequirements:
dockerizePip: non-linux
dockerImage: mlupin/docker-lambda:python3.9-build
...
then serverless-python-requirements would automatically open a Docker container based on mlupin/docker-lambda:python3.9-build image.
This container would mimic the Lamda environment, let pip install and compile everything in it. So the compiled extensions will be compatible.
This worked in my case. Hope this helps.

python: cannot import name beam_runner_api_pb2

I am relatively new to Python and Beam and I have followed the Apache Beam - Python Quickstart (here) to the last letter. My Python 2.7 virtual environment was created with conda.
I cloned the example from https://github.com/apache/beam
When I try to run
python -m apache_beam.examples.wordcount --input sample_text.txt --output counts
I get the following error
/Users/name/anaconda3/envs/py27/bin/python: cannot import name beam_runner_api_pb2
(which after searching I understand means that there is a circular import)
I have no idea where to begin. Is this a bug or something wrong with my setup.
(I have now tried redoing the example in three different virtual environments - all with the same result)
It seems it was my mistake. I did not correctly install the Google Cloud Platfrom (gcp) components. Once I did this it all worked
# As part of the initial setup, install Google Cloud Platform specific extra components.
pip install apache-beam[gcp]

Unable to install pandas on AWS Lambda

I'm trying to install and run pandas on an Amazon Lambda instance. I've used the recommended zip method of packaging my code file model_a.py and related python libraries (pip install pandas -t /path/to/dir/) and uploaded the zip to Lambda. When I try to run a test, this is the error message I get:
Unable to import module 'model_a': C extension:
/var/task/pandas/hashtable.so: undefined symbol: PyFPE_jbuf not built.
If you want to import pandas from the source directory, you may need
to run 'python setup.py build_ext --inplace' to build the C extensions
first.
Looks like an error in a variable defined in hashtable.so that comes with the pandas installer. Googling for this did not turn up any relevant articles. There were some references to a failure in numpy installation but nothing concrete. Would appreciate any help in troubleshooting this! Thanks.
I would advise you to use Lambda layers to use additional libraries. The size of a lambda function package is limited, but layers can be used up to 250MB (more here).
AWS has open sourced a good package, including Pandas, for dealing with data in Lambdas. AWS has also packaged it making it convenient for Lambda layers. You can find instructions here.
I have successfully run pandas code on lambda before. If your development environment is not binary-compatible with the lambda environment, you will not be able to simply run pip install pandas -t /some/dir and package it up into a lambda .zip file. Even if you are developing on linux, you may still run into compatability issues.
So, how do you get around this? The solution is actually pretty simple: run your pip install on a lambda container and use the pandas module that it downloads/builds instead. When I did this, I had a build script that would spin up an instance of the lambci/lambda container on my local system (a clone of the AWS Lambda container in docker), bind my local build folder to /build and run pip install pandas -t /build/. Once that's done, kill the container and you have the lambda-compatible pandas module in your local build folder, ready to zip up and send to AWS along with the rest of your code.
You can do this for an arbitrary set of python modules by making use of a requirements.txt file, and you can even do it for arbitrary versions of python by first creating a virtual environment on the lambci container. I haven't needed to do this for a couple of years, so maybe there are better tools by now, but this approach should at least be functional.
If you want to install it directly through the AWS Console, I made a step-by-step youtube tutorial, check out the video here: How to install Pandas on AWS Lambda

Categories