Startup.cs equivalent in Python for Azure Functions / Dependency Injection Python - python

I am a C# developer who has been tasked with converting some deployed C# Azure functions (mostly webhooks / SB) to Python.
I am struggling with the concept of the python equivalency to dependency injection.
Take for example an api client class that makes calls continuously to some 3rd party API to push and pull data. In .NET if i had a webhook function that needed to use this api client, i would initiate a singleton service in the startup.cs class, and inject it into my azure webhook function. This is advantageous cause i can handle the token refreshing and what not inside of the service class itself and have the token stored in memory, instead of having to re-create an instance of the api client class each time the webhook is fired.
How do i do this in Python? Or what is the right method of doing something similar in a similar environment (Azure functions) where we store tokens in memory AND create a service once and use the same service across multiple functions?
Thanks

While not being using widely because the way python is setup, dependency injection can be implemented using python. There is a python package called dependency-injector which helps in implementing di.
We should not store the tokens in local memory of the function as functions are stateless and serverless what that means is we don't have either control or knowledge about the servers on which our function is running thus we don't know how long they will be stored in the memory instead of this once you acquire the tokens add them to the azure keyvault, so that you can use them frequently.
you can share a python class among the functions just place them a separate folder and import them in your functions.
from ..common_files import test_file
Refer this article by Szymon miks for examples of dependency Injection in python
Refer this documentation on the sharing file between functions.
Refer the following documentation on azure keyvault about retrieving keys

Related

How we can automate python client application which is used an an interface for user to call APIs

We have made an python client which is used as an interface for user. some function is defined in the client which internally calls the APIs and give output to users.
My requirement is to automate the python client - functions and validate the output.
Please suggest tools to use.
There are several ways to do that:
You can write multiple tests for your application as the test cases which are responsible to call your functions and get the result and validate them. It calls the "feature test". To do that, you can use the python "unittest" library and call the tests periodically.
If you have a web application you can use "selenium" to make automatic test flows. (Also you can run it in a docker container virtually)
The other solution is to write another python application to call your functions or send requests everywhere you want to get the specific data and validate them. (It's the same with the two other solutions with a different implementation)
The most straightforward way is using Python for this, the simplest solution would be a library like pytest. More comprehensive option would be something like Robot framework
Given you have jmeter in your tags I assume that at some point you will want to make a performance test, however it might be easier to use Locust for this as it's pure Python load testing framework.
If you still want to use JMeter it's possible to call Python programs using OS Process Sampler

Azure Function App split between consumption plan and service app plan

I've recently started working on a project in Azure Functions that has two components.
Both components use the same shared code, however the plans these components use seem to differ.
An API endpoint for web users for controlling objects saved in a CosmosDB - Would benefit from a Service App plan because of the cold/warm issues with Consumption plan.
A backend scheduled process that uses these objects - Would benefit from a Consumption plan, seeing as it could use automatic scaling, and exact execution time is not so important.
I thought of using the premium plan to solve both issues, but it seems pretty expensive (my workload is pretty low, and from the calculator on Azure's page it looks like around 150$ a month by default - correct me if I'm wrong).
I was wondering if there is a way to split a function app into two plans, or have two function apps share code.
Thanks!
I was wondering if there is a way to split a function app into two plans, or have two function apps share code.
The first options is not something that can be done. The second option is the one to go for here. You can have the shared code in a separate Class Library project and reference it from both Function Apps.
A class library defines types and methods that are called by an application. If the library targets .NET Standard 2.0, it can be called by any .NET implementation (including .NET Framework) that supports .NET Standard 2.0. If the library targets .NET 5, it can be called by any application that targets .NET 5.
When you create a class library, you can distribute it as a NuGet package or as a component bundled with the application that uses it.
More information: Tutorial: Create a .NET class library using Visual Studio
EDIT:
For Python, you can create modules.
Python has a way to put definitions in a file and use them in a script or in an interactive instance of the interpreter. Such a file is called a module; definitions from a module can be imported into other modules or into the main module (the collection of variables that you have access to in a script executed at the top level and in calculator mode).
More info: 6. Modules

Azure function apps with Python one time initialization

I'm using Azure function apps with Python. I have two dozen function apps that all use a Postgres DB and Custom Vision. All function apps are setup as HttpTriggers. Right now, when a function is triggered, a new database handler (or custom vision handler) object is created, used and terminated when the function app call is done.
It seems to be very counterproductive to instantiate a new objects on every single request that comes in. Is there a way to instantiate shared objects once and then pass them to a function when they are called?
In general, Azure Functions are intended to be stateless and not share objects from one invocation to the next. However, there are some exceptions.
Sharing Connection Objects
Azure Docs recommend the Improper Instantiation Pattern for sharing of connection objects that are intended in an application to be opened once and used again and again.
There are some things to keep in mind for this to work for you, mainly:
The key element of this antipattern is repeatedly creating and destroying instances of a shareable object. If a class is not shareable (not thread-safe), then this antipattern does not apply.
They have some walkthroughs there that will probably help you. Since your question is fairly generic, the best I can do is recommend you read through it and see if that will help you.
Durable Functions
The alternative is to consider the Durable Functions instead of the standard. They are intended to be able to pass objects between functions making them not quite stateless.
Durable Functions is an advanced extension for Azure Functions that isn't appropriate for all applications. This article assumes that you have a strong familiarity with concepts in Azure Functions and the challenges involved in serverless application development.

Is there anyway to get a callback after every insert/delete on Google Cloud Datastore?

I would like to sync my Cloud Datastore contents with an index in ElasticSearch. I would like for the ES index to always be up to date with the contents of Datastore.
I noticed that an equivalent mechanism is available in the Appengine Python Standard Environment by implementing a _post_put_hook method in a Datastore Model. This doesn't seem to be possible however using the google-cloud-datastore library available for use in the flex environment.
Is there any way to receive a callback after every insert? Or will I have to put up a "proxy" API in front of the datastore API which will update my ES index after every insert/delete?
The _post_put_hook() of NDB.Model does only work if you have written the entity through NDB to Datastore, and yes, unfortunately the NDB library is only available in App Engine Python Standard Environment. I don't know of such feature in Cloud Datastore. If I remember correctly, Firebase Realtime Database or Firestore have triggers for writes, but I guess you are not eager to migrate the database neither.
In Datastore you would either need a "proxy" API with the above method as you suggested, or you would need to modify your Datastore client(s) to do this upon any successful write op. The latter may come with higher risk of fails and stale data in ElasticSearch, especially if the client is outside your control.
I believe that a custom API makes sense if consistent and up-to-date search records is important for your use-cases. Datastore and Python / NDB (maybe with Cloud Endpoints) would be a good approach.
I have a similar solution running on GAE Python Standard (although with the builtin Search API instead of ElasticSearch). If you choose this route you should be aware of two potential caveats:
_post_put_hook() is always called, even if the put operation failed. I have added a code sample below. You can find more details in the docs: model hooks,
hook methods,
check_success()
Exporting the data to ElasticSearch or Search API will prolong your response time. This might be no issue for background tasks, just call the export feature inside _post_put_hook(). But if a user made the request, this could be a problem. For these cases, you can defer the export operation to a different thread, either by using the deferred.defer() method or by creating a push task). More or less, they are the same. Below, I use defer().
Add a class method for every kind of which you want to export search records. Whenever something went wrong or you move apps / datastores, add new search indexes etc. you can call this method that will then query all entities of that kind from datastore batch by batch, and export the search records.
Example with deferred export:
class CustomModel(ndb.Model):
def _post_put_hook(self, future):
try:
if future.check_success() is None:
deferred.defer(export_to_search, self.key)
except:
pass # or log error to Cloud Console with logging.error('blah')
def export_to_search(key=None):
try:
if key is not None:
entity = key.get()
if entity is not None:
call_export_api(entity)
except:
pass # or log error to Cloud Console with logging.error('blah')
```

How to show Spark application's percentage of completion on AWS EMR (and Boto3)?

I am running a Spark step on AWS EMR, this step is added to EMR through Boto3, I will like to return to the user a percentage of completion of the task, is there anyway to do this?
I was thinking to calculate this percentage with the number of completed stages of Spark, I know this won't be too precise, as the stage 4 may take double time than stage 5 but I am fine with that.
Is it possible to access this information with boto3?
I checked the method list_steps (here are the docs) but in the response I am getting only if its running without other information.
DISCLAIMER: I know nothing about AWS EMR and Boto3
I will like to return to the user a percentage of completion of the task, is there anyway to do this?
Any way? Perhaps. Just register a SparkListener and intercept events as they come. That's how web UI works under the covers (which is the definitive source of truth for Spark applications).
Use spark.extraListeners property to register a SparkListener and do whatever you want with the events.
Quoting the official documentation's Application Properties:
spark.extraListeners A comma-separated list of classes that implement SparkListener; when initializing SparkContext, instances of these classes will be created and registered with Spark's listener bus. If a class has a single-argument constructor that accepts a SparkConf, that constructor will be called; otherwise, a zero-argument constructor will be called.
You could also consider REST API interface:
In addition to viewing the metrics in the UI, they are also available as JSON. This gives developers an easy way to create new visualizations and monitoring tools for Spark. The JSON is available for both running applications, and in the history server. The endpoints are mounted at /api/v1. Eg., for the history server, they would typically be accessible at http://:18080/api/v1, and for a running application, at http://localhost:4040/api/v1.
This is not supported at the moment and I don't think it will be anytime soon.
You'll just have to follow application logs the old fashioned way. So maybe consider formatting your logs in a way you know what has actually finished.

Categories