Save new blobs files using azure durable functions Python - python

i have a proyect that will be conterized in docker, it consists on a DurableFunctionsHttpStart, a Orchestactor and a activity function, the activity function its web scraper that once the data has been downloaded will save a csv into the azure blob storage, once i tested my function i got the next error inside the Docker container:
HttpResponseError: This request is not authorized to perform this operation using this resource type.
i tried using the function biding with an out blob as below:
and tried using the python sdk as showen below:
this is the host.json of my activity function
the problem its since i use the durable function because in normal httptrigger i dont have any issue.

Related

Upload large blob to Azure storage container using App service and function app

I am working on a project to allow users to upload blob into blob container in our storage account. I developed a simple UI (flask) using Azure App Service to allow user choose files to upload, and then want to upload these files to the blob container.
My original design is UI -> Blob Container by Python Storage SDK:
containerClient.upload_blob(filename, file)
But I am facing the timeout issue due to Azure App Service when uploading large files.
So I change the upload UI with dropzone.js, and enable uploading in chunk, so that the server can consistently receive response to prevent timeout.
And another issue coming up is that upload process is executed for every piece of chunk, and blob container only receives the last chunk of the data that I upload. (From the document, I know that the chunking is automatically used in blob upload, I wonder if we are able to track the progress of the upload??? if so, I probably don't need to use dropzone.js for uploading in chunk).
I also tried another approach by creating Azure App Function (HTTPS trigger), and then send an http trigger to that endpoint to start the blob upload.
for file in files:
fileToSend = {'file': (f.filename, f.stream, f.content_type, f.headers)}
r = requests.post('https://myazurefunctionapp.azurewebsites.net/api/funcName', files=fileToSend)
In the azure function, I use Python Storage SDK to connect to container and then upload blob
container = ContainerClient.from_connection_string(conn_str, container_name)
for k, f in req.files.items():
container.upload_blob(f.filename, f)
But I notice that the function is triggered by piece of chunk (request), and I also end up with only receiving the last chunk of data in the container.
I wonder what would be the better workflow? or if there any way that makes sure the upload is completed (in azure function) and then start the upload to blob container.
Many Thanks,
• Storage clients default to a 32 MB maximum single block upload. When a block blob upload is larger than the value in ‘SingleBlobUploadThresholdInBytes’ property, storage clients break the file into blocks of maximum allowed size and try to upload it. Since the block blob size that you are trying to upload is greater than 32 MB, it throws an exception and breaks the file into allowed smaller chunks. Also, you might not be using the correct ‘Blob service client’ which interacts with the resources, i.e., storage account, blob storage containers and blobs.
Below is an example of the code for client object creation which requires a storage account’s blob service account URL and a credential that allows you to access a storage account: -
from azure.storage.blob import BlobServiceClient
service = BlobServiceClient(account_url="https://<my-storage-account-name>.blob.core.windows.net/", credential=credential)
• Thus, similarly, as you are using the above code in python to create a blob service client for interacting with storage accounts, kindly refer to the below documentation link that describes in detail as in how to develop a python code to integrate it with blob storage for storing massive amounts of unstructured data, such as text or binary data.
https://learn.microsoft.com/en-us/python/api/overview/azure/storage-blob-readme?view=azure-python
You can deploy this code in your app service or function and set the trigger accordingly for uploading and downloading blobs from the storage account. It also describes as in how you can configure authentication for this process to ensure that the correct user and files are being given access.
And refer to the documentation link for details on how to configure a blob trigger function in Azure for various interactions with the storage account when any users initiate any transaction through it.
https://learn.microsoft.com/en-us/azure/storage/blobs/blob-upload-function-trigger?tabs=azure-portal

can we use azure data explorer function (for example series_decompose() ) locally or anywhere in python program

There is function in azure data explorer i.e. series_decompose() , so I need to use this function in my python program locally with data from sql
So can I do it, and if yes then how?
You can run KQL functions when using Kusto (a.k.a. ADX or Azure Data Explorer), or another service that runs on top of Kusto, e.g. Azure Monitor, Sentinel etc.

Loading a new CSV in Azure Blob Storage to SQL DB

I am loading a csv file into an Azure Blob Storage account. I would like a process to be triggered when a new file is added, that takes the new CSV and BCP loads it into an Azure SQL database.
My idea is to have an Azure Data Factory pipeline that is event triggered. However, I am stuck as to what to do next. Should an Azure Function be triggered that takes this CSV and uses BCP to load it into the DB? Can Azure Functions even use BCP?
I am using Python.
I would like to please check below link. Basically you want to copy new files as well the modified file for that single copy data is used full. Use event based trigger(when files in created) instead on schedule one.
https://www.mssqltips.com/sqlservertip/6365/incremental-file-load-using-azure-data-factory/

How to read and modify a csv file on one bucket in cloud storage and save the results in another bucket using Cloud Functions

I have CSV files coming to a folder on a Cloud Storage bucket and I want to create a Cloud Function that opens the CSV, adds a new column to it and then save the results to a new bucket as new.csv file.
Is there a way to do that using python Cloud Function???
Thanks in advance.
The idea that you’re trying to implement is totally possible and can be achieved using Google Cloud Functions.
For that, you would need to create a storage-triggered Cloud Function. More specifically, you can create your function in a way that it will respond to change notifications emerging from your Google Cloud Storage.
These notifications can be configured to respond to various events inside a bucket: object creation, deletion, archiving and metadata updates.
For the situation described, you will need to use the trigger google.storage.object.finalize.
This event is sent when a new object is created in the bucket or an existing object is overwritten, and a new generation of that object is created.
Here you can find a sample code of a storage-triggered Cloud Function written in Python, while this tutorial will give you a more detailed overview of the usage of storage-triggered functions.

Communicating with azure container using server-less function

I have created a python serverless function in azure that gets executed when a new file is uploaded to azure blob (BlobTrigger). The function extracts certain properties of the file and saves it in the DB. As the next step, I want this function copy and process the same file inside a container instance running in ACS. The result of processing should be returned back to the same azure function.
This is a hypothetical architecture that I am currently brainstorming on. I wanted to know if this is feasible. Can you provide me some pointers on how I can achieve this.
I dont see any ContainerTrigger kind of functionality that can allow me to trigger the container and process my next steps.
I have tried utilizing the code examples mentioned here but they have are not really performing the tasks that I need: https://github.com/Azure-Samples/aci-docs-sample-python/blob/master/src/aci_docs_sample.py
Based on the comments above you can consider.
Azure Container Instance
Deploy your container in ACI (Azure Container Instance) and expose HTTP end point from container , just like any web url. Trigger Azure Function using blob storage trigger and then pass your blob file URL to the exposed http end point to your container. Process the file there and return the response back to azure function just like normal http request/response.
You can completely bypass azure function and can trigger your ACI (container instance) using logic apps , process the file and directly save in database.
When you are using Azure function make sure this is short lived process since Azure function will exit after certain time (default 5 mins). For long processing you may have to consider azure durable functions.
Following url can help you understand better.
https://github.com/Azure-Samples/aci-event-driven-worker-queue

Categories