How to upload a large string in an Azure Blob? - python

Right now I'm trying to figure out how to work with Azure, and now I'm stuck in a problem while storing my data in the storage account.
I have three strings and want to store each of them in a separate blob. With the first two, my code works fine, but the third one causes some retries and ends with a timeout.
My code is running within an Azure function.
Here is a minimal example:
from azure.storage.blob import BlobClient
blob_client = BlobClient.from_connection_string(
conn_str. = '<STORAGE_ACCOUNT_CONNECTION_STRING>',
container_name = '<CONTAINER_NAME>',
blob_name. = '<NAME_OF_BLOB>',
)
dic_blob_props = blob_client.upload_blob(
data = '<INFORMATION_THAT_SHOULD_GO_TO_THE_BLOB>',
blob_type = "BlockBlob",
overwrite = True,
)
The for the first two strings everything works fine but the third fails. The strings have the following length:
len(s_1) = 1246209
len(s_2) = 8794086
len(s_3) = 24518001
Most likely it is because the third string is too long, but there must be a way to save it, right?
I have already tried to set the timeout time within the .upload_blob method by timeout=600, but this has not changed the result at all, nor the time until a new attempt to write is made.
The error is:
Exception: ServiceResponseError: ('Connection aborted.', timeout('The write operation timed out'))
If you have any ideas on the problem pleast let me know :-)

In my case, the problem disappeared after I deployed the function in the cloud. It seems that there was a problem debugging with Visual Studio code.

On my side, I don't have the problem. You can have a look of my code:
__init__.py
import logging
import azure.functions as func
def main(req: func.HttpRequest,outputblob: func.Out[func.InputStream],) -> func.HttpResponse:
logging.info('This code is to upload a string to a blob.')
s_3 = "x"*24518001
outputblob.set(s_3)
return func.HttpResponse(
"The string already been uploaded to a blob.",
status_code=200
)
function.json
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "anonymous",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"route": "{test}",
"methods": [
"get",
"post"
]
},
{
"type": "http",
"direction": "out",
"name": "$return"
},
{
"name": "outputblob",
"type": "blob",
"path": "test1/{test}.txt",
"connection": "str",
"direction": "out"
}
]
}
local.settings.json
{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": "",
"FUNCTIONS_WORKER_RUNTIME": "python",
"str":"DefaultEndpointsProtocol=https;AccountName=0730bowmanwindow;AccountKey=xxxxxx==;EndpointSuffix=core.windows.net"
}
}
Then I hit the endpoint http://localhost:7071/api/bowman, it uploads the string to blob and don't have time out error:
So, I think the problem is related with the method you use.

Related

Azure Python Function: Underscore are not working in routes

Hello i am creating an "Azure Python Function" with an POST/GET route in the UNC path like http://localhost:7071/api/indicator/sjz_jb_1. I create therefore the following code :
function.json
In the previous version I used {indicator:alpa?} based on https://learn.microsoft.com/nl-nl/azure/azure-functions/functions-bindings-http-webhook-trigger?tabs=python#customize-the-http-endpoint. But with the alfa datatype the underscores are not working. After searching on the internet I found the stackoverflow post Underscore in URL not working with attribute routing it was based on c# but the code was still working but still result in a page not found.
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "function",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": [
"get",
"post"
],
"route": "indicator/{indicator:regex(^[a-zA-Z_]+$)}"
},
{
"type": "http",
"direction": "out",
"name": "$return"
}
]
}
init.py
import logging
import json
import azure.functions as func
from shared_code.Metadata import Visualisatieportal_indicatoren
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
ind = req.route_params.get('indicator')
logging.info(ind)
indicator1 = Visualisatieportal_indicatoren.getVisualisatieportal_indicatoren(ind)
result = json.dumps(indicator1.__dict__)
return func.HttpResponse(str(id))
My question is, what do I need to change so the so underscores can support in the route of the endpoints.
Manny thanks
Erik
My question is, what do I need to change so the so underscores can
support in the route of the endpoints.
One of the workaround could solve the above issue,
To use underscore you can try with the below :
"route": "indicator/{indicator:regex(^[a-zA-Z0-9_]*$)}"
Also Based on this MS DOC we can not use following attribute in python function except c# & java.

Delete CosmosDB Container Items

I am trying to create an Azure Function (implemented in Python) to delete an item in a CosmosDB container. Using Azure Cosmos DB Input & Output bindings, I was able to add, query and update items but I was not able to find a method that could delete one. Is it possible to delete an item using the binding methods?
The following code is what I am currently using to do a simple update.
_init_.py file
import logging
import azure.functions as func
def main(req: func.HttpRequest, doc: func.Out[func.Document]) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
departure_time = ""
arrival_time = ""
try:
req_body = req.get_json()
except ValueError:
pass
else:
bc_id_no = req_body.get('bc_id_no')
trip_id = req_body.get('trip_id')
departure_time = req_body.get('departure_time')
arrival_time = req_body.get('arrival_time')
if bc_id_no and trip_id:
newdocs = func.DocumentList()
input_dict = {
"bc_id_no": bc_id_no,
"id": trip_id,
"departure_time": departure_time,
"arrival_time": arrival_time
}
newdocs.append(func.Document.from_dict(input_dict))
doc.set(newdocs)
return func.HttpResponse(f"This HTTP triggered function executed successfully.")
else:
return func.HttpResponse(
"bc_id_no or trip_id not available",
status_code=200
)
function.json
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "anonymous",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": [
"get",
"post"
],
"route": "update_rec"
},
{
"type": "cosmosDB",
"direction": "out",
"name": "doc",
"databaseName": "mockDB",
"collectionName": "mockCollection",
"connectionStringSetting": "AzureCosmosDBConnectionString"
},
{
"type": "http",
"direction": "out",
"name": "$return"
}
]
}
Understand that it may be possible to use the sqlQuery configuration properties for the input binding to specify a delete statement (not too sure if this is a good practice even..) but just wondering if another method for deletion is available.
The bindings only support querying, reading (Input binding), or adding (Output binding), there is no Delete support.
There is no configuration you can pass that would make the binding execute a Delete, it's just not there in the code: https://github.com/Azure/azure-webjobs-sdk-extensions/tree/cosmos/v3.x/src/WebJobs.Extensions.CosmosDB
The only alternative I can think of is if you used the Python SDK directly inside the Function to perform the delete: https://learn.microsoft.com/azure/cosmos-db/sql-api-sdk-python
Just make sure that the instance is created and maintained outside of the execution scope: https://learn.microsoft.com/azure/azure-functions/manage-connections#static-clients

Azure Functions Blob deployment

I am going off the documentation here: https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob-output?tabs=python
Here is the code I currently have:
function.json
{
"bindings": [
{
"queueName": "myqueue-items",
"connection": "nameofstorageaccount_STORAGE",
"name": "queuemsg",
"type": "queueTrigger",
"direction": "in"
},
{
"name": "inputblob",
"type": "blob",
"dataType": "binary",
"path": "samples-workitems/{queueTrigger}",
"connection": "nameofstorageaccount_STORAGE",
"direction": "in"
},
{
"name": "outputblob",
"type": "blob",
"dataType": "binary",
"path": "samples-workitems/{queueTrigger}-Copy",
"connection": "nameofstorageaccount_STORAGE",
"direction": "out"
}
],
"disabled": false,
"scriptFile": "__init__.py"
}
init.py
import logging
import azure.functions as func
def main(queuemsg: func.QueueMessage, inputblob: bytes, outputblob: func.Out[bytes]):
logging.info(f'Python Queue trigger function processed {len(inputblob)} bytes')
outputblob.set(inputblob)
If I am understanding correctly, this function should get triggered when a blob is added to a container, and for it to save a copy of that blob inside the same container.
The functions runs, however nothing happens when a blob is uploaded to a container? I would like to trigger some code with a blob being uploaded, this is the only full example I have found with Python and Blob Trigger.
Appreciate any help,
Thanks! :)
No. If you read the document, it states that the function is triggered when a message is sent to the queue:
The following example shows blob input and output bindings in a
function.json file and Python code that uses the bindings. The
function makes a copy of a blob. The function is triggered by a
queue message that contains the name of the blob to copy. The new
blob is named {originalblobname}-Copy.
If you want to execute a function when a blob is created, please see Blob Trigger example here: https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob-trigger?tabs=python.

Azure Functions how to return an HttpResponse or display a message before the script finishes

I have a Python Azure Function that is one file and one main function:
def main(req: func.HttpRequest) -> func.HttpResponse:
[bunch of code]
return func.HttpResponse("the file will be deleted in 10 minutes", status_code=200)
It creates a file inside Azure Blob storage for the user and deletes it in 10 minutes. I use time.sleep(600) to do this. However, the message only arrives at the end of this timer, after the file has already been deleted.
How can I make the HttpResponse show the message before the script ends, then wait 10 minutes before deleting the message?
I've tried adding func.HttpResponse('the file will be deleted in 10 minutes') before the time.sleep(600) but it doesn't return anything.
For a Function with http output binding like this you have to return the http response at the end for the response to work. So with a single Function, you cannot achieve this. Continue reading for the alternate solution.
This problem is typically an 'asynchronous' processing example where you want to respond immediately like "ok, I am going to do this" while it "queues further processing" to be continued in the backend. To achieve this in Azure Function you will need 2 functions as below:
Function 1 : Http trigger, http output and Queue output binding (for simplicity I will use storage queue).
Function 2 : Queue trigger (will get triggered by the message queued by function 1).
Function 1 (update according to your need):
JSON:
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "function",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": [
"get",
"post"
]
},
{
"type": "http",
"direction": "out",
"name": "$return"
},
{
"type": "queue",
"direction": "out",
"name": "msg",
"queueName": "outqueue",
"connection": "AzureStorageQueuesConnectionString"
}
]
}
Code:
import azure.functions as func
def main(req: func.HttpRequest, msg: func.Out[str]) -> func.HttpResponse:
[bunch of code]
input_msg = "<create the message body required by function 2>"
msg.set(input_msg)
return func.HttpResponse("the file will be deleted in 10 minutes", status_code=201)
Function 2 (update according to your need):
JSON:
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "msg",
"type": "queueTrigger",
"direction": "in",
"queueName": "messages",
"connection": "AzureStorageQueuesConnectionString"
}
]
}
Code:
import json
import azure.functions as func
def main(msg: func.QueueMessage):
# below is just an example of parsing the message, in your case it might be taking the blob info required for deleting
message = json.dumps({
'id': msg.id,
'body': msg.get_body().decode('utf-8'),
'expiration_time': (msg.expiration_time.isoformat()
if msg.expiration_time else None),
'insertion_time': (msg.insertion_time.isoformat()
if msg.insertion_time else None),
'time_next_visible': (msg.time_next_visible.isoformat()
if msg.time_next_visible else None),
'pop_receipt': msg.pop_receipt,
'dequeue_count': msg.dequeue_count
})
[bunch of code]
NOTE: You can also look at Durable Functions where you can handle complex workflow and would not need to manage the queueing yourself. But since your scenario in this case is quite simple, I did not cover it.
This is because the actual response isn't sent before the function itself returns something to the pipeline - the pipeline will then return the result to the caller.
And instead of doing this wonky 10-minute waiting inside a function app (which is something you really never should do), I'd create a queue message, set the initial invisibility to 10 minutes, add to e.g. delete-file-queue. Have a QueueTrigger somewhere, listening to delete-file-queue, and do the deletion of the file.
So instead, do something like this (I'm not super familiar with Functions in python, so treat this as pseudo code):
def main(req: func.HttpRequest) -> func.HttpResponse:
# handle whatever you have to, but do NOT include time.sleep
queue_client.send_message("path/to/blob", visibility_timeout=600)
# the message will end up in the back of the queue, and
# it'll stay invisible for 600 seconds
# this is something we don't have to wait for, and thus, the following
# will return immediately
return func.HttpResponse("file will be deleted in 10 minutes")
Your QueueTrigger would then be something like this:
def main(filename: func.QueueMessage, inputblob: func.InputStream) -> None:
# check if inputblob is none, if not, delete it
In your functions.json, you should include bindings for the filename and inputblob:
{
"name": "filename",
"type": "queueTrigger",
"direction": "in",
"queueName": "delete-file-queue",
"connection": "MyStorageConnectionString"
},
{
"name": "inputblob",
"type": "blob",
"path": "{queueTrigger}",
"connection": "MyStorageConnectionString",
"direction": "in"
}
Guide to initializing a queue_client.
And more info here.

upload hyperlink data to azure binding

I am trying to stream data from hyperlink destination to azure storage. I have to do this via binding since I want to run this from azure function App.
file -- function.json:
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "anonymous",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": [
"get",
"post"
]
},
{
"type": "http",
"direction": "out",
"name": "$return"
},
{
"type": "blob",
"direction": "out",
"name": "outputBlob",
"path": "samples-workitems/{rand-guid}",
"connection": ""
}
]
}
file -- init.py:
import logging
import cdsapi
import azure.functions as func
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
def main(req: func.HttpRequest, outputBlob:func.Out[func.InputStream]) -> func.HttpResponse:
logging.info('Python HTTP trigger function is about to process request.')
try:
source_blob="http://www.africau.edu/images/default/sample.pdf"
with open(source_blob, "rb") as data:
print(data)
outputBlob.set(data)
except Exception as ex:
logging.info(" error!", ex, "occurred.")
return func.HttpResponse(
"This HTTP triggered function executed successfully.",
status_code=200
)
I have tested binding and it works. When I simply do outputBlob.set("sample string") data is streamed as it should be.
I am stuck with converting data from hyperlink to bytes(or blob). While running code above, i get error Exception: TypeError: not all arguments converted during string formatting. Any help in converting this and uploading to azure storage is appreciated.
Problem is you were trying to read the File from URL with open(source_blob, "rb") as data: which of course won't work since open is for local files only. I have changed your code as below using requests module to get the remote URL response and set the content to blob.
import requests
source_url="http://www.africau.edu/images/default/sample.pdf"
with requests.get(source_url, stream=True) as r:
r.raise_for_status()
outputBlob.set(r.content)

Categories