I've been reading a lot around different ways to shutdown a Flask app but I don't get it how I could implement something for my use case.
I wrote and am testing a simple Flask app which takes a POST request to create some resources within Google Cloud. This Flask app is deployed into a container and is running on Cloud Run.
My question is, I want to shutdown the app right after a 200 response or would there be a way to handle one request per Cloud Run instance?
app = Flask(__name__)
#app.route('/', methods=['POST'])
def main():
#some validation on the request.json
try:
kick_off_terraform()
return ("Success", 200)
except Exception as e:
print(e)
After doing some research I found out I can control the concurrency on the GCP side, and that way I can allow only one request per instance on Cloud Run.
gcloud run deploy $SERVICE_NAME
--image gcr.io/$GCP_PROJECT_ID/$SERVICE_NAME:$CI_COMMIT_SHORT_SHA
--region=us-east1
--platform managed
--concurrency=1
Sadly hacks like --concurrency=1 or --max-instances=1 are not great because shutting down the server after a request may cause the request to fail. (When I did that in the past, requests failed.)
Based on your question I am guessing you might not have fully grasped Cloud Run runtime behavior. Please note that:
You don't need to "shut down" a container on Cloud Run. It automatically suspends once all requests finish, and you are not even charged for the idle time that occurs outside of a request.
Operations like kick_off_terraform() can't happen in the background (they have to finish before you return the response), because Cloud Run currently doesn't allocate CPU in the background.
What you need is something like "run to completion containers" and you may need to wait a bit for that to be supported by Cloud Run.
Related
Hi i have a flask application that Build as a docker image to serve as an api
this image is deployed to multiple environments (DEV/QA/PROD)
i want to use an applicationInsight for each environment
using a single application Insight works fine
here is a code snippet
app.config['APPINSIGHTS_INSTRUMENTATIONKEY'] = APPINSIGHTS_INSTRUMENTATIONKEY
appinsights = AppInsights(app)
#app.after_request
def after_request(response):
appinsights.flush()
return response
but to have multiple application i need to configure app.config with the key of the app insight
i thought of this solution which thourghs errors
here a snippet :
app = Flask(__name__)
def monitor(key):
app.config['APPINSIGHTS_INSTRUMENTATIONKEY'] = key
appinsights = AppInsights(app)
#app.after_request
def after_request(response):
appinsights.flush()
return response
#app.route("/")
def hello():
hostname = urlparse(request.base_url).hostname
print(hostname)
if hostname == "dev url":
print('Dev')
monitor('3ed57a90-********')
if hostname == "prod url":
print('prod')
monitor('941caeca-********-******')
return "hello"
this example contains the function monitor which reads the url and decide which app key to give so it can send metrics to the right place but apparently i can't do those processes after the request is sent (is there a way a config variable can be changed based on the url condition ?)
error Message :
AssertionError: The setup method 'errorhandler' can no longer be called on the application. It has already handled its first request, any changes will not be applied consistently. Make sure all imports, decorators, functions, etc. needed to set up the application are done before running it.
i hope someone can guide me to a better solution
thanks in advance
AFAIK, Normally Application Insights SDK collect the telemetry data, and it has sent to Azure by batch. So, you have to keep a single application insights resource for an application. Use staging for use different application insights for same application.
When the request started for the service till to complete his response the Application insights has taking care of the specific service life cycle. The application while start to end it will track the information. So that we can't use more than one Application Insights in single application.
When Application starts the AI start collecting telemetry data when Application stops then the AI stops gathering telemetry information. We were using Flush to even though in between application stops to send information to AI.
I have tried what you have used. It confirms the same in the log
I have tried with single application insights I can be able to collect all telemetry information.
I'm trying to write a script that uses #response.call_on_close on Google Cloud Run, to send an inmediate return to the invoker and do some processing after that, so that the invoker doesn't keep waitng.
The script involves the use of Selenium, and it works fine in a local Cloud Run, but when deployed on the actual Cloud I get a "devToolsActivePort file doesn't exist" error.
When I comment out all of the #response.call_on_close part and invoke it directly, it also works fine, so it has nothing to do with Selenium, there must be a problem with the decorator part but I can't figure it out.
This is the code I'm using to make the call:
from flask import Flask, request
from scraper import scrap
app = Flask(__name__)
#app.after_request
def response_processor(response):
request_json = request.get_json()
keyword = request_json['keyword']
tztimezone = request_json['tztimezone']
#response.call_on_close
def process_after_request():
scrap(keyword, topic, tztimezone)
return response
#app.route("/", methods=['GET', 'POST'])
def main():
if request.method != 'POST':
return 'Only POST requests are accepted', 405
return ''
Any help will be greatly appreciated.
Thanks!
You cannot perform CPU processing after you return a response to the client.
Google Cloud Run considers the service request is complete once you return a response. The CPU will be put to sleep for your container until the next request.
This link will help:
Lifecycle of a container on Cloud Run
Cloud Run just got a new feature: Always on cpu to disable the CPU throttling.
However, you want pays the processing time ONLY when the request is processed, you will pay for the instance until it is offloaded (about 15 minutes after the latest request received).
But, you have a CPU and memory cost discount (25% and 20%)
Be careful:
If your background job takes more than 15 minutes and the instance didn't receive new request, it will be offloaded and your job killed. This feature is designed to continue the process during several seconds after the request end.
Cloud Run team doesn't guaranty the 15 minutes (commonly observed) especially if the platform get an heavy demand on new Cloud Run instances.
In this case, you can combine this feature with the min instance feature. We can discuss further the different tradeoff of this design.
In short:
I have a Django application being served up by Apache on a Google Compute Engine VM.
I want to access a secret from Google Secret Manager in my Python code (when the Django app is initialising).
When I do 'python manage.py runserver', the secret is successfully retrieved. However, when I get Apache to run my application, it hangs when it sends a request to the secret manager.
Too much detail:
I followed the answer to this question GCP VM Instance is not able to access secrets from Secret Manager despite of appropriate Roles. I have created a service account (not the default), and have given it the 'cloud-platform' scope. I also gave it the 'Secret Manager Admin' role in the web console.
After initially running into trouble, I downloaded the a json key for the service account from the web console, and set the GOOGLE_APPLICATION_CREDENTIALS env-var to point to it.
When I run the django server directly on the VM, everything works fine. When I let Apache run the application, I can see from the logs that the service account credential json is loaded successfully.
However, when I make my first API call, via google.cloud.secretmanager.SecretManagerServiceClient.list_secret_versions , the application hangs. I don't even get a 500 error in my browser, just an eternal loading icon. I traced the execution as far as:
grpc._channel._UnaryUnaryMultiCallable._blocking, line 926 : 'call = self._channel.segregated_call(...'
It never gets past that line. I couldn't figure out where that call goes so I couldnt inspect it any further than that.
Thoughts
I don't understand GCP service accounts / API access very well. I can't understand why this difference is occurring between the django dev server and apache, given that they're both using the same service account credentials from json. I'm also surprised that the application just hangs in the google library rather than throwing an exception. There's even a timeout option when sending a request, but changing this doesn't make any difference.
I wonder if it's somehow related to the fact that I'm running the django server under my own account, but apache is using whatever user account it uses?
Update
I tried changing the user/group that apache runs as to match my own. No change.
I enabled logging for gRPC itself. There is a clear difference between when I run with apache vs the django dev server.
On Django:
secure_channel_create.cc:178] grpc_secure_channel_create(creds=0x17cfda0, target=secretmanager.googleapis.com:443, args=0x7fe254620f20, reserved=(nil))
init.cc:167] grpc_init(void)
client_channel.cc:1099] chand=0x2299b88: creating client_channel for channel stack 0x2299b18
...
timer_manager.cc:188] sleep for a 1001 milliseconds
...
client_channel.cc:1879] chand=0x2299b88 calld=0x229e440: created call
...
call.cc:1980] grpc_call_start_batch(call=0x229daa0, ops=0x20cfe70, nops=6, tag=0x7fe25463c680, reserved=(nil))
call.cc:1573] ops[0]: SEND_INITIAL_METADATA...
call.cc:1573] ops[1]: SEND_MESSAGE ptr=0x21f7a20
...
So, a channel is created, then a call is created, and then we see gRPC start to execute the operations for that call (as far as I read it).
On Apache:
secure_channel_create.cc:178] grpc_secure_channel_create(creds=0x7fd5bc850f70, target=secretmanager.googleapis.com:443, args=0x7fd583065c50, reserved=(nil))
init.cc:167] grpc_init(void)
client_channel.cc:1099] chand=0x7fd5bca91bb8: creating client_channel for channel stack 0x7fd5bca91b48
...
timer_manager.cc:188] sleep for a 1001 milliseconds
...
timer_manager.cc:188] sleep for a 1001 milliseconds
...
So, we a channel is created... and then nothing. No call, no operations. So the python code is sitting there waiting for gRPC to make this call, which it never does.
The problem appears to be that the forking behaviour of Apache breaks gRPC somehow. I couldn't nail down the precise cause, but after I began to suspect that forking was the issue, I found this old gRPC issue that indicates that forking is a bit of a tricky area.
I tried to reconfigure Apache to use a different 'Multi-processing Module', but as my experience in this is limited, I couldn't get gRPC to work under any of them.
In the end, I switched to using nginx/uwsgi instead of Apache/mod_wsgi, and I did not have the same issue. If you're trying to solve a problem like this and you have to use Apache, I'd advice further investigating Apache forking, how gRPC handles forking, and the different MPMs available for Apache.
I'm facing a similar issue. When running my Flask Application with eventlet==0.33.0 and gunicorn https://github.com/benoitc/gunicorn/archive/ff58e0c6da83d5520916bc4cc109a529258d76e1.zip#egg=gunicorn==20.1.0. When calling secret_client.access_secret_version it hangs forever.
It used to work fine with an older eventlet version, but we needed to upgrade to the latest version of eventlet due to security reasons.
I experienced a similar issue and I was able to solve with the following:
import grpc.experimental.gevent as grpc_gevent
from gevent import monkey
from google.cloud import secretmanager
monkey.patch_all()
grpc_gevent.init_gevent()
client = secretmanager.SecretManagerServiceClient()
I am running a flask application which has an option in the UI where the user can click a button and it calls an endpoint to perform some analysis.
The application is served as follows:
from waitress import serve
serve(app, host="0.0.0.0", port=5000)
After around ~1 minute, I am receiving a gateway timeout in the UI:
504 Gatway Time-out
However, the flask application keeps doing the work behind and after 2 minutes it completes the processing and I can see that it submits the data on the db side. So the process is not timing out itself.
I tried already passing channel_timeout argument to a much higher value (default seems 120 seconds) but with no luck. I know that this makes sense to implement in a different way where the user doesn't have to wait for these two minutes, but I am looking if there is such a timeout set by default and whether it can be increased.
The application is deployed in K8s and the UI is exposed via ingress. Could the timeout come from ingress instead?
The problem was with the ingress controller default timeout.
I managed to get around this issue by changing the implementation and having this job run as background task instead.
I've set up a Google Cloud BigTable cluster on my project. The main codebase for the project runs within a standard Python App Engine environment, which can't use the gcloud-python library because of the reliance on grpcio. To get around this, I've set up a Python App Engine Flexible Environment service within the same project and written a very simple Flask server to run on it, which I can then hit from my standard environment. The code looks something like this:
from gcloud import bigtable
app = Flask(__name__)
client = bigtable.Client(project=bigtable_config.PROJECT_ID, read_only=True)
cluster = client.cluster(bigtable_config.ZONE_ID, bigtable_config.CLUSTER_ID)
table = cluster.table(bigtable_config.TABLE_ID)
#app.route("/query/<start_key>/<end_key>")
def run_query(start_key, end_key):
if not client.is_started():
client.start()
row_data = table.read_rows(start_key=start_key, end_key=end_key)
row_data.consume_all()
// do some stuff to the row data here, get results
return jsonify(results)
I can run this code locally and it works great. I can deploy it to my service and it continues to work great. However, if the service sits idle for some period of time (I've typically noticed it after about an hour), then every request I make starts failing with this error:
NetworkError(code=StatusCode.UNAUTHENTICATED, details="Request had invalid authentication credentials.")
If I redeploy the service, it starts working again. I do not observe this behavior when I'm running the service locally.
What am I doing wrong? I'm assuming I'm making some mistake in my setup of the client, where it's not properly using the app engine credentials. Do I need to kill the client and restart it when I encounter this error?
This issue is being tracked at github.