this is my first question on stackoverflow and I'm new to programming:
What is the right way to load data into the GAE datastore when deploying my app? This should only happen once at deployment.
In other words: How can I call methods in my code, such that these methods are only called when I deploy my app?
The GAE documentation for python2.7 says, that one shouldn't call a main function, so I can't do this:
if __name__ == '__main__':
initialize_datastore()
main()
Create a handler that is restricted to admins only. When that handler is invoked with a simple GET request you could have it check to see if the seed data exists and if it doesn't, insert it.
Configuring a handler to require login or administrator status.
Another option is to write a Python script that utilizes the Remote API. This would allow you to access local data sources such as a CSV file or a locally hosted database and wouldn't require you to create a potentially unwieldy handler.
Read about the Remote API in the docs.
Using the Remote API Shell - Google App Engine
Related
In short:
I have a Django application being served up by Apache on a Google Compute Engine VM.
I want to access a secret from Google Secret Manager in my Python code (when the Django app is initialising).
When I do 'python manage.py runserver', the secret is successfully retrieved. However, when I get Apache to run my application, it hangs when it sends a request to the secret manager.
Too much detail:
I followed the answer to this question GCP VM Instance is not able to access secrets from Secret Manager despite of appropriate Roles. I have created a service account (not the default), and have given it the 'cloud-platform' scope. I also gave it the 'Secret Manager Admin' role in the web console.
After initially running into trouble, I downloaded the a json key for the service account from the web console, and set the GOOGLE_APPLICATION_CREDENTIALS env-var to point to it.
When I run the django server directly on the VM, everything works fine. When I let Apache run the application, I can see from the logs that the service account credential json is loaded successfully.
However, when I make my first API call, via google.cloud.secretmanager.SecretManagerServiceClient.list_secret_versions , the application hangs. I don't even get a 500 error in my browser, just an eternal loading icon. I traced the execution as far as:
grpc._channel._UnaryUnaryMultiCallable._blocking, line 926 : 'call = self._channel.segregated_call(...'
It never gets past that line. I couldn't figure out where that call goes so I couldnt inspect it any further than that.
Thoughts
I don't understand GCP service accounts / API access very well. I can't understand why this difference is occurring between the django dev server and apache, given that they're both using the same service account credentials from json. I'm also surprised that the application just hangs in the google library rather than throwing an exception. There's even a timeout option when sending a request, but changing this doesn't make any difference.
I wonder if it's somehow related to the fact that I'm running the django server under my own account, but apache is using whatever user account it uses?
Update
I tried changing the user/group that apache runs as to match my own. No change.
I enabled logging for gRPC itself. There is a clear difference between when I run with apache vs the django dev server.
On Django:
secure_channel_create.cc:178] grpc_secure_channel_create(creds=0x17cfda0, target=secretmanager.googleapis.com:443, args=0x7fe254620f20, reserved=(nil))
init.cc:167] grpc_init(void)
client_channel.cc:1099] chand=0x2299b88: creating client_channel for channel stack 0x2299b18
...
timer_manager.cc:188] sleep for a 1001 milliseconds
...
client_channel.cc:1879] chand=0x2299b88 calld=0x229e440: created call
...
call.cc:1980] grpc_call_start_batch(call=0x229daa0, ops=0x20cfe70, nops=6, tag=0x7fe25463c680, reserved=(nil))
call.cc:1573] ops[0]: SEND_INITIAL_METADATA...
call.cc:1573] ops[1]: SEND_MESSAGE ptr=0x21f7a20
...
So, a channel is created, then a call is created, and then we see gRPC start to execute the operations for that call (as far as I read it).
On Apache:
secure_channel_create.cc:178] grpc_secure_channel_create(creds=0x7fd5bc850f70, target=secretmanager.googleapis.com:443, args=0x7fd583065c50, reserved=(nil))
init.cc:167] grpc_init(void)
client_channel.cc:1099] chand=0x7fd5bca91bb8: creating client_channel for channel stack 0x7fd5bca91b48
...
timer_manager.cc:188] sleep for a 1001 milliseconds
...
timer_manager.cc:188] sleep for a 1001 milliseconds
...
So, we a channel is created... and then nothing. No call, no operations. So the python code is sitting there waiting for gRPC to make this call, which it never does.
The problem appears to be that the forking behaviour of Apache breaks gRPC somehow. I couldn't nail down the precise cause, but after I began to suspect that forking was the issue, I found this old gRPC issue that indicates that forking is a bit of a tricky area.
I tried to reconfigure Apache to use a different 'Multi-processing Module', but as my experience in this is limited, I couldn't get gRPC to work under any of them.
In the end, I switched to using nginx/uwsgi instead of Apache/mod_wsgi, and I did not have the same issue. If you're trying to solve a problem like this and you have to use Apache, I'd advice further investigating Apache forking, how gRPC handles forking, and the different MPMs available for Apache.
I'm facing a similar issue. When running my Flask Application with eventlet==0.33.0 and gunicorn https://github.com/benoitc/gunicorn/archive/ff58e0c6da83d5520916bc4cc109a529258d76e1.zip#egg=gunicorn==20.1.0. When calling secret_client.access_secret_version it hangs forever.
It used to work fine with an older eventlet version, but we needed to upgrade to the latest version of eventlet due to security reasons.
I experienced a similar issue and I was able to solve with the following:
import grpc.experimental.gevent as grpc_gevent
from gevent import monkey
from google.cloud import secretmanager
monkey.patch_all()
grpc_gevent.init_gevent()
client = secretmanager.SecretManagerServiceClient()
I am getting the error while deploying the Azure function from the local system.
I wen through some blogs and it is stating that my function is unable to connect with the Azure storage account which has the functions meta data.
Also, The function on the portal is showing the error as: Azure Functions runtime is unreachable
Earlier my function was running but after integrating the function with a Azure premium App service plan it has stooped working. My assumption is that my app service plan having some restriction for the inbound/outbound traffic rule and Due to this it is unable to establish the connection with the function's associated storage account.
Also, I would like to highlight that if a function is using the premium plan then we have to add few other configuration properties.
WEBSITE_CONTENTAZUREFILECONNECTIONSTRING = "DefaultEndpointsProtocol=https;AccountName=blob_container_storage_acc;AccountKey=dummy_value==;EndpointSuffix=core.windows.net"
WEBSITE_CONTENTSHARE = "my-function-name"
For the WEBSITE_CONTENTSHARE property I have added the function app name but I am not sure with the value.
Following is the Microsoft document reference for the function properties
Microsoft Function configuration properties link
Can you please help me to resolve the issue.
Note: I am using python for the Azure functions.
I have created a new function app with Premium plan and selected the interpreter as Python. When we select Python, OS will be automatically Linux.
Below is the message we get to create functions for Premium plan function App:
Your app is currently in read only mode because Elastic Premium on Linux requires running from a package.
PortalScreenshot
We need to create, deploy and run function apps from a package, refer to the documentation on how we can run functions from package.
Documentation
Make sure to add all your local.settings.json configurations to Application Settings in function app.
Not sure of what kind of Azure Function you are using but usually when there is a Storage Account associated, we need to specify the AzureWebJobsStorage field in the serviceDependencies.json file inside Properties folder. And when I had faced the same error, the cause was that while publishing the azure function from local, some settings from the local.settings.json were missing in the Application Settings of the app service under Configuration blade.
There can be few more things which you can recheck:
Does the storage account you are trying to use existing still or is deleted by any chance.
While publishing the application from local, using the web deploy method, the publish profile is correct or has any issues.
Disabling the function app and then stopping the app service before redeploying it.
Hope any of the above mentions help you diagnose and solve the issue.
The thing is that there is a difference in how the function deployed using Consumption vs Premium service plan.
Consumption - working out of the box.
Premium - need to add the WEBSITE_RUN_FROM_PACKAGE = 1 in the function Application settings. (see https://learn.microsoft.com/en-us/azure/azure-functions/run-functions-from-deployment-package for full details)
I would like to upgrade my app engine python version from Python 2 to Python 3. But in second generation app engine we cannot use login field in the handler in app.yaml to make certain pages in app engine only accessible to admin.
As per the guidelines Google suggests as follows: The login field is not supported. Use Cloud Identity and Access Management for user management.
I am not able to figure how can I use Identity and Access Management to control login access?
Are you trying to have admin only endpoints that can actually be used called by an admin user? Or are you trying to have admin only endpoints that are only meant to run cron jobs and/or enqueue tasks?
If it is the former (i.e. have pages/handlers that will actually be viewed by admin personnel), then the dcumentation here may be what you're looking for. Unfortunately, as I have noticed with app engine documentation, you may have to read pages upon pages of "theory" and never see sample code you can try to actually use. My guess however is that you will probably end up writing a decorator to check user authorization and authentication, found below.
If you are only trying to limit access to endpoints to secure running cron jobs and queueing tasks, then you are probably looking for this and this solution. Basically, you write a decorator to verify if the endpoint/handler is being called by a cron job or a task queue. Here's working code which should be good to run:
# main.py
from flask import Flask, request, redirect, render_template
app = Flask(__name__)
# Define the decorator to protect your end points
def validate_cron_header(protected_function):
def cron_header_validator_wrapper(*args, **kwargs):
# https://cloud.google.com/appengine/docs/standard/python3/scheduling-jobs-with-cron-yaml#validating_cron_requests
header = request.headers.get('X-Appengine-Cron')
# If you are validating a TASK request from a TASK QUEUE instead of a CRON request, then use 'X-Appengine-TaskName' instead of 'X-Appengine-Cron'
# example:
# header = request.headers.get('X-Appengine-TaskName')
# Other possible headers to check can be found here: https://cloud.google.com/tasks/docs/creating-appengine-handlers#reading_app_engine_task_request_headers
# If the header does not exist, then don't run the protected function
if not header:
# here you can raise an error, redirect to a page, etc.
return redirect("/")
# Run and return the protected function
return protected_function(*args, **kwargs)
# The line below is necessary to allow the use of the wrapper on multiple endpoints
# https://stackoverflow.com/a/42254713
cron_header_validator_wrapper.__name__ = protected_function.__name__
return cron_header_validator_wrapper
#app.route("/example/protected/handler")
#validate_cron_header
def a_protected_handler():
# Run your code here
your_response_or_error_etc = "text"
return your_response_or_error_etc
#app.route("/yet/another/example/protected/handler/<myvar>")
#validate_cron_header
def another_protected_handler(some_var=None):
# Run your code here
return render_template("my_sample_template", some_var=some_var)
Now you have to use Cloud IAM Client Librares in your code in order to provide access. You can find an example of how to use it in Python 3 here.
User authentication in most of Google Cloud Platform is very different from the Python 2 App Engine approach. You could fully implement user login in a variety of ways in your app to make this restriction, but you can't just set a property in a .yaml file to do this.
Or, try putting your admin functions all in a separate App Engine service, then use IAP (Identity-Aware Proxy) to restrict access to that service to desired users. This page should help. And here's a codelab I wrote that has detailed steps to protect an entire App Engine app, not just one service, with IAP.
Things have changed since this question was asked so providing an updated answer
As of March, 2022
You can still use login:required in app.yaml file and it will force a visitor to login with their gmail account.
Python 3 now supports users API (see announcement) which means that if a page is protected by login: required, you can now call is_current_user_admin() on the route handler to confirm it is your administrator
Even if you don't wish to use the users API, you can still get the details of the logged in user (for pages protected by login:required) by checking for any of the following headersX-Appengine-User-Email, X-Appengine-User-Id. You can refer to my response to this other SO question
This used to be done in the adminstrator console. Now, according to the docs, it's controlled by a setting in the application's configuration files.
I updated my app.yaml file to include these lines and redeployed it:
#
# Module Settings
# https://cloud.google.com/appengine/docs/python/modules/#Python_Configuration
#
module: default
instance_class: F2
However, I haven't noticed any improvement in my application's performance. Specifically, I have a (cpu-bound) script that was taking 4-5 secs to run and there has been no difference since the change.
So my question is: am I doing this correctly? And is there a way to confirm (for example, in the logs or elsewhere in the admin console) the level at which my application's servers are running?
I should note that I am testing this on an unbilled application. Although I couldn't find any information in the docs that indicated this feature was limited to billed applications, I know that some features are unavailable on unbilled apps.
The settings you have there look correct.
If you are using modules, and it looks like you are, you can confirm the frontend instance class is what you set it to by viewing the "Versions" page on the old app engine console at http://appengine.google.com/
If you aren't using modules you can view the instance type on the "Application Settings" page.
Unfortunately, there doesn't seem to be a way to check the frontend instance class using the new cloud console.
If you look under Instances in the application dashboard you can see which ones you currently have running.
I have a grok-based webapp that persists data using ZODB. Can I query the object db offline i.e. from a python script that would be run on the webserver hosting the grok/paste webapp instance?
And would there be any issues in doing so while the web server is interacting with the database simultaneously?
You can open the ZODB with python and inspect the data, yes. To do so while you also run the web site, you'll need to use a concurrency layer like ZEO or RelStorage; plain FileStorage does not support concurrent access otherwise.