How to deploy a flask application with a config file? - python

I have a flask application and I use a config file with some sensitive information. I was wondering how to deploy my application with the config file without releasing the sensitive information it holds.

TLDR; Create a class to hold your config secrets, store the actual secrets in environment variables on your host machine, and read in the environment variables in your app.
Detailed implementation below.
This is my folder structure:
api
|_cofig
|_config.py
|_app.py
Then inside of my app.py, which actually starts my Flask application, it looks roughly like this (I've excluded everything that doesn't matter).
from config.config import config
def create_app(app_environment=None):
if app_environment is None:
app = Flask(__name__)
app.config.from_object(config[os.getenv('FLASK_ENV', 'dev')])
else:
app = Flask(__name__)
app.config.from_object(config[app_environment])
if __name__ == "__main__":
app = create_app(os.getenv('FLASK_ENV', 'dev'))
app.run()
This allows you to dynamically specify an app environment. For example, you can pass the app environment by setting an environment variable and reading it in before you call create_app(). This is extremely useful if you containerize your Flask app using Docker or some other virtualization tool.
Lastly, my config.py file looks like this. You would change the attributes in each of my environment configs to your secrets.
import os
class ProdConfig:
# Database configuration
API_TOKEN = os.environ.get('PROD_MARKET_STACK_API_KEY_SECRET')
class DevConfig:
# Database configuration
API_TOKEN = os.environ.get('API_KEY_SECRET')
class TestConfig:
# Database configuration
API_TOKEN = os.environ.get('MARKET_STACK_API_KEY')
config = {
'dev': DevConfig,
'test': TestConfig,
'prod': ProdConfig
}
Further, you would access your config secrets throughout any modules in your Flask application via...
from flask import current_app
current_app.config['API_TOKEN']`

I believe the answer to your question may be more related to where your application is being deployed, rather than which web-framework you are using.
As far as I understand, it's a bad practice to store/track sensitive information (passwords and API keys for example) on your source files and you should probably avoid that.
If you have already commited that sensitive data and you want to remove it completely from your git history, I recommend checking this GitHub page.
A couple of high level solutions could be:
Have you config file access environment variables instead of hard coded values.
If you are using a cloud service such as Google Cloud Platform or AWS, you could use a secret manager to store your data and fetch it safely from your app.
Another approach could be storing the information encrypted (maybe with something like KMS), and decrypt it when needed (my least favorite).

I have deployed my flask web app api on azure. I have lot of config files for that I have created a separate directory where I keep all my config files. This is how my project directory looks like
configs
-> app_config.json
-> client_config.json
logs
-> app_debug.log
-> app_error.log
data
-> some other data related files
app.py
app.py is my main python file from which I have imported all the config files and below is how I use it
config_file = os.path.join(os.path.dirname(__file__), 'configs', 'app_config.json')
# Get the config data from config json file
json_data = open(config_file)
config_data = json.load(json_data)
json_data.close()
After this I can easily use config_data anywhere in the code:
mongo_db = connect_mongodb(username=config_data['MongoUsername'], password=config_data['MongoPassword'], url=config_data['MongoDBURL'], port=config_data['Port'], authdbname=config_data['AuthDBName'])

Related

Can't serve swagger on remotely hosted server with different url

I'm trying to serve some simple service using flask and flask_restx (a forked project of flask-restplus, that would be eventually served on AWS.
When it is served, I want to generate swagger page for others to test it easily.
from flask import Flask
from flask_restx import Api
from my_service import service_namespace
app = Flask(__name__)
api = Api(app, version='1.0')
api.add_namespace(service_namespace)
if __name__ == '__main__':
app.run(debug=True)
When I test it locally (e.g. localhost:5000), it works just fine. Problem is, when it is hosted on AWS, because it has a specific domain (gets redirected?) (e.g. my-company.com/chris-service to a container), the document page is unable to find its required files like css and so:
What I've looked and tried
Python (Flask + Swagger) Flasgger throwing 404 error
flask python creating swagger document error
404 error in Flask
Also tried adding Blueprint (albeit without knowing exactly what it does):
app = Flask(__name__)
blueprint = Blueprint("api", __name__,
root_path="/chris-service",
# url_prefix="/chris-service", # doesn't work
)
api = Api(blueprint)
app.register_blueprint(blueprint)
...
And still no luck.
Update
So here's more information as per the comments (pseudo, but technically identical)
Access point for the swagger is my-company.com/chris (with or without http:// or https:// doesn't make difference)
When connecting to the above address, the request URL for the assets are my-company.com/swaggerui/swagger-ui.css
You can access the asset in my-company.com/chris/swaggerui/swagger-ui.css
So I my resolution (which didn't work) was to somehow change the root_path (not even sure if it's the correct wording), as shown in What I've looked and tried.
I've spent about a week to solve this but can't find a way.
Any help will be greatful :) Thanks
Swagger parameters defined at apidoc.py file. Default apidoc object also created in this file. So if you want to customize it you have change it before app and api initialization.
In your case url_prefix should be changed (I recommend to use environment variables to be able set url_prefix flexibly):
$ export URL_PREFIX='/chris'
from os import environ
from flask import Flask
from flask_restx import Api, apidoc
if (url_prefix := environ.get('URL_PREFIX', None)) is not None:
apidoc.apidoc.url_prefix = url_prefix
app = Flask(__name__)
api = Api(app)
...
if __name__ == '__main__':
app.run()
Always very frustrating when stuff is working locally but not when deployed to AWS. Reading this github issue, these 404 errors on swagger assets are probably caused by:
Missing javascript swagger packages
Probably not the case, since flask-restx does this for you. And running it locally should also not work in this case.
Missing gunicorn settings
Make sure that you are also setting gunicorn up correctly as well with
--forwarded-allow-ips if deploying with it (you should be). If you are in a kubernetes cluster you can set this to *
https://docs.gunicorn.org/en/stable/settings.html#forwarded-allow-ips
According to this post, you also have to explicitly set
settings.FLASK_SERVER_NAME to something like http://ec2-10-221-200-56.us-west-2.compute.amazonaws.com:5000
If that does not work, try to deploy a flask-restx example, that should definetely work. This rules out any errors on your end.

How to connect to the remote Google Datastore from localhost using dev_appserver.py?

I simply need an efficient way to debug GAE application, and to do so I need to connect to the production GAE infrastructure from the localhost when running dev_appserver.py.
Next code work well if I run it as a separate script:
import argparse
try:
import dev_appserver
dev_appserver.fix_sys_path()
except ImportError:
print('Please make sure the App Engine SDK is in your PYTHONPATH.')
raise
from google.appengine.ext import ndb
from google.appengine.ext.remote_api import remote_api_stub
def main(project_id):
server_name = '{}.appspot.com'.format(project_id)
remote_api_stub.ConfigureRemoteApiForOAuth(
app_id='s~' + project_id,
path='/_ah/remote_api',
servername=server_name)
# List the first 10 keys in the datastore.
keys = ndb.Query().fetch(10, keys_only=True)
for key in keys:
print(key)
if __name__ == '__main__':
parser = argparse.ArgumentParser(
description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument('project_id', help='Your Project ID.')
args = parser.parse_args()
main(args.project_id)
With this script, I was able to get data from remote Datastore. But where is I need to put the same code in my application(which is obviously not a single script) to make it work?
I've tried to put remote_api_stub.ConfigureRemoteApiForOAuth() code in the appengine_config.py but I've got a recursion error.
I'm running app like this:
dev_appserver.py app.yaml --admin_port=8001 --enable_console --support_datastore_emulator=no --log_level=info
The application uses NDB to access Google Datastore.
Application contain many modules and files and I simply don't know where is to put remote_api_stab auth code.
I hope somebody from the google team will see this topic because I've searched all the internet without any results. That's unbelievable how many people developing apps for the GAE platform, but it looks like nobody is developing/debugging apps locally.

No api proxy found for service "app_identity_service" when running GAE script

I'm trying to run a custom stript to upload static files to a bucket.
import os
import sys
sys.path.append("/tools/google_appengine")
from google.appengine.ext import vendor
from google.appengine.api import app_identity
vendor.add('../libraries')
import cloudstorage as gcs
STATIC_DIR = '../dashboard/dist'
def main():
bucket_path = ''.join('/' + app_identity.get_default_gcs_bucket_name())
What I've been trying so far:
- initialize stubs manuaIlly
def initialize_service_apis():
from google.appengine.tools import dev_appserver
from google.appengine.tools.dev_appserver_main import ParseArguments
args, option_dict = ParseArguments(sys.argv) # Otherwise the option_dict isn't populated.
dev_appserver.SetupStubs('local', **option_dict)
(taken from https://blairconrad.wordpress.com/2010/02/20/automated-testing-using-app-engine-service-apis-and-a-memcaching-memoizer/)
But this gives me import error when importing dev_appserver lib.
Is there any way to resolve the issue ?
I need this script for an automatic deployment process.
The No api proxy found for service <blah> error messages typically indicate attempts to use GAE standard env infrastructure (packages under google.appengine in your case) inside standalone scripts, which is not OK. See GAE: AssertionError: No api proxy found for service "datastore_v3".
You have 2 options:
keep the code but make it execute inside a GAE app (as a request handler, for example), not as a standalone script
drop GAE libraries and switch to libraries designed to be used from standalone scrips. In your case you're looking for Cloud Storage Client Libraries. You may also need to adjust access control to the respective GAE app bucket.
I'm not familiar with dev_appserver.SetupStubs(), but I received this same error message while running unit tests in a testbed. In that environment, you have to explicitly enable stubs for any services you wish to test (see the docs).
In particular, initializing the app identity stub solved my problem:
from google.appengine.ext import testbed
t = testbed.Testbed()
t.init_app_identity_stub()

Local filesystem as a remote storage in Django

I use Amazon S3 as a part of my webservice. The workflow is the following:
User uploads lots of files to web server. Web server first stores them locally and then uploads to S3 asynchronously
User sends http-request to initiate job (which is some processing of these uploaded files)
Web service asks worker to do the job
Worker does the job and uploads result to S3
User requests the download link from web-server, somedbrecord.result_file.url is returned
User downloads result using this link
To work with files I use QueuedStorage backend. I initiate my FileFields like this:
user_uploaded_file = models.FileField(..., storage=queued_s3storage, ...)
result_file = models.FileField(..., storage=queued_s3storage, ...)
Where queued_s3storage is an object of class derived from ...backends.QueuedStorage and remote field is set to '...backends.s3boto.S3BotoStorage'.
Now I'm planning to deploy the whole system on one machine to run everything locally, I want to replace this '...backends.s3boto.S3BotoStorage' with something based on my local filesystem.
The first workaround was to use FakeS3 which can "emulate" S3 locally. Works, but this is not ideal, just extra unnecessary overhead.
I have Nginx server running and serving static files from particular directories. How do I create my "remote storage" class that actually stores files locally, but provides download links which lead to files served by Nginx? (something like http://myip:80/filedir/file1). Is there a standard library class for that in django?
The default storage backend for media files is local storage.
Your settings.py defines these two environment variables:
MEDIA_ROOT (link to docs) -- this is the absolute path to the local file storage folder
MEDIA_URL (link to docs) -- this is the webserver HTTP path (e.g. '/media/' or '//%s/media' % HOSTNAME
These are used by the default storage backend to save media files. From Django's default/global settings.py:
# Default file storage mechanism that holds media.
DEFAULT_FILE_STORAGE = 'django.core.files.storage.FileSystemStorage'
This configured default storage is used in FileFields for which no storage kwarg is provided. It can also be accessed like so: rom django.core.files.storage import default_storage.
So if you want to vary the storage for local development and production use, you can do something like this:
# file_storages.py
from django.conf import settings
from django.core.files.storage import default_storage
from whatever.backends.s3boto import S3BotoStorage
app_storage = None
if settings.DEBUG == True:
app_storage = default_storage
else:
app_storage = S3BotoStorage()
And in your models:
# models.py
from file_storages import app_storage
# ...
result_file = models.FileField(..., storage=app_storage, ...)
Lastly, you want nginx to serve the files directly from your MEDIA_URL. Just make sure that the nginx URL matches the path in MEDIA_URL.
I'm planning to deploy the whole system on one machine to run everything locally
Stop using QueuedStorage then, because "[QueuedStorage] enables having a local and a remote storage backend" and you've just said you don't want a remote.
Just use FileSystemStorage and configure nginx to serve the location / settings.MEDIA_ROOT

Amazon CloudFront distribution_id as a credential with Boto

I'm new to Python and Boto, I've managed to sort out file uploads from my server to S3.
But once I've uploaded a new file I want to do an invalidation request.
I've got the code to do that:
import boto
print 'Connecting to CloudFront'
cf = boto.connect_cloudfront()
cf.create_invalidation_request(aws_distribution_id, ['/testkey'])
But I'm getting an error: NameError: name 'aws_distribution_id' is not defined
I guessed that I could add the distribution id to the ~/.boto config, like the aws_secret_access_key etc:
$ cat ~/.boto
[Credentials]
aws_access_key_id = ACCESS-KEY-ID-GOES-HERE
aws_secret_access_key = ACCESS-KEY-SECRET-GOES-HERE
aws_distribution_id = DISTRIBUTION-ID-GOES-HERE
But that's not actually listed in the docs, so I'm not too surprised it failed:
http://docs.pythonboto.org/en/latest/boto_config_tut.html
My problem is I don't want to add the distribution_id to the script as I run it on both my live and staging servers, and I have different S3 and CloudFront set ups for both.
So I need the distribution_id to change per server, which is how I've got the the AWS access keys set.
Can I add something else to the boto config or is there a python user defaults I could add it to?
Since you can have multiple cloudfront distributions per account, it wouldn't make sense to configure it in .boto.
You could have another config file specific to your own environment and run your invalidation script using the config file as argument (or have the same file, but with different data depending on your env).
I solved this by using the ConfigParser. I added the following to the top of my script:
import ConfigParser
# read conf
config = ConfigParser.ConfigParser()
config.read('~/my-app.cnf')
distribution_id = config.get('aws_cloudfront', 'distribution_id')
And inside the conf file at ~/.my-app.cnf
[aws_cloudfront]
distribution_id = DISTRIBUTION_ID
So on my live server I just need to drop the cnf file into the user's home dir and change the distribution_id

Categories