How to set boto3 connect timeout and read timeout using environment variables? - python

I'm dealing with a large existing python application and trying to set global defaults for all the boto3 resources and clients that don't have any specified. Since there are many of them, I don't want to update every place the resources are created to create a botocore Config object; so it seems to make sense to use the environment variable approach to configuration. I want to set 4 configurations related to timeout and retry, but of the 4, the documentation only indicates that 2 of them can be set via environment variables. Same for configuring using a config file.
botocore.Config supports connect_timeout, read_timeout, retry mode, and retry max_attempts.
But the environment variables only support AWS_MAX_ATTEMPTS and AWS_RETRY_MODE (at least according to the documentation). How to set the connect_timeout and read_timeout by environment variable?

I don't think this is possible at the moment.
It seems that the environment variable names are all explicitly defined in botocore's config provider.

Related

Learning Python fast how can I protect some private connections from been exposed

Hi I'm new to the community and new to Python, experienced but rusty on other high level languages, so my question is simple.
I made a simple script to connect to a private ftp server, and retrieve daily information from it.
from ftplib import FTP
#Open ftp connection
#Connect to server to retrieve inventory
#Open ftp connection
def FTPconnection(file_name):
ftp = FTP('ftp.serveriuse.com')
ftp.login('mylogin', 'password')
#List the files in the current directory
print("Current File List:")
file = ftp.dir()
print(file)
# # #Get the latest csv file from server
# ftp.cwd("/pub")
gfile = open(file_name, "wb")
ftp.retrbinary('RETR '+ file_name, gfile.write)
gfile.close()
ftp.quit()
FTPconnection('test1.csv')
FTPconnection('test2.csv')
That's the whole script, it passes my credentials, and then calls the function FTPconnection on two different files I'm retrieving.
Then my other script that processes them has an import statement, as I tried to call this script as a module, what my import does it's just connect to the FTP server and fetch information.
import ftpconnect as ftpc
This is the on the other Python script, that does the processing.
It works but I want to improve it, so I need some guidance on best practices about how to do this, because in Spyder 4.1.5 I get an 'Module ftpconnect called but unused' warning ... so probably I am missing something here, I'm developing on MacOS using Anaconda and Python 3.8.5.
I'm trying to build an app, to automate some tasks, but I couldn't find anything about modules that guided me to better code, it simply says you have to import whatever .py file name you used and that will be considered a module ...
and my final question is how can you normally protect private information(ftp credentials) from being exposed? This has nothing to do to protect my code but the credentials.
There are a few options for storing passwords and other secrets that a Python program needs to use, particularly a program that needs to run in the background where it can't just ask the user to type in the password.
Problems to avoid:
Checking the password in to source control where other developers or even the public can see it.
Other users on the same server reading the password from a configuration file or source code.
Having the password in a source file where others can see it over your shoulder while you are editing it.
Option 1: SSH
This isn't always an option, but it's probably the best. Your private key is never transmitted over the network, SSH just runs mathematical calculations to prove that you have the right key.
In order to make it work, you need the following:
The database or whatever you are accessing needs to be accessible by SSH. Try searching for "SSH" plus whatever service you are accessing. For example, "ssh postgresql". If this isn't a feature on your database, move on to the next option.
Create an account to run the service that will make calls to the database, and generate an SSH key.
Either add the public key to the service you're going to call, or create a local account on that server, and install the public key there.
Option 2: Environment Variables
This one is the simplest, so it might be a good place to start. It's described well in the Twelve Factor App. The basic idea is that your source code just pulls the password or other secrets from environment variables, and then you configure those environment variables on each system where you run the program. It might also be a nice touch if you use default values that will work for most developers. You have to balance that against making your software "secure by default".
Here's an example that pulls the server, user name, and password from environment variables.
import os
server = os.getenv('MY_APP_DB_SERVER', 'localhost')
user = os.getenv('MY_APP_DB_USER', 'myapp')
password = os.getenv('MY_APP_DB_PASSWORD', '')
db_connect(server, user, password)
Look up how to set environment variables in your operating system, and consider running the service under its own account. That way you don't have sensitive data in environment variables when you run programs in your own account. When you do set up those environment variables, take extra care that other users can't read them. Check file permissions, for example. Of course any users with root permission will be able to read them, but that can't be helped. If you're using systemd, look at the service unit, and be careful to use EnvironmentFile instead of Environment for any secrets. Environment values can be viewed by any user with systemctl show.
Option 3: Configuration Files
This is very similar to the environment variables, but you read the secrets from a text file. I still find the environment variables more flexible for things like deployment tools and continuous integration servers. If you decide to use a configuration file, Python supports several formats in the standard library, like JSON, INI, netrc, and XML. You can also find external packages like PyYAML and TOML. Personally, I find JSON and YAML the simplest to use, and YAML allows comments.
Three things to consider with configuration files:
Where is the file? Maybe a default location like ~/.my_app, and a command-line option to use a different location.
Make sure other users can't read the file.
Obviously, don't commit the configuration file to source code. You might want to commit a template that users can copy to their home directory.
Option 4: Python Module
Some projects just put their secrets right into a Python module.
# settings.py
db_server = 'dbhost1'
db_user = 'my_app'
db_password = 'correcthorsebatterystaple'
Then import that module to get the values.
# my_app.py
from settings import db_server, db_user, db_password
db_connect(db_server, db_user, db_password)
One project that uses this technique is Django. Obviously, you shouldn't commit settings.py to source control, although you might want to commit a file called settings_template.py that users can copy and modify.
I see a few problems with this technique:
Developers might accidentally commit the file to source control. Adding it to .gitignore reduces that risk.
Some of your code is not under source control. If you're disciplined and only put strings and numbers in here, that won't be a problem. If you start writing logging filter classes in here, stop!
If your project already uses this technique, it's easy to transition to environment variables. Just move all the setting values to environment variables, and change the Python module to read from those environment variables.

Best way to define Django global variable on apache starup

I have some configuration in a json file and on the database and I want to load those configuration on Django startup (Apache server startup).. I will be using those global variable within all the application.
For Example: External server connection api or number of instances.
What is the best way to define the global variables. I want to load the json file when server starts and use the variable value util server stop. ?
It sounds like the thing you're probably looking for is environment variables - you can always use a small script to set the environment variables from the JSON that you have at present.
Setting these in your .bashrc file or, more preferably a virtualenv will let you:
Take sensitive settings, like SECRET_KEY out of version control.
Offer database settings, either by supplying them as a DB URL or as seperate environment variables.
Set both Django settings and other useful variables outside of the immediate Django project.
The django-environ docs have a useful tutorial on how to set it up. The Django Cookie-Cutter project makes extensive use of Environment Variables (including DB and mailserver settings), and is a great place to pick up hints and approaches.

Session bus initialization

I'm trying to use D-Bus to control another application. When using Python bindings, it is possible to use D-Bus just with dbus.SessionBus().
However, other application require to first set up the environment variables DBUS_SESSION_BUS_ADDRESS and DBUS_SESSION_BUS_PID, otherwise they report that the name "was not provided by any .service files".
My question is, why is it necessary for some application to set up the environment variables? Is the a standard procedure to initialize the session bus in some situations?
Just a guess: python client might be able to use X11 to discover session bus address (in addition to using DBUS_SESSION_BUS_ADDRESS environment variable). It is stored in _DBUS_SESSION_BUS_ADDRESS property of _DBUS_SESSION_BUS_SELECTION_[hostname]_[uuid] selection owner window (uuid is content of /var/lib/dbus/machine-id )

Postgres: is set_config(). current_setting() a private/robust stack for application variables?

In my application I have triggers that need access to things like user id. I am storing that information with
set_config('PRIVATE.'|'user_id', '221', false)
then, while I am doing operations that modify the database, triggers may do:
user_id = current_setting('PRIVATE.user_id');
it seems to work great. My database actions are mostly from python, psycopg2, once I get a connection I'll do the set_config() as my first operation, then go about my database business. Is this practice a good one or could data leak from one session to another? I was doing this sort of thing with the SD and GD variables in plpython, but that language proved too heavy for what I was trying to do so I had to shift to plpgsql.
While it's not really what they're designed for, you can use GUCs as session variables.
They can also be transaction scoped, with SET LOCAL or the set_config equivalent.
So long as you don't allow the user to run arbitrary SQL they're a reasonable choice, and session-local GUCs aren't shared with other sessions. They're not designed for secure session-local storage but they're handy places to stash things like an application's "current user" if you're not using SET ROLE or SET SESSION AUTHORIZATION for that.
Do be aware that the user can define them via environment variables if you let them run a libpq based client, e.g.
$ PGOPTIONS="-c myapp.user_id=fred" psql -c "SHOW myapp.user_id;"
myapp.user_id
---------------
fred
(1 row)
Also, on older PostgreSQL versions you had to declare the namespace in postgresql.conf before you could use it.

Do any python tools exist for rolling over log/configuration files?

Here's what I'm trying to do:
modify a default configuration file on a local machine (Thing.conf).
save the previous config file on a virtual client to something like Thing.conf.1, and keep track of the previous 10 or so conf files. (Thing.conf.2, Thing.conf.3, etc.)
push that configuration file to the remote virtual client (/etc/thing/Thing.conf).
to be clear -- step 2 is the crux of the problem here, step 1 and 3 are just for context.
The python logging framework has a RotatingFileHandler, it also allows you force a rollover with RotatingFileHandler.doRollover(). I'm not sure if that's what you're after though. It'll allow you to roll over the log (config?) files on the virtual client where the logging is presumably being done.

Categories