Django doesn't see environment variables when deployed to Elastic Beanstalk

Django doesn't see environment variables when deployed to Elastic Beanstalk - python

I'm trying to set up a Django/DRF application on Elastic Beanstalk, and for whatever reason, Django just can't see the desired environment variables. When I log in, I can see them just fine, by using
$ eb ssh
$ cat /opt/python/current/env
I can also see them, except relatively sensitive ones involving RDS, simply using $eb printenv.
All that appears to be set up and working properly. However, Django likes to read the environment immediately on starting up, and it appears that the environment variables just aren't set yet. I've experimented with simply inserting print(os.environ) in settings.py, and when I do that, I discover a whole bunch of environment variables which I don't need (i.e. 'SUPERVISOR_GROUP_NAME': 'httpd'), and none of the ones I've set myself, like DJ_SECRET_KEY.
I've since changed the code to report the absence of specific environment variables when it loads the settings, and from a recent run, it generated this:
[Wed Nov 23 15:56:38.164153 2016] [:error] [pid 15708] DJ_SECRET_KEY not in environment; falling back to hardcoded value.
[Wed Nov 23 15:56:38.189717 2016] [:error] [pid 15708] RDS_DB_NAME not in environment; falling back to sqlite
[Wed Nov 23 15:56:38.189751 2016] [:error] [pid 15708] AWS_STORAGE_BUCKET_NAME not in environment; falling back to local static storage.
Again, those variables are set in the settings, and they show up with any other reporting tool EB gives me. They just aren't set in time for Django to read them when it launches and reads settings.py.
This looks pretty close to this issue, but it's not really the same: I know how to see / load the environment variables into the shell when ssh-ing into the eb instance; they're just not showing up when I need them to for the actual project.
This is almost exactly the issue I'm having, but the accepted-correct answer makes no sense to me, and the top-voted answer doesn't apply; those files are already in git.
How should I configure things so that Django can see the environment variables?

Given that the EB stores all these environment variables in a canonical location as a bash script, I ended up simply having bash execute the script, and updating the environment from the parsed results.
I created the file get_eb_env.py in parallel to my settings.py. Its main contents:
import os
import subprocess
ENV_PATH = '/opt/python/current/env'
def patch_environment(path=ENV_PATH):
"Patch the current environment, os.environ, with the contents of the specified environment file."
# mostly pulled from a very useful snippet: http://stackoverflow.com/a/3505826/504550
command = ['bash', '-c', 'source {path} && env'.format(path=path)]
proc = subprocess.Popen(command, stdout=subprocess.PIPE, universal_newlines=True)
proc_stdout, _ = proc.communicate(timeout=5)
# proc_stdout is just a big string, not a file-like object
# we can't iterate directly over its lines.
for line in proc_stdout.splitlines():
(key, _, value) = line.partition("=")
os.environ[key] = value
Then, I just import and call patch_environment() near the head of my settings.py.

I solved the problem modifying the key "DJANGO_AWS_SECRET_ACCESS_KEY" because DJANGO generated a key with quotation marks (`) and interpreted it as a command.

This is not exactly what you are asking for, but I hope this solves your issue anyway.
I am using a lot of environment variables in settings. I think Elastic Beanstalk sets everything up once before it runs container commands and such and the variables are not available then and that is why your logging is showing them empty. But do you really need them the variables at that point?
You can't put whatever local development settings you need in local_settings.py and keep that out of version control.
We use them like this.
if 'DB_HOST' in os.environ:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'ebdb',
'USER': 'ebroot',
'PASSWORD': 'ebpassword',
'HOST': os.environ['DB_HOST'],
'PORT': '3306',
}
}
try:
from local_settings import *
except ImportError:
pass
They are also available when you run container_commands:
container_commands:
01_do_something:
command: "do_something_script_{$PARAM1}.sh"

Related

"Python-Eggs is writable by group/others" Error in Windows 7

I installed trac using BitNami the other day and after restarting my computer I'm not able to get it running as a service today. I see in the error log this error
[Fri Dec 02 08:52:40.565865 2016] [:error] [pid 4052:tid 968] C:\Bitnami\trac-1.0.13-0\python\lib\site-packages\setuptools-7.0-py2.7.egg\pkg_resources.py:1045: UserWarning: C:\WINDOWS\system32\config\systemprofile\AppData\Roaming\Python-Eggs is writable by group/others and vulnerable to attack when used with get_resource_filename. Consider a more secure location (set with .set_extraction_path or the PYTHON_EGG_CACHE environment variable).
Everyone's suggestion is to move the folder path PYTHON_EGG_CACHE to the C:\egg folder or to suppress the warning at the command line.
I've already set the PYTHON_EGG_CACHE for the system, I set it in trac's setenv.bat file, and in the trac.wsgi file but it's not picking up on the changes when I try to start the service.
Alternately I can't change the permissions on the folder in Roaming using chmod like in Linux, and I can't remove any more permissions on the folder in Roaming (myself, Administrators, System) as Corporate IT doesn't allow for Administrators to be removed and this isn't an unreasonable policy.

I found out that there was another service running on the 8080 port that I had setup trac on and that was causing the trouble. The error in the logs was not pointing to that as being the issue however.

Apache2 not serving django content

I am taking over a django project which another developer maintained. The service is run on an Ubuntu machine. ZEO is used for content caching. Ajax/Dajax is used for asynchronious content. Celery is used for task management and Django is used for the project itself.
The service is usually reached via a specific IP address which limits access to specific URLs. http://my_server_ip. Without knowingly changing anything, this started to not work. Instead of taking me to the splash page, entering the IP would hang, unsuccesfully connecting. I don't get a 404, 500 or some other error, it just sits and continually tries to load as if waiting to connect or to receive content.
I attempted to restart the service in the hopes that this would solve the problem, it did not. I performed a system reboot and followed the following commands, as per the prior developer's documentation, to reboot the server.
From within the django project:
runzeo -a localhost:8090 -f /path/to/operations_cache.fs
su project_owner
python manage.py celery worker -n multiprocessing_worker --loglevel=debug -Q multiprocessing_queue
python manage.py celery worker --concurrency=500 --pool=eventlet --loglevel=debug -Q celery -n eventlet_worker
The two celery commands had to be run as the owner of the project directory.
Finally I ran sudo service apache2 restart. Upon completion, I tried to navigate to the webpage but received the same response: hung on connecting. The trac pages do work at http://my_server_ip/trac.
The following is all I have found in the apache log files.
error.log
[Fri Feb 06 16:01:11 2015] [error] /usr/lib/python2.7/dist-packages/configobj.py:145: DeprecationWarning: The compiler package is deprecated and removed in Python 3.x.
[Fri Feb 06 16:01:11 2015] [error] import compiler
[Fri Feb 06 16:01:11 2015] [error]
access.log
my.ip - user [06/Feb/2015:15:55:40 -0500] "GET / HTTP/1.1" 500 632 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:35.0 Gecko/20100101 Firefox/35.0"
I have tried looking into the django logs, but nothing appears there. Perhaps I am not finding the correct logs.
As a good start, where can I find out what the system is hanging on. How do I determine if it is an issue with django, apache, the interaction between the two, etc. This would help me zero in on specifically what is happening.
Edit
I was able to resolve my problem, though I cannot say for sure what resolved the issue. I suspect that the issue had to do with permissions on the static files folder. I serve my content through the www-user. After going through the steps described above I then ran yes yes | python manage.py collectstatic as the user www-data. The server was able restart and I was able to access the trac content as well as the django content.
I am hesitant to post this as an answer because I do not know with certainty whether my solution described here is the step which solved the problem.

Python CGI script - 500 error

I have python CGI script that runs perfect on my local Apache 2.2.22 server. It outputs correct result and so on. But when I try to execute it on virtual hosting, there is error 500.
I really have no idea why it does not works.
Apache error log looks like:
[Wed Jul 12 16:06:54 2013] [error] [client 89.223.235.12] Premature end of script headers: /home/u67442/rrrrr.com/cgi-bin/test.py
[Wed Jul 12 16:09:31 2013] [error] [client 89.223.235.12] Premature end of script headers: /home/u67442/rrrrr.com/cgi-bin/test.py
I've already tried following things:
I'm sure that path to interpreter is correct - #!/usr/local/bin/python. Another CGI script works fine with this path.
I have set chmod 755 to this script.
I have set end-of-line characters in UNIX-format.
I use correct HTTP-header: print "Content-type:text/html\n\n"
Output section of script:
print "Content-type:text/html\n\n"
print "<html>"
print "<head>"
print "<title>Results</title>"
print "<head><h2 align=center>Results</h2></head>"
print "</head>"
print '<body bgcolor="#e8f3d8">'
print "<hr>"
print "<b>Result = %s </b>" % str(round(total_sum, 5))
print "</body>"
print "</html>"
Funny that another VERY similar script with the same path, header, EOL, output and so on works perfect on local server and virtual hosting. And it's very strange that this script works fine on my local apache, but on virtual web hosting it crashes with 500 internal server error. I really don't know what to do. In technical support says that there is problem in my script.
There is only one idea - timeout of waiting of output. Data processing in my script takes about 15-25 seconds.
What can you advice?

Check your error_log.
If you can, run the script from the command line of the virtual host.
If you can, su - webserveruser and do it again.
Are you trying to import a module that is not present on the server?
Does the webserver process have permission to fetch the data that ends up in total_sum?
To narrow down the problem, try hard-coding a value for total_sum and commenting out the code that fetches data and computes it. Does the rest work then?
Does the virtual host run the same version of python as your local server? If not, check that your code works in both versions.

Problem solved. Problem was in the script. I replace function for data extraction from *.dat files from one of 'while' loops.
Anyway, I still do not understand why it worked on local server and didn't worked on virtual hosting.

Setting NewRelic environment on Dotcloud (Python)

I have a Python application that is set up using the new New Relic configuration variables in the dotcloud.yml file, which works fine.
However I want to run a sandbox instance as a test/staging environment, so I want to be able to set the environment of the newrelic agent so that it uses the different configuration sections of the ini configuration. My dotcloud.yml is set up as follows:
www:
type: python
config:
python_version: 'v2.7'
enable_newrelic: True
environment:
NEW_RELIC_LICENSE_KEY: *****************************************
NEW_RELIC_APP_NAME: Application Name
NEW_RELIC_LOG: /var/log/supervisor/newrelic.log
NEW_RELIC_LOG_LEVEL: info
NEW_RELIC_CONFIG_FILE: /home/dotcloud/current/newrelic.ini
I have custom environment variables so that the sanbox is set as "test" and the live application is set to "production"
I am then calling the following in my uswsgi.py
NEWRELIC_CONFIG = os.environ.get('NEW_RELIC_CONFIG_FILE')
ENVIRONMENT = os.environ.get('MY_ENVIRONMENT', 'test')
newrelic.agent.initialize(NEWRELIC_CONFIG, ENVIRONMENT)
However the dotcloud instance is already enabling newrelic because I get this in the uwsgi.log file:
Sun Nov 18 18:50:12 2012 - unable to load app 0 (mountpoint='') (callable not found or import error)
Traceback (most recent call last):
File "/home/dotcloud/current/wsgi.py", line 15, in <module>
newrelic.agent.initialize(NEWRELIC_CONFIG, ENVIRONMENT)
File "/opt/ve/2.7/local/lib/python2.7/site-packages/newrelic-1.8.0.13/newrelic/config.py", line 1414, in initialize
log_file, log_level)
File "/opt/ve/2.7/local/lib/python2.7/site-packages/newrelic-1.8.0.13/newrelic/config.py", line 340, in _load_configuration
'environment "%s".' % (_config_file, _environment))
newrelic.api.exceptions.ConfigurationError: Configuration has already been done against differing configuration file or environment. Prior configuration file used was "/home/dotcloud/current/newrelic.ini" and environment "None".
So it would seem that the newrelic agent is being initialised before uwsgi.py is called.
So my question is:
Is there a way to initialise the newrelic environment?

The easiest way to do this, without changing any code would be to do the following.
Create a new sandbox app on dotCloud (see http://docs.dotcloud.com/0.9/guides/flavors/ for more information about creating apps in sandbox mode)
$ dotcloud create -f sandbox <app_name>
Deploy your code to the new sandbox app.
$ dotcloud push
Now you should have the same code running in both your live and sandbox apps. But because you want to change some of the ENV variables for the sandbox app, you need to do one more step.
According to this page http://docs.dotcloud.com/0.9/guides/environment/#adding-environment-variables there are 2 different ways of adding ENV variables.
Using the dotcloud.yml's environment section.
Using the dotcloud env cli command
Whereas dotcloud.yml allows you to define different environment variables for each service, dotcloud env set environment variables for the whole application. Moreover, environment variables set with dotcloud env supersede environment variables defined in dotcloud.yml.
That means that if we want to have different values for our sandbox app, we just need to run a dotcloud env command to set those variables on the sandbox app, which will override the ones in your dotcloud.yml
If we just want to change on variable we would run this command.
$ dotcloud env set NEW_RELIC_APP_NAME='Test Application Name'
If we want to update more then one at a time we would do the following.
$ dotcloud env set \
'NEW_RELIC_APP_NAME="Test Application Name"' \
'NEW_RELIC_LOG_LEVEL=debug'
To make sure that you have your env varibles set correctly you can run the following command.
$ dotcloud env list
Notes
The commands above, are using the new dotCloud 0.9.x CLI, if you are using the older one, you will need to either upgrade to the new one, or refer to the documentation for the old CLI http://docs.dotcloud.com/0.4/guides/environment/
When you set your environment variables it will restart your application so that it can install the variables, so to limit your downtime, set all of them in one command.

Unless they are doing something odd, you should be able to override the app_name supplied by the agent configuration file by doing:
import newrelic.agent
newrelic.agent.global_settings().app_name = 'Test Application Name'
Don't call newrelic.agent.initialize() a second time.
This will only work if app_name is listing a single application to report data to.

mod_wsgi force reload modules

Is there a way to have mod_wsgi reload all modules (maybe in a particular directory) on each load?
While working on the code, it's very annoying to restart apache every time something is changed. The only option I've found so far is to put modname = reload(modname) below every import.. but that's also really annoying since it means I'm going to have to go through and remove them all at a later date..

The link:
http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode
should be emphasised. It also should be emphaised that on UNIX systems daemon mode of mod_wsgi must be used and you must implement the code monitor described in the documentation. The whole process reloading option will not work for embedded mode of mod_wsgi on UNIX systems. Even though on Windows systems the only option is embedded mode, it is possible through a bit of trickery to do the same thing by triggering an internal restart of Apache from the code monitoring script. This is also described in the documentation.

The following solution is aimed at Linux users only, and has been tested to work under Ubuntu Server 12.04.1
To run WSGI under daemon mode, you need to specify WSGIProcessGroup and WSGIDaemonProcess directives in your Apache configuration file, for example
WSGIProcessGroup my_wsgi_process
WSGIDaemonProcess my_wsgi_process threads=15
More details are available in http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives
An added bonus is extra stability if you are running multiple WSGI sites under the same server, potentially with VirtualHost directives. Without using daemon processes, I found two Django sites conflicting with each other and turning up 500 Internal Server Error's alternatively.
At this point, your server is in fact already monitoring your WSGI site for changes, though it only watches the file you specified using WSGIScriptAlias, like
WSGIScriptAlias / /var/www/my_django_site/my_django_site/wsgi.py
This means that you can force the WSGI daemon process to reload by changing the WSGI script. Of course, you don't have to change its contents, but rather,
$ touch /var/www/my_django_site/my_django_site/wsgi.py
would do the trick.
By utilizing the method above, you can automatically reload a WSGI site in production environment without restarting/reloading the entire Apache server, or modifying your WSGI script to do production-unsafe code change monitoring.
This is particularly useful when you have automated deploy scripts, and don't want to restart the Apache server on deployment.
During development, you may use a filesystem changes watcher to invoke touch wsgi.py every time a module under your site changes, for example, pywatch

The mod_wsgi documentation on code reloading is your best bet for an answer.

I know it's an old thread but this might help someone. To kill your process when any file in a certain directory is written to, you can use something like this:
monitor.py
import os, sys, time, signal, threading, atexit
import inotify.adapters
def _monitor(path):
i = inotify.adapters.InotifyTree(path)
print "monitoring", path
while 1:
for event in i.event_gen():
if event is not None:
(header, type_names, watch_path, filename) = event
if 'IN_CLOSE_WRITE' in type_names:
prefix = 'monitor (pid=%d):' % os.getpid()
print "%s %s/%s changed," % (prefix, path, filename), 'restarting!'
os.kill(os.getpid(), signal.SIGKILL)
def start(path):
t = threading.Thread(target = _monitor, args = (path,))
t.setDaemon(True)
t.start()
print 'Started change monitor. (pid=%d)' % os.getpid()
In your server startup, call it like:
server.py
import monitor
monitor.start(<directory which contains your wsgi files>)
if your main server file is in the directory which contains all your files, you can go like:
monitor.start(os.path.dirname(__file__))
Adding other folders is left as an exercise...
You'll need to 'pip install inotify'
This was cribbed from the code here: https://code.google.com/archive/p/modwsgi/wikis/ReloadingSourceCode.wiki#Restarting_Daemon_Processes
This is an answer to my duplicate question here: WSGI process reload modules

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.