Why are my Django / uWSGI vassal logs empty? - python

I am running my Django site as a vassal of UWSGI emperor. I have created /etc/uwsgi-emperor/vassals/mysite.ini as follows:
[uwsgi]
socket = /var/opt/mysite/uwsgi.sock
chmod-socket = 775
chdir = /opt/mysite
master = true
virtualenv = /opt/mysite_virtualenv
env = DJANGO_SETTINGS_MODULE=mysite.settings
module = mysite.wsgi:application
uid = www-data
gid = www-data
processes = 1
threads = 1
plugins = python3,logfile
logger = file:/var/log/uwsgi/app/mysite.log
vacuum = true
But /var/log/uwsgi/app/mysite.log does not get created. If I touch it, it remains empty. This occurs even after I trigger 500-style errors in the application.
Why aren't my logs being written?

The vassal does not have permission to write to the file (or create the file in the first place). You should
cd /var/log/uwsgi/app
touch mysite.log # create the file
chown www-data:www-data mysite.log # give the vassal permission
(where www-data:www-data matches the uid and gid values in your ini file).
Logs will start appearing shortly.

Related

Celery: The module was not found

I am using Open Semantic Search (OSS) and I would like to monitor its processes using the Flower tool. The workers that Celery needs should be given as OSS states on its website
The workers will do tasks like analysis and indexing of the queued files. The workers are implemented by etl/tasks.py and will be started automatically on boot by the service opensemanticsearch.
This tasks.py file looks as follows:
#!/usr/bin/python3
# -*- coding: utf-8 -*-
#
# Queue tasks for batch processing and parallel processing
#
# Queue handler
from celery import Celery
# ETL connectors
from etl import ETL
from etl_delete import Delete
from etl_file import Connector_File
from etl_web import Connector_Web
from etl_rss import Connector_RSS
verbose = True
quiet = False
app = Celery('etl.tasks')
app.conf.CELERYD_MAX_TASKS_PER_CHILD = 1
etl_delete = Delete()
etl_web = Connector_Web()
etl_rss = Connector_RSS()
#
# Delete document with URI from index
#
#app.task(name='etl.delete')
def delete(uri):
etl_delete.delete(uri=uri)
#
# Index a file
#
#app.task(name='etl.index_file')
def index_file(filename, wait=0, config=None):
if wait:
time.sleep(wait)
etl_file = Connector_File()
if config:
etl_file.config = config
etl_file.index(filename=filename)
#
# Index file directory
#
#app.task(name='etl.index_filedirectory')
def index_filedirectory(filename):
from etl_filedirectory import Connector_Filedirectory
connector_filedirectory = Connector_Filedirectory()
result = connector_filedirectory.index(filename)
return result
#
# Index a webpage
#
#app.task(name='etl.index_web')
def index_web(uri, wait=0, downloaded_file=False, downloaded_headers=[]):
if wait:
time.sleep(wait)
result = etl_web.index(uri, downloaded_file=downloaded_file, downloaded_headers=downloaded_headers)
return result
#
# Index full website
#
#app.task(name='etl.index_web_crawl')
def index_web_crawl(uri, crawler_type="PATH"):
import etl_web_crawl
result = etl_web_crawl.index(uri, crawler_type)
return result
#
# Index webpages from sitemap
#
#app.task(name='etl.index_sitemap')
def index_sitemap(uri):
from etl_sitemap import Connector_Sitemap
connector_sitemap = Connector_Sitemap()
result = connector_sitemap.index(uri)
return result
#
# Index RSS Feed
#
#app.task(name='etl.index_rss')
def index_rss(uri):
result = etl_rss.index(uri)
return result
#
# Enrich with / run plugins
#
#app.task(name='etl.enrich')
def enrich(plugins, uri, wait=0):
if wait:
time.sleep(wait)
etl = ETL()
etl.read_configfile('/etc/opensemanticsearch/etl')
etl.read_configfile('/etc/opensemanticsearch/enhancer-rdf')
etl.config['plugins'] = plugins.split(',')
filename = uri
# if exist delete protocoll prefix file://
if filename.startswith("file://"):
filename = filename.replace("file://", '', 1)
parameters = etl.config.copy()
parameters['id'] = uri
parameters['filename'] = filename
parameters, data = etl.process (parameters=parameters, data={})
return data
#
# Read command line arguments and start
#
#if running (not imported to use its functions), run main function
if __name__ == "__main__":
from optparse import OptionParser
parser = OptionParser("etl-tasks [options]")
parser.add_option("-q", "--quiet", dest="quiet", action="store_true", default=False, help="Don\'t print status (filenames) while indexing")
parser.add_option("-v", "--verbose", dest="verbose", action="store_true", default=False, help="Print debug messages")
(options, args) = parser.parse_args()
if options.verbose == False or options.verbose==True:
verbose = options.verbose
etl_delete.verbose = options.verbose
etl_web.verbose = options.verbose
etl_rss.verbose = options.verbose
if options.quiet == False or options.quiet==True:
quiet = options.quiet
app.worker_main()
I read multiple tutorials about Celery and from my understanding, this line should do the job
celery -A etl.tasks flower
but it doesnt. The result is the statement
Error: Unable to load celery application. The module etl was not found.
Same for
celery -A etl.tasks worker --loglevel=debug
so Celery itself seems to be causing the trouble, not flower. I also tried e.g. celery -A etl.index_filedirectory worker --loglevel=debug but with the same result.
What am I missing? Do I have to somehow tell Celery where to find etl.tasks? Online research doesn't really show a similar case, most of the "Module not found" errors seem to occur while importing stuff. So possibly it's a silly question but I couldn't find a solution anywhere. I hope you guys can help me. Unfortunately, I won't be able to respond until Monday though, sorry in advance.
I got same issue, I installed and configured my queue as follows, and it works.
Install RabbitMQ
MacOS
brew install rabbitmq
sudo vim ~/.bash_profile
In bash_profile add the following line:
PATH=$PATH:/usr/local/sbin
Then update bash_profile:
sudo source ~/.bash_profile
Linux
sudo apt-get install rabbitmq-server
Configure RabbitMQ
Launch the queue:
sudo rabbitmq-server
In another Terminal, configure the queue:
sudo rabbitmqctl add_user myuser mypassword
sudo rabbitmqctl add_vhost myvhost
sudo rabbitmqctl set_user_tags myuser mytag
sudo rabbitmqctl set_permissions -p myvhost myuser ".*" ".*" ".*"
Launch Celery
I would suggest to go in the folder that contains task.py and use the following command:
celery -A task worker -l info -Q celery --concurrency 5
Beware that this error means two things:
The module is missing
The module exists but cannot be loaded. If it has errors in it, such as a SyntaxError for instance.
To check that it's not the latter, run:
python -c "import <myModuleContainingTasksDotPyFile>"
In the context of this question:
python -c "import etl"
If it crashes, fix this first (Unlike with celery, you'll get a detailed error message).
Solutions above did not work for me.
I had the same issue and my problem was that in main celery.py (that was in SmartCalend folder) I had:
app = Celery('proj')
but instead I must type there:
app = Celery('SmartCalend')
where SmartCalend is the actual app name where celery.py belongs (!). not any random word, but precisely app name. Thats nowhere mentioned, only in official docs here:
Try export PYTHONPATH=<parent directory> where parent directory is the folder where the etl is. Run the Celery worker, and see it if fixes your problem. This is probably one of the most common Celery "issues" (not really Celery, but Python in general). Alternatively, run the Celery worker from that folder.
Answer for MacOS Catalina:
When you install celery with pip (pip install celery), python can import celery, but you are not able to launch celery from the terminal because the terminal does not know of the celery executable.
Add celery to the path to fix:
nano ~/.bash_profile
In the file add: export PATH="/Users/gavinbelson/Library/Python/2.7/bin:$PATH"
To save the file in the nano editor: ctrl+o, then enter, then ctrl+x
To update the terminal with your change type: source ~/.bash_profile
Now you should be able to type celery in the terminal window
---- Note this is for the default python terminal command which runs version 2.7. If you are using python3 to run python, you would need to change alter the path variable accordingly

Cron job with django application

I would like to use a cron task in order to delete media files if the condition is True.
Users generate export files stored in the Media folder. In order to clean export files in the background, I have a Cron task which loops over each file and looks if the expiry delay is passed or not.
I used django-cron library
Example:
File in Media Folder : Final_Products___2019-04-01_17:50:43.487845.xlsx
My Cron task looks like this :
class MyCronExportJob(CronJobBase):
""" Cron job which removes expired files at 18:30 """
RUN_AT_TIMES = ['18:30']
schedule = Schedule(run_at_times=RUN_AT_TIMES)
code = 'app.export_cron_job'
def do(self):
now = datetime.datetime.now()
media_folder = os.listdir(os.path.join(settings.MEDIA_ROOT, 'exports'))
for files in media_folder:
file = os.path.splitext(files.split(settings.EXPORT_TITLE_SEPARATOR, 1)[1])[0]
if datetime.datetime.strptime(file, '%Y-%m-%d_%H:%M:%S.%f') + timedelta(minutes=settings.EXPORT_TOKEN_DELAY) < now:
os.remove(os.path.join(os.path.join(settings.MEDIA_ROOT, 'exports'), files))
# settings.EXPORT_TOKEN_DELAY = 60 * 24
I edited my crontab -e :
30 18 * * * source /home/user/Bureau/Projets/app/venv/bin/activate.csh && python /home/user/Bureau/Projets/app/src/manage.py runcrons --force app.cron.MyCronExportJob
Then I launched service cron restart
But nothing as changed. My file is still there. However, it should be removed because his date is greater than now + settings.EXPORT_TOKEN_DELAY
I'm using Ubuntu to local dev and FreeBSD as a production server environment.
EDIT:
I tried some things but crontab doesn't work for the moment.
1) * * * * * /bin/date >> /home/user/Bureau/Projets/app/cron_output
==> It works, so crontab works
2) I ran : python manage.py runcrons in my console
==> It works
3) I ran this script (cron.sh):
source /home/user/.bashrc
cd /home/user/Bureau/Projets/app
pyenv activate app
python src/manage.py runcrons --force
deactivate
==> It works
4) I ran this crontab line :
35 10 * * * /home/user/Bureau/Projets/app/utility/cron.sh
==> Service restarted at 10h32, I waited until 10h38 : nothing !

uWSGI stats using uwsgitop and socket showing JSONDecodeError

I am not getting uwsgi stats using uwsgitop and socket. I have put uwsgi configuration for the stats with socket and when I tried to get the stat using the command:
uwsgitop /var/www/uwsgi/proj.socket
It's throwing the error
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I am using uwsgi version 2.0.17.1.
Here is my uwsgi ini file
[uwsgi]
# Multi Thread Support
enable-threads = true
# Django-related settings
# the base directory (full path)
chdir = /home/user/base-dir/proj-path/
# Django's wsgi file
module = proj.wsgi
# the virtualenv (full path)
home = /home/user/base-path/
# process-related settings
# master
master = true
# maximum number of worker processes
processes = 10
socket = /var/www/uwsgi/proj.socket
# ... with appropriate permissions - may be needed
chmod-socket = 666
# clear environment on exit
vacuum = true
daemonize = /var/www/uwsgi/uwsgi.log
pidfile = /var/www/uwsgi/uwsgi_hub.pid
logto = /var/log/proj_uwsgi%n.log
uid = user
gid = user
http-auto-gzip = true
memory-report = True
py-tracebacker=/var/www/uwsgi/proj.socket
--stats /var/www/uwsgi/proj.socket
I think that you should have something like this in your config file:
socket = /var/www/uwsgi/proj.socket
stats = /var/www/uwsgi/stats.socket
And run uwsgitop on the stats socket, like so:
uwsgitop /var/www/uwsgi/stats.socket

running python-daemon as non-priviliged user and keeping group-memberships

I'm writing a daemon in python, using the python-daemon package. the daemon is started at boot-time (init.d) and needs to access various devices.
the daemon is to run on an embedded system (beaglebone) running ubuntu.
now my problem is that I want to run the daemon as an unprivileged user rather (e.g. mydaemon) than root.
in order to allow the daemon to access the devices I added that user to the required groups.
in the python code I use daemon.DaemonContext(uid=uidofmydamon).
the process started by root daemonizes nicely and is owned by the correct user, but I get permission denied errors when trying to access the devices.
I wrote a small test application, and it seems that the process does not inherit the group-memberships of the user.
#!/usr/bin/python
import logging, daemon, os
if __name__ == '__main__':
lh=logging.StreamHandler()
logger = logging.getLogger()
logger.setLevel(logging.INFO)
logger.addHandler(lh)
uid=1001 ## UID of the daemon user
with daemon.DaemonContext(uid=uid,
files_preserve=[lh.stream],
stderr=lh.stream):
logger.warn("UID : %s" % str(os.getuid()))
logger.warn("groups: %s" % str(os.getgroups()))
when i run the above code as the user with uid=1001 i get something like
$ ./testdaemon.py
UID: 1001
groups: [29,107,1001]
whereas when I run the above code as root (or su), I get:
$ sudo ./testdaemon.py
UID: 1001
groups: [0]
How can I create a daemon-process started by root but with a different effective uid and intact group memberships?
my current solution involves dropping root priviliges before starting the actual daemon, using the chuid argument for start-stop-daemon:
start-stop-daemon \
--start \
--chuid daemonuser \
--name testdaemon \
--pidfile /var/run/testdaemon/test.pid \
--startas /tmp/testdaemon.py \
-- \
--pidfile /var/run/testdaemon/test.pid \
--logfile=/var/log/testdaemon/testdaemon.log
the drawback of this solution is, that i need to create all directories, where the daemon ought to write to (noteably /var/run/testdaemon and /var/log/testdaemon), before starting the actual daemon (with the proper file permissions).
i would have preferred to write that logic in python rather than bash.
for now that works, but me thinketh that this should be solveable in a more elegant fashion.
This can be fixed by monkey patching the daemon module, the code is as follows:
import os, grp, pwd
class DaemonError(Exception):
pass
class DaemonOSEnvironmentError(DaemonError, OSError):
pass
def change_process_owner(uid, gid):
try:
# This line adds all the groups the user is member of
# to keep the expected permissions
os.setgroups(
[g.gr_gid for g in grp.getgrall()
if pwd.getpwuid(uid).pw_name in g.gr_mem
]
)
os.setgid(gid)
os.setuid(uid)
except Exception, exc:
error = DaemonOSEnvironmentError(u"Unable to change process
owner (%(exc)s)" % vars())
raise error
And then the monkey patch:
import daemon
daemon.daemon.change_process_owner = change_process_owner

Read Celery configuration from Python properties file

I have an application that needs to initialize Celery and other things (e.g. database). I would like to have a .ini file that would contain the applications configuration. This should be passed to the application at runtime.
development.init:
[celery]
broker=amqp://localhost/
backend=amqp://localhost/
task.result.expires=3600
[database]
# database config
# ...
celeryconfig.py:
from celery import Celery
import ConfigParser
config = ConfigParser.RawConfigParser()
config.read(...) # Pass this from the command line somehow
celery = Celery('myproject.celery',
broker=config.get('celery', 'broker'),
backend=config.get('celery', 'backend'),
include=['myproject.tasks'])
# Optional configuration, see the application user guide.
celery.conf.update(
CELERY_TASK_RESULT_EXPIRES=config.getint('celery', 'task.result.expires')
)
# Initialize database, etc.
if __name__ == '__main__':
celery.start()
To start Celery, I call:
celery worker --app=myproject.celeryconfig -l info
Is there anyway to pass in the config file without doing something ugly like setting a environment variable?
Alright, I took Jordan's advice and used a env variable. This is what I get in celeryconfig.py:
celery import Celery
import os
import sys
import ConfigParser
CELERY_CONFIG = 'CELERY_CONFIG'
if not CELERY_CONFIG in os.environ:
sys.stderr.write('Missing env variable "%s"\n\n' % CELERY_CONFIG)
sys.exit(2)
configfile = os.environ['CELERY_CONFIG']
if not os.path.isfile(configfile):
sys.stderr.write('Can\'t read file: "%s"\n\n' % configfile)
sys.exit(2)
config = ConfigParser.RawConfigParser()
config.read(configfile)
celery = Celery('myproject.celery',
broker=config.get('celery', 'broker'),
backend=config.get('celery', 'backend'),
include=['myproject.tasks'])
# Optional configuration, see the application user guide.
celery.conf.update(
CELERY_TASK_RESULT_EXPIRES=config.getint('celery', 'task.result.expires'),
)
if __name__ == '__main__':
celery.start()
To start Celery:
$ export CELERY_CONFIG=development.ini
$ celery worker --app=myproject.celeryconfig -l info
How is setting an environment variable ugly? You either set an environment variable with the current version of your application, or you can derive it based on hostname, or you can make your build/deployment process overwrite the file, and on development you let development.ini copy over to settings.ini in a general location, and on production you let production.ini copy over to settings.ini.
Any of these options are quite common. Using a configuration management tool such as Chef or Puppet to put the file in place is a good option.

Categories