I'm currently using Celery 3.1 in a Django project on Python 2.7.
So far, we were using the Django ORM as a broker for development and staging environments. That was convenient, because you could pretty much just check out the sources, install the dependencies, run the migrations and celery worker would just work out of the box.
I'm thinking about how to set that up after upgrading to Celery 4.x due to the Django ORM broker having been removed. Are there any message queues that don't require any local setup (or can be pip-installed) and separate launching?
Related
I have implemented a flask application with celery for background task processing using redis as a message queue. On the development server I am able to send an asynchronous request and continue the background process on celery. I have read that the latest version of celery isn’t compatible with windows. How can I make it run on Microsoft IIS(internet information services) for production?
If not, what are the best alternatives for background processing/task queues available which would work on the IIS production server?
It is really simple - run Celery workers on Linux boxes.
I have a small infrastructure plan that does not include Django. But, because of my experience with Django, I really like Celery. All I really need is Redis + Celery to make my project. Instead of using the local filesystem, I'd like to keep everything in Redis. My current architecture uses Redis for everything until it is ready to dump the results to AWS S3. Admittedly I don't have a great reason for using Redis instead of the filesystem. I've just invested so much into architecting this with Docker and scalability in mind, it feels wrong not to.
I was searching for a non-Django database scheduler too a while back, but it looked like there's nothing else. So I took the Django scheduler code and modified it to use SQLAlchemy. Should be even easier to make it use Redis instead.
It turns out that you can!
First I created this little project from the tutorial on celeryproject.org.
That went great so I built a Dockerized demo as a proof of concept.
Things I learned from this project
Docker
using --link to create network connections between containers
running commands inside containers
Dockerfile
using FROM to build images iteratively
using official images
using CMD for images that "just work"
Celery
using Celery without Django
using Celerybeat without Django
using Redis as a queue broker
project layout
task naming requirements
Python
proper project layout for setuptools/setup.py
installation of project via pip
using entry_points to make console_scripts accessible
using setuid and setgid to de-escalate privileges for the celery deamon
I'm willing to send tasks from a web server (running Django) to a remote machine that is holding a Rabbitmq server and some workers that I implemented with Celery.
If I follow the Celery way to go, it seems I have to share the code between both machines, which means replicating the workers logic code in the web app code.
So:
Is there a best practice to do that? Since code is redundant, I am thinking about using a git submodule (=> replicated in the web app code repo, and in the workers code repo)
Should I better use something else than Celery then?
Am I missing something?
One way to manage this is to store your workers in your django project. Django and celery play nice to each other allowing you to use parts of your django project in your celery app. http://celery.readthedocs.org/en/latest/django/first-steps-with-django.html
Deploying this would mean that your web application would not use the modules involved with your celery workers, and on your celery machine your django views and such would never be used. This usually only results in a couple of megs of unused django application code...
You can use send_task. It takes same parameters than apply_async but you only have to give the task name. Without loading the module in django you can send tasks:
app.send_task('tasks.add', args=[2, 2], kwargs={})
http://celery.readthedocs.org/en/latest/reference/celery.html#celery.Celery.send_task
I am confused between the differences between these two applications while trying to setup celery on my django project.
What are the differences between the two if any? When reading tutorials online I see them both used, and i'm not sure which would be best for me. It appears that djcelery is kinda like celery but tailored for django? But celery doesn't need to be included in intalled apps while djcelery does.
Thank you
Django-celery was a project that provided Celery integration for django, but it is no longer required.
You don't have to install django-celery anymore. Since version 3.1 django is supported out of the box.
So to install celery you can use pip:
pip install -U Celery
This is a note from Celery First Steps with Django Tutorial
Note:
Previous versions of Celery required a separate library to work with
Django, but since 3.1 this is no longer the case. Django is supported
out of the box now so this document only contains a basic way to
integrate Celery and Django. You will use the same API as non-Django
users so it’s recommended that you read the First Steps with Celery
tutorial first and come back to this tutorial. When you have a working
example you can continue to the Next Steps guide.
When using Django, you should install django-celery from PyPI. Celery will be installed as a dependency.
Djcelery hooks your django project in with Celery, which is a more general tool used with a variety of application stacks.
Here is Celery's getting started with Django guide, which describes installing django-celery and setting up your first tasks.
Previous versions of Celery required a separate library to work with Django, but since 3.1 this is no longer the case. Django is supported out of the box now so this document only contains a basic way to integrate Celery and Django. You’ll use the same API as non-Django users: https://docs.celeryproject.org/en/latest/django/first-steps-with-django.html#configuring-your-django-project-to-use-celery
How should the project be deployed and run. There are loads of tools in this space. Which should be used and why?
Supervisor
Gunocorn
Ngnix
Fabric
Boto
Pip
Virtualenv
Load balancers
It depends on your configuration. We are using the following stack for our environment on Rackspace, but you can setup the same thing on AWS with EC2 instances.
Ubuntu 11.04
Varnish (in memory cache) to avoid disk seeks
NginX to server static content
Apache to server dynamic content (MOD-WSGI)
Python 2.7.2 with Django
Jenkins for our continuous builds
GIT for version control
Fabric for the deployment.
So the way it works is that a GIT push to the origin repository is being polled by Jenkins. Jenkins then pulls the changes down from the origin. Builds a Python Egg, runs Unit tests, uses Fabric to deploy this egg to the environments necessary and reloads the Apache config to make sure the forked Apache processes are picking up the new Python egg.
Hope this helps.
As Michael Klockel already stated depends on your configuration, I have:
Ubuntu 10.04 LTS
Nginx
Uwsgi
git version control
python virtualenv and pip
You can check the deployment settings here:
Django, Virtualenv, nginx + uwsgi import module wsgi error
and why I use nginx and uwsgi here:
http://nichol.as/benchmark-of-python-web-servers
Also I use fabric for the deployment of the app, and chef solo http://ericholscher.com/blog/2010/nov/8/building-django-app-server-chef/
johny cache for sql queries and raven and sentry to keep a log of whats going on on the app.
I'd use uWSGI+Nginx from a performance perspective (I think the comparison has already been linked in another answer), pip and virtualenv for deployment as this keeps things self-contained, and facilitates clean deployment using fabric or similar. Use git for version control. Jenkins can handle continuous integration. I'd use the AWS load balancer (ELB) in front of your EC2 instances for balancing - does the job without you having to fret too much about it. django-storages for uploading your static files to s3, that saves you the effort of having another server to hand out static files.
However, it depends a little on your admin overheads. If you're looking for something clean and simple for deployment and scaling, I'd scrap the whole AWS EC2 stack, use Heroku as a front end, and s3 for your static files. This saves all the admin time of maintaining the boxes, and allows you to concentrate on the dev.