I am starting to work with GAE Task Queue system and things seem to be working fine except for one issue. Everything works fine in my Django-nonrel project with the default queue but breaks with named queues and says it can not find them. I also noticed that they do not show up in the Console as expected. I followed the guide and would assume that just having the queue.yaml in the project would be enough to see them.
Here is an example:
queue:
- name: bob
max_concurrent_requests: 200
rate: 20/s
- name: default
rate: 10/s
I would expect to see the default and another task queue called "bob" in the Console.
Am I missing something in my configuration of this? Doesn't the presence of the queue.yaml set things up properly?
I am running GAEL 1.6.1
Thanks,
RB
Related
I am building an online shop, following Chapter 7 in the book "Django 3 by Example." The book was written by Antonio Melé.
Everything works fine in my local machine. It also works well when I deploy it to Heroku.
However, when I try to use Celery and RabbitMQ (CloudAMQP, Little Lemur, on Heroku), the email message the worker was supposed to send to the customer is not sent. The task takes more then 30 seconds and then it crashes:
heroku[router]: at=error code=H12 desc="Request timeout" method=POST
I have created a tasks.py file with the task of sending emails. My settings.py file include the following lines for Celery:
broker_url = os.environ.get('CLOUDAMQP_URL')
broker_pool_limit = 1
broker_heartbeat = None
broker_connection_timeout = 30
result_backend = None
event_queue_expires = 60
worker_prefetch_multiplier = 1
worker_concurrency = 50
This was taken from https://www.cloudamqp.com/docs/celery.html
And my Procfile is as follows,
web: gunicorn shop.wsgi --log-file -
worker: celery worker --app=tasks.app
Am I missing something?
Thanks!
So fairly familiar with heroku, though not your tech stack. So the general approach to deal with heroku timeout is this:
First, determine exactly what is causing the timeout. One or more things are taking a lot of time.
Now you have 3 main options.
Heroku Scheduler (or one of several other similar addons). Very useful if you can run a script of some sort via a terminal command, and 10 minutes/1 hour/24 hour checks to see if the script needs to be run is good enough for you. I typically find this to be the most straightforward solution, but it's not always an acceptable one. Depending on what you are emailing, an email being delayed 5-15 minutes might be acceptable.
Background process worker. Looks like this is what you are trying to do with Celery, but it's not configured right, probably can't help much on that.
Optimize. The reason heroku sets a 30 second timeout is because generally speaking there really isn't a good reason for a user to wait 30 seconds for a response. I'm scratching my head as to why sending an email would take more than 30 seconds, unless you need to send a few hundred of them or the email is very, very large. Alternatively, you might be doing a lot of work before you send the email, though that raises the question fo why not do that work seperately from the send email command. I suspect you should probably look into the why of this before you try to get a background process worker setup.
After several days trying to solve this issue, I contacted the support department at CLOUDAMQP
They helped me figure out that the problem was related to Celery not identifying my BROKER_URL properly.
Then I came across this nice comment by #jainal09 here. There was an extra variable that should be set in settings.py:
CELERY_BROKER_URL = '<broker address given by Heroku config>'
Adding that extra line solved the problem. Now Heroku is able to send the email correctly.
I tried using Celery for some I/O tasks for my project, but I reached a dead end and I would really appreciate your help with this if possible.
So what I'm trying to achieve basically is to first run a group of the same task on multiple remote machines (e.g. copying some files on them) and after that run another group of the another task type (e.g. installing a Python module on the machines).
I've tried implementing this stuff like this:
final_job = chain( group(copy_files_job) | HandleResults().s(), group(install_module_job) | HandleResults().s())
result = final_job.delay()
What I would like to achieve with this is to also report for each group of tasks the result back to a web interface. I'm not entirely sure if this is the correct way of doing what I want with Celery.
But running this will return a NotImplementedError: Starting chords requires a result backend to be configured. This is not true though, because I use Redis both as broker and result backend and it works well if I don't add the second group of tasks with the handle results task (group(install_module_job) | HandleResults().s())).
So it's clear that the error is not the correct one. I guess Celery tries to tell me that I configured the final_job the wrong way, but I really don't know how else can I write what I'm trying to achieve.
I have a piece of DASK code run on local machine which work 90% of time but will stuck sometimes. Stuck mean. No crash, no error print out not cpu usage. never end.
I google and think it maybe due to some worker dead. I will be very useful if I can see the worker log and figure out why.
But I cannot find my worker log. I go to edit config.yaml to add loging but still see nothing from stderr.
Then I go to dashboard --> info --> logs and see blank page.
The code it stuck is
X_test = df_test.to_dask_array(lengths=True)
or
proba = y_pred_proba_train[:, 1].compute()
and my ~/.config/dask/config.yaml or ~.dask/config.yaml look like
logging:
distributed: info
distributed.client: warning
distributed.worker: debug
bokeh: error
I am using
python 3.6
dask 1.1.4
All I need is a way to see the log so that I can try to figure out what goes wrong.
Thanks
Joseph
Worker logs are usually managed by whatever system you use to set up Dask.
Perhaps you used something like Kubernetes or Yarn or SLURM?
These systems all have ways to get logs back.
Unfortunately, once a Dask worker is no longer running, Dask itself has no ability to collect logs for you. You need to use the system that you use to launch Dask.
We use Python(2.7)/Django(1.8.1) and Gunicorn(19.4.5) for our web application and supervisor(3.0) to monitor it. I have recently encountered 2 issues in logging:
Django was logging into previous day logs(We have log rotation enabled)
Django was not logging anything at all.
The first scenario is understandable where the log rotation changed the file but Django was not updated.
The second scenario fixed when I restarted the supervisor process. Which led me to believe again the file descriptor was not updated in the django process.
I came by this SO thread which states:
Each child is an independent process, and file handles in the parent
may be closed in the child after a fork (assuming POSIX). In any case,
logging to the same file from multiple processes is not supported.
So I have few questions:
My gunicorn has 4 child processes and if one of them fails while
writing to a log file will the other child process won't be able to
use it? and how to debug these kind of scenarios?
Personally I found debugging errors in python logging module to be
difficult. Can some one point how to debug errors such as this or is
there any way I can monkey patch logging to not fail silently?*(Kindly read update section)*
I have seen Django LogRotation causes the Issue type 1 as explained above and not some script scheduled via cron. So what is preferable?
Note: The logging config is not a problem. I have already spent fair amount of time trying to figure that out. Also if the config is the issue Django will not write log files after a process restart.
Update:
For my second question I see that logging modules provides an option to raiseExceptions on failure although this is discourages in production environment. Documentation here. So now my question becomes how do I set this in Django?
I felt like closing this question. Bit awkward and seems stupid after 2 months. But I guess being stupid is part of the learning and want this to be as a reference for people who stumble across this.
Scenario 1: Django on using TimedRotatingFileHandler seems not to update the file descriptor some times and hence writes to old log files unless we restart the supervisor. We are yet to find the reason for this behaviour and update the reason if found. For now we are using WatchedFileHandler and then using logrotate utility to rotate the logs.
Scenario 2: This is the stupid question. When I was logging with some string formatting I forgot to give enough variables which is why the logger was erring. But this didn't get propagated. But locally when I was testing I found that logging module was actually throwing that error but silently and any logs after it in the module were not getting printed. Lessons learns from this scenario were:
If there is a problem in logging find out if the string formatting does not err
Using log.debug('example: {msg}'.format(msg=msg)) of python instead of log.debug('example: %s', msg).
I've got a google appengine app which runs some code on dynamic backend defined as follows:
backends:
- name: downloadfilesbackend
class: B1
instances: 1
options: dynamic
I've recently made some changes to my code and added a second backend. I've moved some tasks from the front end to the new backend and they work fine. However I want to move the tasks that originally ran on downloadfilesbackend to the new backend (to save on instance hours). I am doing this simply by changing the name of the target to the new backend i.e.
taskqueue.add(queue_name = "organise-files",
url=queue_organise_files,
target='organise-files-backend')
However, despite giving the new backend name as the target the tasks are still being run by the old backend. Any idea why this is happening or how I can fix it?
EDIT:
The old backend is running new tasks - I've checked this.
I've also been through all of my code to check to see if anything is calling the old backend and nothing is. There are only two methods which added tasks to the old backend, and both of these methods have been changed as detailed above.
I stopped the old backend for a few hours, to see whether this would change anything, but all that happened was that the tasks got jammed until I restarted the backend. The new backend is running other tasks fine, so it's definitely been updated correctly...
It's taken a while but I've finally discovered that it is not enough to just change the code and upload it using the SDK. If the code running on the backend or sending tasks to the backend handler is changed then you must run
appcfg backends <dir> update [backend]
documentation for this command is here. There isn't any documentation I've seen that says this - it was related to another related error I was experiencing that prompted this as an avenue. Just thought I'd let people know who may be having a similar problem