I'm using this instruction http://django-cron.readthedocs.org/en/latest/installation.html and can't understand meaning of "a unique code" line.
from django_cron import CronJobBase, Schedule
class MyCronJob(CronJobBase):
RUN_EVERY_MINS = 120 # every 2 hours
schedule = Schedule(run_every_mins=RUN_EVERY_MINS)
code = 'my_app.my_cron_job' # a unique code
def do(self):
pass # do your thing here
Can anyone explain me what this line do?
code = 'my_app.my_cron_job' # a unique code
Looking at code here:
def make_log(self, *messages, **kwargs):
cron_log = self.cron_log
cron_job = getattr(self, 'cron_job', self.cron_job_class)
cron_log.code = cron_job.code
we can understand, that this "unique code" denote particular cron task. Every time your cron task is executed, CronJobLog instance is created with cron_log.code = cron_job.code.
So, it is possible to filter the logs, that belongs to particular task:
last_job = CronJobLog.objects.filter(code=cron_job.code).latest('start_time')
That is why it must be unique, to not mix logs from one cron task with another.
I suppose it has the same purpose as id, but this code has meaningful value.
Related
I would like to use Django signals to trigger a celery task like so:
def delete_content(sender, instance, **kwargs):
task_id = uuid()
task = delete_libera_contents.apply_async(kwargs={"instance": instance}, task_id=task_id)
task.wait(timeout=300, interval=2)
But I'm always running into kombu.exceptions.EncodeError: Object of type MusicTracks is not JSON serializable
Now I'm not sure how to tread MusicTracks instance as it's a model class instance. How can I properly pass such instances to my task?
At my tasks.py I have the following:
#app.task(name="Delete Libera Contents", queue='high_priority_tasks')
def delete_libera_contents(instance, **kwargs):
libera_backend = instance.file.libera_backend
...
Never send instance in celery task, you only should send variables for example instanse primary key and then inside of the celery task via this pk find this instance and then do your logic
your code should be like this:
views.py
def delete_content(sender, **kwargs):
task_id = uuid()
task = delete_libera_contents.apply_async(kwargs={"instance_pk": sender.pk}, task_id=task_id)
task.wait(timeout=300, interval=2)
task.py
#app.task(name="Delete Libera Contents", queue='high_priority_tasks')
def delete_libera_contents(instance_pk, **kwargs):
instance = Instance.ojbects.get(pk = instance_pk)
libera_backend = instance.file.libera_backend
...
you can find this rule in celery documentation (can't find link), one of
reasons imagine situation:
you send your instance to celery tasks (it is delayed for any reason for 5 min)
then your project makes logic with this instance, before your task finished
then celery's task time come and it uses this instance old version, and this instance become corrupted
(this is the reason as I think it is, not from the documentation)
First off, sorry for making the question a bit confusing, especially for the people that have already written an answer.
In my case, the delete_content signal can be trigger from three different models, so it actually looks like this:
#receiver(pre_delete, sender=MusicTracks)
#receiver(pre_delete, sender=Movies)
#receiver(pre_delete, sender=TvShowEpisodes)
def delete_content(sender, instance, **kwargs):
delete_libera_contents.delay(instance_pk=instance.pk)
So every time one of these models triggers a delete action, this signal will also trigger a celery task to actually delete the stuff in the background (all stored on S3).
As I cannot and should not pass instances around directly as pointed out by #oruchkin, I pass the instance.pk to the celery task which I then have to find in the celery task as I don't know in the celery task what model has triggered the delete action:
#app.task(name="Delete Libera Contents", queue='high_priority_tasks')
def delete_libera_contents(instance_pk, **kwargs):
if Movies.objects.filter(pk=instance_pk).exists():
instance = Movies.objects.get(pk=instance_pk)
elif MusicTracks.objects.filter(pk=instance_pk).exists():
instance = MusicTracks.objects.get(pk=instance_pk)
elif TvShowEpisodes.objects.filter(pk=instance_pk).exists():
instance = TvShowEpisodes.objects.get(pk=instance_pk)
else:
raise logger.exception("Task: 'Delete Libera Contents', reports: No instance found (code: JFN4LK) - Warning")
libera_backend = instance.file.libera_backend
You might ask why do you not simply pass the sender from the signal to the celery task. I also tried this and again, as already pointed out, I cannot pass instances and I fail with:
kombu.exceptions.EncodeError: Object of type ModelBase is not JSON serializable
So it really seems I have to hard obtain the instance using the if-elif-else clauses at the celery task.
I am getting an error in a program which is using threading to perform functions simultaneously, this process is a job which is run once per hour, The data that comes in is from a view in sql.
The function that is called in target arg returns a dictionary and says
"dict object is not callable".
Inside the function, is returned a dictionary.
My doubt is what should return in this function, if I don't return anything will it affect any other thread?
# inside the jobs in django I call this function
def ticket_booking():
query = " SELECT * FROM vw_ticket_list;"
ttd = []
try:
result = query_to_dicts(query)
tickets = json.loads(to_json(result))
if tickets:
# Calling the vaccination push message (php).
for ticket in tickets:
# Getting Generic details used by all categories
full_name = tickets['full_name']
gender = tickets['gender']
email =tickets[email]
if tickets['language_id'] == 1: # book english movie tickets
# Calling the function inside the Thread for a Syncronuz Call (Wait and Watch)
myThread = threading.Thread(target=book_english(email, full_name))
myThread.start()
myThread.join()
if tickets['language_id'] == 2: # book italian movie tickets
myThread = threading.Thread(target=book_italian( email, full_name, gender))
myThread.start() # Starting the Thread
myThread.join() #Will return here if sth is returned
As you can see in code comments, if book italian function returns sth, only then it can return here and I have 5 threads in total to excute simultaneously, book italian function is like:
def book_italian(email,fullname,gender):
try
# posts data to another server #
b=requests.post(some postdata)
a =log.objects.create(log data from b in crct format)
return a--->{"Message":"Log created successfully"}
a is type class dict and I tried to change it to many types still gives me same error, this job isn't running when run in crontab.
When you use threading.Thread to execute something, you should separate the target callable object (like a function) and corresponding arguments (parameters) then pass them respectively:
myThread = threading.Thread(target=book_italian, args=(email, full_name, gender))
Refer to document.
I'm writing a code that's supposed to get some file names using a recursive function (scan_folder) and writing them into an sqlite database with a second function (update_db).
The first issue is, whenever scan_folder() calls itself, it calls update_db() immediately after, although it shouldn't. Because of this, the database gets updated A LOT. Maybe I could pop the values that get passed to the second function after it finishes, but I'd like to know why this is happening.
class Sub:
def __init__(self, parent, scan_type):
self.database = ConnectionToDatabase()
self.database_name = ConnectionToDatabase().database_name()
def scan_folder(self):
connection = sqlite3.connect(self.database_name)
try:
cursor = connection.cursor()
for file_name in os.listdir(self.parent):
if file_name.endswith('.srt'):
if self.scan_type is True:
cursor.execute('SELECT count(*) FROM subs WHERE name = ?', (file_name,))
else:
current_path = "".join((self.parent, "/", file_name))
if os.path.isdir(current_path):
dot = Sub(current_path, self.scan_type)
# I THINK HERE IS THE ERROR, ACCORDING TO PYCHARM DEBUGGER
# HERE THE update_db() IS CALLED AND ONLY AFTER IT FINISHES, dot.scan_folder() BEGINS
dot.scan_folder()
connection.close() # Closes connection that adds subtitle names into the database
finally:
self.database.update_database(dirty_files_amount)
Here begins the second function:
class ConnectionToDatabase:
def __init__(self):
self.database = './sub_master.db'
def update_database(self, dirty_files_amount):
connection_update = sqlite3.connect(self.database)
cursor = connection_update.cursor()
for sub_name in to_update:
cursor.execute('UPDATE subs SET ad_found = 1 WHERE name = ?', (sub_name,))
connection_update.commit()
connection_update.close()
This is just a hunch, but right here:
dot = Sub(current_path, self.scan_type)
You're setting it to be equals your Sub-method and in that method you have a:
self.database = ConnectionToDatabase()
self.database_name = ConnectionToDatabase().database_name()
That calls itself through your ConnectionToDatabase class, where your update_db is residing
when I call scan_folder it enters and if/else statement which gets every file and folder in the current directory. When it doesn't find anything else there, instead of jumping back to the previous directory, it calls update_db before.
The best thing to do is to just re write the whole thing, as stated previously, the functions are doing too many things.
So this is kind of a python design question + multiple heritance. I'm working on a program of mine and I've ran into an issue I can't figure out a decent way of solving.
To keep it simple. The software scans a log event file generated from another program. Initially it creates and stores each event in a representative event object. But I want to access them quickly and with a more robust language so I'm loading them into a SQL DB after doing a little processing on each event, so theres more data than previous. When I query the DB I'm wanting to recreate an object for each entry representative of the event so its easier to work with.
The problem I'm running into is that I want to avoid a lot of duplicate code and technically I should be able to just reuse some of the code in the original classes for each event. Example:
class AbstractEvent:
__init__(user, time)
getTime()
getUser()
class MessageEvent(AbstractEvent):
__init__(user,time,msg)
getMessage()
class VideoEvent(AbstractEvent):
pass
But, there is extra data after its gone into the DB so there needs to be new subclasses:
class AbstractEventDB(AbstractEvent):
__init__(user, time, time_epoch)
getTimeEpoch()
(static/classmethod) fromRowResult(row)
class MessageEventDB(AbstractEventDB, MessageEvent):
__init__(user, time, msg, time_epoch, tags)
getTags()
(static/classmethod) fromRowResult(row)
class VideoEventDB(AbstractEventDB, VideoEvent):
pass
This is a simpler version than whats happening, but it shows some of what does happen. I change long form time stamps from the log file into epoch timestamps when they go into the DB and various tags are added on message events but other events have nothing extra really beyond the timestamp change.
The above is ideally how I would like to format it, but the problem I've ran into is that the call signatures are completely different on the DB object side compared to the Simple Event side; so when I try to call super() I get an error about expected arguements missing.
I was hoping someone might be able to offer some advice on how to structure it and avoid duplicating code 10-20 times over, particularly in the fromRowResult (a factory method). Help much appreciated.
I thin what you are looking for is a Python implementation for the decorator design pattern.
http://en.wikipedia.org/wiki/Decorator_pattern
The main idea is to replace multiple inheritance with inheritance + composition:
class AbstractEvent(object):
def __init__(self, user, time):
self.user = user
self.time = time
class MessageEvent(AbstractEvent):
def __init__(self, user, time, msg):
super(MessageEvent, self).__init__(user, time)
self.msg = msg
class AbstractEventDBDecorator(object):
def __init__(self, event, time_epoch):
# event is a member of the class. Using dynamic typing, the event member will
# be a AbstractEvent or a MessageEvent at runtime.
self.event = event
self.time_epoch = time_epoch
#classmethod
def fromRowResult(cls, row):
abstract_event = AbstractEvent(row.user, row.time)
abstract_event_db = AbstractEventDBDecorator(abstract_event, row.time_epoch)
return abstract_event_db
class MessageEventDB(AbstractEventDBDecorator):
def __init__(self, message_event, time_epoch, tags):
super(MessageEventDB, self).__init__(message_event, time_epoch)
self.tags = tags
#classmethod
def fromRowResult(cls, row):
message_event = MessageEvent(row.user, row.time, row.msg)
message_event_db = MessageEventDB(message_event, row.time_epoch, row.tags)
return message_event_db
class Row:
def __init__(self, user, time, msg, time_epoch, tags):
self.user = user
self.time = time
self.msg = msg
self.time_epoch = time_epoch
self.tags = tags
if __name__ == "__main__":
me = MessageEvent("user", "time", "msg")
r = Row("user", "time", "Message", "time_epoch", "tags")
med = MessageEventDB.fromRowResult(r)
print med.event.msg
I am trying to get the sender filter working e.g.
#celery.task
def run_timer(crawl_start_time):
return crawl_start_time
#task_success.connect
def run_timer_success_handler(sender, result, **kwargs):
print '##################################'
print 'in run_timer_success_handler'
The above works fine, but if I try to filter by sender, it never works:
#task_success.connect(sender='tasks.run_timer')
def run_timer_success_handler(sender, result, **kwargs):
print '##################################'
print 'in run_timer_success_handler'
I also tried:
#task_success.connect(sender='run_timer')
#task_success.connect(sender=run_timer)
#task_success.connect(sender=globals()['run_timer'])
None of them work.
How do I effectively use the sender filter to ensure that by callback is called on for the run_timer task and not the others.
It's better to filter sender inside function in this case now. Like:
#task_success.connect
def ...
if sender == '...':
...
Because current celery signals implementation has issue when task sender and worker are different python processes.
Because it converts your sender into the identifier and uses it for filtering, but celery sends task by string name. Here is the problem code (celery.utils.dispatch.signals):
def _make_id(target): # pragma: no cover
if hasattr(target, 'im_func'):
return (id(target.im_self), id(target.im_func))
return id(target)
And id('tasks.run_timer') is not the same as id('tasks.run_timer') of a worker process. If you want you may hack it and relace id by hash function
http://docs.celeryproject.org/en/latest/userguide/signals.html#task-success
...
Sender is the task object executed. (not the same as after_task_publish.sender)
...
So, you should
#task_success.connect(sender=run_timer)
def ...
It works for me. Good Luck.