django __init__.py file not executing pygame code - python

I'm currently developing a website using django that stores the IP address of people who access the site in a database. The idea is, that every week at midnight the website completes a traceroute on every item in the database and then maps the geographical location of each hop to the destination onto a map using pygame. This map is then stored as an image and shown on the website.
Currently, everything works individually as it should.
My init.py file currently looks something like this:
import subprocess, sys
print "Starting scripts..."
P = subprocess.Popen([sys.executable, "backgroundProcess.py"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
print "...Scripts started"
When running this from the command prompt, or even from the gui it works fine, and the map is drawn correctly if the time is right. However when the script is run when the website is started, the text is printed correctly (Starting scripts... and ...scripts started) but the map is not drawn. In short, my question is: Does django limit what you can do in the init.py files?

Code at a module top level (whether it's a package's __init__.py or just any regular module) is only executed once per process on the first import of this module. Since Django is served using long-running processes, this code will only get executed when a new process is started.
The simplest solution for a scheduled background task is to write a Django management command and call it from a cron job (or whatever available scheduler on your system). You really don't need that subprocess stuff here...

Related

How to add cronjob/scheduler for Python scripts on EC2 AWS?

I have a question regarding one of my React apps that I recently developed. It's basically a landing page, which is using React frontend and Node+Express backend and its scraping data from various pages (scrapers are developed in Python).
Right now, the React app itself is hosted in Heroku and the execution of scrapers is working, but it's not scheduled automatically.
Current setup is the following:
EC2 for Python scrapers
AWS RDS MYSQL for database where I write the data to from the EC2 scrapers
I have created a separate file to execute all the others scrapers.
main.py
import time
import schedule
import os
from pathlib import Path
print('python script executed')
# make sure, what is the current working directory to add the right paths to scrapers
path = os.getcwd()
print(path)
#exec(open("/home/ec2-user/testing_python/lhvscraper.py").read())
filenames = [
#output table: fundsdata
Path("/home/ec2-user/testing_python/lhvscraper.py"),
Path("/home/ec2-user/testing_python/luminorscrapertest.py"),
Path("/home/ec2-user/testing_python/sebscraper.py"),
Path("/home/ec2-user/testing_python/swedscraper.py"),
Path("/home/ec2-user/testing_python/tulevascraper.py"),
#output table: feesdata
Path("/home/ec2-user/testing_python/feesscraper.py"),
#output table: yield_y1_data
Path("/home/ec2-user/testing_python/yield_1y_scraper.py"),
#output table: navdata
#Path("/home/ec2-user/testing_python/navscraper.py"),
]
def main_scraper_scheduler():
print("scheduler is working")
for filename in filenames:
print(filename)
with open(filename) as infile:
exec(infile.read())
time.sleep(11)
schedule.every(10).seconds.do(main_scraper_scheduler)
while True:
schedule.run_pending()
time.sleep(1)
I have successfully established a connection between MYSQL and EC2 and tested it on Putty -
which means, if I execute my main.py, all of the scrapers are working, inserting new data to the MYSQL database tables, and then repeat again (see the code above). The only thing is that when I close Putty (kill the connection), then the main.py function stops running.
So my question is: how to set it up like that, so that main.py file would always keep running (let's say, once a day at 12 PM) without me executing it?
I understand it's about setting up cron job or scheduler (or smth like that), but I didn't manage to set it up right now, so yours' help is very much needed.
Thanks in advance!
To avoid making crontab files overly long, Linux has canned entries that run things hourly, daily, weekly, or monthly. You don't have to modify any crontab to use this. ANY executable script that is located in /etc/cron.hourly will automatically be run once an hour. ANY executable script that is located in /etc/cron.daily will automatically be run once per day (usually at 6:30 AM), and so on. Just make sure to include a #! line for Python, and chmod +x to make it executable. Remember that it will run as root, and you can't necessarily predict which directory it will start in. Make no assumptions.
The alternative is to add a line to your own personal crontab. You can list your crontab with crontab -l, and you can edit it with crontab -e. To run something once a day at noon, you might add:
0 12 * * * /home/user/src/my_daily_script.py

Execute external *.exe application using Python and display output in real time

Let me introduce the goal of the application I'm building: I am creating a front-end GUI using PySide (Qt) for a fortran based application used in the framework of CFD. The fortran application is compiled as a *.exe file, and, when executed, it continuously provides the simulated lapse of time and other output details (when I launch it from the console, these data continously appear until it finishes).
For example, if I executed the external code from the console I would get
>> myCFDapplication.exe
Initializing...
Simulation start...
Time is 0.2
Time is 0.4
Time is 0.6
Time is 0.8
Time is 1.0
Simulation finished
>>
With quite a long lapse of time between "Time is .." and the next line.
The objective of the GUI is to generate the initialization files for the external application, to launch the external application and finally to provide the user the computation output information in real time (as plane text).
From other similar topics in this site, I have been able to launch my external application from Python using the following code
import os, sys
import subprocess
procExe = subprocess.Popen("pru", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
while procExe.poll() is None:
line = procExe.stdout.readline()
print("Print:" + line)
but the output is only displayed when the execution finishes, and moreover, the whole GUI freezes until that moment.
I would like to know how to launch my external application using Python, getting the output in real time and passing it to the GUI instantaneously, if possible. The idea would be to print the output in different lines inside a "TextEdit" dialog using the function "append(each_output_line)".
Check out Non-blocking read on a subprocess.PIPE in python and look at the use of Queues to do a non-blocking read of the subprocess. The biggest change for your Qt application is that you are probably going to have to use multiprocessing since, as you have observed, anything blocking in your application is going to freeze the GUI.

Web2Py - configure a scheduler

I have an application written in Web2Py that contains some modules. I need to call some functions out of a module on a periodic basis, say once daily. I have been trying to get a scheduler working for that purpose but am not sure how to get it working properly. I have referred to this and this to get started.
I have got a scheduler.py class in the models directory, which contains code like this:
from gluon.scheduler import Scheduler
from Module1 import Module1
def daily_task():
module1 = Module1()
module1.action1(arg1, arg2, arg3)
daily_task_scheduler = Scheduler(db, tasks=dict(my_daily_task=daily_task))
In default.py I have following code for the scheduler:
def daily_periodic_task():
daily_task_scheduler.queue_task('daily_running_task', repeats=0, period=60)
[for testing I am running it after 60 seconds, otherwise for daily I plan to use period=86400]
In my Module1.py class, I have this kind of code:
def action1(self, arg1, arg2, arg3):
for row in db().select(db.table1.ALL):
row.processed = 'processed'
row.update_record()
One of the issues I am facing is that I don't understand clearly how to make this scheduler work to automatically handle the execution of action1 on daily basis.
When I launch my application using syntax similar to: python web2py.py -K my_app it shows this in the console:
web2py Web Framework
Created by Massimo Di Pierro, Copyright 2007-2015
Version 2.11.2-stable+timestamp.2015.05.30.16.33.24
Database drivers available: sqlite3, imaplib, pyodbc, pymysql, pg8000
starting single-scheduler for "my_app"...
However, when I see the browser at:
http://127.0.0.1:8000/my_app/default/daily_periodic_task
I just see "None" as text displayed on the screen and I don't see any changes produced by the scheduled task in my database table.
While when I see the browser at:
http://127.0.0.1:8000/my_app/default/index
I get an error stating This web page is not available, basically indicating my application never got started.
When I start my application normally using python web2py.py my application loads fine but I don't see any changes produced by the scheduled task in my database table.
I am unable to figure out what I am doing wrong here and how to properly use the scheduler with Web2Py. Basically, I need to know how can I start my application normally alongwith the scheduled tasks properly running in background.
Any help in this regard would be highly appreciated.
Running python web2py.py starts the built-in web server, enabling web2py to respond to HTTP requests (i.e., serving web pages to a browser). This has nothing to do with the scheduler and will not result in any scheduled tasks being run.
To run scheduled tasks, you must start one or more background workers via:
python web2py.py -K myapp
The above does not start the built-in web server and therefore does not enable you to visit web pages. It simply starts a worker process that will be available to execute scheduled tasks.
Also, note that the above does not actually result in any tasks being scheduled. To schedule a task, you must insert a record in the db.scheduler_task table, which you can do via any of the usual methods of inserting records (including using appadmin) or programmatically via the scheduler.queue_task method (which is what you use in your daily_periodic_task action).
Note, you can simultaneously start the built-in web server and a scheduler worker process via:
python web2py.py -a yourpassword -K myapp -X
So, to schedule a daily task and have it actually executed, you need to (a) start a scheduler worker and (b) schedule the task. You can schedule the task by visiting your daily_periodic_task action, but note that you only need to visit that action once, as once the task has been scheduled, it remains in effect indefinitely (given that you have set repeats=0).
If the task does not appear to be working, it is possible there is something wrong with the task itself that is resulting in an error.

Printing PDF's using Python,win32api, and Acrobat Reader 9

I have reports that I am sending to a system that requires the reports be in a readable PDF format. I tried all of the free libraries and applications and the only one that I found worked was Adobe's acrobat family.
I wrote a quick script in python that uses the win32api to print a pdf to my printer with the default registered application (Acrobat Reader 9) then to kill the task upon completion since acrobat likes to leave the window open when called from the command line.
I compiled it into an executable and pass in the values through the command line
(for example printer.exe %OUTFILE% %PRINTER%) this is then called within a batch file
import os,sys,win32api,win32print,time
# Command Line Arguments.
pdf = sys.argv[1]
tempprinter = sys.argv[2]
# Get Current Default Printer.
currentprinter = win32print.GetDefaultPrinter()
# Set Default printer to printer passed through command line.
win32print.SetDefaultPrinter(tempprinter)
# Print PDF using default application, AcroRd32.exe
win32api.ShellExecute(0, "print", pdf, None, ".", 0)
# Reset Default Printer to saved value
win32print.SetDefaultPrinter(currentprinter)
# Timer for application close
time.sleep(2)
# Kill application and exit scipt
os.system("taskkill /im AcroRd32.exe /f")
This seemed to work well for a large volume, ~2000 reports in a 3-4 hour period but I have some that drop off and I'm not sure if the script is getting overwhelmed or if I should look into multithreading or something else.
The fact that it handles such a large amount with no drop off leads me to believe that the issue is not with the script but I'm not sure if its an issue with the host system or Adobe Reader, or something else.
Any suggestions or opinions would be greatly appreciated.
Based on your feedback (win32api.ShellExecute() is probably not synchronous), your problem is the timeout: If your computer or the print queue is busy, the kill command can arrive too early.
If your script runs concurrently (i.e. you print all documents at once instead of one after the other), the kill command could even kill the wrong process (i.e. an acrobat process started by another invocation of the script).
So what you need it a better synchronization. There are a couple of things you can try:
Convert this into a server script which starts Acrobat once, then sends many print commands to the same process and terminates afterwards.
Use a global lock to make sure that ever only a single script is running. I suggest to create a folder somewhere; this is an atomic operation on every file system. If the folder exists, the script is active somewhere.
On top of that, you need to know when the job is finished. Use win32print.EnumJobs() for this.
If that fails, another solution could be to install a Linux server somewhere. You can run a Python server on this box which accepts print jobs that you send with the help of a small Python script on your client machine. The server can then print the PDFs for you in the background.
This approach allow you to add any kind of monitoring you like (sending mails if something fails or send a status mail after all jobs have finished).

python,running command line servers - they're not listening properly

Im attempting to start a server app (in erlang, opens ports and listens for http requests) via the command line using pexpect (or even directly using subprocess.Popen()).
the app starts fine, logs (via pexpect) to the screen fine, I can interact with it as well via command line...
the issue is that the servers wont listen for incoming requests. The app listens when I start it up manually, by typing commands in the command line. using subprocess/pexpect stops the app from listening somehow...
when I start it manually "netstat -tlp" displays the app as listening, when I start it via python (subprocess/pexpect) netstat does not register the app...
I have a feeling it has something to do with the environemnt, the way python forks things, etc.
Any ideas?
thank you
basic example:
note:
"-pz" - just ads ./ebin to the modules path for the erl VM, library search path
"-run" - runs moduleName, without any parameters.
command_str = "erl -pz ./ebin -run moduleName"
child = pexpect.spawn(command_str)
child.interact() # Give control of the child to the user
all of this stuff works correctly, which is strange. I have logging inside my code and all the log messages output as they should. the server wouldnt listen even if I started up its process via a bash script, so I dont think its the python code thats causing it (thats why I have a feeling that its something regarding the way the new OS process is started).
It could be to do with the way that command line arguments are passed to the subprocess.
Without more specific code, I can't say for sure, but I had this problem working on sshsplit ( https://launchpad.net/sshsplit )
To pass arguments correctly (in this example "ssh -ND 3000"), you should use something like this:
openargs = ["ssh", "-ND", "3000"]
print "Launching %s" %(" ".join(openargs))
p = subprocess.Popen(openargs, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
This will not only allow you to see exactly what command you are launching, but should correctly pass the values to the executable. Although I can't say for sure without seeing some code, this seems the most likely cause of failure (could it also be that the program requires a specific working directory, or configuration file?).

Categories