How to run a Python script online every N minutes? - python

To work a bit on my Python, I decided to try to code a simple script for my private use which monitors sites with offers and sends you an email whenever a new offer which you are interested in pops out. I guess I could handle the coding part (extracting the newest one from HTML and such) but I've never really run online any script which requires being fired every N minutes or so. What kind of hosting/server do I need to make my script run independently of my computer and refresh every, say, 5 minutes sending me an email when there's an update?

If you have shell access, you an use crontab to schedule a recurring job.
Otherwise you can use a service like SetCronJob or EasyCron or similar to invoke a script regularly.
Some hosters also provide similar functionalities in their administration interface...

Related

How to run a script on startup in PostgreSQL?

I tried to create a pgAgent Job, but I can't seem to make it work the way I want. I can schedule a maintenance and put my script there, but it is not exactly what I want to do.
To be more precise, what I want to do is to start a script that will subscribe to a broker. I don't want the user to start the script manually. Is there something I can do?
It is a bit unclear what you want to do.
If you want some program to be started right after PostgreSQL, create an appropriate startup script (that depends on your operating system).
If you want something to run in the database right when it is starting up, write a PostgreSQL module in C and add it to shared_preload_libraries in postgresql.conf.

What is the best way to run python scripts in the background of Ubuntu?

I created a python script that work as a bot for instagram (using selenium).
Currently I have 5 profile running, for each of them I have all the files stored in folders (called with the name of the ig profile) and for each profile I have a screen where I can see the "log" of each program.
But now, 5 profile are difficult to manage and sometimes also a little messy.
Is there a way to see the log of all 5 scripts in a unique window?
I'm open also to another way to run the scripts in the background, maybe not "screen" but something else.
Thankyou
If you really want to go the clean way and if you think this will get bigger, you might want to have a look towards Django and Celery.
You can create a web interface, so that you can monitor any way you like.
And you can have cron jobs with Celery so that your bot is always on, or has recurring tasks, etc...
More info on their respective docs, as usual. http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html

Save a Script Variables inside code and reset them after reboot

in my vps i have run 4 Python Script and its been 60 days that i don't reboot my vps and now i have to, but if i reboot vps my python Variables & data will be removed because i don't store them in file and they are store in variables in python script.
my OS is Ubuntu Server 16.04 LTS and i was run my python codes with nohup command until they can run in background.
Now i need a way to stop my scripts without removing they variables and start them with same variables and data after i reboot my vps.
Is There Any Way That I Can Do This?
In Addition, I'm sorry for writing mistakes in my question.
Python doesn't provide any way of doing this.
But you might be able to use CRIU, or a similar tool, to freeze and snapshot the interpreter process. Then, after restart, you can resume the snapshot into a new process that just picks up exactly where you left off.
It may not work.1 But there's a good chance it will. This is essentially the same thing as a Live Migration in the CRIU docs, except that you're not migrating to a new computer/container/etc., just to the future of the same computer. So, start reading with that page, and follow the links from there.
You should probably test before you commit to it.
* Try it (obviously don't include the system restart, just kill -9 the executable) on a Python script that doesn't do anything important (maybe increments a counter, print it out, sleep for a second, repeat.
* Maybe try it on a script that does similar kinds of stuff to what yours are doing.
* If it's safe to have two copies of one of your programs running at the same time (they're not going to stomp all over each other writing to the same file, or fight over the same socket, or whatever), start a second copy and test dump/kill/resume that.
* Try it on one of your real processes, still without restart.
* Try it on all four.
* Cross your fingers, sacrifice a chicken, and do it for real.
If that doesn't pan out, the only option I can think of is to go through your scripts, manually figure out everything that needs to be saved and how it could be accessed from the top-level global, and do that in the debugger.
Ideally, you'll write a script that will automate accessing and saving all that stuff—plus another one to feed it into a new instance at restart. Then you just pdb the live interpreters and start dumping everything.
This is guaranteed to be a whole lot of work, and not much fun. On the plus side, it is guaranteed to work if you do it right. On the third hand, it's pretty easy to not do it right.
1. If you rely on open files, pipes, sockets, etc., CRIU does about as much as you could do, which is more than you might expect at first, but still not everything you could possibly want… Also, if you're using almost all of your RAM, it can be hard to wedge things back into exactly the same state. And there are probably other possible issues.

Run python script automatically every week (windows)

I wrote a little python Script that fetches data from some website using hard coded credentials ( i know its bad but not part of this question).
The website has new data every day and im gathering data from a whole week and parse it into a single .pdf.
I've already adjusted the script to always generate a pdf off last week by default. (no params needed)
Im kinda lazy and don't want to run the script every week by hand.
So is it possible to run the script at certain times, for example every monday at 10am?
Sure, just utilize Windows' task scheduler. There you can create new tasks to your delight and let it run commands to whatever times or intervalls you want. The task schedulers' GUI should be self-explanatory, but to be concrete on your example:
Configure the run time (weekly, monday, 10am) under triggers
Add a new action and give it your Python interpreter as the command and your script to be run as the argument
Configure the rest according to your needs

GAE Backend fails to respond to start request

This is probably a truly basic thing that I'm simply having an odd time figuring out in a Python 2.5 app.
I have a process that will take roughly an hour to complete, so I made a backend. To that end, I have a backend.yaml that has something like the following:
-name: mybackend
options: dynamic
start: /path/to/script.py
(The script is just raw computation. There's no notion of an active web session anywhere.)
On toy data, this works just fine.
This used to be public, so I would navigate to the page, the script would start, and time out after about a minute (HTTP + 30s shutdown grace period I assume, ). I figured this was a browser issue. So I repeat the same thing with a cron job. No dice. Switch to a using a push queue and adding a targeted task, since on paper it looks like it would wait for 10 minutes. Same thing.
All 3 time out after that minute, which means I'm not decoupling the request from the backend like I believe I am.
I'm assuming that I need to write a proper Handler for the backend to do work, but I don't exactly know how to write the Handler/webapp2Route. Do I handle _ah/start/ or make a new endpoint for the backend? How do I handle the subdomain? It still seems like the wrong thing to do (I'm sticking a long-process directly into a request of sorts), but I'm at a loss otherwise.
So the root cause ended up being doing the following in the script itself:
models = MyModel.all()
for model in models:
# Magic happens
I was basically taking for granted that the query would automatically batch my Query.all() over many entities, but it was dying at the 1000th entry or so. I originally wrote it was computational only because I completely ignored the fact that the reads can fail.
The actual solution for solving the problem we wanted ended up being "Use the map-reduce library", since we were trying to look at each model for analysis.

Categories