Python watch-dog script : load url asynchronously - python

I have simple Python script which do check few urls :
f = urllib2.urlopen(urllib2.Request(url))
as i have socket timeout setted on 5 seconds sometimes is bothering to wait 5sec * number of urls on results.
Is there any easy standartized way how to run those url checks asynchronously without big overhead. Script must use standart python components on vanilla ubuntu distribution (no additional installations).
Any ideas ?

I wrote something called multibench a long time ago. I used it for almost the same thing you want to do here, which was to call multiple concurrent instances of wget and see how long it takes to complete. It is a crude load testing and performance monitoring tool. You will need to adapt this somewhat, because this runs the same command n times.

Install additional software. It's a waste of time you re-invent something just because of some packaging decisions made by someone else.

Related

How do I make my python script less laggy?

I am new to python and I've just created this script :
import os
import os.path
import time
while True:
if os.path.isfile('myPathTo/shutdown.svg'):
os.remove('myPathTo/shutdown.svg')
time.sleep(1)
os.system('cd C:\Windows\PSTools & psshutdown -d -t 0')
As you can see, this script is very short and I think there is a way to make it less laggy. On my PC, it is using about 30% of my processor :
Python stats on my pc
I don't really know why it is using so much resources, I need your help :)
A little explanation of the program :
I'm using IFTTT to send a file on my google drive which is synchronized on my pc (shutdown.svg) when I ask google home to shut down my pc.
When Python detect the file, he has to remove it and shut down the pc. I've added time between theses actions to make sure the script does not check the file too many times to reduce lag. Maybe 1 second is too short ?
I've added time between theses actions to make sure the script does not check the file too many times to reduce lag
This loop is sleeping 1 sec only before shutting down when the file is found, i.e. it never sleeps until it actually finds a file. So, put sleep(1) out of the if-condition.
Maybe 1 second is too short?
If you can, make this sleep time as long as possible.
If your only task is to shut down the PC, there are so many ways of scanning for an update like crons for regular scripts running or setting a lightweight server

Is it possible to see what a Python process is doing?

I created a python script that grabs some info from various websites, is it possible to analyze how long does it take to download the data and how long does it take to write it on a file?
I am interested in knowing how much it could improve running it on a better PC (it is currently running on a crappy old laptop.
If you just want to know how long a process takes to run the time command is pretty handy. Just run time <command> and it will report how much time it took to run with it counted in a few categories, like wall clock time, system/kernel time and user space time. This won't tell you anything about which parts of the system are taking up the amount of time. You can always look at a profiler if you want/need that type of information.
That said, as Barmar said, if you aren't doing much processing of the sites you are grabbing, the laptop is probably not going to be a limiting factor.
You can always store the system time in a variable before a block of code that you want to test, do it again after then compare them.

How do I load up python in the shell from Vim, but not use it right away?

I'm doing TDD, but the system I'm working with takes 6 seconds to get through boilerplate code. This code is not part of my work, nor of my tests (it's Autodesk Maya's headless/batch/CLI Python mode). I've talked to support, and there's no way around the load time, so I though maybe I could load and initialize Python first in the background, as I code, and then my mapping would simply run the nosetests inside of that when I'm ready. My tests take something like 0.01 seconds, so this should feel instantaneous, which would really help the red/green/refactor cycle.
In short, instead of firing off /path/to/mayapy /path/to/runtests.py /current/buffer/path, Vim would just fire up /path/to/mayapy with the boilerplate stuff from runtests.py, then somehow hold onto that running instance. When I hit my mapping, it would send into that running instance the call to nosetest with the current buffer's path (and then fire up another instance to hold onto while waiting for the next run). How do I hold onto the running instance and call into it later? I'm even considering having a chain of 2 or 3, for the times when I make minor mistakes and rerun 2 seconds later.
Vim-ipython, the excellent work of Paul Ivanov, is an interface between vim and ipython sessions (demo video). This may relieve you of some of the boilerplate of sending buffers to python and waiting on results.
I'm not entirely sure this is exactly what you want, but with a bit of python and vim glue code it may be a good step in the right direction, but I'm guessing you'd need to do a bit of experimentation to get a workflow you're happy with.

GAE Backend fails to respond to start request

This is probably a truly basic thing that I'm simply having an odd time figuring out in a Python 2.5 app.
I have a process that will take roughly an hour to complete, so I made a backend. To that end, I have a backend.yaml that has something like the following:
-name: mybackend
options: dynamic
start: /path/to/script.py
(The script is just raw computation. There's no notion of an active web session anywhere.)
On toy data, this works just fine.
This used to be public, so I would navigate to the page, the script would start, and time out after about a minute (HTTP + 30s shutdown grace period I assume, ). I figured this was a browser issue. So I repeat the same thing with a cron job. No dice. Switch to a using a push queue and adding a targeted task, since on paper it looks like it would wait for 10 minutes. Same thing.
All 3 time out after that minute, which means I'm not decoupling the request from the backend like I believe I am.
I'm assuming that I need to write a proper Handler for the backend to do work, but I don't exactly know how to write the Handler/webapp2Route. Do I handle _ah/start/ or make a new endpoint for the backend? How do I handle the subdomain? It still seems like the wrong thing to do (I'm sticking a long-process directly into a request of sorts), but I'm at a loss otherwise.
So the root cause ended up being doing the following in the script itself:
models = MyModel.all()
for model in models:
# Magic happens
I was basically taking for granted that the query would automatically batch my Query.all() over many entities, but it was dying at the 1000th entry or so. I originally wrote it was computational only because I completely ignored the fact that the reads can fail.
The actual solution for solving the problem we wanted ended up being "Use the map-reduce library", since we were trying to look at each model for analysis.

long time running python script

I have application of following parts:
client->nginx->uwsgi(python)
and some python scripts can be running long time (2-6 minutes). After execution of script I should give to client content, but connection break with error "gateway timeout 504". What can I use for my case to avoid this error?
So is your goal to reduce the run time of the scripts, or to not have them time out? Browsers are going to give up on a 6 minute request no matter what you try.
Perhaps try doing the work on the server, and then polling for progress with AJAX requests?
Or, if possible, try optimizing the scripts. For example, if you have some horribly slow SQL stuff going on, try cleaning that up.
Otherwise, without more information, a more specific answer is hard to give.
I once set up a system where the "main page" contained an Iframe which showed the output of the long running program as text/plain. I think the the handler for the the Iframe content was a Python CGI script which emitted all headers and then the program output line by line under an Apache server.
I don't know whether this would work under your configuration.
This heavily depends on your server setup (i.e. how easy it is to push data back to the client), but is it possible while running your lengthy application to periodically send some “null” content (e.g plain newlines assuming your output is html) so that the browser thinks this is just a slow connection and not a stalled one?

Categories