What exactly is happening when I call time.sleep(5) in a python script? Is the program using a lot of resources from the computer?
I see people using the sleep function in their programs to schedule tasks, but this requires you leave your hard drive running the whole time right? That would be taking for you computer over the long haul right?
I'm trying to figure out what's to run programs at specific times remotely, but I haven't found an explanation of how to do this that is very intuitive.
Any suggestions?
sleep will mark the process (thread) for being inactive until the given time is up. During this time the kernel will simply not schedule this process (thread). It will not waste resources.
Hard disks typically have spin-down policies based solely on their usage. If they aren't accessed for a specific time, they will spin down. They will spin up as soon as some process (thread) is accessing them again.
This means that letting a process (thread) sleep for some time gives the hard disk a chance to spin down (especially if the sleep duration is large, say, more than some minutes).
Related
We have been using Airflow for a while, it is just great.
Now we are considering moving some of our very frequent tasks into our airflow server too.
Let's say I have a script running every second.
What's the best practice to schedule it with airflow:
Run this script in DAG that is scheduled every second. I highly doubt this will be the solution, there is significant overhead for a DAGRUN
Run this script in a while loop that stops after 6 hours, then schedule it on Airflow to be run every 6 hour?
Create a DAG with no schedule, put the task in a while True loop with proper sleep time, so the task will never terminates unless there is an error.
Any other suggestions?
Or this kind of task is just not suitable for Airflow? should do it with a lambda function and AWS scheduler?
Cheers!
What's the best practice to schedule it
... this kind of task is just not suitable for Airflow?
It is not suitable.
In particular, your airflow is probably configured to re-examine the set of DAGs every 5 seconds, which doesn't sound like a good fit for a 1-second task. Plus the ratio of scheduling overhead to work performed would not be attractive. I suppose you could schedule five simultaneous tasks, twelve times per minute, and have them sleep zero to four seconds, but that's just crazy. And likely you would need to "lock against yourself" to avoid having simultaneous sibling tasks step on each other's toes.
The six-hour suggestion (2.) is not crazy. I will view it as a sixty-minute #hourly task instead, since overheads are similar. Exiting after an hour and letting airflow respawn has several benefits. Log rolling happens at regular intervals. If your program crashes, it will be restarted before too long. If your host reboots, again your program is restarted before too long. Downside is that your business need may view "more than a minute" as "much too long". And coordinating overlapping tasks, or gap between tasks, at the hour boundary may pose some issues.
Your stated needs exactly match the problem that Supervisor addresses. Just use that. You will always have exactly one copy of your event loop running, even if the app crashes, even if the host crashes. Log rolling and other administrative details have already been addressed. The code base is mature and lots of folks have beat on it and incorporated their feature requests. It fits what you want.
I am trying to get my head around threading vs. CPU usage. There are plenty of discussions about threading vs. multiprocessing (a good overview being this answer) so I decided to test this out by launching a maximum number of threads on my 8 CPU laptop running Windows 10, Python 3.4.
My assumption was that all the threads would be bound to a single CPU.
EDIT: it turns out that it was not a good assumption. I now understand that for multithreaded code, only one piece of python code can run at once (no matter where/on which core). This is different for multiprocessing code (where processes are independent and run indeed independently).
While I read about these differences, it is one answer which actually clarified this point.
I think it also explains the CPU view below: that it is an average view of many threads spread out on many CPUs, but only one of them running at one given time (which "averages" to all of them running all the time).
It is not a duplicate of the linked question (which addresses the opposite problem, i.e. all threads on one core) and I will leave it hanging in case someone has a similar question one day and is hopefully helped by my enlightenment.
The code
import threading
import time
def calc():
time.sleep(5)
while True:
a = 2356^36
n = 0
while True:
try:
n += 1
t = threading.Thread(target=calc)
t.start()
except RuntimeError:
print("max threads: {n}".format(n=n))
break
else:
print('.')
time.sleep(100000)
Led to 889 threads being started.
The load on the CPUs was however distributed (and surprisingly low for a pure CPU calculation, the laptop is otherwise idle with an empty load when not running my script):
Why is it so? Are the threads constantly moved as a pack between CPUs and what I see is just an average (the reality being that at a given moment all threads are on one CPU)? Or are they indeed distributed?
As of today it is still the case that 'one thread holds the GIL'. So one thread is running at a time.
The threads are managed on the operating system level. What happens is that every 100 'ticks' (=interpreter instruction) the running thread releases the GIL and resets the tick counter.
Because the threads in this example do continuous calculations, the tick limit of 100 instructions is reached very fast, leading to an almost immediate release of the GIL and a 'battle' between threads starts to acquire the GIL.
So, my assumption is that your operating system has a higher than expected load , because of (too) fast thread switching + almost continuous releasing and acquiring the GIL. The OS spends more time on switching than actually doing any useful calculation.
As you mention yourself, for using more than one core at a time, it's better to look at multiprocessing modules (joblib/Parallel).
Interesting read:
http://www.dabeaz.com/python/UnderstandingGIL.pdf
Um. The point of multithreading is to make sure they work gets spread out. A really easy cheat is to use as many threads as you have CPU cores. The point is they are all independent so they can actually run at the same time. If they were on the same core only one thread at a time could really run at all. They'd pass that core back and forth for processing at the OS level.
Your assumption is wrong and bizarre. What would ever lead you to think they should run on the same CPU and consequently go at 1/8th speed? As the only reason to thread them is typically to get the whole batch to go faster than a single core alone.
In fact, what the hell do you think writing parallel code is for if not to run independently on several cores at the same time? Like this would be pointless and hard to do, let's make complex fetching, branching, and forking routines to accomplish things slower than one core just plugging away at the data?
I am running an os.system(cmd) in a for-loop. Since sometimes it hangs, I am trying to use process=subprocess.pOpen(cmd) in a for-loop. But I want to know the following:
If I do sleep(60) and then check if the process is still running by using process.poll(), how do I differentiate between process actually running even after 1 minute and process that hung?
If I kill the process which hung, will the for-loop still continue or will it exit?
Thanks!
I don't know of any general way to tell whether a process is hung or working. If a process hangs due to a locking issue, then it might consume 0% CPU and you might be able to guess that it is hung and not working; but if it hangs with an infinite loop, the process might make the CPU 100% busy but not accomplish any useful work. And you might have a process communicating on the network, talking to a really slow host with long timeouts; that would not be hung but would consume 0% CPU while waiting.
I think that, in general, the only hope you have is to set up some sort of "watchdog" system, where your sub-process uses inter-process communication to periodically send a signal that means "I'm still alive".
If you can't modify the program you are running as a sub-process, then at least try to figure out why it hangs, and see if you can then figure out a way to guess that it has hung. Maybe it normally has a balanced mix of CPU and I/O, but when it hangs it goes in a tight infinite loop and the CPU usage goes to 100%; that would be your clue that it is time to kill it and restart. Or, maybe it writes to a log file every 30 seconds, and you can monitor the size of the file and restart it if the file doesn't grow. Or, maybe you can put the program in a "verbose" mode where it prints messages as it works (either to stdout or stderr) and you can watch those. Or, if the program works as a daemon, maybe you can actively query it and see if it is alive; for example, if it is a database, send a simple query and see if it succeeds.
So I can't give you a general answer, but I have some hope that you should be able to figure out a way to detect when your specific program hangs.
Finally, the best possible solution would be to figure out why it hangs, and fix the problem so it doesn't happen anymore. This may not be possible, but at least keep it in mind. You don't need to detect the program hanging if the program never hangs anymore!
P.S. I suggest you do a Google search for "how to monitor a process" and see if you get any useful ideas from that.
A common way to detect things that have stopped working is to have them emit a signal at roughly regular intervals and have another process monitor the signal. If the monitor sees that no signal has arrived after, say, twice the interval it can take action such as killing and restarting the process.
This general idea can be used not only for software but also for hardware. I have used it to restart embedded controllers by simply charging a capacitor from an a.c. coupled signal from an output bit. A simple detector monitors the capacitor and if the voltage ever falls below a threshold it just pulls the reset line low and at the same time holds the capacitor charged for long enough for the controller to restart.
The principle for software is similar; one way is for the process to simply touch a file at intervals. The monitor checks the file modification time at intervals and if it is too old kills and restarts the process.
In OP's case the subprocess could write a status code to a file to say how far it has got in its work.
I have written a data munging script that is very CPU intensive. It has been running for a few days now, but now (thanks to trace messages sent to the console), I can see that it is not working (actually, has not been working for the last 10 hours or so.
When I run top, I notice that the process is either sleeping (S) or in uninterreptable sleep (D). This is wasting a lot of time.
I used sudo renice -10 PID to change the process's nice value, and after running for a short while, I notice that the process has gone back to sleep again.
My question(s):
Is there anything I can do to FORCE the script to run until it finishes (if even it means the machine is unusable until the end of the script?
Is there a yield command I can use in Python, which allows me to periodically pass control to other process/threads to stop the scheduler from trying to put my script to sleep?.
I am using python 2.7.x on Ubuntu 10.0.4
The scheduler will only put your process on hold if there is another process ready to run. If you have no other processes which hog up the CPU, your process will be running most of the time. The scheduler does not put your process to sleep just because it feels like it.
My guess is that there is some reason your process is not runnable, e.g. it is blocking and waiting for I/O or data.
Question
I need to make a short write to a file every 15 seconds (and sleep the rest of the time)... it seems to me that multithreading or multiprocessing would be useful to address this, by having a dedicated thread or process to do the file write. Which would be better in terms of timing/reliability, as well as memory footprint?
Background
I am writing a small Python application for a Chumby (so limited memory availability -- 128MB total system memory); to stop the default Chumby control panel from restarting after I've killed it, a temp file needs to be written to every 15 seconds or so to "fool" the watchdog process that would ordinarily restart the control panel. The main application may be busy doing other things, and I don't want to try to have to "watch the clock" as it's doing its other things to make sure it squeezes in the temp file write.
The Chumby seems to be Linux-based, so signal.setitimer() should be available (provided you have access to Python 2.6 or above).
This function allows you to install a handler that is periodically called, so you don't need a thread or process.