Prevent freezing when running python scripts from terminal - python

I am running some python scripts in my linux terminal that happen to be pretty resource intensive, but when I do my system will become pretty non-responsive until the process has completed. I know there are commands like nice and cpulimit but I haven't found a great way to just open a terminal that is somehow resource limited (and what percentage of resources can be devoted to it) and can be used to run any scripts during that particular session.
So is there a good way to do this?

Related

How to analyse performance of python without completing it?

As I was writing a Python script using a third party module, the workload was so big that the OS (Linux with 32GB memory) killed it everytime before it could complete. We learned from syslog that it ran out of physical memory, so the OS killed it through OOM.
Many current performance analysis tools e.g. profile require completion of the script and can not go into the modules that the script used. So I reckon that this should be a common case where completion of the script is not available, and performance analysis is needed desperately under this kind of circumstance. Any advice?
From the original question:
Profile is an amazing tool for performance analysis and does not require completion, and can go into the module that the script used. I think for this question, the best answer is to use profile.

How to automatically rerun a python program after it finishes? Supervisord?

I have a python program that I would like to constantly be running updates and gathering new data. Essentially, I am gathering data from a bunch of domains. My processors take about a day and a half to run. Once they finish, I'd like them to automatically start over again.
I don't want to use a while loop to just restart the processes without killing everything related first because some of the packages that I am using to support these processors (mainly pyV8) have a problem of memory slowly accumulating and I'm not a good enough programmer to dive into debugging a memory leak in a big package like that. So, I need all of the related processes to successfully die and then come back to life.
I have heard that supervisord can do this type of work, but don't like messing around with .conf files and would prefer to keep everything inside of python.
Summary: Is there a package that will kill all related processes with a script/package that I could use to put into a while loop or create this kind of behavior inside of a python script?
I don't see why you couldn't use supervisord. The configuration is really simple and very flexible and it's not limited to python programs.
For example, you can create file /etc/supervisor/conf.d/myprog.conf:
[program:myprog]
command=/opt/myprog/bin/myprog --opt1 --opt2
directory=/opt/myprog
user=myuser
Then reload supervisor's config:
$ sudo supervisorctl reload
and it's on. Isn't it simple enough?
More about supervisord configuration: http://supervisord.org/subprocess.html

How to embed a Python interpreter on a website

I am attempting to build an educational coding site, similar to Codecademy, but I am frankly at a loss as to what steps should be taken. Could I be pointed in the right direction in including even a simple python interpreter in a webapp?
One option might be to use PyPy to create a sandboxed python. It would limit the external operations someone could do.
Once you have that set up, your website would take the code source, send it over ajax to your webserver, and the server would run the code in a subprocess of a sandboxed python instance. You would also be able to kill the process if it took longer than say 5 seconds. Then you return the output back as a response to the client.
See these links for help on a PyPy sandbox:
http://doc.pypy.org/en/latest/sandbox.html
http://readevalprint.com/blog/python-sandbox-with-pypy.html
To create a fully interactive REPL would be even more involved. You would need to keep an interpreter alive to each client on your server. Then accept ajax "lines" of input and run them through the interp by communicating with the running process, and return the output.
Overall, not trivial. You would need some strong dev skills to do this comfortably. You may find this task a bit daunting if you are just learning.
There's more to do here than you think.
The major problem is that you cannot let people run arbitrary Python code on your webserver. For example, what happens if they do
import os
os.system("rm -rf *.*")
So clearly you have to run this Python code securely. But then you have the problem of securing Python, which is basically impossible because of how dynamic it is. And so you'll probably have to run the Python shell in a virtual machine, which comes with its own headaches.
Have you seen e.g. http://code.google.com/p/google-app-engine-samples/downloads/detail?name=shell_20091112.tar.gz&can=2&q=?
One recent option for this is to use repl.
This option is awesome because the compilers are made using JavaScript so the compilation and execution is made in the user-side, meaning that the server is free of vulnerabilities.
They have compilers for: Python3, Python, Javascript, Java, Ruby, PHP...
I strongly recommend you to check their site at http://repl.it
Look into LXC Containers. They have a pretty cool api that you can use to create lightweight linux containers. You could run the subprocess commands inside that container that way the end user could not mess with your main server.

Working implementation of daemon in Python

Does anyone know of a working and well documented implementation of a daemon using python? Please post a link here if you know of a project that fits these two requirements.
Three options I can think of-
Make a cron job that calls your script. Cron is a common name for a GNU/Linux daemon that periodically launches scripts according to a schedule you set. You add your script into a crontab or place a symlink to it into a special directory and the daemon handles the job of launching it in the background. You can read more at wikipedia. There is a variety of different cron daemons, but your GNU/Linux system should have it already installed.
Pythonic approach (a library, for example) for your script to be able to daemonize itself. Yes, it will require a simple event loop (where your events are timer triggering, possibly, provided by sleep function). Here is the one I recommend & use - A simple unix/linux daemon in Python
Use python multiprocessing module. The nitty-gritty of trying to fork a process etc. are hidden in this implementation. It's pretty neat.
I wouldn't recommend 2 or 3 'coz you're in fact repeating cron functionality. The Linux system paradigm is to let multiple simple tools interact and solve your problems. Unless there are additional reasons why you should make a daemon (in addition to trigger periodically), choose the other approach.
Also, if you use daemonize with a loop and a crash happens, make sure that you have logs which will help you debug. Also devise a way so that the script starts again. While if the script is added as a cron job, it will trigger again in the time gap you kept.
If you just want to run a daemon, consider Supervisor, a daemon that itself controls and manages daemons.
If you want to look at the nitty-gritty, you can check out Supervisor's launch script or some of the responses to this lazyweb request.
Check this link for a double-fork daemon: http://code.activestate.com/recipes/278731-creating-a-daemon-the-python-way/
The code is readable and well-documented. You want to take a look at chapter 13 of W. Richard's book 'Advanced Programming in the UNix Environment' for detailed information on Unix daemons.

Measuring CPU time per-thread on Windows

I'm developing a long-running multi-threaded Python application for Windows, and I want the process to know the CPU time that each of its threads has taken. I can get the overall times for the entire process with os.times() but I need to know the per-thread times.
I know that there are external tools such as the Sysinternals Process Explorer, but my program itself needs to have this information. If I were on Linux, I look in the /proc filesystem, as described here. If I were writing C code, I'd use the GetThreadTimes call, as described here.
So how can I accomplish this on Windows using Python?
win32process.GetThreadTimes
You want the Python for Windows Extensions to do hairy windows things.
Or you can simply use yappi. (https://code.google.com/p/yappi/) It transparently uses GetThreadTimes() if CPU clock type is selected for profiling.
See here also for an example: https://code.google.com/p/yappi/wiki/YThreadStats_v082

Categories