What is Python's equivalent to 'ulimit'? - python

I'm trying to implement a check on system resources for the current shell (basically everything in ulimit) in Python to see if enough resources can be allocated. I've found the resource module, but it doesn't seem to have all the information ulimit provides (e.g. POSIX message queues and real-time priority). Is there a way to find the soft and hard limits for these in Python without using external libraries? I'd like to avoid running ulimit as a subprocess if possible but if it's the only way, will do so.

Use resource.getrlimit(). If there's no constant in the resource package, look it up in /usr/include/bits/resource.h:
$ grep RLIMIT_MSGQUEUE /usr/include/bits/resource.h
__RLIMIT_MSGQUEUE = 12,
#define RLIMIT_MSGQUEUE __RLIMIT_MSGQUEUE
Then you can define the constant yourself:
import resource
RLIMIT_MSGQUEUE = 12
print(resource.getrlimit(RLIMIT_MSGQUEUE))

Related

How to manage memory usage of processes in Linux? [duplicate]

I'm trying to implement a check on system resources for the current shell (basically everything in ulimit) in Python to see if enough resources can be allocated. I've found the resource module, but it doesn't seem to have all the information ulimit provides (e.g. POSIX message queues and real-time priority). Is there a way to find the soft and hard limits for these in Python without using external libraries? I'd like to avoid running ulimit as a subprocess if possible but if it's the only way, will do so.
Use resource.getrlimit(). If there's no constant in the resource package, look it up in /usr/include/bits/resource.h:
$ grep RLIMIT_MSGQUEUE /usr/include/bits/resource.h
__RLIMIT_MSGQUEUE = 12,
#define RLIMIT_MSGQUEUE __RLIMIT_MSGQUEUE
Then you can define the constant yourself:
import resource
RLIMIT_MSGQUEUE = 12
print(resource.getrlimit(RLIMIT_MSGQUEUE))

How can I safely run untrusted python code?

Here is the scenario, my website has some unsafe code, which is generated by website users, to run on my server.
I want to disable some reserved words for python to protect my running environment, such as eval, exec, print and so on.
Is there a simple way (without changing the python interpreter, my python version is 2.7.10) to implement the feature I described before?
Many thanks.
Disabling names on python level won't help as there are numerous ways around it. See this and this post for more info. This is what you need to do:
For CPython, use RestrictedPython to define a restricted subset of Python.
For PyPy, use sandboxing. It allows you to run arbitrary python code in a special environment that serializes all input/output so you can check it and decide which commands are allowed before actually running them.
Since version 3.8 Python supports audit hooks so you can completely prevent certain actions:
import sys
def audit(event, args):
if event == 'compile':
sys.exit('nice try!')
sys.addaudithook(audit)
eval('5')
Additionally, to protect your host OS, use
either virtualization (safer) such as KVM or VirtualBox
or containerization (much lighter) such as lxd or docker
In the case of containerization with docker you may need to add AppArmor or SELinux policies for extra safety. lxd already comes with AppArmor policies by default.
Make sure you run the code as a user with as little privileges as possible.
Rebuild the virtual machine/container for each user.
Whichever solution you use, don't forget to limit resource usage (RAM, CPU, storage, network). Use cgroups if your chosen virtualization/containerization solution does not support these kinds of limits.
Last but not least, use timeouts to prevent your users' code from running forever.
One way is to shadow the methods:
def not_available(*args, **kwargs):
return 'Not allowed'
eval = not_available
exec = not_available
print = not_available
However, someone smart can always do this:
import builtins
builtins.print('this works!')
So the real solution is to parse the code and not allow the input if it has such statements (rather than trying to disable them).

Why use Python's os module methods instead of executing shell commands directly?

I am trying to understand what is the motivation behind using Python's library functions for executing OS-specific tasks such as creating files/directories, changing file attributes, etc. instead of just executing those commands via os.system() or subprocess.call()?
For example, why would I want to use os.chmod instead of doing os.system("chmod...")?
I understand that it is more "pythonic" to use Python's available library methods as much as possible instead of just executing shell commands directly. But, is there any other motivation behind doing this from a functionality point of view?
I am only talking about executing simple one-line shell commands here. When we need more control over the execution of the task, I understand that using subprocess module makes more sense, for example.
It's faster, os.system and subprocess.call create new processes which is unnecessary for something this simple. In fact, os.system and subprocess.call with the shell argument usually create at least two new processes: the first one being the shell, and the second one being the command that you're running (if it's not a shell built-in like test).
Some commands are useless in a separate process. For example, if you run os.spawn("cd dir/"), it will change the current working directory of the child process, but not of the Python process. You need to use os.chdir for that.
You don't have to worry about special characters interpreted by the shell. os.chmod(path, mode) will work no matter what the filename is, whereas os.spawn("chmod 777 " + path) will fail horribly if the filename is something like ; rm -rf ~. (Note that you can work around this if you use subprocess.call without the shell argument.)
You don't have to worry about filenames that begin with a dash. os.chmod("--quiet", mode) will change the permissions of the file named --quiet, but os.spawn("chmod 777 --quiet") will fail, as --quiet is interpreted as an argument. This is true even for subprocess.call(["chmod", "777", "--quiet"]).
You have fewer cross-platform and cross-shell concerns, as Python's standard library is supposed to deal with that for you. Does your system have chmod command? Is it installed? Does it support the parameters that you expect it to support? The os module will try to be as cross-platform as possible and documents when that it's not possible.
If the command you're running has output that you care about, you need to parse it, which is trickier than it sounds, as you may forget about corner-cases (filenames with spaces, tabs and newlines in them), even when you don't care about portability.
It is safer. To give you an idea here is an example script
import os
file = raw_input("Please enter a file: ")
os.system("chmod 777 " + file)
If the input from the user was test; rm -rf ~ this would then delete the home directory.
This is why it is safer to use the built in function.
Hence why you should use subprocess instead of system too.
There are four strong cases for preferring Python's more-specific methods in the os module over using os.system or the subprocess module when executing a command:
Redundancy - spawning another process is redundant and wastes time and resources.
Portability - Many of the methods in the os module are available in multiple platforms while many shell commands are os-specific.
Understanding the results - Spawning a process to execute arbitrary commands forces you to parse the results from the output and understand if and why a command has done something wrong.
Safety - A process can potentially execute any command it's given. This is a weak design and it can be avoided by using specific methods in the os module.
Redundancy (see redundant code):
You're actually executing a redundant "middle-man" on your way to the eventual system calls (chmod in your example). This middle man is a new process or sub-shell.
From os.system:
Execute the command (a string) in a subshell ...
And subprocess is just a module to spawn new processes.
You can do what you need without spawning these processes.
Portability (see source code portability):
The os module's aim is to provide generic operating-system services and it's description starts with:
This module provides a portable way of using operating system dependent functionality.
You can use os.listdir on both windows and unix. Trying to use os.system / subprocess for this functionality will force you to maintain two calls (for ls / dir) and check what operating system you're on. This is not as portable and will cause even more frustration later on (see Handling Output).
Understanding the command's results:
Suppose you want to list the files in a directory.
If you're using os.system("ls") / subprocess.call(['ls']), you can only get the process's output back, which is basically a big string with the file names.
How can you tell a file with a space in it's name from two files?
What if you have no permission to list the files?
How should you map the data to python objects?
These are only off the top of my head, and while there are solutions to these problems - why solve again a problem that was solved for you?
This is an example of following the Don't Repeat Yourself principle (Often reffered to as "DRY") by not repeating an implementation that already exists and is freely available for you.
Safety:
os.system and subprocess are powerful. It's good when you need this power, but it's dangerous when you don't. When you use os.listdir, you know it can not do anything else other then list files or raise an error. When you use os.system or subprocess to achieve the same behaviour you can potentially end up doing something you did not mean to do.
Injection Safety (see shell injection examples):
If you use input from the user as a new command you've basically given him a shell. This is much like SQL injection providing a shell in the DB for the user.
An example would be a command of the form:
# ... read some user input
os.system(user_input + " some continutation")
This can be easily exploited to run any arbitrary code using the input: NASTY COMMAND;# to create the eventual:
os.system("NASTY COMMAND; # some continuation")
There are many such commands that can put your system at risk.
For a simple reason - when you call a shell function, it creates a sub-shell which is destroyed after your command exists, so if you change directory in a shell - it does not affect your environment in Python.
Besides, creating sub-shell is time consuming, so using OS commands directly will impact your performance
EDIT
I had some timing tests running:
In [379]: %timeit os.chmod('Documents/recipes.txt', 0755)
10000 loops, best of 3: 215 us per loop
In [380]: %timeit os.system('chmod 0755 Documents/recipes.txt')
100 loops, best of 3: 2.47 ms per loop
In [382]: %timeit call(['chmod', '0755', 'Documents/recipes.txt'])
100 loops, best of 3: 2.93 ms per loop
Internal function runs more than 10 time faster
EDIT2
There may be cases when invoking external executable may yield better results than Python packages - I just remembered a mail sent by a colleague of mine that performance of gzip called through subprocess was much higher than the performance of a Python package he used. But certainly not when we are talking about standard OS packages emulating standard OS commands
Shell call are OS specific whereas Python os module functions are not, in most of the case. And it avoid spawning a subprocess.
It's far more efficient. The "shell" is just another OS binary which contains a lot of system calls. Why incur the overhead of creating the whole shell process just for that single system call?
The situation is even worse when you use os.system for something that's not a shell built-in. You start a shell process which in turn starts an executable which then (two processes away) makes the system call. At least subprocess would have removed the need for a shell intermediary process.
It's not specific to Python, this. systemd is such an improvement to Linux startup times for the same reason: it makes the necessary system calls itself instead of spawning a thousand shells.

Using resource in windows

i've got a script that uses the resource-module from python (see http://docs.python.org/library/resource.html for information). Now i want to port this script to windows. is there any alternative version of this (the python-docs are labeling it as "unix only").
if there isn't, is there any other workaround?
I'm using the following method/constant:
resource.getrusage(resource.RUSAGE_CHILDREN)
resource.RLIMIT_CPU
Thank you
PS: I'm using python 2.7 / 3.2
There's no good way of doing this generically for all "Resources"" -- hence why it's a Unix only command. For CPU speed only you can either use registry keys to set the process id limit:
http://technet.microsoft.com/en-us/library/ff384148%28WS.10%29.aspx
As done here:
http://code.activestate.com/recipes/286159/
IMPORTANT: Backup your registry before trying anything with registry
Or you could set the thread priority:
http://msdn.microsoft.com/en-us/library/ms685100%28VS.85%29.aspx
As done here:
http://nullege.com/codes/search/win32process.SetThreadPriority
For other resources you'll have to scrap together similar DLL access APIs to achieve the desired effect. You should first ask yourself if you need this behavior. Oftentimes you can limit CPU time by sleeping the thread in operation at convenient times to allow the OS to swap processes and memory controls can be done problematically to check data structure sizes.

How can I access Ring 0 with Python?

This answer, stating that the naming of classes in Python is not done because of special privileges, here confuses me.
How can I access lower rings in Python?
Is the low-level io for accessing lower level rings?
If it is, which rings I can access with that?
Is the statement "This function is intended for low-level I/O." referring to lower level rings or to something else?
C tends to be prominent language in os -programming. When there is the OS -class in Python, does it mean that I can access C -code through that class?
Suppose I am playing with bizarre machine-language code and I want to somehow understand what it means. Are there some tools in Python which I can use to analyze such things? If there is not, is there some way that I could still use Python to control some tool which controls the bizarre machine language? [ctypes suggested in comments]
If Python has nothing to do with the low-level privileged stuff, do it still offers some wrappers to control the privileged?
Windows and Linux both use ring 0 for kernel code and ring 3 for user processes. The advantage of this is that user processes can be isolated from one another, so the system continues to run even if a process crashes. By contrast, a bug in ring 0 code can potentially crash the entire machine.
One of the reasons ring 0 code is so critical is that it can access hardware directly. By contrast, when a user-mode (ring 3) process needs to read some data from a disk:
the process executes a special instruction telling the CPU it wants to make a system call
CPU switches to ring 0 and starts executing kernel code
kernel checks that the process is allowed to perform the operation
if permitted, the operation is carried out
kernel tells the CPU it has finished
CPU switches back to ring 3 and returns control to the process
Processes belonging to "privileged" users (e.g. root/Administrator) run in ring 3 just like any other user-mode code; the only difference is that the check at step 3 always succeeds. This is a good thing because:
root-owned processes can crash without taking the entire system down
many user-mode features are unavailable in the kernel, e.g. swappable memory, private address space
As for running Python code in lower rings - kernel-mode is a very different environment, and the Python interpreter simply isn't designed to run in it, e.g. the procedure for allocating memory is completely different.
In the other question you reference, both os.open() and open() end up making the open() system call, which checks whether the process is allowed to open the corresponding file and performs the actual operation.
I think SimonJ's answer is very good, but I'm going to post my own because from your comments it appears you're not quite understanding things.
Firstly, when you boot an operating system, what you're doing is loading the kernel into memory and saying "start executing at address X". The kernel, that code, is essentially just a program, but of course nothing else is loaded, so if it wants to do anything it has to know the exact commands for the specific hardware it has attached to it.
You don't have to run a kernel. If you know how to control all the attached hardware, you don't need one, in fact. However, it was rapidly realised way back when that there are many types of hardware one might face and having an identical interface across systems to program against would make code portable and generally help get things done faster.
So the function of the kernel, then, is to control all the hardware attached to the system and present it in a common interface, called an API (application programming interface). Code for programs that run on the system don't talk directly to hardware. They talk to the kernel. So user land programs don't need to know how to ask a specific hard disk to read sector 0x213E or whatever, but the kernel does.
Now, the description of ring 3 provided in SimonJ's answer is how userland is implemented - with isolated, unprivileged processes with virtual private address spaces that cannot interfere with each other, for the benefits he describes.
There's also another level of complexity in here, namely the concept of permissions. Most operating systems have some form of access control, whereby "administrators" have total control of the system and "users" have a restricted subset of options. So a kernel request to open a file belonging to an administrator should fail under this sort of approach. The user who runs the program forms part of the program's context, if you like, and what the program can do is constrained by what that user can do.
Most of what you could ever want to achieve (unless your intention is to write a kernel) can be done in userland as the root/administrator user, where the kernel does not deny any API requests made to it. It's still a userland program. It's still a ring 3 program. But for most (nearly all) uses it is sufficient. A lot can be achieved as a non-root/administrative user.
That applies to the python interpreter and by extension all python code running on that interpreter.
Let's deal with some uncertainties:
The naming of os and sys I think is because these are "systems" tasks (as opposed to say urllib2). They give you ways to manipulate and open files, for example. However, these go through the python interpreter which in turn makes a call to the kernel.
I do not know of any kernel-mode python implementations. Therefore to my knowledge there is no way to write code in python that will run in the kernel (linux/windows).
There are two types of privileged: privileged in terms of hardware access and privileged in terms of the access control system provided by the kernel. Python can be run as root/an administrator (indeed on Linux many of the administration gui tools are written in python), so in a sense it can access privileged code.
Writing a C extension or controlling a C application to Python would ostensibly mean you are either using code added to the interpreter (userland) or controlling another userland application. However, if you wrote a kernel module in C (Linux) or a Driver in C (Windows) it would be possible to load that driver and interact with it via the kernel APIs from python. An example might be creating a /proc entry in C and then having your python application pass messages via read/write to that /proc entry (which the kernel module would have to handle via a write/read handler. Essentially, you write the code you want to run in kernel space and basically add/extend the kernel API in one of many ways so that your program can interact with that code.
"Low-level" IO means having more control over the type of IO that takes place and how you get that data from the operating system. It is low level compared to higher level functions still in Python that give you easier ways to read files (convenience at the cost of control). It is comparable to the difference between read() calls and fread() or fscanf() in C.
Health warning: Writing kernel modules, if you get it wrong, will at best result in that module not being properly loaded; at worst your system will panic/bluescreen and you'll have to reboot.
The final point about machine instructions I cannot answer here. It's a totally separate question and it depends. There are many tools capable of analysing code like that I'm sure, but I'm not a reverse engineer. However, I do know that many of these tools (gdb, valgrind) e.g. tools that hook into binary code do not need kernel modules to do their work.
You can use inpout library http://logix4u.net/parallel-port/index.php
import ctypes
#Example of strobing data out with nStrobe pin (note - inverted)
#Get 50kbaud without the read, 30kbaud with
read = []
for n in range(4):
ctypes.windll.inpout32.Out32(0x37a, 1)
ctypes.windll.inpout32.Out32(0x378, n)
read.append(ctypes.windll.inpout32.Inp32(0x378)) #Dummy read to see what is going on
ctypes.windll.inpout32.Out32(0x37a, 0)
print read
[note: I was wrong. usermode code can no longer access ring 0 on modern unix systems. -- jc 2019-01-17]
I've forgotten what little I ever knew about Windows privileges. In all Unix systems with which I'm familiar, the root user can access all ring0 privileges. But I can't think of any mapping of Python modules with privilege rings.
That is, the 'os' and 'sys' modules don't give you any special privileges. You have them, or not, due to your login credentials.
How can I access lower rings in Python?
ctypes
Is the low-level io for accessing lower level rings?
No.
Is the statement "This function is intended for low-level I/O." referring to lower level rings or to something else?
Something else.
C tends to be prominent language in os -programming. When there is the OS -class in Python, does it mean that I can access C -code through that class?
All of CPython is implemented in C.
The os module (it's not a class, it's a module) is for accessing OS API's. C has nothing to do with access to OS API's. Python accesses the API's "directly".
Suppose I am playing with bizarre machine-language code and I want to somehow understand what it means. Are there some tools in Python which I can use to analyze such things?
"playing with"?
"understand what it means"? is your problem. You read the code, you understand it. Whether or not Python can help is impossible to say. What don't you understand?
If there is not, is there some way that I could still use Python to control some tool which controls the bizarre machine language? [ctypes suggested in comments]
ctypes
If Python has nothing to do with the low-level privileged stuff, do it still offers some wrappers to control the privileged?
You don't "wrap" things to control privileges.
Most OS's work like this.
You grant privileges to a user account.
The OS API's check the privileges granted to the user making the OS API request.
If the user has the privileges, the OS API works.
If the user lacks the privileges, the OS API raises an exception.
That's all there is to it.

Categories