I found myself having to implement the following use case: I need to run a webapp in which users can submit C programs, which need to be run safely on my backend.
I'm trying to get this done using Node. In the past, I had to do something similar but the user-submitted code was JavaScript code, and I got away with using Node vm2 module. Essentially, I would create a VM and call its run method with the user submitted code as a string argument, then collect the output and do whatever I had to.
I'm trying to understand if using the same moule could help me with C code as well. The idea would be to use exec to first call gcc and compile the user code. Afterwards, I would use a VM to run exec again, this time passing the generated executable as a result. Would this be safe?
I don't understand vm2 deeply enough to know whether the safety is only limited to executing JS code or if it can be trusted to also run any arbitrary shell command safely.
In case vm2 isn't appropriate, what would be another way to run an executable in a sandboxed fashion in Node? Feel free to also suggest Python-based solutions, if you know any. Please note that the code will still be executed in a separate container as the main app regardless, but I want to make extra sure users cannot easily just tear it down at their liking.
Thank you in advance.
I am currently experiencing the same challenge as you, trying to execute safely some untrusted code using spawn, so what I can tell you is that vm2 only works for JS/TS code, but can't control what happens to a new process created by spawn, fork or exec.
For now I haven't found any good solution, but I'm thinking of trying to run the process as a user with limited rights.
As you seem to have access to the C source code, I would advise you to search how to run untrusted C programs (in plain C), and see if you can manipulate the C code in order to have a safer environment from this point of view.
I'm trying to check in python (ubuntu) if a different program is using the camera/microphone of my computer.
I thought of which syscalls are being used when accessing the camera/microphone.
I know that the syscalls "access" and "open" are being used but there are probably specific parameters for that.
And if I know which syscalls are being used, how can I also know if the
program is using those specific syscalls?
I have an example code in which I'm checking if a file.exe added any new files:
if "open(\"" + file_path + "\", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 3" in system_calls_list:
programs_which_added_new_files.append(file_path)
First, I have created a file which adds new files and than I wrote all the syscalls of the file to a list (system_calls_list). And than I'm checking if it has a specific syscall (open) with specific parameters. If it dos'e, I can know that the file I'm checking added new files and than append it's path to a different list (programs_which_added_new_files). The same concept should go to camera/microphone.
Thank you for your help :)
The problem is that there exists multiple audio frameworks for Linux.
The oldest, OSS, uses multiple device names under /dev that you are supposed to open (see list here). It is unlikely that you have this installed on a recent Ubuntu.
What you are probably using is ALSA and PulseAudio. In this case, the program is probably connecting to PulseAudio over a UNIX socket (for example /run/user/1000/pulse/native), but this is an implementation detail of PulseAudio (in particular it can also work over IP). You should use the PulseAudio API to find out, for example if you run pactl list source-outputs you should see a list of clients and the recording devices they are connected to.
I have a program that creates a bunch of movie files. I runs as a cron job and every time it runs the movies from the previous iteration are moved to a 'previous' folder so that there is always a previous version to look at.
These movie files are accessed across a network by various users and that's where I'm running into a problem.
When the script runs and tries to move the files it throws a resource busy error because the files are open by various users. Is there a way in Python to force close these files before I attempt to move them?
Further clarification:
JMax is correct when he mentions it is server level problem. I can access our windows server through Administrative Tools > Computer Management > Shared Folders > Open Files and manually close the files there, but I am wondering whether there is a Python equivalent which will achieve the same result.
something like this:
try:
shutil.move(src, dst)
except OSError:
# Close src file on all machines that are currently accessing it and try again.
This question has nothing to do with Python, and everything to do with the particular operating system and file system you're using. Could you please provide these details?
At least in Windows you can use Sysinternals Handle to force a particular handle to a file to be closed. Especially as this file is opened by another user over a network this operation is extremely destabilising and will probably render the network connection subsequently useless. You're looking for the "-c" command-line argument, where the documentation reads:
Closes the specified handle (interpreted as a hexadecimal number). You
must specify the process by its PID.
WARNING: Closing handles can cause application or system instability.
And if you're force-closing a file mounted over Samba in Linux, speaking from experience this is an excruciating experience in futility. However, others have tried with mixed success; see Force a Samba process to close a file.
As far as I know you have to end the processes which access the file. At least on Windows
The .close() method doesn't work on your object file?
See dive into Python for more information on file objects
[EDIT] I've re-read your question. Your problem is that users do open the same file from the network and you want them to close the file? But can you access to their OS?
[EDIT2] The problem is more on a server level to disconnect the user that access the file. See this example for Windows servers.
This answer, stating that the naming of classes in Python is not done because of special privileges, here confuses me.
How can I access lower rings in Python?
Is the low-level io for accessing lower level rings?
If it is, which rings I can access with that?
Is the statement "This function is intended for low-level I/O." referring to lower level rings or to something else?
C tends to be prominent language in os -programming. When there is the OS -class in Python, does it mean that I can access C -code through that class?
Suppose I am playing with bizarre machine-language code and I want to somehow understand what it means. Are there some tools in Python which I can use to analyze such things? If there is not, is there some way that I could still use Python to control some tool which controls the bizarre machine language? [ctypes suggested in comments]
If Python has nothing to do with the low-level privileged stuff, do it still offers some wrappers to control the privileged?
Windows and Linux both use ring 0 for kernel code and ring 3 for user processes. The advantage of this is that user processes can be isolated from one another, so the system continues to run even if a process crashes. By contrast, a bug in ring 0 code can potentially crash the entire machine.
One of the reasons ring 0 code is so critical is that it can access hardware directly. By contrast, when a user-mode (ring 3) process needs to read some data from a disk:
the process executes a special instruction telling the CPU it wants to make a system call
CPU switches to ring 0 and starts executing kernel code
kernel checks that the process is allowed to perform the operation
if permitted, the operation is carried out
kernel tells the CPU it has finished
CPU switches back to ring 3 and returns control to the process
Processes belonging to "privileged" users (e.g. root/Administrator) run in ring 3 just like any other user-mode code; the only difference is that the check at step 3 always succeeds. This is a good thing because:
root-owned processes can crash without taking the entire system down
many user-mode features are unavailable in the kernel, e.g. swappable memory, private address space
As for running Python code in lower rings - kernel-mode is a very different environment, and the Python interpreter simply isn't designed to run in it, e.g. the procedure for allocating memory is completely different.
In the other question you reference, both os.open() and open() end up making the open() system call, which checks whether the process is allowed to open the corresponding file and performs the actual operation.
I think SimonJ's answer is very good, but I'm going to post my own because from your comments it appears you're not quite understanding things.
Firstly, when you boot an operating system, what you're doing is loading the kernel into memory and saying "start executing at address X". The kernel, that code, is essentially just a program, but of course nothing else is loaded, so if it wants to do anything it has to know the exact commands for the specific hardware it has attached to it.
You don't have to run a kernel. If you know how to control all the attached hardware, you don't need one, in fact. However, it was rapidly realised way back when that there are many types of hardware one might face and having an identical interface across systems to program against would make code portable and generally help get things done faster.
So the function of the kernel, then, is to control all the hardware attached to the system and present it in a common interface, called an API (application programming interface). Code for programs that run on the system don't talk directly to hardware. They talk to the kernel. So user land programs don't need to know how to ask a specific hard disk to read sector 0x213E or whatever, but the kernel does.
Now, the description of ring 3 provided in SimonJ's answer is how userland is implemented - with isolated, unprivileged processes with virtual private address spaces that cannot interfere with each other, for the benefits he describes.
There's also another level of complexity in here, namely the concept of permissions. Most operating systems have some form of access control, whereby "administrators" have total control of the system and "users" have a restricted subset of options. So a kernel request to open a file belonging to an administrator should fail under this sort of approach. The user who runs the program forms part of the program's context, if you like, and what the program can do is constrained by what that user can do.
Most of what you could ever want to achieve (unless your intention is to write a kernel) can be done in userland as the root/administrator user, where the kernel does not deny any API requests made to it. It's still a userland program. It's still a ring 3 program. But for most (nearly all) uses it is sufficient. A lot can be achieved as a non-root/administrative user.
That applies to the python interpreter and by extension all python code running on that interpreter.
Let's deal with some uncertainties:
The naming of os and sys I think is because these are "systems" tasks (as opposed to say urllib2). They give you ways to manipulate and open files, for example. However, these go through the python interpreter which in turn makes a call to the kernel.
I do not know of any kernel-mode python implementations. Therefore to my knowledge there is no way to write code in python that will run in the kernel (linux/windows).
There are two types of privileged: privileged in terms of hardware access and privileged in terms of the access control system provided by the kernel. Python can be run as root/an administrator (indeed on Linux many of the administration gui tools are written in python), so in a sense it can access privileged code.
Writing a C extension or controlling a C application to Python would ostensibly mean you are either using code added to the interpreter (userland) or controlling another userland application. However, if you wrote a kernel module in C (Linux) or a Driver in C (Windows) it would be possible to load that driver and interact with it via the kernel APIs from python. An example might be creating a /proc entry in C and then having your python application pass messages via read/write to that /proc entry (which the kernel module would have to handle via a write/read handler. Essentially, you write the code you want to run in kernel space and basically add/extend the kernel API in one of many ways so that your program can interact with that code.
"Low-level" IO means having more control over the type of IO that takes place and how you get that data from the operating system. It is low level compared to higher level functions still in Python that give you easier ways to read files (convenience at the cost of control). It is comparable to the difference between read() calls and fread() or fscanf() in C.
Health warning: Writing kernel modules, if you get it wrong, will at best result in that module not being properly loaded; at worst your system will panic/bluescreen and you'll have to reboot.
The final point about machine instructions I cannot answer here. It's a totally separate question and it depends. There are many tools capable of analysing code like that I'm sure, but I'm not a reverse engineer. However, I do know that many of these tools (gdb, valgrind) e.g. tools that hook into binary code do not need kernel modules to do their work.
You can use inpout library http://logix4u.net/parallel-port/index.php
import ctypes
#Example of strobing data out with nStrobe pin (note - inverted)
#Get 50kbaud without the read, 30kbaud with
read = []
for n in range(4):
ctypes.windll.inpout32.Out32(0x37a, 1)
ctypes.windll.inpout32.Out32(0x378, n)
read.append(ctypes.windll.inpout32.Inp32(0x378)) #Dummy read to see what is going on
ctypes.windll.inpout32.Out32(0x37a, 0)
print read
[note: I was wrong. usermode code can no longer access ring 0 on modern unix systems. -- jc 2019-01-17]
I've forgotten what little I ever knew about Windows privileges. In all Unix systems with which I'm familiar, the root user can access all ring0 privileges. But I can't think of any mapping of Python modules with privilege rings.
That is, the 'os' and 'sys' modules don't give you any special privileges. You have them, or not, due to your login credentials.
How can I access lower rings in Python?
ctypes
Is the low-level io for accessing lower level rings?
No.
Is the statement "This function is intended for low-level I/O." referring to lower level rings or to something else?
Something else.
C tends to be prominent language in os -programming. When there is the OS -class in Python, does it mean that I can access C -code through that class?
All of CPython is implemented in C.
The os module (it's not a class, it's a module) is for accessing OS API's. C has nothing to do with access to OS API's. Python accesses the API's "directly".
Suppose I am playing with bizarre machine-language code and I want to somehow understand what it means. Are there some tools in Python which I can use to analyze such things?
"playing with"?
"understand what it means"? is your problem. You read the code, you understand it. Whether or not Python can help is impossible to say. What don't you understand?
If there is not, is there some way that I could still use Python to control some tool which controls the bizarre machine language? [ctypes suggested in comments]
ctypes
If Python has nothing to do with the low-level privileged stuff, do it still offers some wrappers to control the privileged?
You don't "wrap" things to control privileges.
Most OS's work like this.
You grant privileges to a user account.
The OS API's check the privileges granted to the user making the OS API request.
If the user has the privileges, the OS API works.
If the user lacks the privileges, the OS API raises an exception.
That's all there is to it.
I'm trying to use setfsuid() with python 2.5.4 and RHEL 5.4.
Since it's not included in the os module, I wrapped it in a C module of my own and installed it as a python extension module using distutils.
However when I try to use it I don't get the expected result.
setfsuid() returns value indicating success (changing from a superuser), but I can't access files to which only the newly set user should have user access (using open()), indicating that fsuid was not truely changed.
I tried to verify setfsuid() worked, by running it consecutively twice with the same user input
The result was as if nothing had changed, and on every call the returned value was of old user id different from the new one. I also called getpid() from the module, and from the python script, both returned the same id. so this is not the problem.
Just in case it's significant, I should note that I'm doing all of this from within an Apache daemon process (WSGI).
Anyone can provide an explanation to that?
Thank you
The ability to change the FSUID is limited to either root or non-root processes with the CAP_SETFCAP capability. These days it's usually considered bad practice to run a webserver with root permissions so, most likely, you'll need to set the capability on the file server (see man capabilities for details). Please note that doing this could severly affect your overall system's security. I'd recommend considering spawning a small backend process that runs as root and converses with your WSGI app via a local UNIX socket prior to mucking with the security of a high-profile target like Apache.