I need to be able to list the command line arguments (if any) passed to other running processes. I have the PIDs already of the running processes on the system, so basically I need to determine the arguments passed to process with given PID XXX.
I'm working on a core piece of a Python module for managing processes. The code is written as a Python extension in C and will be wrapped by a higher level Python library. The goal of this project is to avoid dependency on third party libs such as the pywin32 extensions, or on ugly hacks like calling 'ps' or taskkill on the command line, so I'm looking for a way to do this in C code.
I've Googled this around and found some brief suggestions of using CreateRemoteThread() to inject myself into the other process, then run GetCommandLine() but I was hoping someone might have some working code samples and/or better suggestions.
UPDATE: I've found full working demo code and a solution using NtQueryProcessInformation on CodeProject: http://www.codeproject.com/KB/threads/GetNtProcessInfo.aspx - It's not ideal since it's "unsupported" to cull the information directly from the NTDLL structures but I'll live with it. Thanks to all for the suggestions.
UPDATE 2: I managed through more Googling to dig up a C version that does not use C++ code, and is a little more direct/concisely pointed toward this problem. See http://wj32.wordpress.com/2009/01/24/howto-get-the-command-line-of-processes/ for details.
Thanks!
To answer my own question, I finally found a CodeProject solution that does exactly what I'm looking for:
http://www.codeproject.com/KB/threads/GetNtProcessInfo.aspx
As #Reuben already pointed out, you can use NtQueryProcessInformation to retrieve this information. Unfortuantely it's not a recommended approach, but given the only other solution seems to be to incur the overhead of a WMI query, I think we'll take this approach for now.
Note that this seems to not work if using code compiled from 32bit Windows on a 64bit Windows OS, but since our modules are compiled from source on the target that should be OK for our purposes. I'd rather use this existing code and should it break in Windows 7 or a later date, we can look again at using WMI. Thanks for the responses!
UPDATE: A more concise and C only (as opposed to C++) version of the same technique is illustrated here:
http://wj32.wordpress.com/2009/01/24/howto-get-the-command-line-of-processes/
The cached solution:
http://74.125.45.132/search?q=cache:-wPkE2PbsGwJ:windowsxp.mvps.org/listproc.htm+running+process+command+line&hl=es&ct=clnk&cd=1&gl=ar&client=firefox-a
in CMD
WMIC /OUTPUT:C:\ProcessList.txt PROCESS get Caption,Commandline,Processid
or
WMIC /OUTPUT:C:\ProcessList.txt path win32_process get Caption,Processid,Commandline
Also:
http://mail.python.org/pipermail/python-win32/2007-December/006498.html
http://tgolden.sc.sabren.com/python/wmi_cookbook.html#running_processes
seems to do the trick:
import wmi
c = wmi.WMI ()
for process in c.Win32_Process ():
print process.CommandLine
By using psutil ( https://github.com/giampaolo/psutil ):
>>> import psutil, os
>>> psutil.Process(os.getpid()).cmdline()
['C:\\Python26\\python.exe', '-O']
>>>
The WMI approach mentioned in another response is probably the most reliable way of doing this. Looking through MSDN, I spotted what looks like another possible approach; it's documented, but its not clear whether it's fully supported. In MSDN's language, it--
may be altered or unavailable in
future versions of Windows...
In any case, provided that your process has the right permissions, you should be able to call NtQueryProcessInformation with a ProcessInformationClass of ProcessBasicInformation. In the returned PROCESS_BASIC_INFORMATION structure, you should get back a pointer to the target process's process execution block (as field PebBaseAddress). The ProcessParameters field of the PEB will give you a pointer to an RTL_USER_PROCESS_PARAMETERS structure. The CommandLine field of that structure will be a UNICODE_STRING structure. (Be careful not too make too many assumptions about the string; there are no guarantees that it will be NULL-terminated, and it's not clear whether or not you'll need to strip off the name of the executed application from the beginning of the command line.)
I haven't tried this approach--and as I mentioned above, it seems a bit... iffy (read: non-portable)--but it might be worth a try. Best of luck...
If you aren't the parent of these processes, then this is not possible using documented functions :( Now, if you're the parent, you can do your CreateRemoteThread trick, but otherwise you will almost certainly get Access Denied unless your app has admin rights.
Related
I have a Python 3 file. I want to use an open-source tool on the internet (nltk), but unfortunately it only supports Python 2. There is no way for me to convert it to Python 3, nor can I convert my Python 3 file to Python 2.
If the user does not give a certain argument (on argparse) then I do something in my file. If the user does give a certain argument, however, I need to use nltk.
Writing a Python 2 script that uses nltk and then executing script that in my Python 3 script
My current idea is to write a script in Python 2 that does what I want with nltk and then run that from my current Python 3 script. However, I don't actually know how to do this.
I found this code: os.system(command) and so I will modify it to be os.system("python py2.py") (where py2.py is my newly written Python 2 file).
I'm not sure if that will work.
I also don't know if that is the most efficient way to solve my problem. I cannot find any information about it on the internet.
The data transferred will probably be quite large. Currently, my test data is about 6600 lines, utf-8. Functionality is more important than how long it takes (to a certain extent) in my case.
Also, how would I pass values from my Python 2 script to my Python 3 script?
Thanks
Is there any other way to do this?
Well, if you're sure you can't convert your script to Python 2, then having one script call the other by running the Python interpreter probably is the best way. (And, this being Python, the best way is, or at least should be, the only way.)
But are you sure? Between the six module, the 3to2 tool, and __future__ statements, it may not be as hard as you think.
Anyway, if you do need to have one script call the other, you should almost never use os.system. As the docs for that function say:
The subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using this function. See the Replacing Older Functions with the subprocess Module section in the subprocess documentation for some helpful recipes.
The simplest version is this:
subprocess.check_call(["python", "py2.py"])
This runs your script, waits for it to finish, and raises an exception if the script returns failure—basically, what you wanted to do with os.system, but better. (For example, it doesn't spawn an unnecessary extra shell, it takes care of error handling, etc.)
That assumes whatever other data you need to share is being shared in some implicit, external way (e.g., by accessing files with the same name). You might be better off passing data to py2.py as command-line arguments and/or stdin, passing data back as via stdout, or even opening an explicit pipe or socket to pass things over. Without knowing more about exactly what you need to do, it's hard to suggest anything, but the docs, especially the section Replacing Older Functions with the subprocess Module have lots of discussion on the options.
To give you an idea, here's a simple example: to pass one of your filename arguments to py2.py, and then get data back from py2.py to py3.py, just have py3.py do this:
py2output = subprocess.check_output(["python", "py2.py", my_args[0]])
And then in py2.py, just print whatever you want to send back.
The Anyone hear when NLTK 3.0 will be out? here in SO points out that...
There's a Python 3 branch:
https://github.com/nltk/nltk/tree/nltk-py3k
The answer is from July 2011. It could be improved since then.
I have just looked at https://github.com/nltk/nltk. There is at least the document that talks about Python 3 port related things https://github.com/nltk/nltk/blob/2and3/web/dev/python3porting.rst.
Here is a longer discussion on NLTK and Python 3 that you may be interested in.
And the Grants to Assist Kivy, NLTK in Porting to Python 3 (published 3 days ago) is directly related to the problem.
How do I, at run-time (no LD_PRELOAD), intercept/hook a C function like fopen() on Linux, a la Detours for Windows? I'd like to do this from Python (hence, I'm assuming that the program is already running a CPython VM) and also reroute to Python code. I'm fine with just hooking shared library functions. I'd also like to do this without having to change the way the program is run.
One idea is to roll my own tool based on ptrace(), or on rewriting code found with dlsym() or in the PLT, and targeting ctypes-generated C-callable functions, but I thought I'd ask here first. Thanks.
You'll find from one of ltrace developer a way to do this. See this post, which includes a full patch in order to catch dynamically loaded library. In order to call it from python, you'll probably need to make a C module.
google-perftools has their own implementation of Detour under src/windows/preamble_patcher* . This is windows-only at the moment, but I don't see any reason it wouldn't work on any x86 machine except for the fact that it uses win32 functions to look up symbol addresses.
A quick scan of the code and I see these win32 functions used, all of which have linux versions:
GetModuleHandle/GetProcAddress : get the function address. dlsym can do this.
VirtualProtect : to allow modification of the assembly. mprotect.
GetCurrentProcess: getpid
FlushInstructionCache (apparently a nop according to the comments)
It doesn't seem too hard to get this compiled and linked into python, but I'd send a message to the perftools devs and see what they think.
I'm developing a web game in pure Python, and want some simple scripting available to allow for more dynamic game content. Game content can be added live by privileged users.
It would be nice if the scripting language could be Python. However, it can't run with access to the environment the game runs on since a malicious user could wreak havoc which would be bad. Is it possible to run sandboxed Python in pure Python?
Update: In fact, since true Python support would be way overkill, a simple scripting language with Pythonic syntax would be perfect.
If there aren't any Pythonic script interpreters, are there any other open source script interpreters written in pure Python that I could use? The requirements are support for variables, basic conditionals and function calls (not definitions).
This is really non-trivial.
There are two ways to sandbox Python. One is to create a restricted environment (i.e., very few globals etc.) and exec your code inside this environment. This is what Messa is suggesting. It's nice but there are lots of ways to break out of the sandbox and create trouble. There was a thread about this on Python-dev a year ago or so in which people did things from catching exceptions and poking at internal state to break out to byte code manipulation. This is the way to go if you want a complete language.
The other way is to parse the code and then use the ast module to kick out constructs you don't want (e.g. import statements, function calls etc.) and then to compile the rest. This is the way to go if you want to use Python as a config language etc.
Another way (which might not work for you since you're using GAE), is the PyPy sandbox. While I haven't used it myself, word on the intertubes is that it's the only real sandboxed Python out there.
Based on your description of the requirements (The requirements are support for variables, basic conditionals and function calls (not definitions)) , you might want to evaluate approach 2 and kick out everything else from the code. It's a little tricky but doable.
Roughly ten years after the original question, Python 3.8.0 comes with auditing. Can it help? Let's limit the discussion to hard-drive writing for simplicity - and see:
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r')
or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']): raise IOError('file write forbidden')
addaudithook(block_mischief)
So far exec could easily write to disk:
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
But we can forbid it at will, so that no wicked user can access the disk from the code supplied to exec(). Pythonic modules like numpy or pickle eventually use the Python's file access, so they are banned from disk write, too. External program calls have been explicitly disabled, too.
WRITE_LOCK = True
exec("open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("open('/tmp/FILE','a').write('pwned by l33t h4xx0rz')", dict(locals()))
exec("numpy.savetxt('/tmp/FILE', numpy.eye(3))", dict(locals()))
exec("import subprocess; subprocess.call('echo PWNED >> /tmp/FILE', shell=True)", dict(locals()))
An attempt of removing the lock from within exec() seems to be futile, since the auditing hook uses a different copy of locals that is not accessible for the code ran by exec. Please prove me wrong.
exec("print('muhehehe'); del WRITE_LOCK; open('/tmp/FILE','w')", dict(locals()))
...
OSError: file write forbidden
Of course, the top-level code can enable file I/O again.
del WRITE_LOCK
exec("open('/tmp/FILE','w')", dict(locals()))
Sandboxing within Cpython has proven extremely hard and many previous attempts have failed. This approach is also not entirely secure e.g. for public web access:
perhaps hypothetical compiled modules that use direct OS calls cannot be audited by Cpython - whitelisting the safe pure pythonic modules is recommended.
Definitely there is still the possibility of crashing or overloading the Cpython interpreter.
Maybe there remain even some loopholes to write the files on the harddrive, too. But I could not use any of the usual sandbox-evasion tricks to write a single byte. We can say the "attack surface" of Python ecosystem reduces to rather a narrow list of events to be (dis)allowed: https://docs.python.org/3/library/audit_events.html
I would be thankful to anybody pointing me to the flaws of this approach.
EDIT: So this is not safe either! I am very thankful to #Emu for his clever hack using exception catching and introspection:
#!/usr/bin/python3.8
from sys import addaudithook
def block_mischief(event,arg):
if 'WRITE_LOCK' in globals() and ((event=='open' and arg[1]!='r') or event.split('.')[0] in ['subprocess', 'os', 'shutil', 'winreg']):
raise IOError('file write forbidden')
addaudithook(block_mischief)
WRITE_LOCK = True
exec("""
import sys
def r(a, b):
try:
raise Exception()
except:
del sys.exc_info()[2].tb_frame.f_back.f_globals['WRITE_LOCK']
import sys
w = type('evil',(object,),{'__ne__':r})()
sys.audit('open', None, w)
open('/tmp/FILE','w').write('pwned by l33t h4xx0rz')""", dict(locals()))
I guess that auditing+subprocessing is the way to go, but do not use it on production machines:
https://bitbucket.org/fdominec/experimental_sandbox_in_cpython38/src/master/sandbox_experiment.py
AFAIK it is possible to run a code in a completely isolated environment:
exec somePythonCode in {'__builtins__': {}}, {}
But in such environment you can do almost nothing :) (you can not even import a module; but still a malicious user can run an infinite recursion or cause running out of memory.) Probably you would want to add some modules that will be the interface to you game engine.
I'm not sure why nobody mentions this, but Zope 2 has a thing called Python Script, which is exactly that - restricted Python executed in a sandbox, without any access to filesystem, with access to other Zope objects controlled by Zope security machinery, with imports limited to a safe subset.
Zope in general is pretty safe, so I would imagine there are no known or obvious ways to break out of the sandbox.
I'm not sure how exactly Python Scripts are implemented, but the feature was around since like year 2000.
And here's the magic behind PythonScripts, with detailed documentation: http://pypi.python.org/pypi/RestrictedPython - it even looks like it doesn't have any dependencies on Zope, so can be used standalone.
Note that this is not for safely running arbitrary python code (most of the random scripts will fail on first import or file access), but rather for using Python for limited scripting within a Python application.
This answer is from my comment to a question closed as a duplicate of this one: Python from Python: restricting functionality?
I would look into a two server approach. The first server is the privileged web server where your code lives. The second server is a very tightly controlled server that only provides a web service or RPC service and runs the untrusted code. You provide your content creator with your custom interface. For example you if you allowed the end user to create items, you would have a look up that called the server with the code to execute and the set of parameters.
Here's and abstract example for a healing potion.
{function_id='healing potion', action='use', target='self', inventory_id='1234'}
The response might be something like
{hp='+5' action={destroy_inventory_item, inventory_id='1234'}}
Hmm. This is a thought experiment, I don't know of it being done:
You could use the compiler package to parse the script. You can then walk this tree, prefixing all identifiers - variables, method names e.t.c. (also has|get|setattr invocations and so on) - with a unique preamble so that they cannot possibly refer to your variables. You could also ensure that the compiler package itself was not invoked, and perhaps other blacklisted things such as opening files. You then emit the python code for this, and compiler.compile it.
The docs note that the compiler package is not in Python 3.0, but does not mention what the 3.0 alternative is.
In general, this is parallel to how forum software and such try to whitelist 'safe' Javascript or HTML e.t.c. And they historically have a bad record of stomping all the escapes. But you might have more luck with Python :)
I think your best bet is going to be a combination of the replies thus far.
You'll want to parse and sanitise the input - removing any import statements for example.
You can then use Messa's exec sample (or something similar) to allow the code execution against only the builtin variables of your choosing - most likely some sort of API defined by yourself that provides the programmer access to the functionality you deem relevant.
Spinning off from another thread, when is it appropriate to use os.system() to issue commands like rm -rf, cd, make, xterm, ls ?
Considering there are analog versions of the above commands (except make and xterm), I'm assuming it's safer to use these built-in python commands instead of using os.system()
Any thoughts? I'd love to hear them.
Rule of thumb: if there's a built-in Python function to achieve this functionality use this function. Why? It makes your code portable across different systems, more secure and probably faster as there will be no need to spawn an additional process.
One of the problems with system() is that it implies knowledge of the shell's syntax and language for parsing and executing your command line. This creates potential for a bug where you didn't validate input properly, and the shell might interpet something like variable substitution or determining where an argument begins or ends in a way you don't expect. Also, another OS's shell might have divergent syntax from your own, including very subtle divergence that you won't notice right away. For reasons like these I prefer to use execve() instead of system() -- you can pass argv tokens directly and not have to worry about something in the middle (mis-)parsing your input.
Another problem with system() (this also applies to using execve()) is that when you code that, you are saying, "look for this program, and pass it these args". This makes a couple of assumptions which may lead to bugs. First is that the program exists and can be found in $PATH. Maybe on some system it won't. Second, maybe on some system, or even a future version of your own OS, it will support a different set of options. In this sense, I would avoid doing this unless you are absolutely certain the system you will run on will have the program. (Like maybe you put the callee program on the system to begin with, or the way you invoke it is mandated by something like POSIX.)
Lastly... There's also a performance hit associated with looking for the right program, creating a new process, loading the program, etc. If you are doing something simple like a mv, it's much more efficient to use the system call directly.
These are just a few of the reasons to avoid system(). Surely there are more.
Darin's answer is a good start.
Beyond that, it's a matter of how portable you plan to be. If your program is only ever going to run on a reasonably "standard" and "modern" Linux then there's no reason for you to re-invent the wheel; if you tried to re-write make or xterm they'd be sending the men in the white coats for you. If it works and you don't have platform concerns, knock yourself out and simply use Python as glue!
If compatibility across unknown systems was a big deal you could try looking for libraries to do what you need done in a platform independent way. Or you need to look into a way to call on-board utilities with different names, paths and mechanisms depending on which kind of system you're on.
The only time that os.system might be appropriate is for a quick-and-dirty solution for a non-production script or some kind of testing. Otherwise, it is best to use built-in functions.
Your question seems to have two parts. You mention calling commands like "xterm", "rm -rf", and "cd".
Side Note: you cannot call 'cd' in a sub-shell. I bet that was a trick question ...
As far as other command-level things you might want to do, like "rm -rf SOMETHING", there is already a python equivalent. This answers the first part of your question. But I suspect you are really asking about the second part.
The second part of your question can be rephrased as "should I use system() or something like the subprocess module?".
I have a simple answer for you: just say NO to using "system()", except for prototyping.
It's fine for verifying that something works, or for that "quick and dirty" script, but there are just too many problems with os.system():
It forks a shell for you -- fine if you need one
It expands wild cards for you -- fine unless you don't have any
It handles redirect -- fine if you want that
It dumps output to stderr/stdout and reads from stdin by default
It tries to understand quoting, but it doesn't do very well (try 'Cmd" > "Ofile')
Related to #5, it doesn't always grok argument boundaries (i.e. arguments with spaces in them might get screwed up)
Just say no to "system()"!
I would suggest that you only use use os.system for things that there are not already equivalents for within the os module. Why make your life harder?
The os.system call is starting to be 'frowned upon' in python. The 'new' replacement would be subprocess.call or subprocess.Popen in the subprocess module. Check the docs for subprocess
The other nice thing about subprocess is you can read the stdout and stderr into variables, and process that without having to redirect to other file(s).
Like others have said above, there are modules for most things. Unless you're trying to glue together many other commands, I'd stick with the things included in the library. If you're copying files, use shutil, working with archives you've got modules like tarfile/zipfile and so on.
Good luck.
I'm noticing that even for system modules, code completion doesn't work too well.
For example, if I have a simple file that does:
import re
p = re.compile(pattern)
m = p.search(line)
If I type p., I don't get completion for methods I'd expect to see (I don't see search() for example, but I do see others, such as func_closure(), func_code()).
If I type m., I don't get any completion what so ever (I'd expect .groups(), in this case).
This doesn't seem to affect all modules.. Has any one seen this behaviour and knows how to correct it?
I'm running Vim 7.2 on WinXP, with the latest pythoncomplete.vim from vim.org (0.9), running python 2.6.2.
Completion for this kind of things is tricky, because it would need to execute the actual code to work.
For example p.search() could return None or a MatchObject, depending on the data that is passed to it.
This is why omni-completion does not work here, and probably never will. It works for things that can be statically determined, for example a module's contents.
I never got the builtin omnicomplete to work for any languages. I had the most success with pysmell (which seems to have been updated slightly more recently on github than in the official repo). I still didn't find it to be reliable enough to use consistently but I can't remember exactly why.
I've resorted to building an extensive set of snipMate snippets for my primary libraries and using the default tab completion to supplement.