I'm working on a grid system which has a number of very powerful computers. These can be used to execute python functions very quickly. My users have a number of python functions which take a long time to calculate on workstations, ideally they would like to be able to call some functions on a remote powerful server, but have it appear to be running locally.
Python has an old function called "apply" - it's mostly useless these days now that python supports the extended-call syntax (e.g. **arguments), however I need to implement something that works a bit like this:
rapply = Rapply( server_hostname ) # Set up a connection
result = rapply( fn, args, kwargs ) # Remotely call the function
assert result == fn( *args, **kwargs ) #Just as a test, verify that it has the expected value.
Rapply should be a class which can be used to remotely execute some arbitrary code (fn could be literally anything) on a remote server. It will send back the result which the rapply function will return. The "result" should have the same value as if I had called the function locally.
Now let's suppose that fn is a user-provided function I need some way of sending it over the wire to the execution server. If I could guarantee that fn was always something simple it could could just be a string containing python source code... but what if it were not so simple?
What if fn might have local dependencies: It could be a simple function which uses a class defined in a different module, is there a way of encapsulating fn and everything that fn requires which is not standard-library? An ideal solution would not require the users of this system to have much knowledge about python development. They simply want to write their function and call it.
Just to clarify, I'm not interested in discussing what kind of network protocol might be used to implement the communication between the client & server. My problem is how to encapsulate a function and its dependencies as a single object which can be serialized and remotely executed.
I'm also not interested in the security implications of running arbitrary code on remote servers - let's just say that this system is intended purely for research and it is within a heavily firewalled environment.
Take a look at PyRO (Python Remote objects) It has the ability to set up services on all the computers in your cluster, and invoke them directly, or indirectly through a name server and a publish-subscribe mechanism.
It sounds like you want to do the following.
Define a shared filesystem space.
Put ALL your python source in this shared filesystem space.
Define simple agents or servers that will "execfile" a block of code.
Your client then contacts the agent (REST protocol with POST methods works well for
this) with the block of code.
The agent saves the block of code and does an execfile on that block of code.
Since all agents share a common filesystem, they all have the same Python library structure.
We do with with a simple WSGI application we call "batch server". We have RESTful protocol for creating and checking on remote requests.
Stackless had ability to pickle and unpickle running code. Unfortunately current implementation doesn't support this feature.
You could use a ready-made clustering solution like Parallel Python. You can relatively easily set up multiple remote slaves and run arbitrary code on them.
You could use a SSH connection to the remote PC and run the commands on the other machine directly. You could even copy the python code to the machine and execute it.
Syntax:
cat ./test.py | sshpass -p 'password' ssh user#remote-ip "python - script-arguments-if-any for test.py script"
1) here "test.py" is the local python script.
2) sshpass used to pass the ssh password to ssh connection
Related
I know the question title is weird!.
I have two virtual machines. First one has limited resources, while the second one has enough resources just like normal machine. The first machine will receive a signal from an external device. This signal will trigger a python compiler to execute a script. The script is big and the first machine does not have enough resources to execute it.
I can copy the script to the second machine to run it there, but I can't make the second machine receive the external signal. I am wondering if there is a way to make the compiler on the first machine ( once the external signal received) call the compiler on the second machine, so the compiler on the second machine executes the script? so the second compiler should use the second machine resources. check the attached image please.
Assume that the connection is established between the two machines and they can see each other, and the second machine has a copy from the script. I just need the commands that pass ( the execution ) to the second machine and make it use its own resources.
You should look into the microservice architecture to do this.
You can achieve this either by using flask and sending server requests between each machine, or something like nameko, which will allow you to create a "bridge" between machines and call functions between them (seems like what you are more interested in). Example for nameko:
Machine 2 (executor of resource-intensive script):
from nameko.rpc import rpc
class Stuff(object):
#rpc
def example(self):
return "Function running on Machine 2."
You would run the above script through the Nameko shell, as detailed in the docs.
Machine 1:
from nameko.standalone.rpc import ClusterRpcProxy
# This is the amqp server that machine 2 would be running.
config = {
'AMQP_URI': AMQP_URI # e.g. "pyamqp://guest:guest#localhost"
}
with ClusterRpcProxy(config) as cluster_rpc:
cluster_rpc.Stuff.example() # Function running on Machine 2.
More info here.
Hmm, there's many approaches to this problem.
If you want a python only solution, you can check out
dispy http://dispy.sourceforge.net/
Or Dask. https://dask.org/
If you want a robust solution (what I use on my home computing cluster but imo overkill for your problem) you can use
SLURM. SLURM is basically a way to string multiple computers together into a "supercomputer". https://slurm.schedmd.com/documentation.html
For a semi-quick, hacky solution. You can write a microservice. Essentially, your "weak" computer will receive the message then send a http request to your "strong" computer. Your strong computer will contain the actual program, compute results, and pass back the result to your "weak" computer.
Flask is an easy and lightweight solution for this.
All of these solutions require some type of networking. At the least, the computers need to be on the same LAN or both have access over the web.
There are many other approaches not mentioned. For example, you can export a NFS (netowrk file storage) and have one computer put a file in the shared folder and the other computer perform work on the file. I'm sure there are plenty other contrived ways to accomplish this task :). I'd be happy to expand on a particular method if you want.
Here is the scenario, my website has some unsafe code, which is generated by website users, to run on my server.
I want to disable some reserved words for python to protect my running environment, such as eval, exec, print and so on.
Is there a simple way (without changing the python interpreter, my python version is 2.7.10) to implement the feature I described before?
Many thanks.
Disabling names on python level won't help as there are numerous ways around it. See this and this post for more info. This is what you need to do:
For CPython, use RestrictedPython to define a restricted subset of Python.
For PyPy, use sandboxing. It allows you to run arbitrary python code in a special environment that serializes all input/output so you can check it and decide which commands are allowed before actually running them.
Since version 3.8 Python supports audit hooks so you can completely prevent certain actions:
import sys
def audit(event, args):
if event == 'compile':
sys.exit('nice try!')
sys.addaudithook(audit)
eval('5')
Additionally, to protect your host OS, use
either virtualization (safer) such as KVM or VirtualBox
or containerization (much lighter) such as lxd or docker
In the case of containerization with docker you may need to add AppArmor or SELinux policies for extra safety. lxd already comes with AppArmor policies by default.
Make sure you run the code as a user with as little privileges as possible.
Rebuild the virtual machine/container for each user.
Whichever solution you use, don't forget to limit resource usage (RAM, CPU, storage, network). Use cgroups if your chosen virtualization/containerization solution does not support these kinds of limits.
Last but not least, use timeouts to prevent your users' code from running forever.
One way is to shadow the methods:
def not_available(*args, **kwargs):
return 'Not allowed'
eval = not_available
exec = not_available
print = not_available
However, someone smart can always do this:
import builtins
builtins.print('this works!')
So the real solution is to parse the code and not allow the input if it has such statements (rather than trying to disable them).
Python's Fabric provides the ability to invoke fabric functions outside of the fab utility using the execute function. A contextual problem arises when an execute function is invoked within another function that was called using execute. Fabric loses the context of the outer execute when the inner execute is invoked and never recovers it. For example:
env.roledefs = {
'webservers': ['web1','web2'],
'load_balancer': ['lb1']
}
#roles('webserver')
def deploy_code():
#ship over tar.gz of code to unpack.
...
execute(remove_webserver_from_load_balancer, sHost=env.host_string)
...
#shutdown webserver, unpack files, and restart web server
...
execute(add_webserver_to_load_balancer, sHost=env.host_string)
#roles('load_balancer')
def remove_webserver_from_load_balancer(sHost=None):
ssh("remove_host %s" % sHost)
execute(deploy_code)
After the first call to execute, Fabric completely loses its context and executes all further commands within the deploy_code function with host_string='lb1' instead of 'web1'. How can I get it to remember it?
I came up with this hack, but I feel like it could break on future releases:
with settings(**env):
execute(remove_webserver_from_load_balancer, sHost=env.host_string)
This effectively saves all state and restores it after the call, but seems like an unintended use of the function. Is there a better way to tell Fabric that it's in a nested execute and to use a settings stack or an equivalent method to remember state?
Thanks!
Your not using fabric right. As you'd just call fab deploy_code instead of running the fabfile like it's python. I'd suggest going through the tutorial for a better idea on how to structure your fabfile.
Anyhow though, you can look here for how to use execute(), and here for more of the specifics.
You have a typo in that you've dropped the 's' from the webservers role. Which might account for you not having a good host string when you want it on the second task.
But that aside, you can also set roles and hosts in the execute() command itself.
Can you recommend on a python tool / module that allows scheduling tasks on remote machine in a network?
Note that the solution must be able to not only run certain jobs/commands on remote machines, but also verify that jobs etc are still running (for example, consider the case where a machine dies after a task has been assigned to it?)
RPyC or Remote Python Call, is a transparent and symmetrical python library for remote procedure calls, clustering and distributed-computing. Here an example from Wikipedia:
import rpyc
conn = rpyc.classic.connect("hostname") # assuming a classic server is running on 'hostname'
print conn.modules.sys.path
conn.modules.sys.path.append("lucy")
print conn.modules.sys.path[-1]
# a version of 'ls' that runs remotely
def remote_ls(path):
ros = conn.modules.os
for filename in ros.listdir(path):
stats = ros.stat(ros.path.join(path, filename))
print "%d\t%d\t%s" % (stats.st_size, stats.st_uid, filename)
remote_ls("/usr/bin")
# and exceptions...
try:
f = conn.builtin.open("/non/existent/file/name")
except IOError:
pass
To check if the remote server has died after assigning it a job, you can use the ping method of the Connection class. The complete API is described here.
Fabric (http://docs.fabfile.org/en/1.0.1/index.html) is a pretty good toolkit for various sys admin and deployment tasks. It comes with a few pre defined tasks but also gives you the flexibility to add what you need.
I highly recommend it.
Should be able to use Python WMI, for *NIX based systems it's a wrap around SSH and CRON.
I have a script. It uses GTK. And I need to know if another copy of scrip starts. If it starts window will extend.
Please, tell me the way I can detect it.
You could use a D-Bus service. Your script would start a new service if none is found running in the current session, and otherwise send a D-Bus message to the running instace (that can send "anything", including strings, lists, dicts).
The GTK-based library libunique (missing Python bindings?) uses this approach in its implementation of "unique" applications.
You can use a PID file to determine if the application is already running (just search for "python daemon" on Google to find some working implementations).
If you detected that the program is already running, you can communicate with the running instance using named pipes.
The new copy could search for running copies, fire a SIGUSER signal and trigger a callback in your running process that then handles all the magic.
See the signal library for details and the list of things that can go wrong.
I've done that using several ways depending upon the scenario
In one case my script had to listen on a TCP port. So I'd just see if the port was available it'd mean it is a new copy. This was sufficient for me but in certain cases, if the port is already in use, it might be because some other kind of application is listening on that port. You can use OS calls to find out who is listening on the port or try sending data and checking the response.
In another case I used PID file. Just decide a location and a filename, and everytime your script starts, read that file to get a PID. If that PID is running, it means another copy is already there. Otherwise create that file and write your process ID in it. This is pretty simple. If you are using django then you can simply use django's daemonizer: "from django.utils import daemonize". Otherwise you can use this script: http://www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python/