I'm using Pyomo library to solve an optimization problem with Gurobi optimizer. I have an academic license for Gurobi, but afaik this should not pose any kind of limitation. I'd like to optimize the model with different parameters in parallel. In other words, I have many instances of the same model but different parameters that could be solved independently.
I've followed the instruction that I found on the documentation page of Pyomo regarding pyro and dispatching (here the reference). Basically, I started the Pyomo name server with pyomo_ns, then the dispatch server with dispatch_srvr, and finally I have launched four instances of pyro_mip_server.
When I launch my python script it works. No errors occur. The problem is that it takes the same amount of time as the serial execution. I've also monitored the activity of all eight cores that my CPU has and only one is constantly at 100% of the load. It's like no concurrent execution is happening at all.
I'm using SolverManager provided by Pyomo library to submit from the Python script different instances of my model. You can find here the guide of how to use it.
I'm on a Linux machine, the latest LTS Ubuntu distro.
Have somebody had such an experience or does someone know what's the problem here? If you need any additional info just let me know :). Thank you!
Related
We’re using grpc and protobuf objects as our communication method for a reinforcement learning project, Using python as a client and an external program as a server.
We’re currently in the process of trying to get multiple environments running in parallel to speed up training times, hence the use of ray.
However I am running into issues getting our existing communication methods to serialize properly for ray.remote etc.
This is the main error message that comes up:
“TypeError: cannot pickle ‘google.protobuf.pyext._message.MessageDescriptor’ object”
Any potential solutions or ideas? My python knowledge is lacking a little some i’m not exactly sure what the best way around this is.
I come from a sort of HPC background and I am just starting to learn about machine learning in general and TensorFlow in particular. I was initially surprised to find out that distributed TensorFlow is designed to communicate with TCP/IP by default though it makes sense in hindsight given what Google is and the kind of hardware it uses most commonly.
I am interested in experimenting with TensorFlow in a parallel way with MPI on a cluster. From my perspective, this should be advantageous because latency should be much lower due to MPI's use of Remote Direct Memory Access (RDMA) across machines without shared memory.
So my question is, why doesn't this approach seem to be more common given the increasing popularity of TensorFlow and machine learning ? Isn't latency a bottleneck ? Is there some typical problem that is solved, that makes this sort of solution impractical? Are there likely to be any meaningful differences between calling TensorFlow functions in a parallel way vs implementing MPI calls inside of the TensorFlow library ?
Thanks
It seems tensorflow already supports MPI, as stated at https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/mpi
MPI support for tensorflow was also discussed at https://arxiv.org/abs/1603.02339
Generally speaking, keep in mind MPI is best at sending/receiving messages, but not so great at sending notifications and acting upon events.
Last but not least, MPI support of multi-threaded applications (e.g. MPI_THREAD_MULTIPLE) has not always been production-ready among MPI implementation s.
These were two general statements and i honestly do not know if they are relevant for tensorflow.
According to the doc in Tensorflow git repo,actually tf utilizes gRPC library by detault, which is based on HTTP2 protocol, rather than TCP/IP protocol, and this paper should give you some insight, hope this information is useful.
I am new to TensorFlow, Linux, and ML. I am trying to use a GPU in another system in my lab for training my model. I have connected to the system using SSH.
Now what I am stuck on is how should I write the python code? One thing I can do is run python in the terminal window where I can see the username of the other machine I am connected to but it takes a lot of efforts and is not an efficient way of doing it.
What I want to do is write the python code in a file(on my machine) and run it on the machine possessing GPU. Can you describe to me how I should do that?
P.S: I understand it is a very basic doubt but I would appreciate if you can help me with it
Sorry to plug my own site, but I described how to do this with Pycharm in a blogpost.
Really hope this helps you out! If you have any more questions, feel free to ask!
Tensorflow has a Distributed Tensorflow tutorial out now. Since you can connect over SSH, this is an option.
I currently has an executable that when running uses all the cores on my server. I want to add another server, and have the jobs split between the two machines, but still each job using all the cores on the machine it is running. If both machines are busy I need the next job to queue until one of the two machines become free.
I thought this might be controlled by python, however I am a novice and not sure which python package would be the best for this problem.
I liked the "heapq" package for the queuing of the jobs, however it looked like it is designed for a single server use. I then looked into Ipython.parallel, but it seemed more designed for creating a separate smaller job for every core (on either one or more servers).
I saw a huge list of different options here (https://wiki.python.org/moin/ParallelProcessing) but I could do with some guidance as which way to go for a problem like this.
Can anyone suggest a package that may help with this problem, or a different way of approaching it?
Celery does exactly what you want - make it easy to distribute a task queue across multiple (many) machines.
See the Celery tutorial to get started.
Alternatively, IPython has its own multiprocessing library built in, based on ZeroMQ; see the introduction. I have not used this before, but it looks pretty straight-forward.
Suppose I have a script written in Python or Ruby, or a program written in C. How do I ensure that the script has no access to network capabilities?
You more or less gave a generic answer yourself by tagging it with "sandbox" because that's what you need, some kind of sandbox. Things that come to mind are: using JPython or JRuby that run on the JVM. Within the JVM you can create a sandbox using a policy file so no code in the JVM can do thing you don't allow.
For C code, it's more difficult. The brute force answer could be to run your C code in a virtual machine with no networking capabilities. I really don't have a more elegant answer right now for that one. :)
Unless you're using a sandboxed version of Python (using PyPy for example), there is no reliable way to switch-off network access from within the script itself. Of course, you could run under a VM with the network access shut off.
Firewalls can block specific applications or processes from accessing the network. ZoneAlarms is a good one that I have used to do exactly what you want in the past. So it can be done programatically, but I don't know near enough about OS programming to offer any advice on how to go about doing it.