my python Ray script runs on a single worker only

my python Ray script runs on a single worker only - python

I am a new with Ray and after have read he documentation, I came up with a script that mimics what I want to do further with Ray. Here is my script:
import ray
import time
import h5py
#ray.remote
class Analysis:
def __init__(self):
self._file = h5py.File('./Data/Trajectories/MDANSE/apoferritin.h5')
def __getstate__(self):
print('I dump')
d = self.__dict__.copy()
del d['_file']
return d
def __setstate__(self,state):
self.__dict__ = state
self._file = h5py.File('./Data/Trajectories/MDANSE/apoferritin.h5')
def run_step(self,index):
time.sleep(5)
print('I run a step',index)
def combine(self,index):
print('I combine',index)
ray.init(num_cpus=4)
a = Analysis.remote()
obj_id = ray.put(a)
for i in range(100):
output = ray.get(a.run_step.remote(i))
My problem is that when I run this script it runs on a single worker as indicated by the Ray output whereas I would expect 4 workers to be fired. Would you know what is wrong with my script ?

Quoting from ray docs on actor
Methods called on different actors can execute in parallel, and methods called on the same actor are executed serially in the order that they are called.
Another issue with the above code is that ray.get is a blocking call.
I will suggest instantiating multiple actors and running the jobs, like
actors = [Analysis.remote() for i in range(num_cpus)]
outputs = []
for i in range(100):
outputs.append(actors[i % num_cpus].run_step.remote(i))
output = ray.get(outputs)

Related

Run an object method in a daemon thread in python

I am trying to simulate an environment with vms and trying to run an object method in background thread. My code looks like the following.
hyper_v.py file :
import random
from threading import Thread
from virtual_machine import VirtualMachine
class HyperV(object):
def __init__(self, hyperv_name):
self.hyperv_name = hyperv_name
self.vms_created = {}
def create_vm(self, vm_name):
if vm_name not in self.vms_created:
vm1 = VirtualMachine({'vm_name': vm_name})
self.vms_created[vm_name] = vm1
vm1.boot()
else:
print('VM:', vm_name, 'already exists')
def get_vm_stats(self, vm_name):
print('vm stats of ', vm_name)
print(self.vms_created[vm_name].get_values())
if __name__ == '__main__':
hv = HyperV('temp')
vm_name = 'test-vm'
hv.create_vm(vm_name)
print('getting vm stats')
th2 = Thread(name='vm1_stats', target=hv.get_vm_stats(vm_name) )
th2.start()
virtual_machine.py file in the same directory:
import random, time, uuid, json
from threading import Thread
class VirtualMachine(object):
def __init__(self, interval = 2, *args, **kwargs):
self.vm_id = str(uuid.uuid4())
#self.vm_name = kwargs['vm_name']
self.cpu_percentage = 0
self.ram_percentage = 0
self.disk_percentage = 0
self.interval = interval
def boot(self):
print('Bootingup', self.vm_id)
th = Thread(name='vm1', target=self.update() )
th.daemon = True #Setting the thread as daemon thread to run in background
print(th.isDaemon()) #This prints true
th.start()
def update(self):
# This method needs to run in the background simulating an actual vm with changing values.
i = 0
while(i < 5 ): #Added counter for debugging, ideally this would be while(True)
i+=1
time.sleep(self.interval)
print('updating', self.vm_id)
self.cpu_percentage = round(random.uniform(0,100),2)
self.ram_percentage = round(random.uniform(0,100),2)
self.disk_percentage = round(random.uniform(0,100),2)
def get_values(self):
return_json = {'cpu_percentage': self.cpu_percentage,
'ram_percentage': self.ram_percentage,
'disk_percentage': self.disk_percentage}
return json.dumps(return_json)
The idea is to create a thread that keeps on updating the values and on request, we read the values of the vm object by calling the vm_obj.get_values() we would be creating multiple vm_objects to simulate multiple vms running in parallel and we need to get the information from a particular vm on request.
The problem, that I am facing, is that the update() function of the vm doesnot run in the background (even though the thread is set as daemon thread).
The method call hv.get_vm_stats(vm_name) waits until the completion of vm_object.update() (which is called by vm_object.boot()) and then prints the stats. I would like to get the stats of the vm on request by keeping the vm_object.update() running in the background forever.
Please share your thoughts if I am overlooking anything related to the basics. I tried looking into the issues related to the python threading library but I could not come to any conclusion. Any help is greatly appreciated. The next steps would be to have a REST api to call these functions to get the data of any vm but I am struck with this problem.
Thanks in advance,

As pointed out by #Klaus D in the comments, my mistake was using the braces when specifying the target function in the thread definition, which resulted in the function being called right away.
target=self.update() will call the method right away. Remove the () to
hand the method over to the thread without calling it.

Using Multiprocessing with Modules

I am writing a module such that in one function I want to use the Pool function from the multiprocessing library in Python 3.6. I have done some research on the problem and the it seems that you cannot use if __name__=="__main__" as the code is not being run from main. I have also noticed that the python pool processes get initialized in my task manager but essentially are stuck.
So for example:
class myClass()
...
lots of different functions here
...
def multiprocessFunc()
do stuff in here
def funcThatCallsMultiprocessFunc()
array=[array of filenames to be called]
if __name__=="__main__":
p = Pool(processes=20)
p.map_async(multiprocessFunc,array)
I tried to remove the if __name__=="__main__" part but still no dice. any help would appreciated.

It seems to me that your have just missed out a self. from your code. I should think this will work:
class myClass():
...
# lots of different functions here
...
def multiprocessFunc(self, file):
# do stuff in here
def funcThatCallsMultiprocessFunc(self):
array = [array of filenames to be called]
p = Pool(processes=20)
p.map_async(self.multiprocessFunc, array) #added self. here
Now having done some experiments, I see that map_async could take quite some time to start up (I think because multiprocessing creates processes) and any test code might call funcThatCallsMultiprocessFunc and then quit before the Pool has got started.
In my tests I had to wait for over 10 seconds after funcThatCallsMultiprocessFunc before calls to multiprocessFunc started. But once started, they seemed to run just fine.
This is the actual code I've used:
MyClass.py
from multiprocessing import Pool
import time
import string
class myClass():
def __init__(self):
self.result = None
def multiprocessFunc(self, f):
time.sleep(1)
print(f)
return f
def funcThatCallsMultiprocessFunc(self):
array = [c for c in string.ascii_lowercase]
print(array)
p = Pool(processes=20)
p.map_async(self.multiprocessFunc, array, callback=self.done)
p.close()
def done(self, arg):
self.result = 'Done'
print('done', arg)
Run.py
from MyClass import myClass
import time
def main():
c = myClass()
c.funcThatCallsMultiprocessFunc()
for i in range(30):
print(i, c.result)
time.sleep(1)
if __name__=="__main__":
main()

The if __name__=='__main__' construct is an import protection. You want to use it, to stop multiprocessing from running your setup on import.
In your case, you can leave out this protection in the class setup. Be sure to protect the execution points of the class in the calling file like this:
def apply_async_with_callback():
pool = mp.Pool(processes=30)
for i in range(z):
pool.apply_async(parallel_function, args = (i,x,y, ), callback = callback_function)
pool.close()
pool.join()
print "Multiprocessing done!"
if __name__ == '__main__':
apply_async_with_callback()

Python Class Instance member variable isn't being updated inside thread

Currently creating separate instances of my class, Example, then creating a thread for each instance and utilizing the Class's execute_thread function as the thread function target. The thread function continues running as long as the member variable exit_signal is not updated to True. Once control, shift, and 2 are pressed on the keyboard, the member variable isn't updated from within the thread instance.
The problem is the thread function isn't recognizing any change to the member variable, why isn't it detecting the change, is the while loop preventing it from doing so?
import keyboard
import multiprocessing
import time
class Example:
m_exit_signal = False
def __init__(self):
keyboard.add_hotkey('control, shift, 2', lambda: self.exit_signaled())
def execute_example_thread(self):
exit_status = self.m_exit_signal
# THREAD continues till exit is called! -
while exit_status == False:
time.sleep(5)
exit_status = self.m_exit_signal
print(exit_status)
def exit_signaled(self):
self.m_exit_signal = True
print("Status {0}".format(self.m_exit_signal))
example_objects = []
example_objects.append(Example())
example_objects.append(Example())
example_threads = []
for value in example_objects:
example_threads.append(multiprocessing.Process(target=value.execute_example_thread, args=()))
example_threads[-1].start()

Multiprocessing forks your code so that it runs in a separate process. In the code above the keyboard callback is calling the method in the instances present in the parent process. The loop (and a copy of the class instance) is actually running in a forked version in a child process. In order to signal the child, you need to share a variable between them and use it to pass data back and forth. Try the code below.
import keyboard
import multiprocessing as mp
import time
class Example(object):
def __init__(self, hot_key):
self.run = mp.Value('I', 1)
keyboard.add_hotkey('control, shift, %d' % hot_key, self.exit_signaled)
print("Initialized {}".format(mp.current_process().name))
def execute(self):
while self.run.value:
time.sleep(1)
print("Running {}".format(mp.current_process().name))
print("{} stopping".format(mp.current_process().name))
def exit_signaled(self):
print("exit signaled from {}".format(mp.current_process().name))
self.run.value = 0
p1 = mp.Process(target=Example(1).execute)
p1.start()
time.sleep(0.1)
p2 = mp.Process(target=Example(2).execute)
p2.start()
Here the parent and the child of each instance share an self.run = mp.Value To share data, you need to use one of these, not just any python variable.

Python - Non-empty shared list on separate thread appears empty

I've two classes - MessageProducer and MessageConsumer.
MessageConsumer does the following:
receives messages and puts them in its message list "_unprocessed_msgs"
on a separate worker thread, moves the messages to internal list "_in_process_msgs"
on the worker thread, processes messages from "_in_process_msgs"
On my development environment, I'm facing issue with #2 above - after adding a message by performing step#1, when worker thread checks length of "_unprocessed_msgs", it gets it as zero.
When step #1 is repeated, the list properly shows 2 items on the thread on which the item was added. But in step #2, on worker thread, again the len(_unprocessed_msgs) returns zero.
Not sure why this is happening. Would really appreciate help any help on this.
I'm using Ubuntu 16.04 having Python 2.7.12.
Below is the sample source code. Please let me know if more information is required.
import threading
import time
class MessageConsumerThread(threading.Thread):
def __init__(self):
super(MessageConsumerThread, self).__init__()
self._unprocessed_msg_q = []
self._in_process_msg_q = []
self._lock = threading.Lock()
self._stop_processing = False
def start_msg_processing_thread(self):
self._stop_processing = False
self.start()
def stop_msg_processing_thread(self):
self._stop_processing = True
def receive_msg(self, msg):
with self._lock:
LOG.info("Before: MessageConsumerThread::receive_msg: "
"len(self._unprocessed_msg_q)=%s" %
len(self._unprocessed_msg_q))
self._unprocessed_msg_q.append(msg)
LOG.info("After: MessageConsumerThread::receive_msg: "
"len(self._unprocessed_msg_q)=%s" %
len(self._unprocessed_msg_q))
def _queue_unprocessed_msgs(self):
with self._lock:
LOG.info("MessageConsumerThread::_queue_unprocessed_msgs: "
"len(self._unprocessed_msg_q)=%s" %
len(self._unprocessed_msg_q))
if self._unprocessed_msg_q:
LOG.info("Moving messages from unprocessed to in_process queue")
self._in_process_msg_q += self._unprocessed_msg_q
self._unprocessed_msg_q = []
LOG.info("Moved messages from unprocessed to in_process queue")
def run(self):
while not self._stop_processing:
# Allow other threads to add messages to message queue
time.sleep(1)
# Move unprocessed listeners to in-process listener queue
self._queue_unprocessed_msgs()
# If nothing to process continue the loop
if not self._in_process_msg_q:
continue
for msg in self._in_process_msg_q:
self.consume_message(msg)
# Clean up processed messages
del self._in_process_msg_q[:]
def consume_message(self, msg):
print(msg)
class MessageProducerThread(threading.Thread):
def __init__(self, producer_id, msg_receiver):
super(MessageProducerThread, self).__init__()
self._producer_id = producer_id
self._msg_receiver = msg_receiver
def start_producing_msgs(self):
self.start()
def run(self):
for i in range(1,10):
msg = "From: %s; Message:%s" %(self._producer_id, i)
self._msg_receiver.receive_msg(msg)
def main():
msg_receiver_thread = MessageConsumerThread()
msg_receiver_thread.start_msg_processing_thread()
msg_producer_thread = MessageProducerThread(producer_id='Producer-01',
msg_receiver=msg_receiver_thread)
msg_producer_thread.start_producing_msgs()
msg_producer_thread.join()
msg_receiver_thread.stop_msg_processing_thread()
msg_receiver_thread.join()
if __name__ == '__main__':
main()
Following is the log the I get:
INFO: MessageConsumerThread::_queue_unprocessed_msgs: len(self._unprocessed_msg_q)=0
INFO: Before: MessageConsumerThread::receive_msg: len(self._unprocessed_msg_q)=0
INFO: After: MessageConsumerThread::receive_msg: **len(self._unprocessed_msg_q)=1**
INFO: MessageConsumerThread::_queue_unprocessed_msgs: **len(self._unprocessed_msg_q)=0**
INFO: MessageConsumerThread::_queue_unprocessed_msgs: len(self._unprocessed_msg_q)=0
INFO: Before: MessageConsumerThread::receive_msg: len(self._unprocessed_msg_q)=1
INFO: After: MessageConsumerThread::receive_msg: **len(self._unprocessed_msg_q)=2**
INFO: MessageConsumerThread::_queue_unprocessed_msgs: **len(self._unprocessed_msg_q)=0**

This is not a good desing for you application.
I spent some time trying to debug this - but threading code is naturally complicated, so we should try to descomplicate it, instead of getting it even more confure.
When I see threading code in Python, I usually see it written a in a procedural form: a normal function that is passed to threading.Thread as the target argument that drives each thread. That way, you don't need to write code for a new class that will have a single instance.
Another thing is that, although Python's global interpreter lock itself guarantees lists won't get corrupted if modified in two separate threads, lists are not a recomended "thread data passing" data structure. You probably should look at threading.Queue to do that
The thing is wrong in this code at first sight is probably not the cause of your problem due to your use of locks, but it might be. Instead of
self._unprocessed_msg_q = []
which will create a new list object, the other thread have momentarily no reference too (so it might write data to the old list), you should do:
self._unprocessed_msg_q[:] = []
Or just the del slice thing you do on the other method.
But to be on the safer side, and having mode maintanable and less surprising code, you really should change to a procedural approach there, assuming Python threading. Assume "Thread" is the "final" object that can do its thing, and then use Queues around:
# coding: utf-8
from __future__ import print_function
from __future__ import unicode_literals
from threading import Thread
try:
from queue import Queue, Empty
except ImportError:
from Queue import Queue, Empty
import time
import random
TERMINATE_SENTINEL = object()
NO_DATA_SENTINEL = object()
class Receiver(object):
def __init__(self, queue):
self.queue = queue
self.in_process = []
def receive_data(self, data):
self.in_process.append(data)
def consume_data(self):
print("received data:", self.in_process)
del self.in_process[:]
def receiver_loop(self):
queue = self.queue
while True:
try:
data = queue.get(block=False)
except Empty:
print("got no data from queue")
data = NO_DATA_SENTINEL
if data is TERMINATE_SENTINEL:
print("Got sentinel: exiting receiver loop")
break
self.receive_data(data)
time.sleep(random.uniform(0, 0.3))
if queue.empty():
# Only process data if we have nothing to receive right now:
self.consume_data()
print("sleeping receiver")
time.sleep(1)
if self.in_process:
self.consume_data()
def producer_loop(queue):
for i in range(10):
time.sleep(random.uniform(0.05, 0.4))
print("putting {0} in queue".format(i))
queue.put(i)
def main():
msg_queue = Queue()
msg_receiver_thread = Thread(target=Receiver(msg_queue).receiver_loop)
time.sleep(0.1)
msg_producer_thread = Thread(target=producer_loop, args=(msg_queue,))
msg_receiver_thread.start()
msg_producer_thread.start()
msg_producer_thread.join()
msg_queue.put(TERMINATE_SENTINEL)
msg_receiver_thread.join()
if __name__ == '__main__':
main()
note that since you want multiple methods in the recever thread to do things with data, I used a class - but it does not inherit from Thread, and does not have to worry about its workings. All its methods are called within the same thread: no need of locks, no worries about race conditions within the receiver class itself. For communicating outside the class, the Queue class is structured to handle any race conditions for us.
The producer loop, as it is just a dummy producer, has no need at all to be written in class form. But it would look just the same, if it had more methods.
(The random sleeps help visualize what would happen in "real world" message receiving)
Also, you might want to take a look at something like:
https://www.thoughtworks.com/insights/blog/composition-vs-inheritance-how-choose

Finally I was able to solve the issue. In the actual code, I've a Manager class that is responsible for instantiating MessageConsumerThread as its last thing in the initializer:
class Manager(object):
def __init__(self):
...
...
self._consumer = MessageConsumerThread(self)
self._consumer.start_msg_processing_thread()
The problem seems to be with passing 'self' in MessageConsumerThread initializer when Manager is still executing its initializer (eventhough those are last two steps). The moment I moved the creation of consumer out of initializer, consumer thread was able to see the elements in "_unprocessed_msg_q".
Please note that the issue is still not reproducible with the above sample code. It is manifesting itself in the production environment only. Without the above fix, I tried queue and dictionary as well but observed the same issue. After the fix, tried with queue and list and was able to successfully execute the code.
I really appreciate and thank #jsbueno and #ivan_pozdeev for their time and help! Community #stackoverflow is very helpful!

Executing tasks in parallel in python

I am using python 2.7, I have some code that looks like this:
task1()
task2()
task3()
dependent1()
task4()
task5()
task6()
dependent2()
dependent3()
The only dependencies here are as follows: dependent1 needs to wait for tasks1-3, dependent2 needs to wait for tasks 4-6 and dependent3 needs to wait for dependents1-2... The following would be okay: running the whole 6 tasks first in parallel, then the first two dependents in parallel.. then the final dependent
I prefer to have as much tasks as possible running in parallel, I've googled for some modules but I was hoping to avoid external libraries, and not sure how the Queue-Thread technique can solve my problem (maybe someone can recommend a good resource?)

The builtin threading.Thread class offers all you need: start to start a new thread and join to wait for the end of a thread.
import threading
def task1():
pass
def task2():
pass
def task3():
pass
def task4():
pass
def task5():
pass
def task6():
pass
def dep1():
t1 = threading.Thread(target=task1)
t2 = threading.Thread(target=task2)
t3 = threading.Thread(target=task3)
t1.start()
t2.start()
t3.start()
t1.join()
t2.join()
t3.join()
def dep2():
t4 = threading.Thread(target=task4)
t5 = threading.Thread(target=task5)
t4.start()
t5.start()
t4.join()
t5.join()
def dep3():
d1 = threading.Thread(target=dep1)
d2 = threading.Thread(target=dep2)
d1.start()
d2.start()
d1.join()
d2.join()
d3 = threading.Thread(target=dep3)
d3.start()
d3.join()
Alternatively to join you can use Queue.join to wait for the threads end.

If you are willing to give external libraries a shot, you can express tasks and their dependencies elegantly with Ray. This works well on a single machine, the advantage here is that parallelism and dependencies can be easier to express with Ray than with python multiprocessing and it doesn't have the GIL (global interpreter lock) problem that often prevents multithreading from working efficiently. In addition it is very easy to scale the workload up on a cluster if you need to in the future.
The solution looks like this:
import ray
ray.init()
#ray.remote
def task1():
pass
#ray.remote
def task2():
pass
#ray.remote
def task3():
pass
#ray.remote
def dependent1(x1, x2, x3):
pass
#ray.remote
def task4():
pass
#ray.remote
def task5():
pass
#ray.remote
def task6():
pass
#ray.remote
def dependent2(x1, x2, x3):
pass
#ray.remote
def dependent3(x, y):
pass
id1 = task1.remote()
id2 = task2.remote()
id3 = task3.remote()
dependent_id1 = dependent1.remote(id1, id2, id3)
id4 = task4.remote()
id5 = task5.remote()
id6 = task6.remote()
dependent_id2 = dependent2.remote(id4, id5, id6)
dependent_id3 = dependent3.remote(dependent_id1, dependent_id2)
ray.get(dependent_id3) # This is optional, you can get the results if the tasks return an object
You can also pass actual python objects between the tasks by using the arguments inside of the tasks and returning the results (for example saying "return value" instead of the "pass" above).
Using "pip install ray" the above code works out of the box on a single machine, and it is also easy to parallelize applications on a cluster, either in the cloud or your own custom cluster, see https://ray.readthedocs.io/en/latest/autoscaling.html and https://ray.readthedocs.io/en/latest/using-ray-on-a-cluster.html). That might come in handy if your workload grows later on.
Disclaimer: I'm one of the developers of Ray.

Look at Gevent.
Example Usage:
import gevent
from gevent import socket
def destination(jobs):
gevent.joinall(jobs, timeout=2)
print [job.value for job in jobs]
def task1():
return gevent.spawn(socket.gethostbyname, 'www.google.com')
def task2():
return gevent.spawn(socket.gethostbyname, 'www.example.com')
def task3():
return gevent.spawn(socket.gethostbyname, 'www.python.org')
jobs = []
jobs.append(task1())
jobs.append(task2())
jobs.append(task3())
destination(jobs)
Hope, this is what you have been looking for.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

my python Ray script runs on a single worker only - python

Related

Run an object method in a daemon thread in python

Using Multiprocessing with Modules

Python Class Instance member variable isn't being updated inside thread

Python - Non-empty shared list on separate thread appears empty

Executing tasks in parallel in python

Categories

Resources