Restart a process if running longer than x amount of minutes - python

I have a program that creates a multiprocessing pool to handle a webextraction job. Essentially, a list of product ID's is fed into a pool of 10 processes that handle the queue. The code is pretty simple:
import multiprocessing
num_procs = 10
products = ['92765937', '20284759', '92302047', '20385473', ...etc]
def worker():
for workeritem in iter(q.get, None):
time.sleep(10)
get_product_data(workeritem)
q.task_done()
q.task_done()
q = multiprocessing.JoinableQueue()
procs = []
for i in range(num_procs):
procs.append(multiprocessing.Process(target=worker))
procs[-1].daemon = True
procs[-1].start()
for product in products:
time.sleep(10)
q.put(product)
q.join()
for p in procs:
q.put(None)
q.join()
for p in procs:
p.join()
The get_product_data() function takes the product, opens an instance of Selenium, and navigates to a site, logs in, and collects the details of the product and outputs to a csv file. The problem is, randomly (literally... it happens at different points of the website's navigation or extraction process) Selenium will stop doing whatever it's doing and just sit there and stop doing it's job. No exceptions are thrown or anything. I've done everything I can in the get_product_data() function to get this to not happen, but it seems to just be a problem with Selenium (i've tried using Firefox, PhantomJS, and Chrome as it's driver, and still run into the same problem no matter what).
Essentially, the process should never run for longer than, say, 10 minutes. Is there any way to kill a process and restart it with the same product id if it has been running for longer than the specified time?
This is all running on a Debian Wheezy box with Python 2.7.

You could write your code using multiprocessing.Pool and the timeout() function suggested by #VooDooNOFX. Not tested, consider it an executable pseudo-code:
#!/usr/bin/env python
import signal
from contextlib import closing
from multiprocessing import Pool
class Alarm(Exception):
pass
def alarm_handler(*args):
raise Alarm("timeout")
def mp_get_product_data(id, timeout=10, nretries=3):
signal.signal(signal.SIGALRM, alarm_handler) #XXX could move it to initializer
for i in range(nretries):
signal.alarm(timeout)
try:
return id, get_product_data(id), None
except Alarm as e:
timeout *= 2 # retry with increased timeout
except Exception as e:
break
finally:
signal.alarm(0) # disable alarm, no need to restore handler
return id, None, str(e)
if __name__=="__main__":
with closing(Pool(num_procs)) as pool:
for id, result, error in pool.imap_unordered(mp_get_product_data, products):
if error is not None: # report and/or reschedule
print("error: {} for {}".format(error, id))
pool.join()

You need to ask Selenium to wait an explicit amount of time, or wait for some implicit DOM object to be available. Take a quick look at the selenium docs about that.
From the link, here's a process that waits 10 seconds for the DOM element myDynamicElement to appear.
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait # available since 2.4.0
from selenium.webdriver.support import expected_conditions as EC # available since 2.26.0
ff = webdriver.Firefox()
ff.get("http://somedomain/url_that_delays_loading")
try:
element = WebDriverWait(ff, 10).until(EC.presence_of_element_located((By.ID, "myDynamicElement")))
except TimeoutException as why:
# Do something to reject this item, possibly by re-adding it to the worker queue.
finally:
ff.quit()
If nothing is available in the given time period, a selenium.common.exceptions.TimeoutException is raised, which you can catch in a try/except loop like above.
EDIT
Another option is to ask multiprocessing to timeout the process after some amount of time. This is done using the built-in library signal. Here's an excellent example of doing this, however it's still up to you to add that item back into the work queue when you detect a process has been killed. You can do this in the def handler section of the code.

Related

Can`t attach to detached selenium window in python

Cant send commands to selenium webdriver in detached session because link http://localhost:port died.
But if i put breakpoint 1 link stay alive
import multiprocessing
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
def create_driver_pool(q):
options = Options()
driver = webdriver.Chrome(options=options)
pass #breakpoint 1
return driver.command_executor._url
windows_pool = multiprocessing.Pool(processes=1)
result = windows_pool.map(create_driver_pool, [1])
print(result)
pass # breakpoint 2 for testing link
why is this happening and what can i do about it?
After some research i finally found the reason for this behavor.
Thanks https://bentyeh.github.io/blog/20190527_Python-multiprocessing.html and some googling about signals.
This is not signals at all.
I found this code in selenium.common.service
def __del__(self):
print("del detected")
# `subprocess.Popen` doesn't send signal on `__del__`;
# so we attempt to close the launched process when `__del__`
# is triggered.
try:
self.stop()
except Exception:
pass
This is handler for garbage collector function, that killing subprocess via SIGTERM
self.process.terminate()
self.process.wait()
self.process.kill()
self.process = None
But if you in the debug mode with breakpoint, garbage collector wont collect this object, and del wont start.

Closing Selenium Browser that was Opened in a Child Process

Here's the situation:
I create a child process which opens and deals with a webdriver. The child process is finicky and might error, in which case it would close immediately, and control would be returned to the main function. In this situation, however, the browser would still be open (as the child process never completely finished running). How can I close a browser that is initialized in a child process?
Approaches I've tried so far:
1) Initializing the webdriver in the main function and passing it to the child process as an argument.
2) Passing the webdriver between the child and parent process using a queue.
The code:
import multiprocessing
def foo(queue):
driver = webdriver.Chrome()
queue.put(driver)
# Do some other stuff
# If finicky stuff happens, this driver.close() will not run
driver.close()
if __name__ == '__main__':
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=foo, name='foo', args=(queue,))
# Wait for process to finish
# Try to close the browser if still open
try:
driver = queue.get()
driver.close()
except:
pass
I found a solution:
In foo(), get the process ID of the webdriver when you open a new browser. Add the process ID to the queue. Then in the main function, add time.sleep(60) to wait for a minute, then get the process ID from the queue and use a try-except to try and close the particular process ID.
If foo() running in a separate process hangs, then the browser will be closed in the main function after one minute.

How to break time.sleep() in a python concurrent.futures

I am playing around with concurrent.futures.
Currently my future calls time.sleep(secs).
It seems that Future.cancel() does less than I thought.
If the future is already executing, then time.sleep() does not get cancel by it.
The same for the timeout parameter for wait(). It does not cancel my time.sleep().
How to cancel time.sleep() which gets executed in a concurrent.futures?
For testing I use the ThreadPoolExecutor.
If you submit a function to a ThreadPoolExecutor, the executor will run the function in a thread and store its return value in the Future object. Since the number of concurrent threads is limited, you have the option to cancel the pending execution of a future, but once control in the worker thread has been passed to the callable, there's no way to stop execution.
Consider this code:
import concurrent.futures as f
import time
T = f.ThreadPoolExecutor(1) # Run at most one function concurrently
def block5():
time.sleep(5)
return 1
q = T.submit(block5)
m = T.submit(block5)
print q.cancel() # Will fail, because q is already running
print m.cancel() # Will work, because q is blocking the only thread, so m is still queued
In general, whenever you want to have something cancellable you yourself are responsible for making sure that it is.
There are some off-the-shelf options available though. E.g., consider using asyncio, they also have an example using sleep. The concept circumvents the issue by, whenever any potentially blocking operation is to be called, instead returning control to a control loop running in the outer-most context, together with a note that execution should be continued whenever the result is available - or, in your case, after n seconds have passed.
I do not know much about concurrent.futures, but you can use this logic to break the time. Use a loop instead of sleep.time() or wait()
for i in range(sec):
sleep(1)
interrupt or break can be used to come out of loop.
I figured it out.
Here is a example:
from concurrent.futures import ThreadPoolExecutor
import queue
import time
class Runner:
def __init__(self):
self.q = queue.Queue()
self.exec = ThreadPoolExecutor(max_workers=2)
def task(self):
while True:
try:
self.q.get(block=True, timeout=1)
break
except queue.Empty:
pass
print('running')
def run(self):
self.exec.submit(self.task)
def stop(self):
self.q.put(None)
self.exec.shutdown(wait=False,cancel_futures=True)
r = Runner()
r.run()
time.sleep(5)
r.stop()
As it is written in its link, You can use a with statement to ensure threads are cleaned up promptly, like the below example:
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
I've faced this same problem recently. I had 2 tasks to run concurrently and one of them had to sleep from time to time. In the code below, suppose task2 is the one that sleeps.
from concurrent.futures import ThreadPoolExecutor
executor = ThreadPoolExecutor(max_workers=2)
executor.submit(task1)
executor.submit(task2)
executor.shutdown(wait=True)
In order to avoid the endless sleep I've extracted task2 to run synchronously. I don't whether it's a good practice, but it's simple and fit perfectly in my scenario.
from concurrent.futures import ThreadPoolExecutor
executor = ThreadPoolExecutor(max_workers=1)
executor.submit(task1)
task2()
executor.shutdown(wait=True)
Maybe it's useful to someone else.

Why does this Selenium test fail UNLESS I insert a pdb breakpoint?

I have a Selenium LiveServerTestCase (django project) that tests an AJAXy page. After the main page loads, there's a delay before another item loads, and I need to test for that second item. I'm trying to include a smart way to wait without doing time.sleep(too_long), but the test always fails unless I insert a pdb breakpoint.
def some_test_thing(self):
#loads a page with some ajaxy stuff, so there's a
#delay that needs to be accounted for
url = "something..."
self.browser.get(url)
#import ipdb;ipdb.set_trace() #this test only passes with this statement, wtf^^?
self.assertWithWait(self.assert_, args=(some_args,))
and I use the convenience function assertWithWait, which tries the assertion in a loop, catching the AssertionError if the test fails, until a timeout.
def assertWithWait(self, assertion, timeout=10, args=(), kwargs={}):
total_time = 0
while total_time < timeout:
try:
assertion(*args, **kwargs)
except AssertionError:
pass
else: # assertion passed
return
time.sleep(0.2)
total_time += 0.2
assertion(*args, **kwargs) # final try, will cause test failure if unsuccessful
The problem is that the test always fails if I don't have that pdb breakpoint - the loop runs until the timeout at 10 seconds. However, if I include that pdb breakpoint, and even if I enter "c" and continue immediately (like within 1 second), the test will pass. So clearly it's not a timing issue, because the test completes successfully well within the 10 second timeout in the breakpoint case, but fails after waiting 10 seconds in the no-breakpoint case. So it seems like something related to multiprocessing or multithreading that Selenium is doing, that maybe gets released by the pdb breakpoint? I'm grabbing at straws a bit here.
Help!
EDIT
I'm using this assertWithWait not just to wait for things to load in the page, but to wait for the results of some AJAXy server calls to check that things changed in the database too. So I can't just use Selenium's built-in WebDriverWait functions. Sorry, should have been more clear about that.
You are basically reinventing what an Explicit Wait was made for, use it and handle TimeoutException:
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
def some_test_thing(self):
try:
element = WebDriverWait(self.browser, 10).until(
EC.presence_of_element_located((By.ID, "myDynamicElement"))
)
except TimeoutException:
self.fail("Element not found")
This waits up to 10 seconds before throwing a TimeoutException or if
it finds the element will return it in 0 - 10 seconds. WebDriverWait
by default calls the ExpectedCondition every 500 milliseconds until it
returns successfully. A successful return is for ExpectedCondition
type is Boolean return true or not null return value for all other
ExpectedCondition types.
DUH, my bad. Objects! Let's walk through the calling signature of assertWithWait and the way that assertion later gets executed, and the problem will become clear. So here's the relevant stuff:
def assertWithWait(self, assertion, timeout=10, args=(), kwargs={}):
...
assertion(*args, **kwargs)
...
Let's suppose the assertion is
self.assertEqual(SomeModel.objects.get(pk=1).some_field,
"some value")
So we call assertWithWait like this:
self.assertWithWait(self.assertEqual,
args=(SomeModel.objects.get(pk=1).some_field,
"some value"))
Can you spot my dumb mistake yet??
The args are being evaluated at the time that assertWithWait is called, and so at each iteration of the loop inside assertWithWait the assertEqual is being called with exactly the same arguments (objects!).
The fix is to change the calling signature of assertWithWait to use lambda:
def assertWithWait(self, assertion_func, timeout=6):
"""
Tries the assertion in a loop until a timeout.
Usage:
self.assertWithWait(lambda: assertion_statement)
e.g.
self.assertWithWait(lambda: self.assertEqual(something, something_else))
"""
...
try:
assertion_func()
except AssertionError:
pass
...
Now we have an all-purpose assertWithWait that can check browser elements OR database values (or whatever else you like). And as a bonus, the calling syntax got easier to read - python ftw!

Python use signal alarm in order to limit function execution time dosnt always stop the method

I need to limit function execution time, so i followed Josh Lee answer
try:
with time_limit(10):
long_function_call()
except TimeoutException, msg:
print "Timed out!"
where long_function_call() is a selenium webdriver function that interact with a page and do some operations.
def long_function_call(self, userName, password):
driver = self.initDriver()
try:
driver.get("https://yyyy.com")
time.sleep(2)
if not self.isHttps(driver.current_url):
isHttps = False
driver.find_element_by_id("i015516").clear()
time.sleep(5)
if 'https://yyy.com' not in driver.current_url:
self.raiseFailedToLogin('yyy')
except Exception as e:
self.raiseException('yyy',e)
finally:
driver.close()
driver.quit()
return 'yyyy'
In most cases , when function execution time exceed the signal timeout signal was sent and method stopped, but in some cases the method exceed the timeout and didnt stop. it seems that selenium is hang.(the firefox is open and nothing is done in it).
I tried to pause debugger in these cases , but pause didn't show me where it hang.
If i close the selenium firefox than the debug pause stop on this method:
_read_status [httplib.py:366]
begin [httplib.py:407]
getresponse [httplib.py:1030]
do_open [urllib2.py:1180]
http_open [urllib2.py:1207]
def _read_status(self):
# Initialize with Simple-Response defaults
line = self.fp.readline()
if self.debuglevel > 0: ################Hang here
Any idea why in some cases signal alarm with selenium didnt work? (i dont think they catch interrupt).
This is a very interesting problem you are facing. I've created an example which can demonstrate some of the issues you may face when using with time_limit. If you run the code below you may expect that after 1 second a TimeoutException will be raised which will then exit python and the running thread as well as the xterm should all exit. What happens is after one second the TimeoutException is raised and you will see the "Timed out!" appear in the terminal but both the thread and xterm will continue their execution. This maybe the type of scenario you are facing with selenium. I don't know how selenium is implemented but the firefox process must be spawned in a way similar to how xterm is in the example.
simple approach
import time, os, sys, subprocess, threading
def worker():
for i in range(30):
print i
time.sleep(0.1)
try:
with time_limit(1):
threading.Thread(target=worker).start()
subprocess.check_output('/usr/bin/xterm')
except TimeoutException, msg:
print "Timed out!"
a potential solution to this problem is the following code. It will kill all children of this python program which would kill xterm in this case, and then it would kill itself. This is of course a forceful way to exit but would guarantee everything ended on timeout.
kill them all
subprocess.call('pkill -P %s' % os.getpid(), shell=True)
subprocess.call('kill -9 %s' % os.getpid(), shell=True)
taking into account your comments below another approach would be to have another thread that would perform the kill operations if an operation in the main thread exceeded the specified timeout. This is an implementation of that approach with an example call.
thread that watches and kills on timeout
import time, os
from subprocess import check_output, call
from threading import Thread
from contextlib import contextmanager
def wait_thread(active, duration):
start = time.time()
while active[0] and time.time() - start < duration:
time.sleep(0.1)
if active[0]:
call('pkill -P %s' % os.getpid(), shell=True)
call('kill -9 %s' % os.getpid(), shell=True)
#contextmanager
def wait(duration):
active = [True]
Thread(target=wait_thread, args=(active, duration)).start()
yield
active[0] = False
with wait(1):
time.sleep(0.5)
print 'finished safely before timeout'
with wait(1):
call('/usr/bin/xterm')

Categories