I have written a python test script to test out functionality of a website. Functionality such as Login into the web page, etc. In order to maximize testing, I have tried to implement multithreading to speed up the test process (so I could run two test cases concurrently). I found out that when I run the scripts, two browser would be open (which is correct), however, only one of the browser would be doing the actions I have scripted (such as clicking an element). I am able to browser.get(link) correctly but browser.find_element_by_xpath(xpath).click() didn't work.
thread1 = threading.Thread(target=runTC, args=(argument1,))
thread2 = threading.Thread(target=runTC, args=(argument2,))
# Will execute both in parallel
thread1.start()
thread2.start()
runTC() consists of the test functions I wrote.
To use benefit of multi-threading each your thread has to work with different WebDriver object since a WebDrider object is a sort of binding to a particular session/instance of your web browser.
A web driver is a service that exposes REST interface (W3C standard) to your PL bindings on one hand and translates the calls from your test script to what a particular browser understands on another hand (that is why you have different web drivers for different browsers).
When you create an instance of WebDriver in your test script it actually establishes a session that is associated with running browser. So if your multiple threads use the same object, they'll be acting within the same session, use the same cookies and hence impact each other.
If your threads use their own instances of WebDriver they'll be running in isolation in parallel each in its own browser.
Related
I am trying to build a browser-based debugging tool for Python but I am having trouble combining the Python inner workings with user interactions. My setup looks like this (jsfiddle link). Essentially I am using sys.settrace to inspect a function one opcode at a time. My goal is for the user to be able to manually step forward each instruction by pressing the Step button.
The problems I am having is that the tracer function cannot be asynchronous because Python does not like that, and if it is synchronous there is no real way to stop and wait for input/interaction. If I make just an infinite while loop it freezes the browsers main thread so the page becomes unresponsive.
Does anyone have advice on how I can structure this to allow interaction between the UI and the tracer function?
I managed to get a work-around working based on a service worker. I used an example implementation from here. It utilizes the fact that you can make a synchronous thread wait for an HTTP request, so we intercept that request using a service worker and make it last for as long as we need, and when we're done we can even send data back with the request.
Just point me in the right direction, please. I have no idea how to deal with this.
So, I want to connect my scraping script written in Python, connected to a front-end Android app. I have already written the script and front-end is ready as well. However, I dont know how these two things would communicate with each other, in which the script constantly listens for requests from the Android App (Through Firebase maybe?).
However, there is one more thing. Since multiple users would use the app at the same time, so there will be parallel requests sent from the App as well. How do I let the script to process the requests concurrently without waiting for first one to be completed. All the scraping is done through requests library. I researched a bit, and found some hints related to Threading, Queue, Async etc.
Kindly, tell me which way do I go?
In the given scenario, you can put the script in an azure python function and make a call to it whenever required. However, as you mentioned there will be multiple parallel request which might pose a warning due to the single threaded architecture of Python.
It is documented in our Python Functions Developer reference on how to handle such scenario’s: functions-reference-python ( also check asyncio-eventloop )
Here are methods to handle this:
Use Async calls
Add more Language worker processes per host, this can be done by using application setting : FUNCTIONS_WORKER_PROCESS_COUNT upto a maximum value of 10.
[Please note that each new language worker is spawned every 10 seconds until they are warm.]
Here is a GitHub issue which talks about this issue in detail : https://github.com/Azure/azure-functions-python-worker/issues/236
I have a script that uses a lot of headless Selenium automation and looped HTTP requests. It's very important that I implement a threading/worker queue for this script. I've done that.
My question is: Should I be using multi-thread or multi-process? Thread or ProcessPool? I know that:
"If your program spends more time waiting on file reads or network requests or any type of I/O task, then it is an I/O bottleneck and you should be looking at using threads to speed it up."
and...
"If your program spends more time in CPU based tasks over large datasets then it is a CPU bottleneck. In this scenario you may be better off using multiple processes in order to speed up your program. I say may as it’s possible that a single-threaded Python program may be faster for CPU bound problems, it can depend on unknown factors such as the size of the problem set and so on."
Which is the case when it comes to Selenium? Am I right to think that all CPU-bound tasks related to Selenium will be executed separately via the web driver or would my script benefit from multiple processes?
Or to be more concise: When I thread Selenium in my script, is the web driver limited to 1 CPU core, the same core the script threads are running on?
Web driver is just a driver, a driver cannot drive a car without a car.
For example when you use ChromeDriver to communicate with browser, you are launching Chrome. And ChromeDriver itself does no calculation but Chrome does.
So to clarify, webdriver is a tool to manipulate browser but itself is not a browser.
Based on this, definitely you should choose thread pool instead of process pool as it is surely an I/O bound problem in your python script.
I have created a testsuite which has 2 testcases that are recorded using selenium in firefox. Both of those test cases are in separate classes with their own setup and teardown functions, because of which each test case opens the browser and closes it during its execution.
I am not able to use the same web browser instance for every testcase called from my test suite. Is there a way to achieve this?
This is how is suppose to work.
Tests should be independent else they can influence each other.
I think you would want to have a clean browser each time and not having to clean session/cookies each time, maybe now not, but when you will have a larger suite you will for sure.
Each scenario has will start the browser and it will close it at the end, you will have to research which methods are doing this and do some overriding, this is not recommended at all.
My code is running some instances of threading.Thread for some long asynchronous tasks.
This does not allow me running my django unittests using sqlite backend, because sqlite can not handle multiple connections in threads. Thus, I am successfully mocking Thread with a FakeThread class that i wrote (it simply runs the target synchronously).
However, the mock does not seem to work for selenium tests. I do:
from tests.stubs import FakeThread
# ...
class FunctionalTest(LiveServerTestCase):
#mock.patch('accounts.models.user_profile.Thread', new=FakeThread)
def test_register_agency(self):
self.browser.get("%s%s" % (self.live_server_url, "/register"))
# .. fill in form, submit, eventually calls something in user_profile
# using an instance of Thread. Thread seems to still be threading.Thread
Any idea how to mock Thread in the code that selenium runs when serving my browser calls? Thank you!