Selenium reusing browser session - python

I would like to have a single long-running browser session, that is reused between separate runs of my script. Thus allowing me to avoid logging in every time my script runs. Using other answers I have a working solution:
session_info = load_from_json()
options = webdriver.ChromeOptions()
driver = webdriver.Remote(
command_executor=session_info["executor_url"],
desired_capabilities={},
options = options)
driver.session_id = session_info["session_id"]
This has an unwanted side effect of leaving an orphaned chrome-webdriver session laying around on top of the already existing browser session. I was wondering what I can do to avoid having an extra orphaned session.

Prior to loading a new session try to clear both session and local storage.
driver.getSessionStorage().clear();
driver.getLocalStorage().clear();

Related

Python selenium share second browser between test

I'm quite new to selenium and may be doing something wrong. Please fix me!
I'm creating some E2E tests, part of them require second account.
Each time I open new browser, I have to make login procedure, that takes time.
So, I decided to keep the second browser open between tests and reuse it.
But I can't pass the newly created selenium object to the second test. What I'm doing wrong?
class RunTest(unittest.TestCase):
#classmethod
def setUpClass(self):
#main browser that I always use
self.driver = webdriver.Chrome(...)
def init_second_driver(self):
#second browser that could be used by some tests
self.second_driver = webdriver.Chrome(...)
def test_case1(self):
if not self.second_driver:
self.init_second_driver()
#some tests for case 1
def test_case2(self):
if not self.second_driver: #this check always fails! WHY?
self.init_second_driver()
#some tests for case 2
Thank you in advance
Everytime you create your chromedriver object it's default option is to create a new Chrome profile. Think of a profile as your local cookie store and chache.
You want this to happen. Selenium is designed for testing and logging in each time without history ensures you tests always start from the same state (not logged in and no cookies).
If you have a lot of tests and want your suite to run faster consider running tests in parallel.
For now, if you want to try sharing state between tests (i.e. staying logged in) you can instruct chrome to reuse a profile with the following option/tag:
options = webdriver.ChromeOptions()
options.add_argument('--user-data-dir=C:/Path/To/Your/Testing/User Data')
driver = webdriver.Chrome(options=options)
That should remove the need for a second browser your state.

Working with Selenium and multiple Chrome Browsers on Linux

I have developed a python application for Selenium using Chrome/ChromeDriver.
The application seems to work pretty well on my Windows Based Laptop, but when I move everything to my Linux Based Server Machine, I started to notice strange behaviors when running multiple Browser instances in parallel.
The approach I am currently using is pretty simple: I have one Daemon, launching a separated process instance with Popen for each sub process, as follows:
sub_call = subprocess.Popen(["python", "my_script.py"], stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE, text=True, universal_newlines=True)
Ideally each sub process should run in parallel in a separated Chrome session, with its own Browser parameters (cookies, proxies, etc...).
No sync mechanisms are implemented among any sub process.
For now I would prefer to avoid Selenium Grid because I only plan to run few instances in parallel (let's say less than 5), and I have hardware constraints being everything running in a Docker on a NAS.
This is the code I use to instantiate every ChromeDriver object:
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_argument('--disable-blink-features=AutomationControlled')
chrome_options.add_argument('--user-agent={}'.format(constants.BROWSER_AGENT_HEADER))
chrome_options.add_argument("--headless")
#chrome_options.add_argument("--remote-debugging-port={}".format(CHROME_DRIVER_DEBUG_PORT))
chrome_options.add_argument("--window-size={}".format(CHROME_DRIVER_RESOLUTION))
if ON_POSIX:
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
# chrome_options.add_argument('--remote-debugging-address={}'.format(CHROME_DRIVER_DOCKER_BIND))
chrome_options.add_argument('--proxy-server=http://{}:{}'.format(self.proxy_ip, self.proxy_port))
cap = webdriver.DesiredCapabilities.CHROME
cap['proxy'] = {
"httpProxy": proxy_address_str,
"httpsProxy": proxy_address_str,
"ftpProxy": proxy_address_str,
"sslProxy": proxy_address_str,
"proxyType": "MANUAL",
}
cap['goog:loggingPrefs'] = {'performance': 'ALL'}
service_args = ["--log-path={}/chromedriver.log".format(LOGFILE_PATH),
"--whitelisted-ips=", "--profile-directory={}".format(username)]
self.driver = webdriver.Chrome(executable_path=constants.CHROME_DRIVER_PATH,
service_args=service_args, desired_capabilities=cap, options=chrome_options)
Where I tried to separate each user environment with the --profile-directory arg (NOTE: I just put my system user_names, without setting up any Chrome profile actually)
However it seems that while under windows each process opens its own Chrome window, on Linux there are conficts.
The major issue I had is that when a process ends, if it executes just one of the above lines, effects the other's processes browser in some ways:
driver.quit()
driver.close()
driver.find_element_by_tag_name('body').send_keys(Keys.CONTROL + 'w')
I have temporary commented each of the above lines. It seems to work perfectly (10 hours running at normal duty, no issues)
Now my concern is that I want to avoid that all the open instances of Chrome will end up to get all my memory.
How can I clean the browser without effecting my other running processes?
Thank you in advance for any support I will receive.

(Selenium) Running many firefox browser with less performance [duplicate]

I am using selenium with Firefox to automate some tasks on Instagram. It basically goes back and forth between user profiles and notifications page and does tasks based on what it finds.
It has one infinite loop that makes sure that the task keeps on going. I have sleep() function every few steps but the memory usage keeps increasing. I have something like this in Python:
while(True):
expected_conditions()
...doTask()
driver.back()
expected_conditions()
...doAnotherTask()
driver.forward()
expected_conditions()
I never close the driver because that will slow down the program by a lot as it has a lot of queries to process. Is there any way to keep the memory usage from increasing overtime without closing or quitting the driver?
EDIT: Added explicit conditions but that did not help either. I am using headless mode of Firefox.
Well, This the serious problem I've been going through for some days. But I have found the solution. You can add some flags to optimize your memory usage.
options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
options.add_argument('--no-sandbox')
options.add_argument('--disable-application-cache')
options.add_argument('--disable-gpu')
options.add_argument("--disable-dev-shm-usage")
These are the flags I added. Before I added the flags RAM usage kept increasing after it crosses 4GB (8GB my machine) my machine stuck. after I added these flags memory usage didn't cross 500MB. And as DebanjanB answers, if you running for loop or while loop tries to put some seconds sleep after each execution it will give some time to kill the unused thread.
To start with Selenium have very little control over the amount of RAM used by Firefox. As you mentioned the Browser Client i.e. Mozilla goes back and forth between user profiles and notifications page on Instagram and does tasks based on what it finds is too broad as a single usecase. So, the first and foremost task would be to break up the infinite loop pertaining to your usecase into smaller Tests.
time.sleep()
Inducing time.sleep() virtually puts a blanket over the underlying issue. However while using Selenium and WebDriver to execute tests through your Automation Framework, using time.sleep() without any specific condition defeats the purpose of automation and should be avoided at any cost. As per the documentation:
time.sleep(secs) suspends the execution of the current thread for the given number of seconds. The argument may be a floating point number to indicate a more precise sleep time. The actual suspension time may be less than that requested because any caught signal will terminate the sleep() following execution of that signal’s catching routine. Also, the suspension time may be longer than requested by an arbitrary amount because of the scheduling of other activity in the system.
You can find a detailed discussion in How to sleep webdriver in python for milliseconds
Analysis
There were previous instances when Firefox consumed about 80% of the RAM.
However as per this discussion some of the users feels that the more memory is used the better because it means you don't have RAM wasted. Firefox uses RAM to make its processes faster since application data is transferred much faster in RAM.
Solution
You can implement either/all of the generic/specific steps as follows:
Upgrade Selenium to current levels Version 3.141.59.
Upgrade GeckoDriver to GeckoDriver v0.24.0 level.
Upgrade Firefox version to Firefox v65.0.2 levels.
Clean your Project Workspace through your IDE and Rebuild your project with required dependencies only.
If your base Web Client version is too old, then uninstall it and install a recent GA and released version of Web Client.
Some extensions allow you to block such unnecessary content, as an example:
uBlock Origin allows you to hide ads on websites.
NoScript allows you to selectively enable and disable all scripts running on websites.
To open the Firefox client with an extension you can download the extension i.e. the XPI file from https://addons.mozilla.org and use the add_extension(extension='webdriver.xpi') method to add the extension in a FirefoxProfile as follows:
from selenium import webdriver
profile = webdriver.FirefoxProfile()
profile.add_extension(extension='extension_name.xpi')
driver = webdriver.Firefox(firefox_profile=profile, executable_path=r'C:\path\to\geckodriver.exe')
If your Tests doesn't requires the CSS you can disable the CSS following the this discussion.
Use Explicit Waits or Implicit Waits.
Use driver.quit() to close all
the browser windows and terminate the WebDriver session because if
you do not use quit() at the end of the program, the WebDriver
session will not be closed properly and the files will not be cleared
off memory. And this may result in memory leak errors.
Creating new firefox profile and use it every time while running test cases in Firefox shall eventually increase the performance of execution as without doing so always new profile would be created and caching information would be done there and if driver.quit does not get called somehow before failure then in this case, every time we end up having new profiles created with some cached information which would be consuming memory.
// ------------ Creating a new firefox profile -------------------
1. If Firefox is open, close Firefox.
2. Press Windows +R on the keyboard. A Run dialog will open.
3. In the Run dialog box, type in firefox.exe -P
Note: You can use -P or -ProfileManager(either one should work).
4. Click OK.
5. Create a new profile and sets its location to the RAM Drive.
// ----------- Associating Firefox profile -------------------
ProfilesIni profile = new ProfilesIni();
FirefoxProfile myprofile = profile.getProfile("automation_profile");
WebDriver driver = new FirefoxDriver(myprofile);
Please share execution performance with community if you plan to implement this way.
There is no fix for that as of now.
I suggest you use driver.close() approach.
I was also struggling with the RAM issue and what i did was i counted the number of loops and when the loop count reached to a certain number( for me it was 200) i called driver.close() and then start the driver back again and also reset the count.
This way i did not need to close the driver every time the loop is executed and has less effect on the performance too.
Try this. Maybe it will help in your case too.

Python Selenium send request and avoid "Waiting for (website) ...."

I am launching several requests on different tabs. While one tab loads I will iteratively go to other tabs and see whether they have loaded correctly. The mechanism works great except for one thing: a lot of time is wasted "Waiting for (website)..."
The way in which I go from one tab to the other is launching an exception whenever a key element that I have to find is missing. But, in order to check for this exception (and therefore to proceed on other tabs, as it should do) what happens is that I have to wait for the request to end (so for the message "Waiting for..." to disappear).
Would it be possible not to wait? That is, would it be possible to launch the request via browser.get(..) and then immediately change tab?
Yes you can do that. You need to change the pageLoadStrategy of the driver. Below is an example of firefox
import time
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium import webdriver
cap = DesiredCapabilities.FIREFOX
cap["pageLoadStrategy"] = "none"
print(DesiredCapabilities.FIREFOX)
driver = webdriver.Firefox(capabilities=cap)
driver.get("http://tarunlalwani.com")
#execute code for tab 2
#execute code for tab 3
Nothing will wait now and it is up to you to do all the waiting. You can also use eager instead of none

Multi-threading in selenium python

I am working on a project which needs bit automation and web-scraping for which I am using Selenium and BeautifulSoup (python2.7).
I want to open only one instance of a web browser and login to a website, keeping that session, I am trying to open new tabs which will be independently controlled by threads, each thread controlling a tab and performing their own task. How should I do it? An example code would be nice. Well here's my code:
def threadFunc(driver, tabId):
if tabId == 1:
#open a new tab and do something in it
elif tabId == 2:
#open another new tab with some different link and perform some task
.... #other cases
class tabThreads(threading.Thread):
def __init__(self, driver, tabId):
threading.Thread.__init__(self)
self.tabID = tabId
self.driver = driver
def run(self):
print "Executing tab ", self.tabID
threadFunc(self.driver, self.tabID)
def func():
# Created a main window
driver = webdriver.Firefox()
driver.get("...someLink...")
# This is the part where i am stuck, whether to create threads and send
# them the same web-driver to stick with the current session by using the
# javascript call "window.open('')" or use a separate for each tab to
# operate on individual pages, but that will open a new browser instance
# everytime a driver is created
thread1 = tabThreads(driver, 1)
thread2 = tabThreads(driver, 2)
...... #other threads
I am open to suggestions for using any other module, if needed
My understanding is that Selenium drivers are not thread-safe. In the WebDriver spec, the Thread Safety section is empty...which I take to mean they have not addressed the topic at all. https://www.w3.org/TR/2012/WD-webdriver-20120710/#thread-safety
So while you could share the driver reference with multiple threads and make calls to the driver from multiple threads, there is no guarantee that the driver will be able to handle multiple asynchronous calls correctly.
Instead, you must either synchronize calls from multiple threads to ensure one is completed before the next starts, or you should have just one thread making Selenium API calls...potentially handling commands from a queue that is filled by multiple other threads.
Also, see Can Selenium use multi threading in one browser?
I you are using the script to automatically submit forms (simply said doing GET and POST requests), I would recommend you to look at requests. You can easily capture Post requests from your Browser (Network tab in Developer Pane on both Firefox and Chrome), and submit them. Something like:
session = requests.session()
response = session.get('https://stackoverflow.com/')
soup = BeautifulSoup(response.text)
and even POST data like:
postdata = {'username':'John','password':password}
response=session.post('example.com',data=postdata,allow_redirects=True)
It can be easily threaded, Multiple times faster than using selenium, the only problem is there is no JavaScript or Form support, so you need to do it the old fashioned way.
EDIT:
Also take a look at ThreadPoolExecutor

Categories