Run python SCRIPT on multiple browsers at the same time using selenium - python

I would like to run my script on Multiple browser using selenium.
As of now I am able to perform the operation by opening one browser at a time.
Eg:- Register to amazon.
I want to be able to Register two users to amazon at the same time.
This is the code I have as of now.
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.select import Select
driver.get("https://www.amazon.com/ap/register?openid.pape.max_auth_age=0&openid.return_to=https%3A%2F%2Fwww.amazon.com%2F%3Fref_%3Dnav_signin&prevRID=VBHFJ50CPKFJ3PGG7RDY&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.assoc_handle=usflex&openid.mode=checkid_setup&openid.ns.pape=http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0&prepopulatedLoginId=&failedSignInCount=0&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&pageId=usflex&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0")
driver.find_element_by_xpath("""//*[#id="s2id_ID_form4a8055de_guest_register_sponsor_lookup"]/a/span[2]/b""").click()
driver.find_element_by_xpath("""//*[#id="s2id_autogen1_search"]""").send_keys(v1)
By using this I can run it for one user at one time. But I want to be able to register more than two users upto n users at the same time.
Hence, the multiple windows questions.

You could create multiple instances of the webdriver. You can then manipulate each individually. For example,
from selenium import webdriver
driver1 = webdriver.Chrome()
driver2 = webdriver.Chrome()
driver1.get("http://google.com")
driver2.get("http://yahoo.com")

This question is a bit old at this point, but I still found it applicable to something I was having trouble with today.
In order to achieve parallel processes you need to utilize multiprocessing. Essentially, this allows you to create browser instances for each function and allow each script to lock to each browser GIL separately. You can then start each of the processes in your main code and they will all execute in parallel.
If you need an explanation on how to do this, a great video can be found here

Related

Should I use Selenium grid for the following scenario?

So basically I have a list of URLs. I want to open each URL using webdriver simultaneously so the task can be achieved in a short span of time (instead of looping through each URL in the list).
Should I use Selenium Grid or is there a simpler way?
My code looks as follows:
import selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException
import time
from selenium.webdriver.common.keys import Keys
list = ['www.link1.com', 'www.link2.com','www.link3.com'....]
for i in list:
driver2 = webdriver.Chrome()
driver2.get(i)
time.sleep(1)
try:
finallinks = []
all_links = driver2.find_elements(By.XPATH, "/html/body/main/section[1]/div/section[2]/div/div[1]/div/div/div[1]/div/div/section/main/div[2]/form/div[2]/div/div/a")
print("HOLAAAAAA")
for a in all_links:
if str(a.get_attribute('href')).startswith("https://something/view") and a.get_attribute(
'href') not in finallinks:
finallinks.append(a.get_attribute('href'))
print(finallinks)
except NoSuchElementException:
print("Didn't exist")
if you want to run multiple instances of your webdriver tests in parallel, you can use Selenium Grid. It enables you to distribute your tests across multiple machines and run them in parallel, which can significantly reduce the time it takes to complete a suite of tests.
However, if you are working on a small scale, you can also use multi-threading or multi-processing to run multiple webdriver instances simultaneously in your code. This approach may be simpler than setting up a Selenium Grid, but it will not scale as well if you need to run tests on many machines or if you need to run tests in different environments.
Selenium Grid
Selenium Grid allows the execution of automation scripts on remote machines by routing commands sent by the client to remote browser instances using the WebDriver. Selenium Grid enables us to run tests in parallel on multiple machines and also allows testing on different browser versions enabling cross platform testing.
This usecase
A lot depends on the size of the list of urls.
Case A: In case of 5-10 urls:
list = ['www.link1.com', 'www.link2.com','www.link3.com'....'www.link10.com']
I would still go with a single node, just to save the cost of maintaining the Selenium Grid
Case B: In case of more then 10 urls:
list = ['www.link1.com', 'www.link2.com','www.link3.com'....'www.link20.com']
It would be adviseable to implement Selenium Grid and distribute the urls among the available Selenium Grid Nodes as follows:
Node 1:
list = ['www.link1.com', 'www.link2.com','www.link3.com'....'www.link10.com']
Node 2:
list = ['www.link11.com', 'www.link12.com','www.link13.com'....'www.link20.com']

Creating a script that takes live data from a website (for now) and displays it

This isn't really a specific question i'm sorry for that. I'm trying to create a script that would take real time data from another site ( from table tag to be exact, make it an array and display it somewhere ). I've created a simple python script:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import requests
import time
driver = webdriver.Chrome('C:/drivers/chromedriver.exe')
driver.set_page_load_timeout("10")
driver.get("link to the site")
driver.find_element_by_id("username-real").send_keys("login")
driver.find_element_by_id("pass-real").send_keys("pwd")
driver.find_element_by_xpath('//input[#class="button-login"]').submit()
#here potentially for loop that would refresh every second:
for elem in driver.find_elements_by_xpath('//[#class="table-body#"]'):
#do something
As you can see it's pretty simple, basically open chrome webdriver, log in to the website and do something with the table, I didn't try to properly get the data yet because i don't like this method.
I was wondering if there's another way to do it, without running the webdriver - some console like application? I'm pretty lost what should i look into in order to create a script like that. Other programming language? Some kind of framework/method?
If you want to use Selenium you have to use the WebDriver. See it as a "connection" between your Programm and Google Chrome. If you can use Safari you can use Selenium without any WebDrivers that have to be installed manually.
If you want to use other tools I can recommend Beautifulsoup. It's basically a HTML-Parser wich looks into the HTML-Code of the WebPage. With BS you don't have to install any Drivers etc. You also can use BS with Python.
A other Method I'm thinking of is, downloading the HTML-Text of the WebPage and search locally through the file. But I wouldn't recommend this Method.
For WebPages Selenium is really the way to go. I often use it for my own projects

How to login to a website using Python/Selenium?

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait # available since 2.4.0
from selenium.webdriver.support import expected_conditions as EC # available since 2.26.0
browser = webdriver.Chrome('C:/Users/xyz/Downloads/chromedriver.exe')
# Define all variables required
urlErep = browser.get('http://www.erepublik.com')
xPathToSubmitButton = "//*[#id='login_form']/div[1]/p[3]/button"
urlAlerts = 'https://www.erepublik.com/en/main/messages-alerts/1'
one = 1
xPathToAlerts = "//*[#id='deleteAlertsForm']/table/tbody/tr[%d]/td[3]/p" %one
def logintoerep():
email = browser.find_element_by_id("citizen_email")
password = browser.find_element_by_id("citizen_password")
email.send_keys('myemail')
password.send_keys('mypassword')
browser.find_element_by_xpath(xPathToSubmitButton).click()
logintoerep()
The text above is code I wrote using Selenium to login to erepublik.com.
My main goal is to verify some information on eRepublik.com whenever someone fills a Google Form, and then complete an action based on the Google Form data. I'm trying to login to eRepublik using Selenium, and in each attempt to run the script(which I need to run 24/7, so that whenever the form gets a new response the script is ran) it creates a new window, and after 10-20 times I've logged in to the website it asks for captcha which Selenium can't complete. While in my existing browser window, I'm already logged in so I don't have to worry about Captcha and can just run my code.
How can I bypass this problem? Because I need the script to be able to login every time on its own, but captcha won't allow that. The best solution would be to use Selenium on my existing browser windows, but it doesn't allow that.
Is is possible to copy some settings from my normal browser windows to the Selenium-run browser windows so that every time logs in automatically instead?
I'm open to any suggestions as long as they can get me to verify and complete a few minor actions in the website I've linked.
You can attach your Chrome profile to Selenium tests
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir=C:\\Path") #Path to your chrome profile
browser = webdriver.Chrome(executable_path="C:\\Users\\chromedriver.exe", chrome_options=options)
First off, CAPTCHAs are meant to do exactly that: repel robots/scripts from brute-forcing, or doing repeated actions on certain app features (e.g: login/register flows, send messages, purchase flows, etc.). So you can only go around... never through.
That being said, you can simulate the logged-in state by doing one of the following:
loading the authentication cookies required for the user to be logged in (usually it's only one cookie with a token of some sorts);
loading a custom profile in the browser that already has that user logged in;
use some form of basic auth when navigating to that specific URL (if the web-app has any logic to support this);
Recommended approach: Usually in most companies (at least from my exp), there usually is a specific cookie, or flag that you can set to disable CAPTCHAs for testing purposes. If this is not the case, talk to your PM/DEVs to create such a feature that permits the testing of your web-app.
Don't want to advertise advertise my content, but I think I best tackled this topic HERE. Maybe it can further help.
Hope you solve the problem. Cheers!

Is it advisable to speed up scraping using selenium by starting multiple webdrivers?

I have over 19,000 links which I need to visit to scrape data from. Each takes about 5 seconds to fully load, which means that I will need slightly more than 26 hours to scrape everything on a single webdriver.
To me, it seems that a solution is simply to start another webdriver (or few others) in a separate python notebook which goes through another portion of the links in parallel. i.e:
In first iPython notebook:
from selenium import webdriver
driver1 = webdriver.Firefox()
... scraping code looping over links 0-9500 using driver1...
In second iPython notebook:
from selenium import webdriver
driver2 = webdriver.Firefox()
... scraping code looping over links 9501-19000 using driver2...
I'm fairly new to scraping so this question may be completely elementary/ridiculous(?). However, I've tried searching for this and haven't seen anything on the topic, so I would appreciate any advice on this matter. Or any recommendations for a better/more correct way to implement this.
I've heard of multi-threading using the thread module (http://www.tutorialspoint.com/python/python_multithreading.htm), but wonder whether implementing it in this manner would have any advantage over simply creating multiple webdrivers as in the aforementioned code.
You really need to use Selenium in order to do this?
Check Scrapy with this framework you can easily send a lots of request and scrape data. Selenium is useful to get browser automation.

Python django: How to call selenium.set_speed() with django LiveServerTestCase

To run my functional tests i use LiveServerTestCase.
I want to call set_speed (and other methods, set_speed is just an example) that aren't in the webdriver, but are in the selenium object.
http://selenium.googlecode.com/git/docs/api/py/selenium/selenium.selenium.html#module-selenium.selenium
my subclass of LiveServerTestCase
from selenium import webdriver
class SeleniumLiveServerTestCase(LiveServerTestCase):
#classmethod
def setUpClass(cls):
cls.driver = webdriver.Firefox()
cls.driver.implicitly_wait(7)
cls.driver.maximize_window()
# how to call selenium.selenium.set_speed() from here? how to get the ref to the selenium object?
super(SeleniumLiveServerTestCase, cls).setUpClass()
How to get that? I can't call the constructor on selenium, i think.
You don't. Setting the speed in WebDriver is not possible and the reason for this is that you generally shouldn't need to, and the 'waiting' is now done at a different level.
Before it was possible to tell Selenium, don't run this at normal speed, run it at a slower speed to allow more things to be available on page load, for slow loading pages or AJAX'ified pages.
Now, you do away with that altogether. Example:
I have a login page, I login and once logged in I see a "Welcome" message. The problem is the Welcome message is not displayed instantly and is on a time delay (using jQuery).
Pre WebDriver Code would dictate to Selenium, run this test, but slow down here so we can wait until the Welcome message appears.
Newer WebDriver code would dictate to Selenium, run this test, but when we login, wait up to 20 seconds for the Welcome Message to appearing, using explicit waits.
Now, if you really want access to "set" Selenium's speed, first off I'd recommend against it but the solution would be to dive into the older, now deprecated code.
If you use WebDriver heavily already, you can use the WebDriverBackedSelenium which can give you access to the older Selenium methods, whilst keeping the WebDriver backing the same, therefore much of your code would stay the same.
https://groups.google.com/forum/#!topic/selenium-users/6E53jIIT0TE
Second option is to dive into the old Selenium code and use it, this will change a lot of your existing code (because it is before the "WebDriver" concept was born).
The code for both Selenium RC & WebDriverBackedSelenium lives here, for the curious:
https://code.google.com/p/selenium/source/browse/py/selenium/selenium.py
Something along the lines of:
from selenium import webdriver
from selenium import selenium
driver = webdriver.Firefox()
sel = selenium('localhost', 4444, '*webdriver', 'http://www.google.com')
sel.start(driver = driver)
You'd then get access to do this:
sel.setSpeed(5000)

Categories