How to close the browser after completing a download? - python

How to make browser closed after completing download?
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
browser.get(any_url)
browser.find_elements_by_xpath('//input[#value="Download"]').click()
# The program start downloading now.
# HERE WHAT IS THE CODE?
browser.quit()
I want to close the browser only after completing the download.

You may want to use the below piece of code right before you close the browser.
time.sleep(5)# Gives time to complete the task before closing the browser. You
may increase the seconds to 10 or 15,basically the amount of time
required for download otherwise it goes to the next step
immediately.
browser.quit()

You can use the pause command:
pause ( waitTime )
Wait for the specified amount of time (in milliseconds)
http://release.seleniumhq.org/selenium-core/1.0/reference.html#pause
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
browser.get(any_url)
browser.find_elements_by_xpath('//input[#value="Download"]').click()
# The program start downloading now.
pause (10000) # pause/sleeps for 10 seconds
browser.quit()

This is an alternative way I did on C#. Maybe you can use the same technique and apply it on python.
public static string GetRequest(string url, bool isBinary = false) {
// binary is the file that will be downloaded
// Here you perform asynchronous get request and download the binary
// Python guide for GetRequest -> http://docs.python-requests.org/en/latest/user/quickstart/
}
browser.webdriver.Firefox();
browser.get(any_url);
elem = browser.findElement("locator");
GetRequest(elem.getAttribute('href'), true); // when this method is done, you expect the get request is done
browser.quit();

The trick that I used was to open the download manager page and expect by one element that indicate that the download is finished. Follow the Python code used:
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
...
# Wait until the download finish. This code just works for one single download at time on Firefox.
# browser.execute_script('window.open();')
# ActionChains(browser).key_down(Keys.COMMAND).send_keys('t').key_up(Keys.COMMAND).perform()
browser.get('about:downloads')
# files = browser.find_elements_by_class_name('download-state')
WebDriverWait(browser, URL_LOAD_TIMEOUT).until(EC.presence_of_element_located((By.CLASS_NAME, 'downloadIconShow')))
# 'downloadIconCancel'
browser.close()
broswer.quit()
The problem of this approach is that it may be dependent of Firefox version, if Mozilla change that download manager page.

Related

Why is the Selenium output different if run interactively vs in a Python script?

I am using Selenium in Python 3 to get the page source of a site that uses JavaScript. When I run it interactively in an iPython shell, it works as I expect it to. However, when the exact same script is executed non-interactively, the page source is not fully rendered (the JavaScript components aren't rendered). What could be the reason for this? I am running the exact same code on the exact same machine (a headless Linux server).
#!/usr/bin/python3
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
WINDOW_SIZE = "1920,1080"
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--window-size={0}".format(WINDOW_SIZE))
chrome_options.add_argument("--no-sandbox")
service = Service('/usr/local/bin/chromedriver')
driver = webdriver.Chrome(service=service, options=chrome_options)
driver.get("https://www.stakingrewards.com/staking/?page=1&sort=rank_ASC")
src = driver.page_source
# Check page source length
print(len(src))
# Quit all windows related to the driver instance
driver.quit()
The output from the iPython shell is 220101, which is expected, while the output from the command line executed script ($ python script.py) is 38265. Thus, I am not effectively rendering the JavaScript components when I invoke the script from the command line. Why?!
The problem is not with running it interactively or as a script.
In your code, you're not really giving any time for the driver to render all the elements, resulting in incomplete source code. It just happened to be that running it interactively was a little bit faster than running it as a script, resulting in larger page source length. However, I was able to get much larger page source length using Waits(around 650k).
Waits can be used to wait for required elements to be visible/present etc. In your case I'm assuming it's the main table.
The given code below waits for the table to be visible and then returns the page source.
Code snippet-
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
driver.get("https://www.stakingrewards.com/staking/?page=1&sort=rank_ASC")
try:
#waiting for table data to be visible
delay=20 #20 second delay
WebDriverWait(driver, delay).until(EC.visibility_of_element_located((By.CLASS_NAME, 'rt-tbody')))
print(len(driver.page_source))
#raises Exception if element is not visible within delay duration
except TimeoutException:
print("Timeout!!!")
driver.quit()

How to retrieve the Ping Download and Upload time from browser tests?

I am trying to automate speedtests with different browsers automatically, and the main part of the test is inside a loop. The problem is, sometimes, one element which has been selected before, and the script worked correctly, at the one of the next steps, exactly at the same loop and at the same page, but with different number, without any change in the xpath, selenium cannot select it again. So, I can not repeat my test as much as I want.
Most of the time I have this problem with Edge, and I think one reason can be, the xpath for elements which I found by help of Chrome or Firefox. ( I can not find the xpath in Edge first of all, I searched a lot about it).
I also put the different xpath that I use. Actually I want to get the numeric or string values of ping,download, upload location and server.
Please let me know, how can I solve this issue, I tried different sleep time and two different xpath. the script always gives me error when I am trying to select the element with class_name or css_selector.
firefox:
"/html/body/div[3]/div[2]/div/div/div/div[3]/div[1]/div[3]/div/div[3]/div/div[1]/div[2]/div[1]/div/div[2]/span"
chrome:
"//[#id='container']/div[2]/div/div/div/div[3]/div[1]/div[3]/div/div[3]/div/div[1]/div[2]/div[1]/div/div[2]/span"
chrome:
"//div[#class='result-item result-item-ping updated']/div[2]/span"
Other question is how can I wait for a page to load completely. this method WebDriverWait(driver,some seconds) does not work for me and i have to use time.sleep()
Error:
selenium.common.exceptions.NoSuchElementException: Message: No such element
element = driver.find_element_by_xpath("/html/body/div[3]/div[2]/div/div/div/div[3]/div[1]/div[3]/div/div[3]/div/div[1]/div[2]/div[1]/div/div[2]/span")
To automate the speedtests you can use the following solution:
Code Block:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Edge(executable_path=r'C:\WebDrivers\MicrosoftWebDriver.exe')
driver.get("https://www.speedtest.net/")
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.js-start-test.test-mode-multi"))).click()
WebDriverWait(driver, 45).until(EC.url_contains("result"))
print("Ping :"+driver.find_element_by_css_selector("div[title='Reaction Time'] div.result-data.u-align-left>span").get_attribute("innerHTML"))
print("Download: "+driver.find_element_by_css_selector("div[title='Receiving Time'] div.result-data.u-align-left>span").get_attribute("innerHTML"))
print("Upload :"+driver.find_element_by_css_selector("div[title='Sending Time'] div.result-data.u-align-left>span").get_attribute("innerHTML"))
#driver.quit()
Console Output:
Ping :35
Download: 21.53
Upload :3.46
Browser Snapshot:
Use the following CSS locators to identify the values:
Download: *.result-data-large.number.result-data-value.download-speed*
Upload: *.result-data-large.number.result-data-value.upload-speed*
Ping: *.result-data-large.number.result-data-value.ping-speed*
Making use of getText(), you can retrieve their values. Wait for an element in the page to be visible to make sure the page is loaded successfully.
Try with:element = driver.find_element_by_xpath("/html/body/div[3]/div[2]/div/div/div/div[3]/div[1]/div[3]/div/div[3]/div/div[1]/div[2]/div[1]/div/div[2]/")
Maybe also you need to catch exception for: NoSuchElementException cases.
I've tested these CSS selectors and they work in both Chrome and Edge.
span.ping-speed # ping
span.download-speed # download
span.upload-speed # upload
div.server-current > div.result-label # server
If you want to know when the page is done loading, you can wait until the URL changes from https://www.speedtest.net to https://www.speedtest.net/results/<some number>. I would just use WebDriverWait and url_contains("results") , e.g.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
WebDriverWait(driver, 10).until(EC.url_contains("results"))
There are some other approaches in this question.
WebDriverWait driverWait = new WebDriverWait(driver, 30000);
driver.get("https://www.speedtest.net/");
WebElement goLink = driver.findElement(By.cssSelector(".js-start-test.test-mode-multi"));
driverWait.until(ExpectedConditions.elementToBeClickable(goLink));
goLink.click();
By download = By.cssSelector(".result-data-large.number.result-data-value.download-speed");
By upload = By.cssSelector(".result-data-large.number.result-data-value.upload-speed");
By ping = By.cssSelector(".result-data-large.number.result-data-value.ping-speed");
driverWait.until(ExpectedConditions.urlMatches("https://www.speedtest.net/result/[0-9]"));
String downloadSpeed = driver.findElement(download).getText();
String uploadSpeed = driver.findElement(upload).getText();
String pingValue = driver.findElement(ping).getText();
System.out.println("Download: "+downloadSpeed + "\nUpload: "+ uploadSpeed + "\n Ping: "+pingValue);
Output
Download: 78.82
Upload: 45.93
Ping: 23

Reopen same browser window using selenium python and Firefox

Hey i'm trying to make an automatic program to send Whatsapp messages.
I'm currently using python, Firefox and selenium to achieve that.
The problem is that every time i'm calling driver.get(url) it opens a new instance of the firefox browser, blank with no memories of the last run. It makes me scan the bar code every time I run it.
from selenium import webdriver
from selenium.webdriver.firefox.webdriver import FirefoxProfile
cp_profile = webdriver.FirefoxProfile("/Users/Hodai/AppData/Roaming/Mozilla/Firefox/Profiles/v27qat5d.whatsapp_profile")
driver = webdriver.Firefox(executable_path="/Users/Hodai/Desktop/geckodriver",firefox_profile=cp_profile)
driver.get('http://web.whatsapp.com')
#Scan the code before proceeding further
input('Enter anything after scanning QR code')
I've tried to use profile but it seems like it has no affect.
cp_profile = webdriver.FirefoxProfile("/Users/Hodai/AppData/Roaming/Mozilla/Firefox/Profiles/v27qat5d.whatsapp_profile")
driver = webdriver.Firefox(executable_path="/Users/Hodai/Desktop/geckodriver",firefox_profile=cp_profile)
At the end I used chromedriver to achive my goal.
I tried cookies with pickle but it was a bit tricky because it remembered the cookies just for the same domain.
So I used user data for chrome.
now it works like a charm. thank you all.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("user-data-dir=C:/Users/Designer1/AppData/Local/Google/Chrome/User Data/Profile 1")
driver = webdriver.Chrome(chrome_options=options,executable_path="C:\webdrivers\chromedriver.exe")
The easiest way I think is to save your cookies after scanned the qrcode and push them to Selenium manually.
# Load page to be able to set cookies
driver.get('http://web.whatsapp.com')
# Set saved cookies
cookies = {'name1': 'value1', 'name2', 'value2'}
for name in cookies:
driver.add_cookie({
'name': name,
'value': cookies[name],
})
# Load page using cookies
driver.get('http://web.whatsapp.com')
To get your cookies you can use the console (F12), Network tab, right click on the request, Copy => Copy Request Headers.
It should not be like that. It only opens the new window when initialized with new variable or the program starts again. Here is the code for chrome. It doesn't matter how many times you call driver.get(url) it would open the url in the same browser window
from selenium import webdriver
import selenium.webdriver.support.ui as ui
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
import time
driver = webdriver.Chrome(executable_path=r"C:\new\chromedriver.exe")
driver.get('https://www.olx.com.pk/lahore/apple/q-iphone-6s/?search%5Bfilter_float_price%3Afrom%5D=40000&search%5Bfilter_float_price%3Ato%5D=55000')
time.sleep(10)
driver.get('https://www.olx.com.pk/lahore/apple/q-iphone-6s/?search%5Bfilter_float_price%3Afrom%5D=40000&search%5Bfilter_float_price%3Ato%5D=55000')
time.sleep(10)
driver.get('https://www.olx.com.pk/lahore/apple/q-iphone-6s/?search%5Bfilter_float_price%3Afrom%5D=40000&search%5Bfilter_float_price%3Ato%5D=55000')
time.sleep(10)
Let me know if the issue is resolved or you are trying to do something else.

Selenium : Python script taking a lot of time in Quora [duplicate]

So I'm trying to login to Quora using Python and then scrape some stuff.
I'm using Selenium to login to the site. Here's my code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
username = driver.find_element_by_name('email')
password = driver.find_element_by_name('password')
username.send_keys('email')
password.send_keys('password')
password.send_keys(Keys.RETURN)
driver.close()
Now the questions:
It took ~4 minutes to find and fill the login form, which painfully slow. Is there something I can do to speed up the process?
When it did login, how do I make sure there were no errors? In other words, how do I check the response code?
How do I save cookies with selenium so I can continue scraping once I login?
If there is no way to make selenium faster, is there any other alternative for logging in? (Quora doesn't have an API)
I had a similar problem with very slow find_elements_xxx calls in Python selenium using the ChromeDriver. I eventually tracked down the trouble to a driver.implicitly_wait() call I made prior to my find_element_xxx() calls; when I took it out, my find_element_xxx() calls ran quickly.
Now, I know those elements were there when I did the find_elements_xxx() calls. So I cannot imagine why the implicit_wait should have affected the speed of those operations, but it did.
I have been there, selenium is slow. It may not be as slow as 4 min to fill a form. I then started using phantomjs, which is much faster than firefox, since it is headless. You can simply replace Firefox() with PhantomJS() in the webdriver line after installing latest phantomjs.
To check that you have login you can assert for some element which is displayed after login.
As long as you do not quit your driver, cookies will be available to follow links
You can try using urllib and post directly to the login link. You can use cookiejar to save cookies. You can even simply save cookie, after all, a cookie is simply a string in http header
You can fasten your form filling by using your own setAttribute method, here is code for java for it
public void setAttribute(By locator, String attribute, String value) {
((JavascriptExecutor) getDriver()).executeScript("arguments[0].setAttribute('" + attribute
+ "',arguments[1]);",
getElement(locator),
value);
}
Running the web driver headlessly should improve its execution speed to some degree.
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument('-headless')
browser = webdriver.Firefox(firefox_options=options)
browser.get('https://google.com/')
browser.close()
For Windows 7 and IEDRIVER with Python Selenium, Ending the Windows Command Line and restarting it cured my issue.
I was having trouble with find_element..clicks. They were taking 30 seconds plus a little bit. Here's the type of code I have including capturing how long to run.
timeStamp = time.time()
elem = driver.find_element_by_css_selector(clickDown).click()
print("1 took:",time.time() - timeStamp)
timeStamp = time.time()
elem = driver.find_element_by_id("cSelect32").click()
print("2 took:",time.time() - timeStamp)
That was recording about 31 seconds for each click. After ending the command line and restarting it (which does end any IEDRIVERSERVER.exe processes), it was 1 second per click.
I have changed locators and this works fast. Also, I have added working with cookies. Check the code below:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys
import pickle
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
wait = WebDriverWait(driver, 5)
username = wait.until(EC.presence_of_element_located((By.XPATH, '//div[#class="login"]//input[#name="email"]')))
password = wait.until(EC.presence_of_element_located((By.XPATH, '//div[#class="login"]//input[#name="password"]')))
username.send_keys('email')
password.send_keys('password')
password.send_keys(Keys.RETURN)
wait.until(EC.presence_of_element_located((By.XPATH, '//span[text()="Add Question"]'))) # checking that user logged in
pickle.dump( driver.get_cookies() , open("cookies.pkl","wb")) # saving cookies
driver.close()
We have saved cookies and now we will apply them in a new browser:
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
cookies = pickle.load(open("cookies.pkl", "rb"))
for cookie in cookies:
driver.add_cookie(cookie)
driver.get('http://www.quora.com/')
Hope, this will help.

Is Selenium slow, or is my code wrong?

So I'm trying to login to Quora using Python and then scrape some stuff.
I'm using Selenium to login to the site. Here's my code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
username = driver.find_element_by_name('email')
password = driver.find_element_by_name('password')
username.send_keys('email')
password.send_keys('password')
password.send_keys(Keys.RETURN)
driver.close()
Now the questions:
It took ~4 minutes to find and fill the login form, which painfully slow. Is there something I can do to speed up the process?
When it did login, how do I make sure there were no errors? In other words, how do I check the response code?
How do I save cookies with selenium so I can continue scraping once I login?
If there is no way to make selenium faster, is there any other alternative for logging in? (Quora doesn't have an API)
I had a similar problem with very slow find_elements_xxx calls in Python selenium using the ChromeDriver. I eventually tracked down the trouble to a driver.implicitly_wait() call I made prior to my find_element_xxx() calls; when I took it out, my find_element_xxx() calls ran quickly.
Now, I know those elements were there when I did the find_elements_xxx() calls. So I cannot imagine why the implicit_wait should have affected the speed of those operations, but it did.
I have been there, selenium is slow. It may not be as slow as 4 min to fill a form. I then started using phantomjs, which is much faster than firefox, since it is headless. You can simply replace Firefox() with PhantomJS() in the webdriver line after installing latest phantomjs.
To check that you have login you can assert for some element which is displayed after login.
As long as you do not quit your driver, cookies will be available to follow links
You can try using urllib and post directly to the login link. You can use cookiejar to save cookies. You can even simply save cookie, after all, a cookie is simply a string in http header
You can fasten your form filling by using your own setAttribute method, here is code for java for it
public void setAttribute(By locator, String attribute, String value) {
((JavascriptExecutor) getDriver()).executeScript("arguments[0].setAttribute('" + attribute
+ "',arguments[1]);",
getElement(locator),
value);
}
Running the web driver headlessly should improve its execution speed to some degree.
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument('-headless')
browser = webdriver.Firefox(firefox_options=options)
browser.get('https://google.com/')
browser.close()
For Windows 7 and IEDRIVER with Python Selenium, Ending the Windows Command Line and restarting it cured my issue.
I was having trouble with find_element..clicks. They were taking 30 seconds plus a little bit. Here's the type of code I have including capturing how long to run.
timeStamp = time.time()
elem = driver.find_element_by_css_selector(clickDown).click()
print("1 took:",time.time() - timeStamp)
timeStamp = time.time()
elem = driver.find_element_by_id("cSelect32").click()
print("2 took:",time.time() - timeStamp)
That was recording about 31 seconds for each click. After ending the command line and restarting it (which does end any IEDRIVERSERVER.exe processes), it was 1 second per click.
I have changed locators and this works fast. Also, I have added working with cookies. Check the code below:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys
import pickle
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
wait = WebDriverWait(driver, 5)
username = wait.until(EC.presence_of_element_located((By.XPATH, '//div[#class="login"]//input[#name="email"]')))
password = wait.until(EC.presence_of_element_located((By.XPATH, '//div[#class="login"]//input[#name="password"]')))
username.send_keys('email')
password.send_keys('password')
password.send_keys(Keys.RETURN)
wait.until(EC.presence_of_element_located((By.XPATH, '//span[text()="Add Question"]'))) # checking that user logged in
pickle.dump( driver.get_cookies() , open("cookies.pkl","wb")) # saving cookies
driver.close()
We have saved cookies and now we will apply them in a new browser:
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
cookies = pickle.load(open("cookies.pkl", "rb"))
for cookie in cookies:
driver.add_cookie(cookie)
driver.get('http://www.quora.com/')
Hope, this will help.

Categories