Selenium : Python script taking a lot of time in Quora [duplicate] - python

So I'm trying to login to Quora using Python and then scrape some stuff.
I'm using Selenium to login to the site. Here's my code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
username = driver.find_element_by_name('email')
password = driver.find_element_by_name('password')
username.send_keys('email')
password.send_keys('password')
password.send_keys(Keys.RETURN)
driver.close()
Now the questions:
It took ~4 minutes to find and fill the login form, which painfully slow. Is there something I can do to speed up the process?
When it did login, how do I make sure there were no errors? In other words, how do I check the response code?
How do I save cookies with selenium so I can continue scraping once I login?
If there is no way to make selenium faster, is there any other alternative for logging in? (Quora doesn't have an API)

I had a similar problem with very slow find_elements_xxx calls in Python selenium using the ChromeDriver. I eventually tracked down the trouble to a driver.implicitly_wait() call I made prior to my find_element_xxx() calls; when I took it out, my find_element_xxx() calls ran quickly.
Now, I know those elements were there when I did the find_elements_xxx() calls. So I cannot imagine why the implicit_wait should have affected the speed of those operations, but it did.

I have been there, selenium is slow. It may not be as slow as 4 min to fill a form. I then started using phantomjs, which is much faster than firefox, since it is headless. You can simply replace Firefox() with PhantomJS() in the webdriver line after installing latest phantomjs.
To check that you have login you can assert for some element which is displayed after login.
As long as you do not quit your driver, cookies will be available to follow links
You can try using urllib and post directly to the login link. You can use cookiejar to save cookies. You can even simply save cookie, after all, a cookie is simply a string in http header

You can fasten your form filling by using your own setAttribute method, here is code for java for it
public void setAttribute(By locator, String attribute, String value) {
((JavascriptExecutor) getDriver()).executeScript("arguments[0].setAttribute('" + attribute
+ "',arguments[1]);",
getElement(locator),
value);
}

Running the web driver headlessly should improve its execution speed to some degree.
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument('-headless')
browser = webdriver.Firefox(firefox_options=options)
browser.get('https://google.com/')
browser.close()

For Windows 7 and IEDRIVER with Python Selenium, Ending the Windows Command Line and restarting it cured my issue.
I was having trouble with find_element..clicks. They were taking 30 seconds plus a little bit. Here's the type of code I have including capturing how long to run.
timeStamp = time.time()
elem = driver.find_element_by_css_selector(clickDown).click()
print("1 took:",time.time() - timeStamp)
timeStamp = time.time()
elem = driver.find_element_by_id("cSelect32").click()
print("2 took:",time.time() - timeStamp)
That was recording about 31 seconds for each click. After ending the command line and restarting it (which does end any IEDRIVERSERVER.exe processes), it was 1 second per click.

I have changed locators and this works fast. Also, I have added working with cookies. Check the code below:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys
import pickle
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
wait = WebDriverWait(driver, 5)
username = wait.until(EC.presence_of_element_located((By.XPATH, '//div[#class="login"]//input[#name="email"]')))
password = wait.until(EC.presence_of_element_located((By.XPATH, '//div[#class="login"]//input[#name="password"]')))
username.send_keys('email')
password.send_keys('password')
password.send_keys(Keys.RETURN)
wait.until(EC.presence_of_element_located((By.XPATH, '//span[text()="Add Question"]'))) # checking that user logged in
pickle.dump( driver.get_cookies() , open("cookies.pkl","wb")) # saving cookies
driver.close()
We have saved cookies and now we will apply them in a new browser:
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
cookies = pickle.load(open("cookies.pkl", "rb"))
for cookie in cookies:
driver.add_cookie(cookie)
driver.get('http://www.quora.com/')
Hope, this will help.

Related

driver.find_element(By.XPATH, "xpath") not working

I am trying to learn Selenium in Python and I have faced a problem which stopped me from processing.
As you might know, previous versions of Selenium have different syntax comparing the latest one, and I have tried everything to fill the form with my code but nothing happens. I am trying to find XPATH element from [https://demo.seleniumeasy.com/basic-first-form-demo.html] but whatever I do, I cannot type my message into the message field.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service as ChromeService
import time
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)
service = ChromeService(executable_path="C:/Users/SnappFood/.cache/selenium/chromedriver/win32/110.0.5481.77/chromedriver.exe")
driver = webdriver.Chrome()
driver.maximize_window()
driver.get("https://demo.seleniumeasy.com/basic-first-form-demo.html")
time.sleep(1000)
message_field = driver.find_element(By.XPATH,'//*[#id="user-message"]')
message_field.send_keys("Hello World")
show_message_button = driver.find_element(By.XPATH,'//*[#id="get-input"]/button')
show_message_button.click()
By this code, I expect to fill the message field in the form and click the "SHOW MESSAGE" button to print my typed text, but what happens is that my code only opens a new Chrome webpage with empty field.
I have to mention that I don't get any errors by PyCharm and the code runs with no errors.
I would really appreciate if you help me through this to understand what I am doing wrong.
You need to wait for the page loads correctly.
For this, the best approach is using WebDriverWait:
driver = webdriver.Chrome()
driver.maximize_window()
driver.get("https://demo.seleniumeasy.com/basic-first-form-demo.html")
# Wait for the page to load
wait = WebDriverWait(driver, 10)
wait.until(lambda driver: driver.find_element(By.XPATH, '//*[#id="user-message"]'))
message_field = driver.find_element(By.XPATH,'//*[#id="user-message"]')
message_field.send_keys("Hello World")
show_message_button = driver.find_element(By.XPATH,'//*[#id="get-input"]/button')
show_message_button.click()
Tested, it works fine.
Also you can use time.sleep() to wait the load if you know the loading times:
import time
driver = webdriver.Chrome()
driver.maximize_window()
driver.get("https://demo.seleniumeasy.com/basic-first-form-demo.html")
time.sleep(5)
message_field = driver.find_element(By.XPATH,'//*[#id="user-message"]')
message_field.send_keys("Hello World")
time.sleep(5)
show_message_button = driver.find_element(By.XPATH,'//*[#id="get-input"]/button')
show_message_button.click()
time.sleep(5)
with this xpath you have two elements //*[#id="user-message"] . you need to make it unique. Try below xpath this will work as expected.
message_field = driver.find_element(By.XPATH,'//input[#id="user-message"]')
browser snapshot:

python selenium submit works interactively but not in a script, not even with time.sleep

I'm trying to login to a site from a python script using the selenium webdriver in order to download some data that is only available to registered users.
I have the following code:
browser = webdriver.Firefox()
browser.get('https://shop.biogast.at/store15/customer/account/login')
emailElem = browser.find_element_by_id('email')
emailElem.send_keys('12345')
passwordElem = browser.find_element_by_id('pass')
passwordElem.send_keys('12345')
passwordElem.submit()
It works fine when the commands are typed to the python3 shell one by one. However, when I run the code as a script, the username and login are filled in correctly, but instead of logging in, the username and login fields are blanked. The script finishes and there is no error message.
According to what others have suggested in similar situations, this could be a timing issue, as the script runs much faster than when the commands are typed one by one. So I expanded the script to include some sleep values.
browser = webdriver.Firefox()
browser.get('https://shop.biogast.at/store15/customer/account/login')
time.sleep(15)
emailElem = browser.find_element_by_id('email')
emailElem.send_keys('12345')
time.sleep(15)
passwordElem = browser.find_element_by_id('pass')
passwordElem.send_keys('12345')
time.sleep(15)
passwordElem.submit()
Unfortunately, the result is still the same. The fields are blanked and the script finishes with no error. When I run the commands one by one, it works well even when the breaks between the commands are less than 15 seconds, so it really doesn't seem to be a timing problem.
Do you have any idea how should I find the cause? Thank you very much.
Use WebDriverWait to handle the element.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
browser = webdriver.Firefox()
wait = WebDriverWait(browser,40)
browser.get('https://shop.biogast.at/store15/customer/account/login')
emailElem = wait.until(EC.element_to_be_clickable((By.ID, 'email'))) #browser.find_element_by_id('email')
emailElem.send_keys('12345')
passwordElem = wait.until(EC.element_to_be_clickable((By.ID, 'pass'))) #browser.find_element_by_id('pass')
passwordElem.send_keys('12345')
passwordElem.submit()

Reopen same browser window using selenium python and Firefox

Hey i'm trying to make an automatic program to send Whatsapp messages.
I'm currently using python, Firefox and selenium to achieve that.
The problem is that every time i'm calling driver.get(url) it opens a new instance of the firefox browser, blank with no memories of the last run. It makes me scan the bar code every time I run it.
from selenium import webdriver
from selenium.webdriver.firefox.webdriver import FirefoxProfile
cp_profile = webdriver.FirefoxProfile("/Users/Hodai/AppData/Roaming/Mozilla/Firefox/Profiles/v27qat5d.whatsapp_profile")
driver = webdriver.Firefox(executable_path="/Users/Hodai/Desktop/geckodriver",firefox_profile=cp_profile)
driver.get('http://web.whatsapp.com')
#Scan the code before proceeding further
input('Enter anything after scanning QR code')
I've tried to use profile but it seems like it has no affect.
cp_profile = webdriver.FirefoxProfile("/Users/Hodai/AppData/Roaming/Mozilla/Firefox/Profiles/v27qat5d.whatsapp_profile")
driver = webdriver.Firefox(executable_path="/Users/Hodai/Desktop/geckodriver",firefox_profile=cp_profile)
At the end I used chromedriver to achive my goal.
I tried cookies with pickle but it was a bit tricky because it remembered the cookies just for the same domain.
So I used user data for chrome.
now it works like a charm. thank you all.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("user-data-dir=C:/Users/Designer1/AppData/Local/Google/Chrome/User Data/Profile 1")
driver = webdriver.Chrome(chrome_options=options,executable_path="C:\webdrivers\chromedriver.exe")
The easiest way I think is to save your cookies after scanned the qrcode and push them to Selenium manually.
# Load page to be able to set cookies
driver.get('http://web.whatsapp.com')
# Set saved cookies
cookies = {'name1': 'value1', 'name2', 'value2'}
for name in cookies:
driver.add_cookie({
'name': name,
'value': cookies[name],
})
# Load page using cookies
driver.get('http://web.whatsapp.com')
To get your cookies you can use the console (F12), Network tab, right click on the request, Copy => Copy Request Headers.
It should not be like that. It only opens the new window when initialized with new variable or the program starts again. Here is the code for chrome. It doesn't matter how many times you call driver.get(url) it would open the url in the same browser window
from selenium import webdriver
import selenium.webdriver.support.ui as ui
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
import time
driver = webdriver.Chrome(executable_path=r"C:\new\chromedriver.exe")
driver.get('https://www.olx.com.pk/lahore/apple/q-iphone-6s/?search%5Bfilter_float_price%3Afrom%5D=40000&search%5Bfilter_float_price%3Ato%5D=55000')
time.sleep(10)
driver.get('https://www.olx.com.pk/lahore/apple/q-iphone-6s/?search%5Bfilter_float_price%3Afrom%5D=40000&search%5Bfilter_float_price%3Ato%5D=55000')
time.sleep(10)
driver.get('https://www.olx.com.pk/lahore/apple/q-iphone-6s/?search%5Bfilter_float_price%3Afrom%5D=40000&search%5Bfilter_float_price%3Ato%5D=55000')
time.sleep(10)
Let me know if the issue is resolved or you are trying to do something else.

How to close the browser after completing a download?

How to make browser closed after completing download?
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
browser.get(any_url)
browser.find_elements_by_xpath('//input[#value="Download"]').click()
# The program start downloading now.
# HERE WHAT IS THE CODE?
browser.quit()
I want to close the browser only after completing the download.
You may want to use the below piece of code right before you close the browser.
time.sleep(5)# Gives time to complete the task before closing the browser. You
may increase the seconds to 10 or 15,basically the amount of time
required for download otherwise it goes to the next step
immediately.
browser.quit()
You can use the pause command:
pause ( waitTime )
Wait for the specified amount of time (in milliseconds)
http://release.seleniumhq.org/selenium-core/1.0/reference.html#pause
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
browser.get(any_url)
browser.find_elements_by_xpath('//input[#value="Download"]').click()
# The program start downloading now.
pause (10000) # pause/sleeps for 10 seconds
browser.quit()
This is an alternative way I did on C#. Maybe you can use the same technique and apply it on python.
public static string GetRequest(string url, bool isBinary = false) {
// binary is the file that will be downloaded
// Here you perform asynchronous get request and download the binary
// Python guide for GetRequest -> http://docs.python-requests.org/en/latest/user/quickstart/
}
browser.webdriver.Firefox();
browser.get(any_url);
elem = browser.findElement("locator");
GetRequest(elem.getAttribute('href'), true); // when this method is done, you expect the get request is done
browser.quit();
The trick that I used was to open the download manager page and expect by one element that indicate that the download is finished. Follow the Python code used:
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
...
# Wait until the download finish. This code just works for one single download at time on Firefox.
# browser.execute_script('window.open();')
# ActionChains(browser).key_down(Keys.COMMAND).send_keys('t').key_up(Keys.COMMAND).perform()
browser.get('about:downloads')
# files = browser.find_elements_by_class_name('download-state')
WebDriverWait(browser, URL_LOAD_TIMEOUT).until(EC.presence_of_element_located((By.CLASS_NAME, 'downloadIconShow')))
# 'downloadIconCancel'
browser.close()
broswer.quit()
The problem of this approach is that it may be dependent of Firefox version, if Mozilla change that download manager page.

Is Selenium slow, or is my code wrong?

So I'm trying to login to Quora using Python and then scrape some stuff.
I'm using Selenium to login to the site. Here's my code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
username = driver.find_element_by_name('email')
password = driver.find_element_by_name('password')
username.send_keys('email')
password.send_keys('password')
password.send_keys(Keys.RETURN)
driver.close()
Now the questions:
It took ~4 minutes to find and fill the login form, which painfully slow. Is there something I can do to speed up the process?
When it did login, how do I make sure there were no errors? In other words, how do I check the response code?
How do I save cookies with selenium so I can continue scraping once I login?
If there is no way to make selenium faster, is there any other alternative for logging in? (Quora doesn't have an API)
I had a similar problem with very slow find_elements_xxx calls in Python selenium using the ChromeDriver. I eventually tracked down the trouble to a driver.implicitly_wait() call I made prior to my find_element_xxx() calls; when I took it out, my find_element_xxx() calls ran quickly.
Now, I know those elements were there when I did the find_elements_xxx() calls. So I cannot imagine why the implicit_wait should have affected the speed of those operations, but it did.
I have been there, selenium is slow. It may not be as slow as 4 min to fill a form. I then started using phantomjs, which is much faster than firefox, since it is headless. You can simply replace Firefox() with PhantomJS() in the webdriver line after installing latest phantomjs.
To check that you have login you can assert for some element which is displayed after login.
As long as you do not quit your driver, cookies will be available to follow links
You can try using urllib and post directly to the login link. You can use cookiejar to save cookies. You can even simply save cookie, after all, a cookie is simply a string in http header
You can fasten your form filling by using your own setAttribute method, here is code for java for it
public void setAttribute(By locator, String attribute, String value) {
((JavascriptExecutor) getDriver()).executeScript("arguments[0].setAttribute('" + attribute
+ "',arguments[1]);",
getElement(locator),
value);
}
Running the web driver headlessly should improve its execution speed to some degree.
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument('-headless')
browser = webdriver.Firefox(firefox_options=options)
browser.get('https://google.com/')
browser.close()
For Windows 7 and IEDRIVER with Python Selenium, Ending the Windows Command Line and restarting it cured my issue.
I was having trouble with find_element..clicks. They were taking 30 seconds plus a little bit. Here's the type of code I have including capturing how long to run.
timeStamp = time.time()
elem = driver.find_element_by_css_selector(clickDown).click()
print("1 took:",time.time() - timeStamp)
timeStamp = time.time()
elem = driver.find_element_by_id("cSelect32").click()
print("2 took:",time.time() - timeStamp)
That was recording about 31 seconds for each click. After ending the command line and restarting it (which does end any IEDRIVERSERVER.exe processes), it was 1 second per click.
I have changed locators and this works fast. Also, I have added working with cookies. Check the code below:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys
import pickle
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
wait = WebDriverWait(driver, 5)
username = wait.until(EC.presence_of_element_located((By.XPATH, '//div[#class="login"]//input[#name="email"]')))
password = wait.until(EC.presence_of_element_located((By.XPATH, '//div[#class="login"]//input[#name="password"]')))
username.send_keys('email')
password.send_keys('password')
password.send_keys(Keys.RETURN)
wait.until(EC.presence_of_element_located((By.XPATH, '//span[text()="Add Question"]'))) # checking that user logged in
pickle.dump( driver.get_cookies() , open("cookies.pkl","wb")) # saving cookies
driver.close()
We have saved cookies and now we will apply them in a new browser:
driver = webdriver.Firefox()
driver.get('http://www.quora.com/')
cookies = pickle.load(open("cookies.pkl", "rb"))
for cookie in cookies:
driver.add_cookie(cookie)
driver.get('http://www.quora.com/')
Hope, this will help.

Categories