Iterating google search results using python selenium - python

I want to iterate clicking the google search results and copy menus of each site. So far, i am through copying the menus and returning back to the results page but couldn't iterate clicking the results.For now, i would like to learn iterating search results alone but I'm stuck at stale element reference exception, i did see few other sources but no luck.
from selenium import webdriver
chrome_path = r"C:\Users\Downloads\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get('https://www.google.com?q=python#q=python')
weblinks = driver.find_elements_by_xpath("//div[#class='g']//a[not(#class)]");
for links in weblinks[0:9]:
links.get_attribute("href")
print(links.get_attribute("href"))
links.click()
driver.back()

StaleElementReferenceException means that elements you are referring to do not exist anymore. That usually happens when page is automatically redrawn. In your case, you change the page and navigate back, so elements would be redrawn 100%.
Default solution is to search the list inside the loop every time.
If you want to be sure that list is same every iteration, you need to add some additional check (compare texts, etc.)
If you use this code for scraping, probably you don't need back navigation. Just open every page directly with driver.get(href)
Here you can find code example: How to open a link in new tab (chrome) using Selenium WebDriver?

Related

TypeError: 'FirefoxWebElement' object is not iterable error cycling through pages on a dynamic webpage with Selenium

This is the site I want to scrape.
I want to scrape all the information in the table on the first page:
then click on the second and do the same:
And so on until the 51st page. I know how to use selenium to click on page two:
link = "http://www.nigeriatradehub.gov.ng/Organizations"
driver = webdriver.Firefox()
driver.get(link)
xpath = '/html/body/form/div[3]/div[4]/div[1]/div/div/div[1]/div/div/div/div/div/div[2]/div[2]/span/a[1]'
find_element_by_xpath(xpath).click()
But I don't know how to set the code up so that it cycles through each page. The process of me getting the xpath is a manual one in the first place (I go on to Firefox, inspect the item and copy it into the code), so I don't know how to automate that step in and of itself and then the following ones.
I tried going a level higher in the webpage html and choosing the entire section of the page with the elements I want, and cycling through them, but that doesn't work because it's a Firefox web object(see below). Here'a a snapshot of the relevant part of the page source:
By calling the xpath of the higher class like so:
path = '//*[#id="dnn_ctr454_View_OrganizationsListViewDataPager"]'
driver.find_element_by_xpath(path)
and trying to see if I can cycle though it:
for i in driver.find_element_by_xpath(path):
i.click()
I get the following error:
Any advice would be greatly appreciated.
This error message...
...implies that you are trying to iterate through a WebElement where as only list objects are iterable.
Solution
Within the for() loop to create a list to iterate through it's elements, instead of using find_element* you need to use find_elements*. So your effective code block will be:
for i in driver.find_elements_by_xpath(path):
i.click()

Unable to locate/click pop-up button with Selenium in Python

I'm using Selenium in Python 3 to access webpages, and I want to click on a pop-up button, but I am unable to locate it with Selenium.
What I'm describing below applies to a number of sites with a pop-up, so I'll use a simple example.
url = "https://www.google.co.uk"
from selenium import webdriver
driver = webdriver.Firefox()
driver.implicitly_wait(10)
driver.get(url)
The page has a pop-up for agreeing to cookies.
I want the script to click on the "I agree" button, but I'm unable to locate it.
I've found a few questions and posts about this online (including on Stackoverflow), but all the suggestions I found seem to fall in one of the following categories and don't seem to work for me.
Wait longer for the pop-up to actually load.
I've tried adding delays, and in fact, I'm testing this interactively, so I can wait all I want for the page to load before I try to locate the button, but it doesn't make any difference.
Use something like driver.switch_to.alert
I get a NoAlertPresentException. The pop-up doesn't seem to be an alert.
Locate the element using driver.find_element.
This doesn't work either, regardless of which approach I use (xpath, class name, text etc.). I can find elements from the page under the pop-up, but nothing from the pop-up itself. For example,
# Elements in main page (under pop-up)
driver.find_element_by_partial_link_text("Sign in") # returns FirefoxWebElement
driver.find_element_by_class_name("gb_g") # returns FirefoxWebElement
# Elements on the pop-up
driver.find_element_by_partial_link_text("I agree") # NoSuchElementException
driver.find_element_by_class_name("RveJvd snByac") # NoSuchElementException
The popup just doesn't seem to be there in the page source. In fact, if I try looking at the loaded page source from the browser, I can't find anything related to the pop-up. I understand that many sites use client-side scripts to load elements dynamically, so many elements wouldn't show up in the raw source, but that was the point of using Selenium: to load the page, interpret the scripts and access the end result.
So, what am I doing wrong? Where is the pop-up coming from, and how can I access it?

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page

driver.get("https://www.zacks.com/")
driver.find_element_by_xpath("//*[#id='search-q']")
i am trying to find search box on zacks website with selenium but I am getting StaleElementReferenceException
The reason why you're getting this error is simply, the element has been removed from the DOM. There are several reasons for this:
The page itself is destroying/recreating the element on the fly, maybe even rapidly.
Parts of the page have been updated (replaced), but you're still having and old reference.
You navigate to a new page but holding an old reference.
To avoid this, try to keep the element reference as short as possible. If the content is rapidly changing, make the operation directly without the round trip to the client, via javascript:
driver.executeScript("document.getElementById('serach-q').click();");
Maybe you're trying to find while the page and this exact search box are loading. Try to implement wait mechanism for this element, smth like that:
WebDriverWait wait = new WebDriverWait(driver, timeoutInSeconds);
wait.until(ExpectedConditions.visibilityOfElementLocated(locator));

How to prevent page updates after load with Python Selenium Webdriver (Firefox)

I'm using Python Selenium to save data into a spreadsheet from a webpage using Firefox, but the page continually updates data causing errors relating to stale elements. How do I resolve this?
I've tried to turn off JavaScript but that's doesn't seem to do anything. Any suggestions would be great!
If you want to save the data at the page in the specific moment of time you can
get the current page HTML source using WebDriver.page_source function
write it into a file
open the file from the disk using WebDriver.get() function
that's it, you should be able to work with the local copy of the page which will never change
Example code:
driver.get("http://seleniumhq.org")
with open("mypage.html", "w") as mypage:
mypage.write(driver.page_source)
mypage.close()
driver.get(os.getcwd() + "/" + (mypage.name))
#do what you need with the page source
another approach is using WebDriver.find_element function wherever you need to interact with the element.
so instead of
myelement = driver.find_element_by_xpath("//your_selector")
# some other action
myelement.getAttribute("interestingAttribute")
perform find any time you need to interact with the element:
driver.find_element_by_xpath("//your_selector").getAttribute("interestingAttribute")
or even better go for Explicit Wait of the element you need:
WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.XPATH, "//your/selector"))).get_attribute("href")

How do I loop through these web pages with selenium?

I am new to programming but am getting familiar with web-scraping.
I wish to write a code which clicks on each link on the page.
In my attempted code, I have made a sample of just two links to click on to speed things up. However, my current code is only yielding the first link to be clicked on but not the second.
from selenium import webdriver
import csv
driver = webdriver.Firefox()
driver.get("https://www.betexplorer.com/baseball/usa/mlb-2018/results/?
stage=KvfZSOKj&month=all")
matches = driver.find_elements_by_xpath('//td[#class="h-text-left"]')
m_samp = matches[0:1]
for i in m_samp:
i.click()
driver.get("https://www.betexplorer.com/baseball/usa/mlb-2018/results/?
stage=KvfZSOKj&month=all")
Ideally, I would like it to click the first link, then go back to the previous page, then click the second link, then go back to the previous page.
Any help is appreciated.
First take the all the clickable urls into one list
then iterate list
like list_urls= ["url1","url2"]
for i in list_urls:
driver.get(i)
save the all urls other wise going back and clicking will not work , because the you have only one instance of driver not the multiple

Categories