How do I loop through these web pages with selenium? - python

I am new to programming but am getting familiar with web-scraping.
I wish to write a code which clicks on each link on the page.
In my attempted code, I have made a sample of just two links to click on to speed things up. However, my current code is only yielding the first link to be clicked on but not the second.
from selenium import webdriver
import csv
driver = webdriver.Firefox()
driver.get("https://www.betexplorer.com/baseball/usa/mlb-2018/results/?
stage=KvfZSOKj&month=all")
matches = driver.find_elements_by_xpath('//td[#class="h-text-left"]')
m_samp = matches[0:1]
for i in m_samp:
i.click()
driver.get("https://www.betexplorer.com/baseball/usa/mlb-2018/results/?
stage=KvfZSOKj&month=all")
Ideally, I would like it to click the first link, then go back to the previous page, then click the second link, then go back to the previous page.
Any help is appreciated.

First take the all the clickable urls into one list
then iterate list
like list_urls= ["url1","url2"]
for i in list_urls:
driver.get(i)
save the all urls other wise going back and clicking will not work , because the you have only one instance of driver not the multiple

Related

How to work with links that use javascript:window.location using Selenium in Python

I have dabbled with bits of simple code over the years. I am now interested in automating some repetitive steps in a web based CRM used at work. I tried a few automation tools. I was not able to get AutoIT to to work with the Chrome webdriver. I then tried WinTask and did not make meaningful progress. I started exploring Python and Selenium last week.
I now have automated the first few steps of my project by Googling about each step I wanted to achieve, learning from pages on Stackflow and other sites. Where I need help is that most of the links in the CRM are some sort of javascript links. Most of the text links or images have links that are formatted like this...
javascript:window.location = 'Reports/ResponseTimes.aspx?from=1%2f14%2f2021&to=1%2f14%2f2021&target=gn';
It looks like the many find_element_by functions in Selenium do not interact with the javascript links. Tonight I found a page that directed me to use... driver.execute_script(javaScript) ...Eventually I found an example that made it clear I should enter the javascript link into that function. This works...
driver.execute_script("window.location = 'Reports/ResponseTimes.aspx?from=1%2f14%2f2021&to=1%2f14%2f2021&target=gn';")
My issue is that I see now that the javascript links are actually and dynamically generated. In the code above the link gets updated with dates based on the current date. I can't reuse the driver.execute_script() code above since the dates have to be updated.
My hope is to find a way to code so that I can locate the javascript links I need based on some part of the link that does not change. The link above always has "target=gn" at the end and that is unique enough that if I could find and pull the current version of the link into a variable and then run it in driver.execute_script(), I believe that would solve my current issue.
I expect a solution could then be used in the next step I need to perform, where there a list of new leads that all needs to be updated in a manner that tells the system a human has reviewed the lead and "stopped the clock". To view each lead, there are more javascript links. Each link is unique since it includes a value that is the record number for the lead. Here's the first two...
javascript:top.viewItem(971244899);
javascript:top.viewItem(971312602);
I imagine that being able to search the page for some or all of... javascript:top.viewItem( ...in order to create a variable for... javascript:top.viewItem(971244899); ...so that it can be placed in... driver.execute_script() ...is the approach that is needed.
Thanks for any suggestions. I have made many searches on this site and Google for phrases that might teach me more about working with javascript links. I am asking for guidance since I have not been able to move forward on my own. Here's my current code...
import selenium
PATH = "C:\Program Files (x86)\chromedriver.exe"
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.Chrome(PATH)
driver.get("https://apps.vinmanager.com/cardashboard/login.aspx")
# log in
time.sleep(1)
search = driver.find_element_by_name("username")
search.send_keys("xxx")
search.send_keys(Keys.RETURN)
time.sleep(2)
search = driver.find_element_by_name("password")
search.send_keys("xxx")
search.send_keys(Keys.RETURN)
time.sleep(1)
# close news pop-up
driver.find_element_by_link_text("Close").click()
time.sleep(2)
# Nav to left pane
driver.switch_to.frame('leftpaneframe')
# Leads at No Contact link
driver.execute_script("window.location = 'Reports/ResponseTimes.aspx?from=1%2f14%2f2021&to=1%2f14%2f2021&target=gn';")
Eventually I found enough info online to recognize that I needed to replace the "//a" tag in the xpath find method with the proper tag, which was "//area" in my case and then extract the href so that I could execute it...
## click no contact link ##
print('click no contact link...')
cncl = driver.find_element_by_xpath("//area[contains(#href,'target=gn')]").get_attribute('href')
time.sleep(2)
driver.execute_script(cncl)

TypeError: 'FirefoxWebElement' object is not iterable error cycling through pages on a dynamic webpage with Selenium

This is the site I want to scrape.
I want to scrape all the information in the table on the first page:
then click on the second and do the same:
And so on until the 51st page. I know how to use selenium to click on page two:
link = "http://www.nigeriatradehub.gov.ng/Organizations"
driver = webdriver.Firefox()
driver.get(link)
xpath = '/html/body/form/div[3]/div[4]/div[1]/div/div/div[1]/div/div/div/div/div/div[2]/div[2]/span/a[1]'
find_element_by_xpath(xpath).click()
But I don't know how to set the code up so that it cycles through each page. The process of me getting the xpath is a manual one in the first place (I go on to Firefox, inspect the item and copy it into the code), so I don't know how to automate that step in and of itself and then the following ones.
I tried going a level higher in the webpage html and choosing the entire section of the page with the elements I want, and cycling through them, but that doesn't work because it's a Firefox web object(see below). Here'a a snapshot of the relevant part of the page source:
By calling the xpath of the higher class like so:
path = '//*[#id="dnn_ctr454_View_OrganizationsListViewDataPager"]'
driver.find_element_by_xpath(path)
and trying to see if I can cycle though it:
for i in driver.find_element_by_xpath(path):
i.click()
I get the following error:
Any advice would be greatly appreciated.
This error message...
...implies that you are trying to iterate through a WebElement where as only list objects are iterable.
Solution
Within the for() loop to create a list to iterate through it's elements, instead of using find_element* you need to use find_elements*. So your effective code block will be:
for i in driver.find_elements_by_xpath(path):
i.click()

Iterating google search results using python selenium

I want to iterate clicking the google search results and copy menus of each site. So far, i am through copying the menus and returning back to the results page but couldn't iterate clicking the results.For now, i would like to learn iterating search results alone but I'm stuck at stale element reference exception, i did see few other sources but no luck.
from selenium import webdriver
chrome_path = r"C:\Users\Downloads\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get('https://www.google.com?q=python#q=python')
weblinks = driver.find_elements_by_xpath("//div[#class='g']//a[not(#class)]");
for links in weblinks[0:9]:
links.get_attribute("href")
print(links.get_attribute("href"))
links.click()
driver.back()
StaleElementReferenceException means that elements you are referring to do not exist anymore. That usually happens when page is automatically redrawn. In your case, you change the page and navigate back, so elements would be redrawn 100%.
Default solution is to search the list inside the loop every time.
If you want to be sure that list is same every iteration, you need to add some additional check (compare texts, etc.)
If you use this code for scraping, probably you don't need back navigation. Just open every page directly with driver.get(href)
Here you can find code example: How to open a link in new tab (chrome) using Selenium WebDriver?

Getting to the last page in a website when the 'go to last page' button doesn't work

I need to get to the last page in the following site link. This is because when I right click on a row I would be able to export all previous rows along with it as a csv file. That way, I can download the complete data present in the website.
But the problem is that the Go to last page option doesn't work. So I am currently using selenium to click through the next page button to eventually reach the last page. But it takes a lot of time.
This is the code I am currently using.
from selenium import webdriver
import time
url = "https://mahabocw.in/essential-kit-benefits-distribution/"
driver = webdriver.Chrome()
driver.get(url)
for i in range(1000000):
next_button = '/html/body/div[1]/div[6]/div/article/div/div/div/div/div[2]/div/div/div[2]/div/div[4]/span[2]/div[3]/button'
click_next = driver.find_element_by_xpath(next_button)
click_next.click()
Is there any way I could modify the code in the website and maybe make the particular button : Go to last page to work so I can go to that page and download all the data. Or is there some better technique I could adopt using Selenium.
Any help or suggestions would be really helpful. Thanks a lot in advance.

Why does trying to click with selenium brings up "ElementNotInteractableException"?

I'm trying to click on the webpage "https://2018.navalny.com/hq/arkhangelsk/" from the website's main page. However, I get this error
selenium.common.exceptions.ElementNotInteractableException: Message:
There's nothing after "Message:"
My code
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
browser = webdriver.Firefox()
browser.get('https://2018.navalny.com/')
time.sleep(5)
linkElem = browser.find_element_by_xpath("//a[contains(#href,'arkhangelsk')]")
type(linkElem)
linkElem.click()
I think xpath is necessary for me because, ultimately, my goal is to click not on a single link but on 80 links on this webpage. I've already managed to print all the relevant links using this :
driver.find_elements_by_xpath("//a[contains(#href,'hq')]")
However, for starters, I'm trying to make it click at least a single link.
Thanks for your help,
The best way to figure out issues like this, is to look at the page source using developer tools of your preferred browser. For instance, when I go to this page and look at HTML tab of the Firebug, and look for //a[contains(#href,'arkhangelsk')] I see this:
So the link is located within div, which is currently not visible (in fact entire sub-section starting from div with id="hqList" is hidden). Selenium will not allow you to click on invisible elements, although it will allow you to inspect them. Hence getting element works, clicking on it - does not.
What you do with it depends on what your expectations are. In this particular case it looks like you need to click on <label class="branches-map__toggle-label" for="branchesToggle">Список</label> to get that link visible. So add this:
browser.find_element_by_link_text("Список").click();
after that you can click on any links in the list.

Categories