I am trying to extract data from this page using Python Selenium. The table is rendered by Tableau. I would need to input some data and then use the download button.
Interestingly, I can't access the elements inside the table from Selenium. I tried looking by id, class or xpath. I keep getting the NoSuchElementException. However, these elements are rendered in HTML and I can see them with the inspect tool. Does anyone know why this is, and how I can make them visible to Selenium?
EDIT1: It's not a problem of loading time. I tried with time.sleep() and I am also interacting directly with the page.
I can see your tables are inside iFrame. First go inside and then try to scrape table data.
WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH, "//iframe[contains(#src,'zika_Weekly_Agg_tben')]")))
# COde here to scrape data
driver.switch_to.default_content() # To come out of frame
You need to Import
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
This was quite challenging since it has 2 iframe followed by shadow element. And does not stop here. When you switch to the iframe you don't have iframe reference available to access the shadow element. you Can refer below code. It manage to get the tableau chart heading.
# Get first iframe and switch to it
root1 = driver.find_element_by_xpath("//div[#itemprop='articleBody']//iframe")
driver.switch_to.frame(root1)
# Grab the shadow element
shadow = driver.execute_script('return document')
# Get the iframe inside shadow element of first iframe
iframe2 = shadow.find_element_by_xpath("//body/iframe")
# switch to 2nd iframe
driver.switch_to.frame(iframe2)
print("selected 2nd iframe")
shadow_doc2 = driver.execute_script('return document')
print("second iframe")
heading = shadow_doc2.find_element_by_xpath("//div[#class='tab-textRegion-content']/span//span[text()='Cases of Zika Virus Disease']/ancestor::div[2]").text
print(heading)
Output -
Related
Let's say I want to scroll a scrollbar in some element.
As an example, let's take the link "https://www.w3schools.com/howto/howto_css_table_responsive.asp". If I run:
element = driver.find_element_by_xpath("//table")
I get the table element. But now, I want to scroll the scrollbar horizontally, that is controling the view of this table. How can I do this?
I already tried something like:
driver.execute_script("arguments[0].scrollLeft = 200;",element)
But I wasn't successfull. I also tried sending keys, but it didn't work too.
You do not need to actually be able to see elements for Selenium to target them.
Just proceed to target the elements that are 'off screen' as you would if they were on screen.
Selenium is targeting the DOM - which as a loose analogy which I will probably be roasted for in the comments is like a screen with infinite length and width - so no scrolling necessary - it (selenium) can already "see" everything 'on' that 'infinite screen'
If you are having trouble selecting specific elements from within the table in question (if you need a specific code snippet) please feel free to comment - I am rather familiar with Selenium.
EDIT:
In your specific case, since they are being loaded dynamically you are 100% correct that you will need to scroll to get the data. Let's take care of vertical scrolling by hitting the More Financial Data button at the bottom that expands the list.
Now for scrolling to the right - since we can just scroll all the way to the right and have all data that we need to access visible (assuming you don't have a Premium account).
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
driver = webdriver.Firefox()
url = "https://www.morningstar.com/stocks/xnas/tsla/financials"
driver.get(url)
driver.maximize_window()
# Click on Income Statement
xpath = '//*[#id="__layout"]/div/div[2]/div[3]/main/div[2]/div/div/div[1]/sal-components/section/div/div/div/div/div[2]/div/div/div[2]/div[2]/div/div[2]/div/div[2]/div[2]/div[1]/div[1]/a'
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, xpath))).click()
# click on More Financials Detail Data
xpath = '//*[#id="__layout"]/div/div[2]/div[3]/main/div[2]/div/div/div[1]/sal-components/section/div/div/div/div/div[2]/div/div[2]/div/div[2]/div/div[3]/div[2]/div/a'
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, xpath))).click()
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
xpath='/html/body/div[2]/div/div/div[2]/div[3]/main/div[2]/div/div/div[1]/sal-components/section/div/div/div/div/div[2]/div/div[2]/div/div[2]/div/div[2]/div[4]/div/div[3]/div[2]/div[2]'
horizontal_bar_width = driver.find_element_by_xpath(xpath).rect['width']
slider = driver.find_element_by_xpath(xpath)
ActionChains(driver).click_and_hold(slider).move_by_offset(horizontal_bar_width/2, 0).release().perform()
Currently using Python and Selenium to scrape data, export to a CSV and then manipulate as needed. I am having trouble grasping how to build xpath statements to access specific text elements on a dynamically generated page.
https://dutchie.com/embedded-menu/revolutionary-clinics-somerville/menu
From the above page I would like to export the category (not part of each product, but a parent element) followed by all the text fields associated to a product card.
The following statement allows me to pull all the titles (sort of) under the "Flower" category, but from that I am unable to access all child text elements within that product, only a weird variation of title. The xpath approach seems to be ideal as it allows me to pull this data without having to scroll the page with key passes/javascript.
products = driver.find_elements_by_xpath("//div[text()='Flower']/following-sibling::div/div")
for product in products:
print ("Flower", product.text)
What would I add to the above statement if I wanted to pull the full set of elements that contains text for all children within the 'consumer-product-card__InViewContainer', within each category...such as flower, pre-rolls and so on. I expiremented with different approaches last night and different paths/nodes/predicates to try and access this information building off the above code but ultimately failed.
Also is there a way for me to test or visualize in some way "where I am" in terms of scope of a given xpath statement?
Thank you in advance!
I have tried some code for you please take a look and let me know if it resolves your problem.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 60)
driver.get('https://dutchie.com/embedded-menu/revolutionary-clinics-somerville/menu')
All_Heading = wait.until(
EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(#class,\"products-grid__ProductGroupTitle\")]")))
for heading in All_Heading:
driver.execute_script("return arguments[0].scrollIntoView(true);", heading)
print("------------- " + heading.text + " -------------")
ChildElement = heading.find_elements_by_xpath("./../div/div")
for child in ChildElement:
driver.execute_script("return arguments[0].scrollIntoView(true);", child)
print(child.text)
Please find the output of the above code -
Hope this is what you are looking for. If it solve you query then please mark it as answer.
I am writing a python script which will call a webpage and will select an option from the drop down to download that file. To do this task, I am using chropath. It is a browser extension which can give you the relative xpath or id for any button or field on the webpage and using that we can call it from python selenium script.
Above image shows the drop down menu in which I have to select 2019 as year and the download the file. In the lower part of the image, you can see that I have used chropath to get the relative xpath of the drop down menu which is //select[#id='rain']
Below is the code I am using:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("<URL>")
driver.maximize_window()
grbf = driver.find_element_by_xpath("//select[#id='rain']")
grbf.send_keys('2019')
grbf_btn = (By.XPATH, "//form[1]//input[1]")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable(grbf_btn)).click()
from the above code, you can see that I am using xpath to select the drop down grbf = driver.find_element_by_xpath("//select[#id='rain']") and then sending keys as 2019 i.e. grbf.send_keys('2019') and after that I am calling download button to download it. But for some reason, its always selecting year 1999 from the drop down. I am not able to understand what is wrong in this. Is this correct approach to solve this. Please help. Thanks
I had the same problem time ago. Try this:
from selenium.webdriver.support.ui import Select
grbf = Select(driver.find_element_by_xpath("//select[#id='rain']"))
grbf.select_by_value('2019')
In the select_by_value() you have to use the value of the element in the dropdown.
By the way, if an element has id, use it.
grbf = Select(driver.find_element_by_id('rain'))
Try below code:
select = Select(driver.find_element_by_xpath("//select[#id='rain']"))
select.select_by_visible_text('2019')
Another approches to deal with dropdown:
Using Index of Dropdown :
select.select_by_index(Paass index)
Using valueof Dropdown :
select.select_by_value('value of element')
Using visible text of Dropdown :
select.select_by_visible_text('element_text')
In my opinion, I don't think this is the correct approach. You try to select the option which is dropdown (not a text box like ), so send key command does not work.
What you need to do is try to inspect HTML changing when clicking the dropdown and try to XPath for an option that you want to select.
If you still stuck at this problem, I recommend using katalon recorder which is a chrome extension to allow you to record and do UI testing
Ive been using selenium to scrape a site to retrieve some information.
the information is hidden behind a see more tab that is being revealed using javascript when i click it. I have tried many different methods to get the the information visible. and it does not seem to be working .
I have tried to use action chains along with the regular xpath methods to chain the functionality together but it still does not seem to be clicking all the other info is pulled and the button text is printed out to the console instead of being clicked.
def grabDetails(self):
facts = self.browser.find_elements_by_xpath("//section[#id='hdp-content']/main/div/div[4]")
for fact in facts:
details = fact.text
print(details)
def moreFeatures(self):
view_more_elements = WebDriverWait(self.browser, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[contains(text(),'See More Facts and Features')]")))
# features.click()
ActionChains(view_more_elements).double_click().preform()
# self.browser.execute_script('arguments[0].scrollIntoView(true);', features)
Im trying to get the information from this page printed out !
here is the zillow page that im trying to scrape
its the see more sections part below it
You are setting view_more_elements as a WebDriverWait object rather than a WebElement. This will prevent the object from being clickable. You just need to run WebDriverWait as it's own call, then .click() the element.
view_more_xpath = '//div[#class="read-more zsg-centered"]/a'
view_more_elements = WebDriverWait(self.browser, 20).until(EC.visibility_of_element_located((By.XPATH, view_more_xpath))
self.browser.find_element_by_xpath(view_more_xpath).click()
EDIT: you don't actually need to set WebDriverWait to view_more_elements. You can just do:
view_more_xpath = '//div[#class="read-more zsg-centered"]/a'
WebDriverWait(self.browser, 20).until(EC.visibility_of_element_located((By.XPATH, view_more_xpath))
self.browser.find_element_by_xpath(view_more_xpath).click()
I'm trying to webscrape trough this webpage https://www.sigmaaldrich.com/. Up to now I have achieved for the code to use the requests method to use the search bar. After that, I want to look for the different prices of the compounds. The html code that includes the prices is not visible until the Price dropdown has been clicked. I have achieved that by using selenium to click all the dropdowns with the desired class. But after that, I do not know how to get the html code of the webpage that is generated after clicking the dropdowns and where the price is placed.
Here's my code so far:
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from time import sleep
#get the desired search terms by imput
name=input("Reagent: ")
CAS=input("CAS: ")
#search using the name of the compound
data_name= {'term':name, 'interface':'Product%20Name', 'N':'0+',
'mode':'mode%20matchpartialmax', 'lang':'es','region':'ES',
'focus':'product', 'N':'0%20220003048%20219853286%20219853112'}
#search using the CAS of the compound
data_CAS={'term':CAS, 'interface':'CAS%20No.', 'N':'0','mode':'partialmax',
'lang':'es', 'region':'ES', 'focus':'product'}
#get the link of the name search
r=requests.post("https://www.sigmaaldrich.com/catalog/search/", params=data_name.items())
#get the link of the CAS search
n=requests.post("https://www.sigmaaldrich.com/catalog/search/", params=data_CAS.items())
#use selenium to click in the dropdown(only for the name search)
driver=webdriver.Chrome(executable_path=r"C:\webdrivers\chromedriver.exe")
driver.get(r.url)
dropdown=driver.find_elements_by_class_name("expandArrow")
for arrow in dropdown:
arrow.click()
As I said, after this I need to find a way to get the html code after opening the dropdowns so that I can look for the price class. I have tried different things but I don't seem to get any working solution.
Thanks for your help.
You can try using the Selenium WebDriverWait. WebDriverWait
WebDriverWait wait = new WebDriverWait(driver, 30);
WebElement element = wait.until(ExpectedConditions.presenceOfElementLocated(css));
First, You should use WebDriverWait as Austen had pointed out.
For your question try this:
from selenium import webdriver
driver=webdriver.Chrome(executable_path=r"C:\webdrivers\chromedriver.exe")
driver.get(r.url)
dropdown=driver.find_elements_by_class_name("expandArrow")
for arrow in dropdown:
arrow.click()
html_source = driver.page_source
print(html_source)
Hope this helps you!