First of all, I'm completely new in web scraping, html and selenium, so my question might not seem meaningful to you.
What I'm trying to do: automated click on an image on a webpage. Manual click on this image allows to display new information on a part of the webpage.
How I tried to do it: I found something specific to this image, for example:
<img src="images/btn_qmj.gif" border="0">
So I just entered in python:
xpath = '//img[#src="images/btn_qmj.gif"]'
driver.find_elements_by_xpath(xpath)
Problem: this returns me an empty list. I partially understood why, but I don't have any solution.
image of the html code inspector
I included here an image of the html tree I obtain on the web inspector. As one can see, the tree to reach my line of interest gives "/html/frameset/frame1/#document/html/body/center/table/tbody/tr[2]/td/a/img".
The issue is that I cannot access -by using an xpath- anything that comes after the #document. If I try to copy the xpath of my line of interest, it tracks it back only up to the 2nd "/html". It is like the page is subdivided, compartmentalized, with inner parts I cannot access. I suppose there is a way to access the #document content, maybe it refers to another webpage, but I'm stuck at that point.
I would be very grateful if anyone could help me on this.
Have you tried switching to the frame first?
driver.switch_to.frame('gauche')
You need to switch control to frame first to handle image on it. Please see below code for your reference
driver.switch_to_frame(frame)
WebDriverWait(driver, 20).until(
EC.visibility_of_element_located((By.XPATH, "'//img[#src="images/btn_qmj.gif"]'"))).click()
driver.switch_to_default_content()
Note :: You need to add below imports
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
The element is present inside an iframe.In order to access the element you need to switch it first.
Induce WebDriverWait() and frame_to_be_available_and_switch_to_it()
Induce WebDriverWait() and visibility_of_element_located()
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.NAME,"gauche")))
WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH,"//img[#src='images/btn_qmj.gif']"))).click()
You need to import following libraries.
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
Related
The code I want to extract the "ul" element from
After trying suggestion, I want to get "title" in the follwing "li" tag
I'm trying to extract the following "ul" element from a webpage using selenium. Using Python, I can't figure out what the X_PATH should be, I've tried everything I could think of. Also tried the css_selection. I'm getting back nothing.
I want to iterate over the "li" elements within that specific "ul" element. If anybody could help it would be appreciated, I've literally tried everything I can think/search.
//ul[#aria-label='Chat content']/li[1]/div[1]/div[1]
To me it looks like you could just grab the ul by it's aria label and then select the first li from there. You could also just inspect the element and copy the xpath from the developers console.
It also seems to be in an iframe.
wait = WebDriverWait(driver, 30)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.XPATH, "//iframe[#class='embedded-electron-webview embedded-page-content']")))
Imports:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Check is that element are inside some iFrame, maybe is for that, if that dont work install selenium extension in your browser and try to recreat the flow, you will receive all xpath, is a way that I use when nothing is working to me.
Using selenium and python, I'm opening a document by making click on a link which opens in the same tab. When I make a screenshot without using '--headless' it takes the document screenshot, but when I activate '--headless' the screenshot is from the previous page (the one where I make click to the document to be opened)
I have tried different browsers and ways (opening in new tab and switching but not working when using '--headless') but not working...
Any idea why '--headless' is behaving like this?
While taking a screenshot ideally you need to induce WebDriverWait for the visibility_of_element_located() of some static and visible element e.g. some <h1> / <h2> element on the desired page and then take the screenshot as follows:
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "css_visible_element")))
driver.save_screenshot("Federico Albrieu.png")
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
I solved the problem.
I was using the action
ActionChains(driver).move_by_offset('xcoordinate', 'ycoordinate).click().perform()
to open the desired document. Apparently, while using headless, selenium can't make click in the document coordinate (?) but it can when not using headless.
I tried instead making click with element_to_be_clickable((By.XPATH, 'XPATH'))).click() and it works when using headless.
Thanks!
I'm having a really hard time locating elements on this website. My ultimate aim here is scrape the score card data for each country. The way I've envisioned doing that is to have selenium click list view (as opposed to the default country view) and loop going in and out of all the country profiles, scraping the data for each country along the way. But methods to locate elements on the page that worked for me in the past, have been fruitless with this site.
Here's some sample code outlining my issue.
from selenium import webdriver
driver = webdriver.Chrome(executable_path= "C:/work/chromedriver.exe")
driver.get('https://www.weforum.org/reports/global-gender-gap-report-2021/in-full/economy-profiles#economy-profiles')
# click the `list` view option
driver.find_element_by_xpath('//*[#id="root"]/div/div[1]/div[2]/div[2]/svg[2]')
As you can see, I've only gotten as far as step 1 of my plan. I've tried what other questions have suggested as far as adding waits, but to no avail. I see the site fully loaded in my DOM, but no xpaths are working for any element on there I can find. I apologize if this question is posted too often, but do know that any and all help is immensely appreciated. Thank you!
The element is inside an iframe you need to switch it to access the element.
Use WebDriverWait() wait for frame_to_be_available_and_switch_to_it()
driver.get("https://www.weforum.org/reports/global-gender-gap-report-2021/in-full/economy-profiles#economy-profiles")
wait=WebDriverWait(driver,10)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.ID, "iFrameResizer0")))
wait.until(EC.element_to_be_clickable((By.XPATH,"//*[name()='svg' and #class='sc-gxMtzJ bRxjeC']"))).click()
You need following imports.
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
Another update:
driver.get("https://www.weforum.org/reports/global-gender-gap-report-2021/in-full/economy-profiles#economy-profiles")
wait=WebDriverWait(driver,10)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.ID, "iFrameResizer0")))
driver.find_element_by_xpath("//*[name()='svg' and #class='sc-gxMtzJ bRxjeC']").click()
You incorrectly click List view. Your locator has to be stable. I checked it did not work.
So, in order to click the icon, use WebDriverWait:
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
wait = WebDriverWait(driver, timeout=30)
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".sc-gzOgki.ftxBlu>.background")))
list_view = driver.find_element_by_css_selector(".sc-gzOgki.ftxBlu>.background")
list_view.click()
Next, to get the unique row locator, use the following css selector:
.sc-jbKcbu.kynSUT
It will give use the list of all countries.
I want to scrape the data on the website https://www.climatechangecommunication.org/climate-change-opinion-map/. I am somewhat familiar with selenium. But the data I need which is below the map and the tooltip on the map is not visible in the source file. I have read some posts about using PhantomJS and others. However, I am not sure where and how to start. Can someone please help get me started.
Thanks,
Rexon
You can use this sample code:
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get("https://www.climatechangecommunication.org/climate-change-opinion-map/")
# switch to iframe
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH, "//iframe[#src = 'https://environment.yale.edu/ycom/factsheets/MapPage/2017Rev/?est=happening&type=value&geo=county']")))
# do your stuff
united_states = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//*[#id='document']/div[4]//*[name()='svg']")))
print(united_states.text)
# switch back to default content
driver.switch_to.default_content()
Output:
50%
No
12%
Yes
70%
United States
Screenshot of the element:
Explanantion: first of all, to be able to interact with elements below the map you have to switch to iframe content, otherwise it is not possible to interact with this elements. Then the data below the map is in svg tags, which are also not trivial. To be able to do this, you the sample I have provided.
PS: I have used WebDriverWait in my code. With WebDriverWait your code becomes quickier and stable, since Selenium waits for particular conditions like visibility or clickable of particular element. In the sample code the driver wait at least 10 seconds until expected condition will be satisfied.
I'm writing a piece of python code to test whether users are able to click on dynamic tabs.
I'm using find_elements_by_xpath() to search for attributes of the tab. Now, this is a dynamic set of tabs, so they will never be viewable in the page source. But they do exist when I click on inspect element.
I'm using something like this to identify the tabs:
elem = self.browser.find_elements_by_xpath("//a[contains(#role,'tab')]")
for i in elem:
print(i.text)
I tried the get_attribute feature but that did not work. Is there any way I can use python selenium to click on a dynamic tab (that does not appear on page source)?
You can wait some time until required element present in DOM and clickable:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.support import expected_conditions as EC
wait(self.browser, 10).until(EC.element_to_be_clickable((By.XPATH, '//a[contains(#role, "tab")]'))).click()