so I'm currently using python to import data from an excel sheet and then take that information and use it to fill out a form on a webpage.
The problem I'm having is selecting a profile of the drop-down menu.
I've been using the Selenium library and I can actually select the element using find_element_by_xpath assuming, but that's assuming I know the data value, the data value is auto-generated for each new profile that's added so I can't use that as a reliable means.
Profile = Browser.find_element_by_xpath("/html/something/something/.....")
Profile.click()
time.sleep(0.75) #allowing time for link to be clickable
The_Guy = Browser.find_element_by-xpath("/html/something/something/...")
The_Guy.click()
This works only on known paths I would like to do something like this
Profile = Browser.find_element_by_xpath("/html/something/something/.....")
Profile.click()
time.sleep(0.75) #allowing time for link to be clickable
The_Guy = Browser.find_element_by_id("Caption.A")
The_Guy.click()
EXAMPLE OF HTML
<ul class ="list">
<li class = "option" data-value= XXXXX-XXXXX-XXXXX-XX-XXX>
::marker
Thor
</li>
<li class = "option" data-value= XXXXX-XXXXX-XXXXX-XX-XXX>
::marker
IronMan
</li>
<li class = "option" data-value= XXXXX-XXXXX-XXXXX-XX-XXX>
::marker
Caption.A
</li>
....
</ul>
What I'll like to be able to do is search via name (like Caption.A) and then step back to select the parent Li. Thanks in advance
Try using following xpath to find the li containing desired text and then click on it. Sample code:
driver.find_element(By.xpath("//li[contains(text(), 'Caption.A')]")).click()
Hope it helps :)
Related
Say I have an eccormece site I would want to scrape and I am interested in the top ten trending products and when dig into the html element its like this:
<div>
<div>
<span>
<a href='www.mysite/products/1'>
Product 1
</a>
</spa>
</div>
<div>
<span>
<a href='www.mysite/products/2'>
Product 2
</a>
</spa>
</div>
<div>
<span>
<a href='www.mysite/products/3'>
Product 3
</a>
</spa>
</div>
<div>
<span>
<a href='www.mysite/products/4'>
Product 4
</a>
</spa>
</div>
</div>
My first solution was to extract the href attributes and then store them in a list then I would open browser instances for each and every attribute, but then it comes at a cost as I have to close and open the browser and every time I open it I have to authenticate. I then tried solution 2. In my solution two the outer div is the parent and as per selenium way of doing things it would mean that products I stored as follows:
product_1 = driver.find_element_by_xpath("//div/div[1]")
product_2 = driver.find_element_by_xpath("//div/div[2]")
product_3 = driver.find_element_by_xpath("//div/div[3]")
product_4 = driver.find_element_by_xpath("//div/div[4]")
So my objective would is to search for a product and after getting the list target the box's a tag and then click it, go to extract more details on the product and then go back without closing the browser till my list is finished and below is my solution:
for i in range(10):
try:
num = i + 1
path = f"//div/div[{num}]/span/a"
poduct_click = driver.find_element_by_xpath(path)
driver.execute_script("arguments[0].click();", poduct_click)
scrape_product_detail() #function that scrapes the whole product detail
driver.execute_script("window.history.go(-1)") # goes backwards to continue looping
except NoSuchElementException:
print('Element not found')
The problem is it works for the first product and it scrapes all the detail and then it goes back. Despite going back to the product page the program fails to find the second element and those coming afterwards and I am failing to understand what may be the problem. May you kindly assist. Thanks
thanks #Debenjan you did help me a lot there. Your solution is working like a charm. For those who would want to know how I went about here is the following code:
article_elements = self.find_elements_by_class_name("s-card-image")
collection = []
for news_box in article_elements:
# Pulling the hotel name
slug = news_box.find_element_by_tag_name(
'a'
).get_attribute('href')
collection.append(
slug
)
for i in range(len(collection)):
self.execute_script("window.open()")
self.switch_to.window(self.window_handles[i+1])
url = collection[i]
self.get(url)
print(self.title, url, self.current_url)
#A D thanks so much your solution is working too and I just will have to test and see whats the best strategy and go with it. Thanks a lot guys
I'm fairly new to the web scraping world but I really need to do some web scraping on the Thesaurus website for a project I'm working on. I have successfully created a program using beautifulsoup4 that asks the user for a word, then returns the most likely synonyms based on Thesaurus. However, I would like to not only have those synonyms but also the synonyms of every sense of the word (which is depicted on Thesaurus by a list of buttons above the synonyms). I noticed that when clicking a button, the name of the classes also change, so I did a little digging and decided to go with Selenium instead of beautifulsoup.
I have now a code that writes a word on the search bar and clicks it, however, I'm unable to get the synonyms or the said buttons, simply because the find_element finds nothing, and being new to this, I'm afraid I'm using the wrong syntax.
This is my code at the moment (it looks for synonyms of "good"):
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
import time
PATH = "C:\Program Files (x86)\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://thesaurus.com")
search = driver.find_element_by_id("searchbar_input")
search.send_keys('good')
search.send_keys(Keys.RETURN)
try:
headword = WebDriverWait(driver,10).until(
EC.presence_of_element_located((By.ID, "headword"))
)
print(headword.text)
#buttons = headword.find_element_by_class_name("css-bjn8wh e1br8a1p0")
#print(buttons.text)
meanings = WebDriverWait(driver,10).until(
EC.presence_of_element_located((By.ID, "meanings"))
)
print(meanings.text)
#words = meanings.find_elements_by_class_name("css-1kg1yv8 eh475bn0")
#print(words.text)
except:
print('failed')
driver.quit()
For the first part, I want to access the buttons. The headword is simply the element that contains all the buttons I want to press. This is the headword element according to the inspect tool:
<div id="headword" class="css-bjn8wh e1br8a1p0">
<div class="css-vw3jp5 e1ibdjtj4">
*unecessary stuff*
<div class="css-bjn8wh e1br8a1p0">
<div class="postab-container css-cthfds ew5makj3">
<ul class="css-gap396 ew5makj2">
<li data-test-pos-tab="true" class="active-postab css-kgfkmr ew5makj4">
<a class="css-sc11zf ew5makj1">
<em class="css-1v93s5a ew5makj0">adj.</em>
<strong>pleasant, fine</strong>
</a>
</li>
<li data-test-pos-tab="true" class=" css-1ha4k0a ew5makj4">
*similar stuff*
<li data-test-pos-tab="true" class=" css-1ha4k0a ew5makj4">
...
where each one these <li data-test-pos-tab="true" class=" css-1ha4k0a ew5makj4"> is a button I want to click. So far I have tried a bunch of things like the one showed in the code, and also things like:
buttons = headword.find_elements_by_class_name("css-1ha4k0a ew5makj4")
buttons = headword.find_elements_by_css_selector("css-1ha4k0a ew5makj4")
buttons = headword.find_elements_by_class_name("postab-container css-cthfds ew5makj3")
buttons = headword.find_elements_by_css_selector("postab-container css-cthfds ew5makj3")
but in any case Selenium can find these elements.
For the second part I want the synonyms. Here is the meaning element:
<div id="meanings" class="css-16lv1yi e1qo4u831">
<div class="css-1f3egm3 efhksxz0">
*unecessary stuff*
<div data-testid="word-grid-container" class="css-ixatld e1cc71bi0">
<ul class="css-1ngwve3 e1ccqdb60">
<li>
<a font-weight="inherit" href="/browse/acceptable" data-linkid="nn1ov4" class="css-1kg1yv8 eh475bn0">
</a>
</li>
<li>
<a font-weight="inherit" href="/browse/bad" data-linkid="nn1ov4" class="css-1kg1yv8 eh475bn0">
...
where each of these elements is a synonym I want to get. Similarly to the previous case I tried several things such as:
synGrid = meanings.find_element_by_class_name("css-ixatld e1cc71bi0")
synGrid = meanings.find_element_by_css_selector("css-ixatld e1cc71bi0")
words = meanings.find_elements_by_class_name("css-1kg1yv8 eh475bn0")
words = meanings.find_elements_by_css_selector("css-1kg1yv8 eh475bn0")
And again Selenium cannot find these elements...
I would really appreciate some help in order to achieve this, even if it is just a push in the right direction instead of giving a full solution.
Hope I wrote all the needed information, if not, please let me know.
If you use css selector then you have to use dot for class
css_selector(".css-ixatld.e1cc71bi0")
and hash for id
css_selector("#headword")
like you would use in files .css
In css selector you can use also other methods avaliable in CSS.
See css selectors on w3schools.com
Selenium converts class_name to css selector but class_name() expects single name and Selenium has problems when there are two or more names. When it converts class_name to css_selector then it adds dot only before first name but it needs dot also before second and other names. So you have to manually add second dot
class_name("css-ixatld.e1cc71bi0")
See if this works:
meanings = driver.find_elements_by_xpath(".//div[#id='meanings']/div[#data-testid='word-grid-container']/ul/li")
for e in meanings:
e.find_element_by_tag_name("a").click()
//Add a implicit wait if you need
driver.back()
I am trying to automate JIRA tasks but struggling to access bulkedit option after JQL filter. After accessing the correct sceen I am stuck at this point:
enter image description here
HTML code:
<div class="aui-list">
<h5>Bulk Change:</h5>
<ul class="aui-list-sectionaui-first aui-last">
<li class="aui-list-item active">
<a class="aui-list-item-link" id="bulkedit_all" href="/secure/views/bulkedit/BulkEdit1!default.jspa?reset=true&tempMax=4">all 4 issue(s)</a>
</li>
</ul>
</div>
My Python code:
bulkDropdown = browser.find_elements_by_xpath("//div[#class='aui-list']//aui-list[#class='aui-list-item.active']").click()
Try the following xpath -
bulkDropdown = browser.find_elements_by_xpath("//li/a[#id='bulkedit_all']").click()
The link you want has an ID, you should use that unless you find that it's not unique on the page.
browser.find_element_by_id("bulkedit_all").click()
You will likely need to add a wait for clickable since from the screenshot it looks like a popup or tooltip of some kind. See the docs for more info on the different waits available.
<li id="add-to-cart" class="">
<input type="button" value="Add to Cart" class="primary" name="add-to-cart">
</li>
I want to print value
Output: Add to Cart
Here my solution:
first get elements inside < li> (Maybe there will be more than one):
elements = browser.find_elements_by_xpath("//li[#id='add-to-cart']//input")
for e in elements:
print(e.get_attribute("name"))
From Selenium docs:
find the button element using one of the find_element_by... methods on a driver (or on a parent WebElement)
read the name of the button or value of an input using .get_attribute(attribute_name)
read the text using property .text
I am working with a web page that needs some automation and having trouble interacting with certain elements due to their structure. Brief example:
<ul>
<li data-title="Search" data-action="search">
<li class="disabled" data-title="Ticket Grid" data-action="ticket-grid">
<li data-title="Create Ticket" data-action="create">
<li data-title="Settings" data-action="settings">
</ul>
I am aware of all the locator strategies like id and name listed here:
http://selenium-python.readthedocs.org/en/latest/locating-elements.html
However, is there a way to specify finding something by a custom value like in this example "data-title"?
You can use CSS to select any attribute, this is what the formula looks like:
element[attribute(*|^|$|~)='value']
Per your example, it would be:
li[data-title='Ticket Grid']
(source http://ddavison.io/css/2014/02/18/effective-css-selectors.html)
If there are multiple possibilities it is also worth knowing the following option
from selenium.webdriver import Firefox
driver = Firefox()
driver.get(<your_html>)
li_list = driver.find_elements_by_tag_name('li')
for li in li_list:
if li.get_attribute('data-title') == '<wanted_value>':
<do_your_thing>
You can use:
"//li[#data-title='Ticket Grid']"