finding button with similar href - python

a particular button (which allows me to jump to the second page) has a href
inputHref = /letsdeal?sectionLoadingID=m_timeline_loading_div_1485935999_0_36_timeline_unit%3A1%3A00000000001483240170%3A04611686018427387904%3A09223372036854775803%3A04611686018427387904&unit_cursor=timeline_unit%3A1%3A00000000001483240170%3A04611686018427387904%3A09223372036854775803%3A04611686018427387904&timeend=1485935999&timestart=0&tm=AQBwkKKSIKOhqAju&refid=17
and if i click on this button a second page open ups and a button (which takes me to the third page) has a href
inputHref = /letsdeal?sectionLoadingID=m_timeline_loading_div_1485935999_0_36_timeline_unit%3A1%3A00000000001482227114%3A04611686018427387904%3A09223372036854775798%3A04611686018427387904&unit_cursor=timeline_unit%3A1%3A00000000001482227114%3A04611686018427387904%3A09223372036854775798%3A04611686018427387904&timeend=1485935999&timestart=0&tm=AQBwkJZSIKOhqAju&refid=17
Both Href are different in the end part but similar in the start. How can i locate both of these buttons using the XPATH using one formula just like the following code.
extendButton = driver.wait.until(EC.presence_of_element_located(
(By.XPATH, "//a[contains(#href,'"+inputHref + "')]")))

You can apply a partial match using contains():
//a[contains(#href, "letsdeal")]
Or:
//a[contains(#href, "/letsdeal")]
Or, with a CSS selector:
driver.find_element_by_css_selector("a[href*=letsdeal]")
Note that I don't know how unique the "letsdeal" substring is on your page and whether it is used in other href attribute values.

Related

How can I select a specific element on a page when there is so many of that element? Selenium webdriver python;

Essentially what I am trying to do is select just the "Reply" box which circled in red, but there are many of these in the page overall. My aim is to be able to select the first "Reply" box on every page. How can I select just the first reply box for every post (with this link just being an example)?
Currently this doesn't seem to work:
reply = driver.find_element_by_xpath("//*[#id='content']/div/div[2]/div/div/div/div[1]/article/div/aside/ul/li[1]/div/div[2]/div/ul/li[7]/button/span/img")
reply.click()
Many thanks.
First way:
The XPath to locate any of that replay buttons is
//button[#title="Reply"]
So the XPath to locate the first replay button is
(//button[#title="Reply"])[1]
So you can simply
driver.find_element_by_xpath('(//button[#title="Reply"])[1]').click()
Second way:
With the XPath above you can retrieve a list of all the replay buttons and then get the first element in the list and click on it as following:
replay_buttons = driver.find_elements_by_xpath('//button[#title="Reply"]')
replay_buttons[0].click()
You can use css_selector instead of XPath here as well:
replay_buttons = driver.find_elements_by_css_selector('button[title="Reply"]')
replay_buttons[0].click()
Inspecting the page i saw that the class name of this button is: <button class="Button Button--link">
So you can use driver.find_elements_by_class_name('Button Button--link') ,
which returns a list of all the buttons.

Finding Add To Cart button with selenium using find_element_by_xpath

I'm trying to create a function that locates any add to cart button on a given website by searching for the text on the button. For example, Amazon says "Add To Cart" so I am using this function to try to locate the button. Unfortunately I'm getting:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//div[.='Add To Cart']"}
def GetElementByText(driver, url, text):
driver.get(url)
element = driver.find_element_by_xpath("//div[.='" + text + "']")
print(element)
return element
element = GetElementByText(driver, 'https://www.amazon.com/gp/product/B00ZG9U0KA?pf_rd_r=AQC5SP1PPERA8C37YCC8&pf_rd_p=5ae2c7f8-e0c6-4f35-9071-dc3240e894a8', 'Add To Cart')
I've also tried using this function, which works on other websites but not on Amazon.
def GetButtons(driver, url):
driver.get(url)
html = driver.page_source
driver.quit()
soup = BeautifulSoup(html, 'html.parser')
buttons = soup.find_all('button')
return buttons
GetButtons(driver, 'https://www.amazon.com/gp/product/B00ZG9U0KA?pf_rd_r=AQC5SP1PPERA8C37YCC8&pf_rd_p=5ae2c7f8-e0c6-4f35-9071-dc3240e894a8')
Is there an easier way to accomplish this in a dynamic way that would be easy to apply to other websites? My concern is that some websites have buttons, and some have links. Returning all the links or tags using BeautifulSoup returns too many results to practically sort through.
Any ideas for how to accomplish this? The function wouldn't necessarily have to automatically find the button on its own (Though that would be great), but if I could narrow it down enough to search through 10-20 possible results that would be perfect.
Two things.
On Amazon, the text is "Add to Cart" with a lowercase "to," you have it as an uppercase "To."
By changing your xpath to "//*[.='" + text + "']", I was able to find it (after correcting the To/to error). It might be too broad for general application, but worth a try.

Problem clicking on link in header in selenium (python)

Hello I have an issue clicking on the link "Assortiment"
on this page: https://www.colruyt.be/fr
In order to click it I use :
element = browser.find_element_by_xpath('.//li[#class = "first leaf menu-mlid-9143"]')
element.click()
There are two elements that can be found with that xpath. The first one is hidden in the side pullout menu for the mobile version and the second is the one you want. Try scoping your xpath to the div with main-navigation class.
browser.find_element_by_xpath('.//div[#class = "main-navigation"]//li[#class = "first leaf menu-mlid-9143"]')

Click link in Kickstarter using Selenium

I am attempting to scrape Kickstarter based on the project names alone. Using the project name and the base URL I can get to the search page. In order to scrape the project page, I need to use Selenium to click on the URL. However, I cannot point Selenium to the correct element to click on. I would also like this to be dynamic so I do not need to put the project name each time.
<div class="type-18 clamp-5 navy-500 mb3">
<a href="https://www.kickstarter.com/projects/1980119549/knife-block-
designed-by-if-and-red-dot-winner-jle?
ref=discovery&term=Knife%20block%20-
%20Designed%20by%20IF%20and%20Red%20dot%20winner%20JLE%20Design"
class="soft-black hover-text-underline">Knife block -
Designed by IF and
Red dot winner JLE Design
</a>
</div>`
driver = webdriver.Chrome(chrome_path)
url = 'https://www.kickstarter.com/discover/advanced?ref=nav_search&term=Knife
block - Designed by IF and Red dot winner JLE Design'
driver.get(url)
elem = driver.find_elements_by_link_text('Knife block - Designed by IF and Red
dot winner JLE Design')
elem.click()
How can I get the elem to point to the correct link?
In regards to your attempt, your code had a typo: using find_elements.... returns a list of elements so the method .click() would not work. You mean to use find_element.
To dynamically click links, use an XPath instead. The resulting code would be:
elem = driver.find_element_by_xpath('//div[contains(#class, "type-18")]/a')
elem.click()
This would grab the first match. You could do find_elements and iterate over the elements but this would be a bad approach because since you're clicking the links, each time that renders the previous page stale. If there's more than one, you could use the same XPath but indexed:
first_elem = driver.find_element_by_xpath('(//div[contains(#class, "type-18")]/a)[1]')
first_elem.click()
# ...
second_elem = driver.find_element_by_xpath('(//div[contains(#class, "type-18")]/a)[2]')
second_elem.click()
# And so forth...

Selenium (python): can't switch to iframe (name is dynamically generated)

I'm having problem selecting the iframe and accessing the different elements inside it. The iframe name is dynamically generated (e.g. frame11424758092173 or frame0005809321 or frame32138092173). The problem is that Selenium can't find the iframe no matter what i do....
switching to most recent frame doesn't work:
iframe = driver.find_elements_by_tag_name('iframe')[0]
driver.switch_to_frame(iframe)
Waiting for frame gets a timeout exception:
try:
iframe = WebDriverWait(driver, 5).until(EC.frame_to_be_available_and_switch_to_it(By.TAG_NAME('iframe')))
except:
logger.error(traceback.format_exc())
The following lines of code also times out:
try:
iframe = WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.TAG_NAME, u"iframe")))
driver.switch_to_frame(iframe)
except:
logger.error(traceback.format_exc())
I have also tried iterating through the frames but it can't find it. The returned list is empty
iframes = driver.find_elements_by_tag_name('iframe')
#iframes is empty
really need some help...
Have you tried locating the iframe by its XPath and using the contains method?:
iframe = driver.find_element_by_xpath('//iframe[contains(#name, "frame")]')
driver.switch_to_frame(iframe)
Now you can access elements within the iframe.
To exit the iframe use:
driver.switch_to_default_content()
The contains method lets you get an element by a partial attribute value. Pretty useful for dynamically generated IDs, names, etc. You can search by other attributes as well using XPath. For example, say your iframe element has the attribute value = "3". You could use:
iframe = driver.find_element_by_xpath('//iframe[contains(#name, "frame")][#value = "3"]')
driver.switch_to_frame(iframe)
This approach can be used with any number of attributes as well.
You could also try getting the element by its selector. Keep in mind that this limits what you can do with it:
driver.execute_script('document.querySelector("INSERT SELECTOR HERE").doSomething();')
To get the Selector and/or XPath you're going to want to inpect the element using your browser (Chrome in my case). Right click on the element. Click Inspect. Then right click on the HTML element and click Copy > Copy Xpath or Copy > Copy Selector.
If that doesn't work for me, my last resort is to go the url of the iframe.To get that, you need to right-click on the area of the webpage where the iframe exists and click View Frame Source. It'll then lead you to a new page. The url of that page will be shown in the top of the browser after view-source:. You can then simply navigate to that url:
driver.get('insert url of iframe here')
And now you have access to the elements within the iframe. I do not recommend this approach if you are manipulating elements within the iframe and then exiting the iframe. Your changes will get lost. This will only work if you are scraping info off of that iframe, NOT if you are manipulating the elements within. Finding the iframe element and switching into it is usually better and safer.

Categories