python selenium how to get link from google - python

How do I get a link from Google? I tried the following ways, but none of them worked:
find_elements_by_xpath("//*[#class='r']/#href")
driver.find_element_by_xpath("//a").get_attribute("href")
driver.find_element_by_xpath("//a").get_attribute(("href")[2])
I am receiving "none"
I need to get google link, not link to the site (e.g. not www.stackoverflow.com). It's highlighted on this image:

You maye have multiple issues here:
First and third options are a not a valid xpath. Second options may find more than one matches, so it will return the first that fits, which is not necessarily the one you want. So I suggest:
Make find specific enough to locate a proper element. I'd suggest find_element_by_link_text if you know the name of the link you are going to choose:
link = driver.find_element_by_link_text('Stack Overflow')
Given you chose the right link, you should be able to get the attribute:
href = link.get_attribute('href')
If the first statement throws an exception (most likely element not found), you may need to wait for the element to appear on the page, as described here

Related

Can't find element - xPath correct

So, I have an XPath (I've verified this works and has 1 unique value via Google Chrome Tools.
I've tried various methods to try and get this xpath, initally using right click > copy xpath in chrome gave me:
//*[#id="hdr_f0f7cdb71b9a3f44782b87386e4bcb3e"]/th[2]/span/a
However, this ID changes on every reload.
So, i eventually got it down to:
//th[#name="name"]/span/a/text()
element = driver.find_element_by_xpath("//th[#name='name']/span/a/text()")
print(element)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//th[#name='name']/span/a/text()"}
check this:
len(driver.find_elements_by_xpath('//*[contains(#id, "hdr_")]'))
if you won't get a lot of elements you're done with this:
driver.find_elements_by_xpath('//*[contains(#id, "hdr_")]')
You should not be using /text() for a WebElement. Use "//th[#name='name']/span/a" as the Xpath and print the text using element.text (Not sure about the exact method for Python, but in Java it is element.getText() )
I will suggest using absolute XPath rather than relative XPath it might resolve it if the id is changing with every load. please see below how absolute XPath is different from relative using the google search bar.
Relative XPath - //*[#id="tsf"]/div[2]/div/div[1]/div/div[1]/input
absolute XPath - /html/body/div/div[3]/form/div[2]/div/div[1]/div/div[1]/input
I understand as you said you cannot share a link but people here can help if you can share inspect element snapshot showing DOM. so that if there is an issue in XPath it can be rectified. Thanks :-)

Selenium Python: Census ACS Data- unable to select Download button in window

I am attempting to scrape the Census website for ACS data. I have scripted the whole processes using Selenium except the very last click. I am using Python. I need to click a download button that is in a window that pops when the data is zipped and ready, but I can't seem to identify this button. It also seems that the button might change names based on when it was last run, for example, yui-gen2, yui-gen3, etc so I am thinking I might need to account for this someone. Although I normally only see yui-gen2.
Also, the tag seems to be in a "span" which might be adding to my difficulty honing in on the button I need to click.
Please help if you can shed any light on this for me.
code snippet:
#Refine search results to get tables
driver.find_element_by_id("prodautocomplete").send_keys("S0101")
time.sleep(2)
driver.find_element_by_id("prodsubmit").click()
driver.implicitly_wait(100)
time.sleep(2)
driver.find_element_by_id("check_all_btn_above").click()
driver.implicitly_wait(100)
time.sleep(2)
driver.find_element_by_id("dnld_btn_above").click()
driver.implicitly_wait(100)
driver.find_element_by_id("yui-gen0-button").click()
time.sleep(10)
driver.implicitly_wait(100)
driver.find_element_by_id("yui-gen2-button").click()
enter image description here
enter image description here
Instead of using the element id, which as you pointed out varies, you can use XPath as Nogoseke mentioned or CSS Selector. Be careful to not make the XPath/selector too specific or reliant on changing values, in this case the element id. Rather than using the id in XPath, try expressing the XPath in terms of the DOM structure (tags):
//*/div/div/div/span/span/span/button[contains(text(),'Download')]
TIL you can validate your XPath by using the search function, rather than by running it in Selenium. I right-clicked the webpage, "inspect element", ctrl+f, and typed in the above XPath to validate that it is the Download button.
For posterity, if the above XPath is too specific, i.e. it is reliant on too many levels of the DOM structure, you can do something shorter, like
//*button[contains(text(),'Download')]
although, this may not be specific enough and may require an additional field, since there may be multiple buttons on the page with the 'Download' text.
Given the HTML you provided, you should be able to use
driver.find_element_by_id("yui-gen2-button")
I know you said you tried it but you didn't say if it works at all or what error message you are getting. If that never works, you likely have an IFRAME that you need to switch to.
If it works sometimes but not consistently due to changing ID, you can use something like
driver.find_element_by_xpath("//button[.='Download']")
On the code inspection view on Chrome you can right click on the item you want to find and copy the xpath. You can they find your element by xpath on Selenium.

Search for an Xpath returning results outside an element with a specified attribute in scrapy

I'm using the scrapy shell to grab all of the links in the subcategories section of this site: https://www.dmoz.org/Computers/Programming/Languages/Python/.
There's probably a more efficient Xpath, but the one I came up was:
//div[#id="subcategories-div"]/section/div/div/a/#href
As far as I can tell from the page source, there is only one div element with a [#id="subcategories-div"] attribute, so from there I narrow down until I find the link's href. This works when I search for this Xpath in Chrome.
But when I run
response.xpath('//div[#id="subcategories-div"]/section/div/div/a/#href').extract()
in scrapy, it gives me back the links I'm looking for, but then for some reason, it also returns links from //*[#id="doc"]/section[8]/div/div[2]/a
Why is this happening, since nowhere in this path is there a div element with a [#id="subcategories-div"] attribute?
I cant seem to find any id with the name doc in the page You are trying to scrape, You might haven't set a starting response.xpath. do you get the same result if You should to change, so like:
response.xpath('//*div[#id="subcategories-div"]/section/div/div/a/#href').extract()

Selenium and Python 3 – unable to find element

First off, apologies for a commonly asked question. I've looked through all the earlier examples but none of the answers seem to work in my situation.
I'm trying to locate the username and password fields from this website: http://epaper.bt.com.bn/
I've had no problems locating the "myprofile" element and clicking on it. It then loads a page into an iframe. Here's my problem. I've tried all the various methods like find_element_by_id('input_username'), find_element_by_name('username') etc and they all do not work. Would appreciate if someone could point me down the right path.
Try first: (you should switch to iframe)
driver.switch_to.frame("iframe_login")
then you can find your elements. For example:
driver.find_element_by_id("input_username").send_keys("username")
for moving out of iframe:
driver.switch_to.default_content()

Need to click on an xpath using selenium webdriver with python

I have created a list of elements matching an xpath and would like to click through each successively. However, if I use the get_attribute("href") command I get a 'unicode' object has no attribute 'click' error. This is because href is a string. If I don't use get_attribute and simply use this command:
driver.find_elements_by_xpath(".//div/div/div[3]/table//tr[12]/td/table//tr/td/a")
I get a list full of elements. I can successfully click on the first link in the list; however, when I click on the second I get this error: 'Element not
found in the cache - perhaps the page has changed since it was looked up'
I imagine that the reason that the page links I am trying to iterate through are generated via a search query into java (this is one of the href links:
javascript:__doPostBack('ctl00$Content$listJobsByAll1$GridView2','Page$3') )
One more piece of relevant information: there are only two attributes at this xpath location: href and the text.
So, given that I am dealing with a java website and only the two attributes, I am hoping someone can tell me which webdriver commands I can use to get a series of clickable static links. Beyond a specific answer, any advice on how I could have figured this out myself would be helpful.
if you click on a link with selenium, you are changing the current page. The page that you are directed to doesn't have the next element.
to get links use:
'.//tag/#href'
you can try:
for elem in elems:
elem.click()
print browser.current_url
browser.back()

Categories