I have created a list of elements matching an xpath and would like to click through each successively. However, if I use the get_attribute("href") command I get a 'unicode' object has no attribute 'click' error. This is because href is a string. If I don't use get_attribute and simply use this command:
driver.find_elements_by_xpath(".//div/div/div[3]/table//tr[12]/td/table//tr/td/a")
I get a list full of elements. I can successfully click on the first link in the list; however, when I click on the second I get this error: 'Element not
found in the cache - perhaps the page has changed since it was looked up'
I imagine that the reason that the page links I am trying to iterate through are generated via a search query into java (this is one of the href links:
javascript:__doPostBack('ctl00$Content$listJobsByAll1$GridView2','Page$3') )
One more piece of relevant information: there are only two attributes at this xpath location: href and the text.
So, given that I am dealing with a java website and only the two attributes, I am hoping someone can tell me which webdriver commands I can use to get a series of clickable static links. Beyond a specific answer, any advice on how I could have figured this out myself would be helpful.
if you click on a link with selenium, you are changing the current page. The page that you are directed to doesn't have the next element.
to get links use:
'.//tag/#href'
you can try:
for elem in elems:
elem.click()
print browser.current_url
browser.back()
Related
This is the site I want to scrape.
I want to scrape all the information in the table on the first page:
then click on the second and do the same:
And so on until the 51st page. I know how to use selenium to click on page two:
link = "http://www.nigeriatradehub.gov.ng/Organizations"
driver = webdriver.Firefox()
driver.get(link)
xpath = '/html/body/form/div[3]/div[4]/div[1]/div/div/div[1]/div/div/div/div/div/div[2]/div[2]/span/a[1]'
find_element_by_xpath(xpath).click()
But I don't know how to set the code up so that it cycles through each page. The process of me getting the xpath is a manual one in the first place (I go on to Firefox, inspect the item and copy it into the code), so I don't know how to automate that step in and of itself and then the following ones.
I tried going a level higher in the webpage html and choosing the entire section of the page with the elements I want, and cycling through them, but that doesn't work because it's a Firefox web object(see below). Here'a a snapshot of the relevant part of the page source:
By calling the xpath of the higher class like so:
path = '//*[#id="dnn_ctr454_View_OrganizationsListViewDataPager"]'
driver.find_element_by_xpath(path)
and trying to see if I can cycle though it:
for i in driver.find_element_by_xpath(path):
i.click()
I get the following error:
Any advice would be greatly appreciated.
This error message...
...implies that you are trying to iterate through a WebElement where as only list objects are iterable.
Solution
Within the for() loop to create a list to iterate through it's elements, instead of using find_element* you need to use find_elements*. So your effective code block will be:
for i in driver.find_elements_by_xpath(path):
i.click()
How do I get a link from Google? I tried the following ways, but none of them worked:
find_elements_by_xpath("//*[#class='r']/#href")
driver.find_element_by_xpath("//a").get_attribute("href")
driver.find_element_by_xpath("//a").get_attribute(("href")[2])
I am receiving "none"
I need to get google link, not link to the site (e.g. not www.stackoverflow.com). It's highlighted on this image:
You maye have multiple issues here:
First and third options are a not a valid xpath. Second options may find more than one matches, so it will return the first that fits, which is not necessarily the one you want. So I suggest:
Make find specific enough to locate a proper element. I'd suggest find_element_by_link_text if you know the name of the link you are going to choose:
link = driver.find_element_by_link_text('Stack Overflow')
Given you chose the right link, you should be able to get the attribute:
href = link.get_attribute('href')
If the first statement throws an exception (most likely element not found), you may need to wait for the element to appear on the page, as described here
I'm using the scrapy shell to grab all of the links in the subcategories section of this site: https://www.dmoz.org/Computers/Programming/Languages/Python/.
There's probably a more efficient Xpath, but the one I came up was:
//div[#id="subcategories-div"]/section/div/div/a/#href
As far as I can tell from the page source, there is only one div element with a [#id="subcategories-div"] attribute, so from there I narrow down until I find the link's href. This works when I search for this Xpath in Chrome.
But when I run
response.xpath('//div[#id="subcategories-div"]/section/div/div/a/#href').extract()
in scrapy, it gives me back the links I'm looking for, but then for some reason, it also returns links from //*[#id="doc"]/section[8]/div/div[2]/a
Why is this happening, since nowhere in this path is there a div element with a [#id="subcategories-div"] attribute?
I cant seem to find any id with the name doc in the page You are trying to scrape, You might haven't set a starting response.xpath. do you get the same result if You should to change, so like:
response.xpath('//*div[#id="subcategories-div"]/section/div/div/a/#href').extract()
Here is an example page with pagination controlling dynamically loaded results.
http://www.rehabs.com/local/jacksonville-fl/
All that I presently know to try is:
curButton = 1
driver.find_element_by_css_selector('ul[class="pagination"]').find_elements_by_tag_name('li')[curButton].click()
Nothing seems to happen (also when trying to access and click the a tag or driver.get() the href of the a element).
Is there another way to access the hidden elements? For instance, when reading the html of the entire page, the elements of different pagination are shown, but are apparently inaccessible with BeautifulSoup.
Pagination was added for humans. Maybe you used the wrong xpath or css. Check it.
Use this xpath:
//div[#id="listing-basic"]/article/div[#class="h3"]/a/#href
You can click on the pagination button using:
driver.find_elements_by_css_selector('.pagination li a')[1].click()
I'm writing test scripts for a web page in python using Selenium's remote control interface.
I'm writing it like this:
elem = browser.find_element_by_link_text("foo")
elem.click()
elem = browser.find_element_by_name("goo")
elem.send_keys("asdf")
elem = browser.find_element_by_link_text("foo2")
elem.click()
It then needs to select an item in a list. The list becomes visible when the mouse hovers over it, but selenium cannot find the element if it's hidden. The list also shows options based on who is logged in. The list is implemented in CSS, so trying to run it in javascript and using gettext() does not work.
I've tried searching for the link based on name, class and xpath, but it always reports that it is not visible I've verified from browser.page_source() that the link is in the source code, so it's reading the correct page.
How do I select the link inside the list? Any help is appreciated.
Selenium and :hover css suggests that this can't be done using the Selenium RC interface, but instead must be done using the WebDriver API
Try move_to_element(). Check out the API http://readthedocs.org/docs/selenium-python/en/latest/api.html