I am trying to select all elements of a dropdown.
The site I am testing on is: http://jenner.com/people
The dropdown(checkbox list) I am trying to access is the "locations" list.
I am using Python. I am getting the following error: Message: u'Element is not currently visible and so may not be interacted with'
The code I am using is:
from selenium import webdriver
url = "http://jenner.com/people"
driver = webdriver.Firefox()
driver.get(url)
page = driver.page_source
element = driver.find_element_by_xpath("//div[#class='filter offices']")
elements = element.find_elements_by_tag_name("input")
counter = 0
while counter <= len(elements) -1:
driver.get(url)
element = driver.find_element_by_xpath("//div[#class='filter offices']")
elements1 = element.find_elements_by_tag_name("input")
elements1[counter].click()
counter = counter + 1
I have tried a few variations, including clicking the initial element before clicking on the dropdown options, that didnt work. Any ideas on how to make elements visible in Selenium. I have spent the last few hours searching for an answer online. I have seen a few posts regarding moving the mouse in Selenium, but havent found a solution that works for me yet.
Thanks a lot.
As input check-boxes are not visible at initial state,they get visible after click on "filter offices" option.Also there is change in class name changes from "filter offices" to "filter offices open",if you have observed in firebug.Below code works for me but it is in Java.But you can figure out python as it contain really basic code.
driver.get("http://jenner.com/people");
driver.findElement(By.xpath("//div[#class='filter offices']/div")).click();
Thread.sleep(2000L);
WebElement element = driver.findElement(By.xpath("//div[#class='filter offices open']"));
Thread.sleep(2000L);
List <WebElement> elements = element.findElements(By.tagName("input"));
for(int i=0;i<=elements.size()-1;i++)
{
elements.get(i).click();
Thread.sleep(2000L);
elements = element.findElements(By.tagName("input"));
}
I know this is an old question, but I came across it when looking for other information. I don't know if you were doing QA on the site to see if the proper cities were showing in the drop down, or if you were actually interacting with the site to get the list of people who should be at each location. (Side note: selecting a location then un-selecting it returns 0 results if you don't reset the filter - possibly not desired behavior.)
If you were trying to get a list of users at each location on this site, I would think it easier to not use Selenium. Here is a pretty simple solution to pull the people from the first city "Chicago." Of course, you could make a list of the cities that you are supposed to look for and sub them into the "data" variable by looping through the list.
import requests
from bs4 import BeautifulSoup
url = 'http://jenner.com/people/search'
data = 'utf8=%E2%9C%93&authenticity_token=%2BayQ8%2FyDPAtNNlHRn15Fi9w9OgXS12eNe8RZ8saTLmU%3D&search_scope=full_name' \
'&search%5Bfull_name%5D=&search%5Boffices%5D%5B%5D=Chicago'
r = requests.post(url, data=data)
soup = BeautifulSoup(r.content)
people_results = soup.find_all('div', attrs={'class': 'name'})
for p in people_results:
print p.text
Related
Scrapping links should be a simple feat, usually just grabbing the src value of the a tag.
I recently came across this website (https://sunteccity.com.sg/promotions) where the href value of a tags of each item cannot be found, but the redirection still works. I'm trying to figure out a way to grab the items and their corresponding links. My typical python selenium code looks something as such
all_items = bot.find_elements_by_class_name('thumb-img')
for promo in all_items:
a = promo.find_elements_by_tag_name("a")
print("a[0]: ", a[0].get_attribute("href"))
However, I can't seem to retrieve any href, onclick attributes, and I'm wondering if this is even possible. I noticed that I couldn't do a right-click, open link in new tab as well.
Are there any ways around getting the links of all these items?
Edit: Are there any ways to retrieve all the links of the items on the pages?
i.e.
https://sunteccity.com.sg/promotions/724
https://sunteccity.com.sg/promotions/731
https://sunteccity.com.sg/promotions/751
https://sunteccity.com.sg/promotions/752
https://sunteccity.com.sg/promotions/754
https://sunteccity.com.sg/promotions/280
...
Edit:
Adding an image of one such anchor tag for better clarity:
By reverse-engineering the Javascript that takes you to the promotions pages (seen in https://sunteccity.com.sg/_nuxt/d4b648f.js) that gives you a way to get all the links, which are based on the HappeningID. You can verify by running this in the JS console, which gives you the first promotion:
window.__NUXT__.state.Promotion.promotions[0].HappeningID
Based on that, you can create a Python loop to get all the promotions:
items = driver.execute_script("return window.__NUXT__.state.Promotion;")
for item in items["promotions"]:
base = "https://sunteccity.com.sg/promotions/"
happening_id = str(item["HappeningID"])
print(base + happening_id)
That generated the following output:
https://sunteccity.com.sg/promotions/724
https://sunteccity.com.sg/promotions/731
https://sunteccity.com.sg/promotions/751
https://sunteccity.com.sg/promotions/752
https://sunteccity.com.sg/promotions/754
https://sunteccity.com.sg/promotions/280
https://sunteccity.com.sg/promotions/764
https://sunteccity.com.sg/promotions/766
https://sunteccity.com.sg/promotions/762
https://sunteccity.com.sg/promotions/767
https://sunteccity.com.sg/promotions/732
https://sunteccity.com.sg/promotions/733
https://sunteccity.com.sg/promotions/735
https://sunteccity.com.sg/promotions/736
https://sunteccity.com.sg/promotions/737
https://sunteccity.com.sg/promotions/738
https://sunteccity.com.sg/promotions/739
https://sunteccity.com.sg/promotions/740
https://sunteccity.com.sg/promotions/741
https://sunteccity.com.sg/promotions/742
https://sunteccity.com.sg/promotions/743
https://sunteccity.com.sg/promotions/744
https://sunteccity.com.sg/promotions/745
https://sunteccity.com.sg/promotions/746
https://sunteccity.com.sg/promotions/747
https://sunteccity.com.sg/promotions/748
https://sunteccity.com.sg/promotions/749
https://sunteccity.com.sg/promotions/750
https://sunteccity.com.sg/promotions/753
https://sunteccity.com.sg/promotions/755
https://sunteccity.com.sg/promotions/756
https://sunteccity.com.sg/promotions/757
https://sunteccity.com.sg/promotions/758
https://sunteccity.com.sg/promotions/759
https://sunteccity.com.sg/promotions/760
https://sunteccity.com.sg/promotions/761
https://sunteccity.com.sg/promotions/763
https://sunteccity.com.sg/promotions/765
https://sunteccity.com.sg/promotions/730
https://sunteccity.com.sg/promotions/734
https://sunteccity.com.sg/promotions/623
You are using a wrong locator. It brings you a lot of irrelevant elements.
Instead of find_elements_by_class_name('thumb-img') please try find_elements_by_css_selector('.collections-page .thumb-img') so your code will be
all_items = bot.find_elements_by_css_selector('.collections-page .thumb-img')
for promo in all_items:
a = promo.find_elements_by_tag_name("a")
print("a[0]: ", a[0].get_attribute("href"))
You can also get the desired links directly by .collections-page .thumb-img a locator so that your code could be:
links = bot.find_elements_by_css_selector('.collections-page .thumb-img a')
for link in links:
print(link.get_attribute("href"))
I am attempting to get a list of games on
https://www.xbox.com/en-US/live/gold#gameswithgold
According to Firefox's dev console, it seems that I found the correct class: https://i.imgur.com/M6EpVDg.png
In fact, since there are 3 games, I am supposed to get a list of 3 objects with this code: https://pastebin.com/raw/PEDifvdX (the wait is so Seleium can load the page)
But in fact, Selenium says it does not exist: https://i.imgur.com/DqsIdk9.png
I do not get what I am doing wrong. I even tried css selectors like this
listOfGames = driver.find_element_by_css_selector("section.m-product-placement-item f-size-medium context-game gameDiv")
Still nothing. What am I doing wrong?
You are trying to get three different games so you need to give different element path or you can use some sort of loop like this one
i = 1
while i < 4:
link = f"//*[#id='ContentBlockList_11']/div[2]/section[{i}]/a/div/h3"
listGames = str(driver.find_element_by_xpath(link).text)
print(listGames)
i += 1
you can use this kind of loop in some places where there is slight different in xpath,css or class
in this way it will loop over web element one by one and get the list of game
as you are trying to get name I think so you need to put .text which will only get you the name nothing else
Another option with a selector that isn't looped over and changed-- also one that's less dependent on the page structure and a little easier to read:
//a[starts-with(#data-loc-link,'keyLinknowgame')]//h3
Here's sample code:
from selenium import webdriver
from selenium.common.exceptions import StaleElementReferenceException
driver = webdriver.Chrome()
url = f"https://www.xbox.com/en-US/live/gold#gameswithgold"
driver.get(url)
driver.implicitly_wait(10)
listOfGames = driver.find_elements_by_xpath("//a[starts-with(#data-loc-link,'keyLinknowgame')]//h3")
for game in listOfGames:
try:
print(game.text)
except StaleElementReferenceException:
pass
If you're after more than just the title, remove the //h3 selection:
//a[starts-with(#data-loc-link,'keyLinknowgame')]
And add whatever additional Xpath you want to narrow things down to the content/elements that you're after.
I sadly couldn't find any resources online for my problem. I'm trying to store elements found by XPath in a list and then loop over the XPath elements in a list to search in that object. But instead of searching in that given object, it seems that selenium is always again looking in the whole site.
Anyone with good knowledge about this? I've seen that:
// Selects nodes in the document from the current node that matches the selection no matter where they are
But I've also tried "/" and it didn't work either.
Instead of giving me the text for each div, it gives me the text from all divs.
My Code:
from selenium import webdriver
driver = webdriver.Chrome()
result_text = []
# I'm looking for all divs with a specific class and store them in a list
divs_found = driver.find_elements_by_xpath("//div[#class='a-fixed-right-grid-col a-col-left']")
# Here seems to be the problem as it seems like instead of "divs_found[1]" it behaves like "driver" an looking on the whole site
hrefs_matching_in_div = divs_found[1].find_elements_by_xpath("//a[contains(#href, '/gp/product/')]")
# Now I'm looking in the found href matches to store the text from it
for href in hrefs_matching_in_div:
result_text.append(href.text)
print(result_text)
You need to add . for immediate child.Try now.
hrefs_matching_in_div = divs_found[1].find_elements_by_xpath(".//a[contains(#href, '/gp/product/')]")
I am new to programming and need some help with my web-crawler.
At the moment, I have my code opening up every web-page in the list. However, I wish to extract information from each one it loads. This is what I have.
from selenium import webdriver;
import csv;
driver = webdriver.Firefox();
driver.get("https://www.betexplorer.com/baseball/usa/mlb-2018/results/?
stage=KvfZSOKj&month=all")
links_code : list = driver.find_elements_by_xpath('//a[#class="in-match"]');
first_two : list = links_code[0:2];
first_two_links : list = [];
i : int;
for i in first_two:
link = i.get_attribute("href");
first_two_links.append(link);
odds : list = [];
i :int;
for i in first_two_links:
driver.get(i);
o = driver.find_element_by_xpath('//span[#class="table-main__detail-
odds--hasarchive"]');
odds.append(o);
**Error:** NoSuchElementException: Message: Unable to locate element:
//span[#class="table-main__detail- odds--hasarchive"]
I am just looking to scrape the first two links at the moment so that it is easier to manage. However, I can't seem to figure out a way around this error.
It seems to me as if the error indicates that it is searching the x_path in the home page, rather than the link it follows.
Any help is appreciated.
Please help. I am trying to fetch data from a website and then count the occurrences of certain text. Unfortunately, I cannot provide the actual website, but the basics are this.
The web page is loaded and I am presented with a list of values, which are located in the table (the code below reflects this). The page would look something like this.
Header
Table 1
A00001
A00002
A00003
A00004
......
A00500
Each of the above rows (A00001- A00500) represent table a link that I need to click on. Furthermore, each of the links lead to a unique page that I need to extract information from.
I am using selenium to fetch the information and store it as variable data, as you can see in the code below. Here's my problem though- the number of links/rows that I will need to click on will depend on the timeframe that my user selects in the GUI. As you can see from my code, a time frame from 5/1/2011 to 5/30/2011 produces a list of 184 different links that I need to click on.
from selenium import selenium
import unittest, time, re
class Untitled(unittest.TestCase):
def setUp(self):
self.verificationErrors = []
self.selenium = selenium("localhost", 4444, "*chrome", "https://www.example.com")
self.selenium.start()
def test_untitled(self):
sel = self.selenium
sel.open("https://www.example.com")
sel.click("link=Reports")
sel.wait_for_page_to_load("50000")
sel.click("link=Cases")
sel.wait_for_page_to_load("50000")
sel.remove_selection("office", "label=")
sel.add_selection("office", "label=San Diego")
sel.remove_selection("chapter", "label=")
sel.add_selection("chapter", "label=9")
sel.add_selection("chapter", "label=11")
sel.type("StartDate", "5/1/2011")
sel.type("EndDate", "5/30/2011")
sel.click("button1")
sel.wait_for_page_to_load("30000")
Case 1 = sel.get_table("//div[#id='cmecfMainContent']/center[2]/table.1.0")
Case 2 = sel.get_table("//div[#id='cmecfMainContent']/center[2]/table.2.0")
Case 3 = sel.get_table("//div[#id='cmecfMainContent']/center[2]/table.184.0")
def tearDown(self):
self.selenium.stop()
self.assertEqual([], self.verificationErrors)
if name == "main":
unittest.main()
I am confused about 2 things.
1) What is the best way to get selenium to click on ALL of the links on the page without knowing the number of links ahead of time? The only way I know how to do this is to have the user select the number of links in a GUI, which would be assigned to a variable, which could then be included in the following method:
number_of_links = input("How many links are on the page? ")
sel.get_table("//div[#id='cmecfMainContent']/center[2]/number_of_links")
2) I am also confused about how to count the occurrences of certain data that appear on the pages that the links lead to.
i.e.
A00001 leads to a page that contains the table value "Apples"
A00002 leads to a page that contains the table value "Oranges"
A00003 leads to a page that contains the table value "Apples
"
I know selenium can store these as variables, but I'm unsure as to whether or not I can save these as a sequence type, with each additional occurrence being appended to the original list (or added to a dictionary), which could then be counted with the len() function.
Thanks for your help
I'm not familiar with the python api so sorry for that, but in java I know using xpath there is a function to get the number of occurrences of your xpath. So you could write an xpath selector to find the element you want then get the number of occurrences of that path.
Then to click each one you can affix your xpath with an element selector like [1] so if your xpath was //somexpath/something do //somexpath/something[1] to get the first.
Hope that helps
Heres an example: I wrote a crappy api in java to be able to do jquery like operations on collections of xpath matches. My constructor matches the xpath gets the count then creates a list of all matches so i can do things like .clickAll()
public SelquerySelector(String selector, Selenium selenium) {
super("xpath=(" + selector + ")[" + 1 + "]", selenium);
this.xpath = selector;
this.selenium = selenium;
//find out how many elements match
this.length = selenium.getXpathCount(this.xpath).intValue();
//make an array of selectedElements
for(int i = 2; i <= this.length; i++) {
elements.add(new SelquerySelectedElement("xpath=(" + this.xpath + ")[" + i + "]", this.selenium));
}
}
Heres the whole code in case you want to see it:
http://paste.zcd.me/Show.h7m1?id=8002
So I guess to answer your question (without knowing how xpath matches table) you can probably do something like
//div[#id='cmecfMainContent']/center[2]/table and get the number of matches to get the total amount of links then for loop over them. If you can't do that with xpath keep assuming their is another link till you get an acception
for i in range(1,xpathmatchcount):
Case[i] = sel.get_table("//div[#id='cmecfMainContent']/center[2]/table." + i + ".0")