Using Selenium to grab specific information in Python

Using Selenium to grab specific information in Python - python

So I am quite new to using Selenium and thus and quite unsure how to do this, or even word it for this matter.
But what I am trying to do is to use selenium to grab the following values and then store them into a list.
Image provided of the inspector window of Firefox, to show what I am trying to grab (Highlighted)
https://i.stack.imgur.com/rHk9R.png

In Selenium, you access elements using functions find_element(s)_by_xxx(), where xxx is for example the tag name, element name or class name (and more). The functions find_element_... return the first element that matches the argument, while find_elements_... return all matching elements.
Selenium has a [good documentation][1], in section "Getting started" you can find several examples of basic usage.
As to your question, the following code should collect the values you want:
from selenium import webdriver
driver = webdriver.Firefox() # driver for the browser you use
select_elem = driver.find_element_by_name('ctl00_Content...') # full name of the element
options = select_elem.find_elements_by_tag_name('option')
values = []
for option in options:
val = option.get_attribute('value')
values.append(val)

Related

Detect all names and get their link with Selenium Python

I want to make a search system when we enter a word in a variable, it search between all links’ names of this page (all the games) a little like a « control-F » and display the results (names + links) using Selenium (Python).
I don’t know how to make a system like that! If you can help it’s good!
Have a Nice code!

You are attempting to locate specific elements on a page and then sorting through them for a key search term. Selenium can identify elements on a page through a number of methods, see here for a guide. Once you have located all the elements you can filter them for the search term of interest.
Finding ALL the elements of interest:
I would utilise the XPATH of your elements to find them on the page and make a list that you can then search through based on your keyword. In your case all they are identifiable by this xpath:
//div[#class="blog-content"]//a
Extract the required information:
Once you have the list of elements, you will need to iterate over them to extract the href tag (the game's url) and innerHTML text (the name of the game).
I have used list comprehension in the example below to do this, which creates a dictionary {url:name, ...} you can filter your specific items from.
Example Code:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from webdriver_manager.firefox import GeckoDriverManager
website_url = 'https://steamunlocked.net/all-games-2/'
game_xpaths = '//div[#class="blog-content"]//a'
driver = webdriver.Firefox(service=Service(GeckoDriverManager().install()))
driver.get(website_url)
game_elements = driver.find_elements(By.XPATH, game_xpaths)
games = {g.get_attribute('href'):g.get_attribute('innerHTML') for g in game_elements}
games
"""
Outputs:
{'https://steamunlocked.net/red-tether-free-download/': '—Red—Tether–> Free Download (v1.006)',
'https://steamunlocked.net/hack-g-u-last-recode-free-download/': '.hack//G.U. Last Recode Free Download (v1.01)',
'https://steamunlocked.net/n-verlore-verstand-free-download/': '‘n Verlore Verstand Free Download',
'https://steamunlocked.net/0-n-0-w-free-download/': '0°N 0°W Free Download',
'https://steamunlocked.net/007-legends-free-download/': '007 Legends Free Download', ...
"""
Finding SPECIFIC items (i.e. CTRL+F)
To identify and filter only specific items from your dictionary for the occurrence of the word/string you are interested in.
def search(myDict, search_term):
return [[v,k] for k,v in myDict.items() if search_term.lower() in v.lower()]
>>> search(games, 'Ninja')
[['10 Second Ninja Free Download','https://steamunlocked.net/10-second-ninja-free-download/'],
['10 Second Ninja X Free Download','https://steamunlocked.net/10-second-ninja-x-free-download/']]

Selenium only finding certain elements in Python

I'm having some trouble finding elements with Selenium in Python, it works fine for every element on all other websites I have tested yet on a game website it can only find certain elements.
Here is the code I'm using:
from selenium import webdriver
import time
driver = webdriver.Chrome("./chromedriver")
driver.get("https://www.jklm.fun")
passSelf = input("Press enter when in game...")
time.sleep(1)
syllable = driver.find_element_by_xpath("/html/body/div[2]/div[2]/div[2]/div[2]/div").text
print(syllable)
Upon running the code, the element /html/body/div[2]/div[2]/div[2]/div[2]/div isn't found. In the image you can see the element it is trying to find:
Element the code is trying to find
However running the same code but replacing the XPath with something outside of the main game (for example the room code in the top right) it successfully finds the element:
Output of the code being run on a different element
I've tried using the class name, name, selector and XPath to find the original element but no prevail the only things I can think that are affecting it is that:
The elements are changing periodically (not sure if this affects it)
The elements are in the "Canvas area" and it is somehow blocking it.
I'm not certain whether these things matter as I'm new to using selenium any help is appreciated. The website the game is on is https://www.jklm.fun/ if you want to have a look through the elements

Element you are trying to access is inside an iframe. Switch to the frame first like this
driver.switch_to_frame(driver.find_element_by_xpath("//div[#class='game']/iframe[contains(#src,'jklm.fun')]"))

driver.get("https://jklm.fun/JXUS")
WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.XPATH, "//button[#class='styled']"))).click()
time.sleep(10)
driver.switch_to.frame(0)
while True:
Get_Text = driver.find_element_by_xpath("//div[#class='round']").text
print(Get_Text)

Selenium not finding list of sections with classes?

I am attempting to get a list of games on
https://www.xbox.com/en-US/live/gold#gameswithgold
According to Firefox's dev console, it seems that I found the correct class: https://i.imgur.com/M6EpVDg.png
In fact, since there are 3 games, I am supposed to get a list of 3 objects with this code: https://pastebin.com/raw/PEDifvdX (the wait is so Seleium can load the page)
But in fact, Selenium says it does not exist: https://i.imgur.com/DqsIdk9.png
I do not get what I am doing wrong. I even tried css selectors like this
listOfGames = driver.find_element_by_css_selector("section.m-product-placement-item f-size-medium context-game gameDiv")
Still nothing. What am I doing wrong?

You are trying to get three different games so you need to give different element path or you can use some sort of loop like this one
i = 1
while i < 4:
link = f"//*[#id='ContentBlockList_11']/div[2]/section[{i}]/a/div/h3"
listGames = str(driver.find_element_by_xpath(link).text)
print(listGames)
i += 1
you can use this kind of loop in some places where there is slight different in xpath,css or class
in this way it will loop over web element one by one and get the list of game
as you are trying to get name I think so you need to put .text which will only get you the name nothing else

Another option with a selector that isn't looped over and changed-- also one that's less dependent on the page structure and a little easier to read:
//a[starts-with(#data-loc-link,'keyLinknowgame')]//h3
Here's sample code:
from selenium import webdriver
from selenium.common.exceptions import StaleElementReferenceException
driver = webdriver.Chrome()
url = f"https://www.xbox.com/en-US/live/gold#gameswithgold"
driver.get(url)
driver.implicitly_wait(10)
listOfGames = driver.find_elements_by_xpath("//a[starts-with(#data-loc-link,'keyLinknowgame')]//h3")
for game in listOfGames:
try:
print(game.text)
except StaleElementReferenceException:
pass
If you're after more than just the title, remove the //h3 selection:
//a[starts-with(#data-loc-link,'keyLinknowgame')]
And add whatever additional Xpath you want to narrow things down to the content/elements that you're after.

Selenium Python - Store XPath in var and extract depther hirachy XPath from var

I sadly couldn't find any resources online for my problem. I'm trying to store elements found by XPath in a list and then loop over the XPath elements in a list to search in that object. But instead of searching in that given object, it seems that selenium is always again looking in the whole site.
Anyone with good knowledge about this? I've seen that:
// Selects nodes in the document from the current node that matches the selection no matter where they are
But I've also tried "/" and it didn't work either.
Instead of giving me the text for each div, it gives me the text from all divs.
My Code:
from selenium import webdriver
driver = webdriver.Chrome()
result_text = []
# I'm looking for all divs with a specific class and store them in a list
divs_found = driver.find_elements_by_xpath("//div[#class='a-fixed-right-grid-col a-col-left']")
# Here seems to be the problem as it seems like instead of "divs_found[1]" it behaves like "driver" an looking on the whole site
hrefs_matching_in_div = divs_found[1].find_elements_by_xpath("//a[contains(#href, '/gp/product/')]")
# Now I'm looking in the found href matches to store the text from it
for href in hrefs_matching_in_div:
result_text.append(href.text)
print(result_text)

You need to add . for immediate child.Try now.
hrefs_matching_in_div = divs_found[1].find_elements_by_xpath(".//a[contains(#href, '/gp/product/')]")

Looping through drop downs using selenium in python

I am trying to simulated clicking through multiple options on an online data tool that ends with downloading an excel sheet given your filters.
I am currently using selenium and identifying xpaths.
I am able to get through a single iteration and get a single excel sheet, but I need to do it for every possible permutation of drop down choices. To do by hand is unrealistic, as there are thousands of options.
The website for context: https://data.cms.gov/mapping-medicare-disparities
Does anyone know of a function that can be done in selenium that will work?
My current strategy is to create lists with the xpaths and then try to do a permutation function to get all the combinations. However, this has not worked because the function: b.find_element_by_xpath only allows one xpath at a time.
examples of lists:
geography county state/territory
G1 = '//select[#id="geography"]//option[#value="c"]'
G2 = '//select[#id="geography"]//option[#value="s"]'
Geo = [G1, G2]
creating pool of combinations
import itertools
from itertools import product
for perm in product(Geo, Adjust, Analysis, Domain):
print(perm)
actual code to use selenium
**from** selenium **import** webdriver
**from** selenium.webdriver.common.keys **import** Keys
b = webdriver.Firefox()
code to click through a popup
pop_up = b.find_element_by_xpath('/html/body/div[1]/button')
pop_up.click()
code trying to use xpath to select all options at once
b.find_element_by_xpath(('//select[#id="geography"]//option[#value="c"],
'//select[#id="adjust"]//option[#value="1"],'//select[#id="analysis"]
//option[#value="base"],'//select[#id="domain"]//option[#value="d1"]'))
error message: InvalidArgumentException: Message: invalid type: sequence, expected a string at line 1 column 28
This is because the find_element_by_xpath (I am assuming) will only look at 1 xpath at a time.

your syntax in code trying to use xpath... is wrong anyway, but you could just put all the xpaths in a list and loop through it.
xpathlist=['//select[#id="geography"]//option[#value="c"]', '//select[#id="adjust"]//option[#value="1"]',.....]
for xp in xpathlist:
b.find_element_by_xpath(xp)
#then add code to click or download or whatever

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using Selenium to grab specific information in Python - python

Related

Detect all names and get their link with Selenium Python

Selenium only finding certain elements in Python

Selenium not finding list of sections with classes?

Selenium Python - Store XPath in var and extract depther hirachy XPath from var

Looping through drop downs using selenium in python

Categories

Resources