I trying to get the specific element (Minimum Amount) with selenium but its returns empty
options = Options()
options.headless = True
browser = webdriver.Firefox(options=options)
browser.get('https://www.huobi.com/en-us/trade-rules/exchange')
time.sleep(5)
name = browser.find_element(by=By.CSS_SELECTOR, value='.dt-wrap+ .exchange-item span:nth-child(4)')
print(name.text) # Return Empty
how can do it with selenium or beautifulsoap?
Data is also populating from external source via API. So you can easily pull all the required data whatever you need using only requests module.
Example:
import requests
api_url = 'https://www.huobi.com/-/x/pro/v2/beta/common/symbols?r=mhzzzd&x-b3-traceid=6c1acdfbf0a19d63cc05c62de990a55c'
req = requests.get(api_url).json()
for item in req['data']:
print(item['display_name'])
Output:
REN/HUSD
NKN/HT
BRWL/USDT
NSURE/BTC
ITC/USDT
SPA/USDT
CTC/USDT
EVX/BTC
EUL/USDT
USTC/USDT
SUKU/USDT
KAN/BTC
NFT/USDC
LOOKS/USDT
IOI/USDT
DORA/USDT
BAT/USDT
QSP/ETH
WXT/USDT
RING/ETH
NEAR/ETH
SWFTC/BTC
LINK/HUSD
RUFF/USDT
EFI/USDT
DIO/USDT
AVAX/USDC
GSC/ETH
RAD/BTC
INSUR/USDT
NODL/USDT
H2O/USDT
BTC/HUSD
FIRO/ETH
KCASH/BTC
XPNT/USDT
STPT/BTC
XCN/USDT
ETC/BTC
OCN/ETH
BTC/EUR
MAN/BTC
OP/USDC
OXT/BTC
DASH/USDT
KSM/USDT
SD/USDT
YGG/BTC
... so on
I think your CSS_SELECTOR is wrong. Maybe do it with a list and take the element you want?
Something like:
exchange_items: list = driver.find_elements(By.XPATH, '//div[#class="exchange-item"]')
target = exchange_items[3]
print(target.text)
Here we take all items and choose the 4.
In that specific case use .get_attribute('innerText') or .get_attribute('innerHTML') to get your goal with selenium:
browser.find_element(By.CSS_SELECTOR, '.exchange-item span:nth-of-type(4)').get_attribute('innerText')
Related
I am trying to extract all available elements for the Xpath, and I did try element with 's' and without and cant seem to make it work. being 'Element' its alright but only returns me the first result and with 'Elements' it gives me an error : "AttributeError: 'list' object has no attribute 'find_elements'"
My code :
from selenium.webdriver.common.by import By
from selenium import webdriver
url = 'https://automira.ro/dealeri-autorizati/lista'
PATH = 'C:\\Users\\czoca\\AppData\\Roaming\\Microsoft\\Windows\\Start Menu\\Programs\\Python 3.6\\chromedriver.exe'
driver = webdriver.Chrome(PATH)
driver.get(url)
driver.maximize_window()# For maximizing window
driver.implicitly_wait(100)# gives an implicit wait for 20 seconds
dealers = driver.find_elements(By.XPATH, '/html/body/div[4]/div/div[3]/div/div[1]')
for dealer in dealers:
name = dealer.find_elements(By.XPATH, "/html/body/div[4]/div/div[3]/div/div[1]/div/div/h4/a").text
email = dealer.find_elements(By.XPATH, '/html/body/div[4]/div/div[3]/div/div[2]/div/div/div[3]/a').text
phone = dealer.find_elements(By.XPATH, '/html/body/div[4]/div/div[3]/div/div[2]/div/div/div[2]/a').text
print(name,email,phone)
Any ideias?
Thanks!
find_elements method returns a list object. you can iterate on list object to get all the elements you need.
for n in name:
print(n.text)
In your code, dealers returns a list of WebElements, so you can use find_elements. But in for loop - dealer returns only one WebElement per iteration, so you have to use find_element
dealers = driver.find_elements(By.XPATH, '/html/body/div[4]/div/div[3]/div/div[1]')
for dealer in dealers:
# you should use 'find_element' for name, email and phone
name = dealer.find_element(By.XPATH, "/html/body/div[4]/div/div[3]/div/div[1]/div/div/h4/a").text
email = dealer.find_element(By.XPATH, '/html/body/div[4]/div/div[3]/div/div[2]/div/div/div[3]/a').text
phone = dealer.find_element(By.XPATH, '/html/body/div[4]/div/div[3]/div/div[2]/div/div/div[2]/a').text
print(name,email,phone)
I wanted to extract text from multiple pages. Currently, I am able to extract data from the first page but I want to append and go to muliple pages and extract the data from pagination. I have written this simple code which extracts data from the first page. I am not able to extract the data from multiple pages which is dynamic in number.
`
element_list = []
opts = webdriver.ChromeOptions()
opts.headless = True
driver = webdriver.Chrome(ChromeDriverManager().install())
base_url = "XYZ"
driver.maximize_window()
driver.get(base_url)
driver.set_page_load_timeout(50)
element = WebDriverWait(driver, 50).until(EC.presence_of_element_located((By.ID, 'all-my-groups')))
l = []
l = driver.find_elements_by_xpath("//div[contains(#class, 'alias-wrapper sim-ellipsis sim-list--shortId')]")
for i in l:
print(i.text)
`
I have shared the images of class if this could help from pagination.
If we could extract the automate and extract from all the pages that would be awesome. Also, I am new so please pardon me for asking silly questions. Thanks in advance.
You have provided the code just for the previous page button. I guess you need to go to the next page until next page exists. As I don't know what site we are talking about I can only guess its behavior. So I'm assuming the button 'next' disappears when no next page exists. If so, it can be done like this:
element_list = []
opts = webdriver.ChromeOptions()
opts.headless = True
driver = webdriver.Chrome(ChromeDriverManager().install())
base_url = "XYZ"
driver.maximize_window()
driver.get(base_url)
driver.set_page_load_timeout(50)
element = WebDriverWait(driver, 50).until(EC.presence_of_element_located((By.ID, 'all-my-groups')))
l = []
l = driver.find_elements_by_xpath("//div[contains(#class, 'alias-wrapper sim-ellipsis sim-list--shortId')]")
while True:
try:
next_page = driver.find_element(By.XPATH, '//button[#label="Next page"]')
except NoSuchElementException:
break
next_page.click()
l.extend(driver.find_elements(By.XPATH, "//div[contains(#class, 'alias-wrapper sim-ellipsis sim-list--shortId')]"))
for i in l:
print(i.text)
To be able to catch the exception this import has to be added:
from selenium.common.exceptions import NoSuchElementException
Also note that the method find_elements_by_xpath is deprecated and it would be better to replace this line:
l = driver.find_elements_by_xpath("//div[contains(#class, 'alias-wrapper sim-ellipsis sim-list--shortId')]")
by this one:
l = driver.find_elements(By.XPATH, "//div[contains(#class, 'alias-wrapper sim-ellipsis sim-list--shortId')]")
I want to get all the links in the class tag like the image below.
enter image description here
driver.find_elements_by_xpath('/html/body/div[2]/div[2]/div[2]/div/div[2]/div[2]/div/div[1]/div[1]/div/div'):
url_video = a.get_property('href')
print(url_video)
i get the result is none
I use the 'a' tag to get all the links. I just want to get the links in the specified class. Please help me
This my code:
from selenium import webdriver
import time
browser=webdriver.Chrome()
time.sleep(6)
elements = browser.find_elements_by_xpath('/html/body/div[2]/div[2]/div[2]/div/div[2]/div[2]/div/div[1]/div[1]/div/div')
for element in elements:
videoUrl = element.get_attribute('href')
print(videoUrl)
----> The result is none
.find_elements_by_xpath() return a list, you should use element['href'] on each element of the list.
elements = driver.find_elements_by_xpath('...')
for element in elements:
videoUrl = element['href']
print(videoUrl)
I'm trying to get a value from a class name but the onlything a could get thill now is a [ ] output.
so, what I'm supposed to do in the following code?
from selenium import webdriver
import time
options = webdriver.ChromeOptions()
options.add_argument('lang=pt-br')
driver = webdriver.Chrome(
executable_path=r'./chromedriver.exe', options=options)
driver.get('https://economia.uol.com.br/cotacoes/cambio/')
time.sleep(5)
dolar = driver.find_elements_by_class_name('currency2')
time.sleep(5)
print(dolar)
You don't need selenium to get that information, try:
import requests
u = "https://api.cotacoes.uol.com/currency/intraday/list/?format=JSON&fields=bidvalue,askvalue,maxbid,minbid,variationbid,variationpercentbid,openbidvalue,date¤cy=1"
j = requests.get(u).json()
dolar = j['docs'][0]['bidvalue']
# 5.5916
Demo
Notes:
If you need other info, like daily variation (variationpercentbid) search on json the object:
Change the currency value at the end of the url for a different currency, for example, currency=5, will give you the EUR values.
You can use get_attribute to retrieve value of web element, Also in your code you there is no element with class name currency2.
Please find below example :
<input class="field normal" name="currency2" value="5,59" data-audience-click="{"reference":"ativar-campo-texto","component":"currency-converter"}" xpath="1">
Code to retrieve value:
driver.get('https://economia.uol.com.br/cotacoes/cambio/')
currency = driver.find_element_by_name('currency2')
print currency.get_attribute("value")
Output::
5,59
I am trying to scrape data from a website that has a multilevel drop-down menu every time an item is selected it changes the sub items for sub drop-downs.
problem is that for every loop it extracts same sub items from the drop down items. the selection happens but it do not update the items on behalf of new selection from loop
can any one help me why I am not getting the desired results.
Perhaps this is because my drop-down list is in java Script or something.
for instance like this manue in the picture below:
i have gone this far:
enter code here
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
import csv
//#from selenium.webdriver.support import Select
import time
print ("opening chorome....")
driver = webdriver.Chrome()
driver.get('https://www.wheelmax.com/')
time.sleep(10)
csvData = ['Year', 'Make', 'Model', 'Body', 'Submodel', 'Size']
//#variables
yeart = []
make= []
model=[]
body = []
submodel = []
size = []
Yindex = Mkindex = Mdindex = Bdindex = Smindex = Sindex = 0
print ("waiting for program to set variables....")
time.sleep(20)
print ("initializing and setting variables....")
//#initializing Year
Year = Select(driver.find_element_by_id("icm-years-select"))
Year.select_by_value('2020')
yr = driver.find_elements(By.XPATH, '//*[#id="icm-years-select"]')
time.sleep(15)
//#initializing Make
Make = Select(driver.find_element_by_id("icm-makes-select"))
Make.select_by_index(1)
mk = driver.find_elements(By.XPATH, '//*[#id="icm-makes-select"]')
time.sleep(15)
//#initializing Model
Model = Select(driver.find_element_by_id("icm-models-select"))
Model.select_by_index(1)
mdl = driver.find_elements(By.XPATH, '//*[#id="icm-models-select"]')
time.sleep(15)
//#initializing body
Body = Select(driver.find_element_by_id("icm-drivebodies-select"))
Body.select_by_index(1)
bdy = driver.find_elements(By.XPATH, '//*[#id="icm-drivebodies-select"]')
time.sleep(15)
//#initializing submodel
Submodel = Select(driver.find_element_by_id("icm-submodels-select"))
Submodel.select_by_index(1)
sbm = driver.find_elements(By.XPATH, '//*[#id="icm-submodels-select"]')
time.sleep(15)
//#initializing size
Size = Select(driver.find_element_by_id("icm-sizes-select"))
Size.select_by_index(0)
siz = driver.find_elements(By.XPATH, '//*[#id="icm-sizes-select"]')
time.sleep(5)
Cyr = Cmk = Cmd = Cbd = Csmd = Csz = ""
print ("fetching data from variables....")
for y in yr:
obj1 = driver.find_element_by_id("icm-years-select")
Year = Select(obj1)
Year.select_by_index(++Yindex)
obj1.click()
#obj1.click()
yeart.append(y.text)
Cyr = y.text
time.sleep(10)
for m in mk:
obj2 = driver.find_element_by_id("icm-makes-select")
Make = Select(obj2)
Make.select_by_index(++Mkindex)
obj2.click()
#obj2.click()
make.append(m.text)
Cmk = m.text
time.sleep(10)
for md in mdl:
Mdindex =0
obj3 = driver.find_element_by_id("icm-models-select")
Model = Select(obj3)
Model.select_by_index(++Mdindex)
obj3.click()
#obj3.click(clickobj)
model.append(md.text)
Cmd = md.text
time.sleep(10)
Bdindex = 0
for bd in bdy:
obj4 = driver.find_element_by_id("icm-drivebodies-select")
Body = Select(obj4)
Body.select_by_index(++Bdindex)
obj4.click()
#obj4.click(clickobj2)
body.append(bd.text)
Cbd = bd.text
time.sleep(10)
Smindex = 0
for sm in sbm:
obj5 = driver.find_element_by_id("icm-submodels-select")
Submodel = Select(obj5)
obj5.click()
Submodel.select_by_index(++Smindex)
#obj5.click(clickobj5)
submodel.append(sm.text)
Csmd = sm.text
time.sleep(10)
Sindex = 0
for sz in siz:
Size = Select(driver.find_element_by_id("icm-sizes-select"))
Size.select_by_index(++Sindex)
size.append(sz.text)
Scz = sz.text
csvData += [Cyr, Cmk, Cmd, Cbd,Csmd, Csz]
Because of https://www.wheelmax.com has multilevel drop-down menu dependent on each other for example if you select Select Year drop down option, after selected year based on Select Make drop down is enable and display option based on the selected year option.
So basically you need to use Selenium package for handle dynamic option.
Install selenium web driver as per your browser
Download chrome web driver :
http://chromedriver.chromium.org/downloads
Install web driver for chrome browser:
unzip ~/Downloads/chromedriver_linux64.zip -d ~/Downloads
chmod +x ~/Downloads/chromedriver
sudo mv -f ~/Downloads/chromedriver /usr/local/share/chromedriver
sudo ln -s /usr/local/share/chromedriver /usr/local/bin/chromedriver
sudo ln -s /usr/local/share/chromedriver /usr/bin/chromedriver
selenium tutorial
https://selenium-python.readthedocs.io/
Eg. using selenium to select multiple dropdown options
from selenium import webdriver
from selenium.webdriver.support.ui import Select
import time
driver = webdriver.Chrome()
driver.get('https://www.wheelmax.com/')
time.sleep(4)
selectYear = Select(driver.find_element_by_id("icm-years-select"))
selectYear.select_by_value('2019')
time.sleep(2)
selectMakes = Select(driver.find_element_by_id("icm-makes-select"))
selectMakes.select_by_value('58')
Update:
select drop down option value or count total options
for option in selectYear.options:
print(option.text)
print(len(selectYear.options))
Se more
How to extract data from a dropdown menu using python beautifulsoup
The page does a callback to populate with years. Simply mimic that.
If you actually need to change years and select from dependent drop downs, which becomes a different question, you need browser automation e.g. selenium, or to manually perform this and inspect network tab to see if there is an xhr request you can mimic to submit your choices.
import requests
r = requests.get('https://www.iconfigurators.com/json2/?returnType=json&bypass=true&id=13898&callback=yearObj').json()
years = [item['year'] for item in r['years']]
print(years)
I guess the reason you can't parse the years with beautiful soup is because the 'select' tag containing the 'option' tags with all the years is not present yet/is hidden at the moment when beautiful soup downloads the page. It is added to the DOM by executing additional JavaScript I assume. If you look at the DOM of the loaded page using the developer tools of your browser, for example F12 for Mozilla, you'll see that the tag containing the information you look for is: <select id="icm-years-select"">. If you try to parse for this tag with the object downloaded with beautiful soup, you get an empty list of tag objects:
from bs4 import BeautifulSoup
from requests import get
response = get('https://www.wheelmax.com/')
yourSoup = BeautifulSoup(response.text, "lxml")
print(len(yourSoup.select('div #vehicle-search'))) // length = 1 -> visible
print()
print(len(yourSoup.select('#icm-years-select'))) // length = 0 -> not visible
So if you want to get the years by using Python by all means, I guess you might try to click on the respective tag and then parse again using some combination of requests/beautiful soup/ or the selenium module which will require a bit more digging :-)
Otherwise if you just quickly need the years parsed, use JavaScript:
countYears = document.getElementById('icm-years-select').length;
yearArray = [];
for (i = 0; i < countYears; i++) {yearArray.push(document.getElementById('icm-years-select')[i].value)};