I hope everyone is having a good day. I am trying to extract values from a website and have them print out as a list, but I can't figure out how to do that. I have all the values printing as expecting, just can't figure out how to have them print one after another. I know this is a very basic question, but I can't figure it out. Any advice or information is appreciated! Thank you!
import time
import webbrowser
from os import O_SEQUENTIAL, link
import chromedriver_autoinstaller
from selenium import webdriver as wd
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
webdriver = wd.Chrome(executable_path= r"C:\Users\Stephanie\anaconda3\pkgs\python-chromedriver-binary-98.0.4758.48.0-py39hcbf5309_0\Lib\site-packages\chromedriver_binary\chromedriver.exe")
webdriver.implicitly_wait(1)
webdriver.maximize_window()
webdriver.get("https://pcpartpicker.com/user/stephwaters/saved/#view=HgH2xr")
time.sleep(2)
partname = webdriver.find_elements(By.CLASS_NAME, 'td__component')
for part in partname:
print(part.text + ': ')
prices = webdriver.find_elements(By.CLASS_NAME, 'td__price')
for price in prices:
print(price.text)
This is the output:
I would like it to print:
Case: $168.99
Power Supply: $182.00
and so on.
Instead of getting the partnames and prices separately you can iterate over a list of products extracting from each one it's name and price.
Also it's recommended to use Expected Conditions explicit waits, not a hardcoded pauses.
Your code could be something like this:
import time
import webbrowser
from os import O_SEQUENTIAL, link
import chromedriver_autoinstaller
from selenium import webdriver as wd
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
webdriver = wd.Chrome(executable_path= r"C:\Users\Stephanie\anaconda3\pkgs\python-chromedriver-binary-98.0.4758.48.0-py39hcbf5309_0\Lib\site-packages\chromedriver_binary\chromedriver.exe")
wait = WebDriverWait(webdriver, 20)
webdriver.maximize_window()
webdriver.get("https://pcpartpicker.com/user/stephwaters/saved/#view=HgH2xr")
wait.until(EC.visibility_of_element_located((By.XPATH, "//tr[#class='tr__product']")))
time.sleep(0.3) #short delay added to make sure not the first product only got loaded
products = = webdriver.find_elements(By.XPATH, '//tr[#class="tr__product"]')
for product in products:
name = product.find_element(By.XPATH, './/td[#class="td__component"]')
price = product.find_element(By.XPATH, './/td[#class="td__price"]//a')
print(name.text + ': ' + price.text)
Related
Currently the below code gets to the second page and then I get a "NoSuchElementException" error. I have tried iterating using the HREF as well. The link text works slightly better but I encounter issues when I reach the "..." portion of the iteration, so would prefer to stick with the XPATH. Feel like I have tried everything.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
import time
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
url = 'https://www.sec.state.ma.us/LobbyistPublicSearch/Default.aspx'
wait = WebDriverWait(driver,10)
driver.get(url)
driver.find_element('id','ContentPlaceHolder1_rdbSearchByType').click()
select = Select(driver.find_element(By.CLASS_NAME,'p3'))
select.select_by_value('2020')
driver.find_element('id','ContentPlaceHolder1_btnSearch').click()
find_table = driver.find_element(By.ID,'ContentPlaceHolder1_ucSearchResultByTypeAndCategory_grdvSearchResultByTypeAndCategory')
xpath_string = """//*[#id="ContentPlaceHolder1_ucSearchResultByTypeAndCategory_grdvSearchResultByTypeAndCategory"]/tbody/tr[98]/td/table/tbody/tr/td[1]/a"""
all_xpath = []
for i in range(1,all_pages):
pn = str(i + 1)
links_list_sliced = list(xpath_string)
links_list_sliced[131] = pn
joined_links = ''.join(links_list_sliced)
all_xpath.append(joined_links)
for l in all_xpath:
current_page_element = driver.find_element(By.XPATH, l)
wait.until(EC.element_to_be_clickable((current_page_element))).click()
#David Kahn, Seems the number of elements on each page are different. So in your xpath tr[98] is not always the case. You can do the following changes and it should work.
Your xpath_string should be the following. Where I have removed the dependency on position of the page numbers on 98th row of the table.
xpath_string = """//*[#id="ContentPlaceHolder1_ucSearchResultByTypeAndCategory_grdvSearchResultByTypeAndCategory"]/tbody/tr[#align="center"]/td/table/tbody/tr/td[1]/a"""
For this to work you need to change the following line in your for loop:
links_list_sliced[131] = pn
to:
links_list_sliced[144] = pn
Let me know if it works.
I am learning how to use Selenium with Python and have been toying around with a few different things. I keep having an issue where I cannot locate any classes. I am able to locate and print data by xpath, but I cannot locate the classes.
The goal of this script is to gather a number from the table on the website and the current time, then append the items to a CSV file.
Site: https://bitinfocharts.com/top-100-richest-dogecoin-addresses.html
Any advice or guidance would be greatly appreciated as I am new to python. Thank you.
Code:
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
import time
import pandas as pd
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
import csv
from datetime import datetime
from selenium.webdriver.common.by import By
#Open ChromeDriver
PATH = ('/Users/brandon/Desktop/chromedriver')
driver = webdriver.Chrome(PATH)
driver.get("https://bitinfocharts.com/top-100-richest-dogecoin-addresses.html")
driver.implicitly_wait(10)
driver.maximize_window()
#Creates the Time
now = datetime.now()
current_time = now.strftime("%m/%d/%Y, %H:%M:%S")
#####
#Identify the section of page
page = driver.find_element(By.CLASS_NAME,'table table-condensed bb')
time.sleep(3)
#Gather the data
for page in pages():
num_of_wallets = page.find_element(By.XPATH,
"//html/body/div[5]/table[1]/tbody/tr[10]/td[2]").text
table_dict = {"Time":current_time,
"Wallets":num_of_wallets}
file = open('dogedata.csv', 'a')
try:
file.write(f'{current_time},{num_of_wallets}')
finally:
file.close()
table table-condensed bb actually contains 3 class names.
So the best way to locate element based on multiple class names is to use css selector or xpath like:
page = driver.find_element(By.CSS_SELECTOR,'.table.table-condensed.bb')
or
page = driver.find_element(By.XPATH,"//*[#class='table table-condensed bb']")
I am very new to Python and I am trying to get all store locations from website of Muthoot. the following is a code i wrote but i am not getting any output. Please let me know what is wrong and what i need to correct.
As i understand, the code is not getting the search button clicked and hence nothing is moving. But how to do that??
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
import pandas as pd
driver= webdriver.Chrome(executable_path="D:\Chromedriverpath\chromedriver_win32\chromedriver.exe")
driver.get("https://www.muthootfinance.com/branch-locator")
#Saving this element in a variable
drp=Select(driver.find_element_by_id("statelist"))
slist=drp.options
for ele in slist:
table=driver.select("table.table")
columns=table.find("thead").find_all("th")
column_names=[c.string for c in columns]
table_rows=table.find("tbody").find_all("tr")
l=[]
for tr in table_rows:
td=tr.find_all('td')
row=[str(tr.get_text()).strip() for tr in td]
l.append(row)
df=pd.DataFrame(l,columns=column_names)
df.head()
I think this will work for you now, I copied your code and it seems to work!
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
import pandas as pd
driver = webdriver.Chrome("C:\Program Files (x86)\chromedriver.exe")
driver.get("https://www.muthootfinance.com/branch-locator")
# Saving this element in a variable
html_list = driver.find_element_by_id("state_branch_list")
items = html_list.find_elements_by_tag_name("li")
for item in items:
places = item.text
print(places)
df = pd.DataFrame([places])
I am trying to scape a website to get some company information. If the search result is there and matches the search term I would like to continue, if not, I would like to move on to the next company.
Here is the code:
import pandas as pd
import numpy as np
from tqdm import notebook
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from time import time, sleep
import datetime
import sys
url = "https://register.fca.org.uk/s/"
search_box_path = '//*[#id="search-form-search-section-main-input"]'
firm_checkbox_path = '//*[#id="search-form-search-options-radio-group"]/span[1]/label/span[1]'
searchterm = 'XXX Company'
driver = webdriver.Chrome(executable_path=r'C:\Users\XXXX\Chrome Webdriver\chromedriver.exe')
driver.get(url)
element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH,firm_checkbox_path)))
driver.find_element_by_xpath(firm_checkbox_path).click()
driver.find_element_by_xpath(search_box_path).send_keys(searchterm)
driver.find_element_by_xpath(search_box_path).send_keys(Keys.RETURN)
element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH,'//*
[#id="maincontent"]/div[4]/div/div[2]/h1/span[2]')))
element = driver.find_element_by_xpath('//*[#id="maincontent"]/div[4]/div/div[2]/h1/span[2]')
check_result()
The issue is with the check_result function. In this function I am just comparing the searchterm against the element.text of the element from the website.
def check_result():
name= driver.find_element_by_xpath('//*[#id="maincontent"]/div[4]/div/div[2]/h1/span[2]')
return name.text == searchterm:
This logic on its own works fine, but along with the code it give me false even though I know that the text I provide is equal to the element.text.
Any help is much appreciated.
enter image description hereI wrote some code (a piece of them below) to scrape all products from shop's website but it doesnt find any products... i dont know what is wrong with this code. Can someone help me? I added screnn to show html (product-tile represent some product-box so I think that I should use this class to have necessary information)
while True:
# if True:
try:
prod = driver.find_elements_by_class_name("product-tile")
for el in prod:
name = el.find_element_by_class_name("product-name").text
price = el.find_element_by_class_name("price-normal").text
product_list.append(x)
x = [name, price]
print(x)
Try the below code see if it helps.
prod=WebDriverWait(driver, 20).until(expected_conditions.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.product-tile.js-UA-product")))
for el in prod:
name = el.find_element_by_class_name("product-name").get_attribute("innerHTML")
price = el.find_element_by_class_name("price-normal").get_attribute("innerHTML")
Please use following imports.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.common.by import By