I want to extract all h2 elements of the div element. The code that I've used is this:
browser = webdriver.Chrome()
browser.get("https://www.mmorpg.com/play-now")
time.sleep(2)
item_list_new=[]
link = browser.find_element_by_xpath("//div[#class='freegamelist']")
names = link.find_element_by_tag_name('h2')
x = names.text
item_list_new.append(x)
print(item_list_new)
But when I run this, I only get the first 'h2' element of the div element.
Can somebody tell me what am I doing wrong and also please guide me with the correct way of doing it?
Thanks in advance.
you need to write names = link.find_elements_by_tag_name('h2')
Your code should be
browser = webdriver.Chrome()
browser.get("https://www.mmorpg.com/play-now")
time.sleep(2)
item_list_new=[]
link = browser.find_element_by_xpath("//div[#class='freegamelist']")
names = link.find_elements_by_tag_name('h2')
x = names.text
item_list_new.append(x)
print(item_list_new)
find_element_by_tag_name gives the first element and find_elements_by_tag_name gives all the matching elements
Try to get all header values as below:
link = browser.find_element_by_xpath("//div[#class='freegamelist']")
names = link.find_elements_by_tag_name('h2')
item_list_new = [x.text for x in names]
print(item_list_new)
or you can simplify
names = browser.find_elements_by_xpath("//div[#class='freegamelist']//h2")
item_list_new = [x.text for x in names]
print(item_list_new)
You actually want to use the function find_elements_by_tag_name that sounds almost similar, as pointed out here.
Related
im am trying to get the values i added to a list with selenium and print them out. But i am only getting this: <generator object at 0x000001B924EC7990>. How can i print the values in the list.
I also tried to shorten the xpath with "//tr[#class= 'text3'][11]/td" but it didnt work.
Like you can see i tried to loop through the list and convert it in text, but it also didnt work.
Would this work range(driver.find_elements(By.XPATH,"//table[2]/tbody/tr/td[2]/table[1]/tbody/tr/td[3]/table/tbody/tr[2]/td[position() >= last()]"))?
Can you guys help me out?
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.implicitly_wait(10)
website = "https://langerball.de/"
driver.get(website)
for i in range(7):
xpath_test = "//table[2]/tbody/tr/td[2]/table[1]/tbody/tr/td[3]/table/tbody/tr[2]/td[position() >= last()]"
a = driver.find_elements(By.XPATH, xpath_test)
test_li = []
test_li.append(a)
print(b.text for b in test_li)
driver.find_elements method returns a list of web elements while you are looking for their text values. Web element text value can be received by applying the .text method on a web element.
So, you should iterate over the received list of web elements and extract text from each web element in the list.
Also test_li = [] should be defined out of the loop.
So your code could be something like this:
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.implicitly_wait(10)
website = "https://langerball.de/"
driver.get(website)
test_li = []
for i in range(7):
xpath_test = "//table[2]/tbody/tr/td[2]/table[1]/tbody/tr/td[3]/table/tbody/tr[2]/td[position() >= last()]"
a_list = driver.find_elements(By.XPATH, xpath_test)
for a in a_list:
test_li.append(a.text)
print(b.text for b in test_li)
P.S.
I'm not sure about the rest of your code: the for i in range(7) loop and the xpath_test XPath expression
This worked
test_li = []
xpath_test = "//table[2]/tbody/tr/td[2]/table[1]/tbody/tr/td[3]/table/tbody/tr[2]/td[position() <= last()]"
a_list = driver.find_elements(By.XPATH, xpath_test)
for a in a_list:
test_li.append(a.text)
print(test_li)
I am using the below code to get data from http://www.bddk.org.tr/BultenHaftalik. Two table elements have the same class name. How can I get just one of them?
from selenium import webdriver
import time
driver_path = "C:\\Users\\Bacanli\\Desktop\\chromedriver.exe"
browser = webdriver.Chrome(driver_path)
browser.get("http://www.bddk.org.tr/BultenHaftalik")
time.sleep(3)
Krediler = browser.find_element_by_xpath("//*[#id='tabloListesiItem-253']/span")
Krediler.click()
elements = browser.find_elements_by_css_selector("td.ortala")
for element in elements:
print(element.text)
browser.close()
If you want to select all rows for one column only that match a specific css selection, then you can use :nth-child() selector.
Simply, the code will be like this:
elements = browser.find_elements_by_css_selector("td.ortala:nth-child(2)")
In this way, you will get the "Krediler" column rows only. You can also select the first child if you want to by applying the same idea.
I guess what you want to do is to extract the text and not the numbers, try this:
elements = []
for i in range(1,21):
css_selector = f'#Tablo > tbody:nth-child(2) > tr:nth-child({i}) > td:nth-child(2)'
element=browser.find_element_by_css_selector(css_selector)
elements.append(element)
for element in elements:
print(element.text)
browser.close()
I made this XPath
alo1 = driver.find_element(By.XPATH, "//div[#class='txt-block']/span/a/span").text
print(alo1)
but the problem is: i'm getting only the first element, but there is 3 or 4 elements with the same XPath, and i wanted then all.
From page to page the number of elements change from 0 to 4.
How can i do it?
And other thing, do you think is possible to make another XPath? I'm trying to get the name of the producers of the films.
EDIT:
I have a second difficulty. I'm passing this result to an excel sheet, but it needs to be in one line to be printed there, or else will only print the last one. How can it be done? ,
wb = xlwt.Workbook()
ws = wb.add_sheet("A Test Sheet")
driver = webdriver.Chrome()
driver.get('http://www.imdb.com/title/tt4854442/?ref_=wl_li_tt')
labels = driver.find_elements_by_xpath("//div[#class='txt-
block']/span/a/span")
for label in labels:
print (label.text)
ws.write(x-1,1,label.text)
wb.save("sinopses.xls")
The website for reference: http://www.imdb.com/title/tt4854442/?ref_=wl_li_tt
You can get them all at once, and then get text for each element:
alos = driver.find_elements(By.XPATH, "//div[#class='txt-block']/span/a/span")
for alo in alos:
print alo.text
For the first question:
FindElement always give only one result , even if the locator matches more than one , it automatically takes the first one.
If locator gives more than one matching result and you want all of them then you should go for findElements
For the second question:
labels = driver.find_elements_by_xpath("//div[#class='txt-
block']/span/a/span")
result = ''
for label in labels:
result += label.text
print (result)
ws.write(x-1,1,result)
wb.save("sinopses.xls")
On a typical eBay search query where more than 50 listings are returned, such as this, eBay displays in the a grid format (whether you have it set up to display as grid or a list).
I'm using class name to pull out the prices using WebDriver:
prices = webdriver.find_all_elements_by_class_name("bidsold")
The challenge: although all prices on the page look identical in structure, the ones that are crossed out (where Buy It Now is not available and it's Best offer accepted) are actually contained within a child span of the above span:
I could pull these out separately by repeating the find_all_elements_by_class_name method with class sboffer, but (i) I will lose track of the order, and more importantly (ii) it will roughly double the time it takes to extract the prices.
The CSS selector for both types of prices also differ, as do the XPaths.
How do we catch all prices in one go?
Try this:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get('http://www.ebay.com/sch/i.html?rt=nc&LH_Complete=1&_nkw=Columbia+Hiking+Pants&LH_Sold=1&_sacat=0&LH_BIN=1&_from=R40&_sop=3&LH_ItemCondition=1000&_pgn=2')
prices_list = driver.find_elements_by_css_selector('span.amt')
prices_on_page = []
for span in prices_list:
unsold_item = span.find_elements_by_css_selector('span.bidsold.bold')
sold_item = span.find_elements_by_css_selector('span.sboffer')
if len(sold_item):
prices_on_page.append(sold_item[0].text)
elif len(unsold_item):
prices_on_page.append(unsold_item[0].text)
elif span.text:
prices_on_page.append(span.text)
print prices_on_page
driver.quit()
In this case, you will have track of the order and you will only query the specific span element instead of the entire page. This should improve performance.
I would go for xpath- below code worked for me. It grabbed 50 prices!
from selenium import webdriver
driver = webdriver.Firefox()
driver.get('http://www.ebay.com/sch/i.html?rt=nc&LH_Complete=1&_nkw=Columbia+Hiking+Pants&LH_Sold=1&_sacat=0&LH_BIN=1&_from=R40&_sop=3&LH_ItemCondition=1000&_pgn=2')
my_prices = []
itms = driver.find_elements_by_xpath("//div[#class='bin']")
for i in itms:
prices = i.find_elements_by_xpath(".//span[contains(text(),'$')]")
val = ','.join(i.text for i in prices)
my_prices.append([val])
print my_prices
driver.quit()
Result is
[[u'$64.95'], [u'$59.99'], [u'$49.95'], [u'$46.89,$69.99'], [u'$44.98'], [u'$42.95'], [u'$39.99'], [u'$39.99'], [u'$37.95'], [u'$36.68'], [u'$35.96,$44.95'], [u'$34.99'], [u'$34.99'], [u'$34.95'], [u'$30.98'], [u'$29.99'], [u'$29.99'], [u'$29.65,$32.95'], [u'$29.00'], [u'$27.96,$34.95'], [u'$27.50'], [u'$27.50'], [u'$26.99,$29.99'], [u'$26.95'], [u'$26.55,$29.50'], [u'$24.99'], [u'$24.99'], [u'$24.99'], [u'$24.99'], [u'$24.98'], [u'$24.98'], [u'$24.98'], [u'$24.98'], [u'$24.98'], [u'$22.00'], [u'$22.00'], [u'$22.00'], [u'$22.00'], [u'$18.00'], [u'$18.00'], [u'$17.95'], [u'$11.99'], [u'$9.99'], [u'$6.00']]
I'm trying to get text $27.5 inside tag <div>, I located the element by id and the element is called "price".
The snippet of html is as follows:
<div id="PPP,BOSSST,NYCPAS,2015-04-26T01:00:00-04:00,2015-04-26T05:20:00-04:00,_price" class="price inlineBlock strong mediumText">$27.50</div>
Here is what I've tried
price.text
price.get_attribute('value')
Both of the above doesn't work.
Update:
Thanks for everyone that tries to help.
I combined your answers together and got the solution:)
price = driver.find_element_by_xpath("//div[#class='price inlineBlock strong mediumText']")
price_content = price.get_attribute('innerHTML')
print price_content.strip()
Can't you use a regular expression or Beautiful Soup to find the contents of the element in HTML:
re.search(r'<div.*?>(*.?)</div>', price.get_attribute('innerHTML')).group(1)
You element is hidden, last I worked with Selenium you were not able to get text of hidden elements. That said, you can always execute javascript, I dont usually write in python, but it should be something like:
def val = driver.execute_script("return document.getElementById('locator').innerHTML")
Change the css selector to
div[id$='_price']
Complete code
price = fltright.find_element(By.CSS_SELECTOR, "div[id$='_price']")
price.text
I tried your edited solution, but they only get 1 div having class. So, I tried these below to print a List of div having the same class.
Changing element to elements will output a List:
price = driver.find_elements_by_xpath('//div[#class = "price inlineBlock strong mediumText"]')
Use for ... in range () to print a List:
num = len (price)
for i in range (num):
print (price[i].text)
browser.find_element_by_xpath("//form[#id='workQueueTaskListForm']/div[1]/p").text