Python Selenium Web Scraping - Hidden Text / Javascript? - python

I'm trying to extract some text from a page using Python and Selenium The text is visible to me, but I can't work out how to extract it - I think the text was created in Java.
Im on the URL: "https://sellercentral.amazon.co.uk/hz/fba/profitabilitycalculator/index?lang=en_GB" and have entered the product id 'B00FRJ1R4M' for example, pressed search, then entered '20' in the Amazon Fulfilment Item Price box and pressed calculate.
I'm trying to extract the '-5.59' but to no avail.
The closest I think I've got is the follwing code:
cost = driver.find_element_by_xpath("//*[#id='afn-fees']/dl/dd[15]/input")
print(cost.get_attribute('innerHTML'))
print(driver.execute_script("return arguments[0].innerHTML", cost))
But this for returns 'None'.
Any help would be much appreciated.

You need to use .get_attribute("value"), since this is an input, and simplify your locator:
cost = driver.find_element_by_css_selector("input.cost-total")
print(cost.get_attribute("value"))
Here input.cost-total CSS selector would match an input element having cost-total class, which is quite readable and reliable locator in this case.

Related

Python Selenium Web-Driver not finding text element

I am trying to get the product's seller yet it will not get the text. I assume this is some weird thing since the text is also a link. Any help?
Python Code:
self.sold_by = driver.find_element_by_css_selector('#sellerProfileTriggerId').text
HTML Element:
SKUniverse
Try like this:
self.sold_by = driver.find_element_by_css_selector('#sellerProfileTriggerId')
text_element=self.sold_by.text
print(text_element)
Also, why aren't you using xpath or id selectors! Just asking :)

Want to extract decimal number from a page with xpath, selenium wedriver in python

I have a page having item price as shown in attached image. i want to extract this price as 64.99. I want to ask what would be the xpath to get this number as Im using selenium webdriver to find this price
I have tried a lot of permutations of xpaths but the problem is that this page have a lot such products so its being difficult to find unique xpath of that price. e.g -
//li[#class = 'price-current'] (gives 13 result on the page)
//*[#id = 'landingpage-price' and #class = 'price-current'] (give no result)
Any help will be appreciated. Thanks
Since you mentioned there are lot of such products, then the problem you are asking is wrong. You need to find out how to get to the product that you are interested in and then find its price. You are trying to find the price directly.
Now the issue in below xpath
//*[#id = 'landingpage-price' and #class = 'price-current'] (give no result)
is that, you are trying to search inside landingpage-price and specifying the class condition also on the container element. First I would suggest do this using css, but I will show both xpath and css as well.
XPath
elem = driver.find_element_by_xpath("//div[#id = 'landingpage-price']//li[#class = 'price-current']")
print (elem.text.replace("$",""))
CSS
elem = driver.find_element_by_css_selector("#landingpage-price .price-current")
print (elem.text.replace("$",""))
You xpath would break if developers adds more classes to the price. So using a css is better and it does work also. As you can see in below image it uniquely identified the element

Selenium can't find elements by this XPath expression

I'm trying to extract some odds from a page using Selenium ChromeDriver, since the data is dynamic. The "find elements by XPath expression" usually works with these kind of websites for me, but this time, it can't seem to find the element in question, nor any element that belong to the section of the page that shows the relevant odds.
I'm probably making a simple error - if anyone has time to check the page out I'd be very grateful! Sample page: Nordic Bet NHL Odds
driver.get("https://www.nordicbet.com/en/odds#?cat=&reg=&sc=50&bgi=36")
time.sleep(5)
dayElems = driver.find_elements_by_xpath("//div[#class='ng-scope']")
print(len(dayElems))
Output:
0
It was a problem I used to face...
It is in another frame whose id is SportsbookIFrame. You need to navigate into the frame:
driver.switch_to_frame("SportsbookIFrame")
dayElems = driver.find_elements_by_xpath("//div[#class='ng-scope']")
len(dayElems)
Output:
26
For searching iframes, they are usual elements:
iframes = driver.find_elements_by_xpath("//iframe")

Scraping webpage with selenium (python)

Hi I would like to scrap what is selected in the following image:
Image Code
I know i could use the following code to get the text:
cell = driver.find_elements_by_xpath(".//*[#id='ip_selection1233880116name']")
print cell.text
But my problem is that ip_selection1233880116name should be dynamic, given that it changes every time as you can see from the image.
How could I do it?
Thanks a lot for your help!!!!
Use contains to catch just the name presuming the numbers all all that change, for a single element you should also use find_element as opposed to find_elements :
find_element_by_xpath("//*[contains(#id,'ip_selection') and contains(#id,'name')]")
You could also use starts-with and ends-with depending on the browser:
find_element_by_xpath("//*[starts-with(#id,'ip_selection') and ends-with(#id,'name')]")

How to parse Selenium driver elements?

I'm new in Selenium with Python. I'm trying to scrape some data but I can't figure out how to parse outputs from commands like this:
driver.find_elements_by_css_selector("div.flightbox")
I was trying to google some tutorial but I've found nothing for Python.
Could you give me a hint?
find_elements_by_css_selector() would return you a list of WebElement instances. Each web element has a number of methods and attributes available. For example, to get an inner text of the element, use .text:
for element in driver.find_elements_by_css_selector("div.flightbox"):
print(element.text)
You can also make a context-specific search to find other elements inside the current element. Taking into account, that I know what site you are working with, here is an example code to get the departure and arrival times for the first-way flight in a result box:
for result in driver.find_elements_by_css_selector("div.flightbox"):
departure_time = result.find_element_by_css_selector("div.departure p.p05 strong").text
arrival_time = result.find_element_by_css_selector("div.arrival p.p05 strong").text
print [departure_time, arrival_time]
Make sure you study Getting Started, Navigating and Locating Elements documentation pages.

Categories