Selenium CSS selector (Works in Scrapy but not Selenium) Python

Selenium CSS selector (Works in Scrapy but not Selenium) Python - python

I have tried probably every kind of selector and am unable to output this selector as text.
Id, css selector, xpath, all return no result, but when using the same reference in Scrapy shell the desired output is returned.
Any Idea why the Selenium selector does not work?
I am trying to return the text in masterBody_trSalesDate
発売予定日 ： 7月(2021/4/21予約開始)
https://www.example.co.jp/10777687
try:
hatsubai = driver.find_element_by_id('#masterBody_trSalesDate').text
I have honestly tried every possible combination elements and selectors I can think of with no luck, but as mentioned Scrapy shell DOES return the correct data so I am not sure what is going wrong.
Is there any way to test Selenium selectors like scrapy shell without running the script?
Thank you if you have any advice.
image shows working in scrapy shell

When you use by_id or by_xpath then you don't need char #
hatsubai = driver.find_element_by_id('masterBody_trSalesDate').text
That's all.
Minimal working code which works for me
from selenium import webdriver
url = 'https://www.1999.co.jp/10777687'
#driver = webdriver.Firefox()
driver = webdriver.Chrome()
driver.get(url)
hatsubai = driver.find_element_by_id('masterBody_trSalesDate').text
print(hatsubai)
hatsubai = driver.find_element_by_xpath('//*[#id="masterBody_trSalesDate"]').text
print(hatsubai)
hatsubai = driver.find_element_by_css_selector('#masterBody_trSalesDate').text
print(hatsubai)
BTW:
The same is with by_class_name - it needs only name without dot .

You need to use css selector for this one:
hatsubai = driver.find_element_by_css_selector('#masterBody_trSalesDate').text
print(hatsubai)
Output:
発売予定日 ： 7月(2021/4/21予約開始)

Related

Web Scraping: how to extract this kind of div tag?

I am looking at a tag :
.
When I write a code,
message = soup.find("div", {"class": "text-msg-container"})
it gave me none. What are _ngcontent-vex-c62 and data-e2e-text-message-content tags? Do I need to include them too? How should I write them to get the div tag?

You can't because the div isn't there when you send a GET request to get the page code.
That page is built using Angular framework which produce SPA(Single Page Application) which means you can't scrape data from it when you send a GET request because the data isn't there.
The data is being generated by Javascript code which needs to run first to add the required data to the webpage.
You need to use another way that allows Javascript code to run first then you try to get the data you want.

If you want to find class text-msg-container, try Selenium. It will find any locator easily.
import unittest
from selenium import webdriver
class PythonSearch(unittest.TestCase):
def setUp(self):
self.driver = webdriver.Firefox()
def test_search(self):
driver = self.driver
driver.get("http://www.yoursite.com")
elem = driver.find_element_by_css_selector(".text-msg-container")
def tearDown(self):
self.driver.close()
if __name__ == "__main__":
unittest.main()
Use driver = webdriver.Chrome('/path/to/chromedriver') if you are testing Chrome. Look here for more info https://chromedriver.chromium.org/getting-started .
Getting started for Selenium https://selenium-python.readthedocs.io/getting-started.html#simple-usage

try this please
message = soup.find("div", _class="text-msg-container")

i hope that works
from selenium import webdriver
path = "C:/chromedriver.exe" ### path to downloaded chromedriver on your
#pc change this directory or put the same location C:
driver = webdriver.Chrome(path) ## your browser change it if you are not using chrome
driver.get("website link")
out = driver.find_element_by_class_name("text-msg-container")
print(out.text)

Get html of inspect element source with selenium

I'm working in selenium with Chrome.
The webpage I'm accessing updates dynamically.
I need the html that shows the results, I can access it when I do 'inspect element'.
I don't get how I need to access that html from my code. I always get the original html.
I tried this: Get HTML Source of WebElement in Selenium WebDriver using Python
browser.get('http://bijsluiters.fagg-afmps.be/?localeValue=nl')
searchform = browser.find_element_by_class_name('iceInpTxt')
searchform.send_keys('cefuroxim')
button = browser.find_element_by_class_name('iceCmdBtn').click()
element = browser.find_element_by_class_name('contentContainer')
html = element.get_attribute('innerHTML')
browser.close()
print(html)

It seems that it's working after some delay. If I were you I should try to experiment with the delay time.
from selenium import webdriver
import time
browser = webdriver.Chrome()
browser.get('http://bijsluiters.fagg-afmps.be/?localeValue=nl')
searchform = browser.find_element_by_class_name('iceInpTxt')
searchform.send_keys('cefuroxim')
button = browser.find_element_by_class_name('iceCmdBtn').click()
time.sleep(10)
element = browser.find_element_by_class_name('contentContainer')
html = element.get_attribute('innerHTML')
browser.close()
print(html)
Addition: a nicer way is to let the script proceed when an element is available (because of time it takes with JS (for example) before a specific element has been added to the DOM). The element to look for in your example is table with id iceDatTbl (for what I could find after a quick look).

Selenium Driver - Webscraping

Using the Selenium Module to try and webscrape but when I print out the element, it seems that it returns a location the data is stored on the Selenium Server? I'm not exactly sure how this works. Anyway, here's my code. I'm very confused. Can someone tell me what I'm doing wrong?
from selenium import webdriver
browser = webdriver.Firefox()
browser.get('https://caribeexpress.com.do/') #get method
elem2 = browser.find_elements_by_css_selector('div.plan:nth-child(3) > div:nth-child(2) > span:nth-child(2)')
print(elem2)
elems3 = browser.find_elements_by_class_name('value')
print(elems3)
elem4 = browser.find_element_by_xpath('//*[#id="content-wrapper"]/div[2]/div[3]/div/span[2]')
print(elem4)
For some reason, what displays in my Python IDE doesn't display here, I included it in my gist.
https://gist.github.com/jtom343

In case you want to extract the text between span tags.
Replace this to :
print(elem2)
TO:
print(elem2.text.strip())
and this : print(elem4)
To:
print(elem4.text.strip())

Python selenium unable to find element neither by class name nor xpath

I'm newbie in Selenium. I start to learn Selenium via book. And I struggle with unclear behavior of Selenium. For educational purposes I use this site:
http://magento-demo.lexiconn.com/ - I'm trying to find search button by its class name, (which is: class='button search button') or by it xpath
search_button = self.driver.find_element_by_xpath('/html/body/div/div[2]/header/div/div[4]/form/div[1]/button')
or
search_button = self.driver.find_element_by_class_name('button')
but each time selenium unable to find it. Please help me to understand reason of such behavior. Thank you
I used Selenium IDE and it shows me XPATH: //button[#type='submit']
when I tried to find element by xpath,I have got the same error and it is strange. Please advise.
My code is:
import unittest
from selenium import webdriver
class HomePageTest(unittest.TestCase):
#classmethod
def setUpClass(cls):
#create new Firefox session
cls.driver = webdriver.Firefox()
cls.driver.implicitly_wait(30)
cls.driver.maximize_window()
#navvigate to application home page
cls.driver.get('http://magento-demo.lexiconn.com/')
def test_search__text_field_max_length(self):
#get the search text box
search_field=self.driver.find_element_by_id("search")
#check maxlenght attribute st to 128
self.assertEqual("128",search_field.get_attribute("maxlength"))
def test_search_button_enabled(self):
# get Search button
search_button = self.driver.find_element_by_class_name('button')
# check Search button is enabled
self.assertTrue(search_button.is_enabled())
#classmethod
def tearDown(self):
#close the browser window
self.driver.quit()
if __name__=='__main__':
unittest.main(verbosity=2)

Try this :
search_button = self.driver.find_element_by_xpath('//button[#class="button search-button"]')

Try downloading the selenium IDE plugin, install and start recording. Click on the button you want and view how its target is recorded in the IDE. Programmatically, selenium will accept the same xpaths and other selectors as the IDE. After it's been recorded in the IDE, there is a pull down on the target field that lets you see all the different ways you can select that element, ie xpath vs. by class etc.
http://www.seleniumhq.org/projects/ide/
you might try:
css=button.button.search-button
//button[#type='submit']
//form[#id='search_mini_form']/div/button

I think the issue is that your locator isn't specific enough. There is more than one button on the page and more than one element with class=button on the page. This CSS selector is working for me.
self.driver.find_element_by_css_selector("button[title='Search']")

Try this way using xpath locator
Explanation: Use title attribute of <button> tag.
self.driver.find_element_by_xpath("//button[#title='Search']")
OR
Explanation: Use title and type attribute of <button> tag.
self.driver.find_element_by_xpath("//button[#title='Search'][#type='submit']")

Find table elements to fill forms selenium python

My code so far is:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('http://moodle.tau.ac.il/')
driver.find_element_by_xpath("id('page-content')//form[#id='login']// \
input[#type='submit']").click()
Now I'm trying to fill up the login form and I succeeded to find the division
that follows id= content, easy to see in the image:
The following code line I used:
elem = driver.find_element_by_xpath("id('content'))
but it doesn't recognize anything in it and I cant get any further, what should I do to locate the input element?

It doesn't recognize anything because it is in an iframe. Therefore, you first have to switch to the iframe and then search the login form.
Switch to the iframe:
frame = driver.find_element_by_id('credentials')
driver.switch_to.frame(frame)
Or:
driver.switch_to.frame('credentials')

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Selenium CSS selector (Works in Scrapy but not Selenium) Python - python

You need to use css selector for this one: hatsubai = driver.find_element_by_css_selector('#masterBody_trSalesDate').text print(hatsubai) Output: 発売予定日： 7月(2021/4/21予約開始)

Related

Web Scraping: how to extract this kind of div tag?

Get html of inspect element source with selenium

Selenium Driver - Webscraping

Python selenium unable to find element neither by class name nor xpath

Find table elements to fill forms selenium python

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Selenium CSS selector (Works in Scrapy but not Selenium) Python - python

You need to use css selector for this one: hatsubai = driver.find_element_by_css_selector('#masterBody_trSalesDate').text print(hatsubai) Output: 発売予定日 ： 7月(2021/4/21予約開始)

Related

Web Scraping: how to extract this kind of div tag?

Get html of inspect element source with selenium

Selenium Driver - Webscraping

Python selenium unable to find element neither by class name nor xpath

Find table elements to fill forms selenium python

Categories

Resources

You need to use css selector for this one: hatsubai = driver.find_element_by_css_selector('#masterBody_trSalesDate').text print(hatsubai) Output: 発売予定日： 7月(2021/4/21予約開始)