Selenium/Python - Extract dynamically generated HTML after submitting form

Selenium/Python - Extract dynamically generated HTML after submitting form - python

The web page I am trying to access is using JavaScript to dynamically generate HTML form(this one: https://imgur.com/a/rhmXB ). When typing print(page_source), the table seems to appear in the HTML being outputted.
However, after filling the input field and submitting the form, another input field with CAPTCHA image appears(as shown here: https://imgur.com/a/xVfBS ). After typing print(page_source), the input form with the CAPTCHA seems not to be inserted into the HTML.
My question is: How can I access this dynamically generated HTML, which contains the input field and the CAPTCHA image using Selenium?
Here is my code (also, in pastebin):
from selenium import webdriver
driver = webdriver.Chrome("/var/chromedriver/chromedriver")
URL = 'http://nap.bg/link?id=104'
driver.get(URL)
input_field = driver.find_element_by_name('ipID')
input_field.send_keys('0000000000')
driver.find_element_by_id('idSubmit').click()
print(driver.page_source)

After you click on the button, the page takes some time to load the CAPTCHA and other content. You'll need to wait for that to finish loading. You can do that using Selenium's explicit waits.
This is an example for what you can do:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
URL = 'http://nap.bg/link?id=104'
driver.get(URL)
input_field = driver.find_element_by_name('ipID')
input_field.send_keys('0000000000')
driver.find_element_by_id('idSubmit').click()
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.NAME, 'ipResponse')))
print(driver.page_source)

Related

Cannot locate form-control object to send_keys using python Selenium

I am trying to navigate a scheduling website to eventually auto populate a schedule using the following script:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
# Create a Chrome webdriver
driver = webdriver.Chrome(r'C:\Users\chromedriver_win32\chromedriver.exe')
# Navigate to https://www.qgenda.com/
driver.get('https://www.qgenda.com/')
# Wait for the page to load
driver.implicitly_wait(5) # 5 seconds
# You can now interact with the page using the webdriver
# Locate the sign in button
sign_in_button = driver.find_element(By.XPATH,'/html/body/div[1]/div/header[3]/div/div[3]/div/div/div/div/a')
# Click the sign in button
sign_in_button.click()
# Find the input element
input_email = driver.find_element(By.XPATH,'//*[#id="Input_Email"]')
# Send text
input_email.send_keys('Josh')
However, I cannot seem to find the Input_Email object. I've tried all the Xpaths and Id's that make sense and also tried waiting until the object is clickable with no luck. Would really appreciate some guidance on this.
I was expecting Selenium to find the html object form box and pass in text but instead I get an error:
NoSuchElementException: no such element: Unable to locate element: {"method":"xpath","selector":"//*[#id="Input_Email"]"}
even though the Xpath definitely exists.

The XPath seems fine. I am guessing you need to do some explicit wait or implicit wait to ensure the page is fully loaded before allocating the element.
Another thing I would like to point out is that given the login URL is available. Locating the sign in button seems to be redundant. You can access it directly via driver.get('https://login.qgenda.com/')
For instance,
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver.get('https://login.qgenda.com/')
input_email = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, '//*[#id="Input_Email"]'))
)
input_email.send_keys('Josh')
You can read more about it here.

can't get page source from selenium

purpose: using selenium get entire page source.
problem: loaded page does not contain content, only JavaScript files and css files.
target site : https://www.warcraftlogs.com
test code(need 'pip install selenium'):
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.warcraftlogs.com/zone/rankings/29#boss=2512&metric=hps&difficulty=3&class=Priest&spec=Discipline")
pageSource = driver.page_source
fileToWrite = open("page_source.html", "w",encoding='utf-8')
fileToWrite.write(pageSource)
fileToWrite.close()
trythings--
try python request code, same result. that did't contain content only js,css things
It's a personal opinion, this site deliberated hide contant data.
i wanna do scriping this site data,
how can i do?

Here is a way of getting the page source, after all elements loaded:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time as t
[...]
wait = WebDriverWait(driver, 5)
url='https://www.warcraftlogs.com/zone/rankings/29#boss=2512&metric=hps&difficulty=3&class=Priest&spec=Discipline'
driver.get(url)
stuffs = wait.until(EC.presence_of_all_elements_located((By.XPATH, '//div[#class="top-100-details-number kill"]')))
t.sleep(5)
print(driver.page_source)
You can then write page source to file, etc. Selenium documentation: https://www.selenium.dev/documentation/

Trying to log into website with selenium throws either TimeoutException or ElementNotInteractableException

I am having issues logging into the website (memodo.de/login) since the login form is not interactable. With my code, I get either a timeout exception or a ElementNotinteractable Exception.
Please see here the html code for the website as well as my current code. I would appreciate if someone could help me with the issue.
I already tried using the execute_script command, without any luck.
Thank you!
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
import time
import os
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://www.memodo.de/login')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[#id='email']"))).send_keys("email")
WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.XPATH, "//input[#id='passwort']"))).send_keys("password")
button = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[type='submit']"))).click()
HTML Code for the site

The xpaths you are trying are highlighting many Elements in the DOM. The email and password are highlighting 2 Elements each and the submit button is highlighting 5 Elements in the DOM.
The locators we use to find the Elements should be unique that is 1/1 which is visible next to the xpath in the DOM. Link to Refer Go through the How do I write good locators for more clarity.
You can try like below.
driver.get("https://www.memodo.de/login")
wait = WebDriverWait(driver,30)
email = wait.until(EC.element_to_be_clickable((By.XPATH,"//div[#class='register--login-email']/input")))
email.send_keys("Email#email.com")
password = wait.until(EC.element_to_be_clickable((By.XPATH,"//div[#class='register--login-password']/input")))
password.send_keys("password")
submit = wait.until(EC.element_to_be_clickable((By.XPATH,"//form[#id='login--form']//button")))
submit.click()
You can use these xpaths too.
(//input[#id='email'])[2] # For Email Field
(//input[#id='passwort'])[2] # For password Field
//form[#id='login--form']//button # For submit button

How do I retrieve the link of an image through Selenium

I'm trying to make my program fetch the link of an image and then store it as a string in a variable.
This is the xpath of the image. I need to do it through xpaths because the xpaths on the website are very similar bar the "/article[x]". This allow me to increase the number with a variable so that I can go through all the xpaths on the page.
/html/body/div[2]/div[2]/div[3]/div[2]/div[2]/div[1]/div/article[1]/div[2]/div[1]/a/img
Picture of the website that I'm trying to retrieve the links of the image
My code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import tkinter
import time
Anime = input("Enter Anime:")
driver = webdriver.Chrome(executable_path=r"C:\Users\amete\Documents\chromedriver.exe")
driver.get("https://myanimelist.net/search/all?q=one%20piece&cat=all")
search = driver.find_element_by_xpath('//input[#name="q"]')
wait = WebDriverWait(driver, 20)
wait.until(EC.element_to_be_clickable((By.XPATH, '//input[#name="q"]')))
# Clears the field
search.send_keys(Keys.CONTROL, 'a')
search.send_keys(Keys.DELETE)
# The field is now cleared and the program can type whatever it wants
search.send_keys(Anime)
search.send_keys(Keys.RETURN)
# Accept the cookies
wait.until(EC.element_to_be_clickable((By.XPATH, '//*[#id="qc-cmp2-ui"]/div[2]/div/button[3]'))).click()
# Added this wait
wait.until(EC.element_to_be_clickable((By.XPATH,'//h2[#id="anime"]//ancestor::div[#class="content-left"]//article[1]/div[contains(#class, "list")][1]/div[contains(#class, "information")]/a[1]')))
link = driver.find_element_by_xpath('//h2[#id="anime"]//ancestor::div[#class="content-left"]//article[1]/div[contains(#class, "list")][1]/div[contains(#class, "information")]/a[1]').text
piclink = driver.('/html/body/div[2]/div[2]/div[3]/div[2]/div[2]/div[1]/div/article[1]/div[2]/div[1]/a/img')
print (piclink)

you can get it like this (specify the attribute)
piclink = driver.find_element_by_xpath('/html/body/div[2]/div[2]/div[3]/div[2]/div[2]/div[1]/div/article[1]/div[2]/div[1]/a/img').get_attribute('src')
print(piclink)

how to click the login button using selenium xpath

I am trying to login to a website using python so that I can get some of their text from the website.
Here is my code. There always an error at the end of the code after the id and password code.
import os
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Chrome(executable_path=r"C:\chromedriver\chromedriver.exe")
driver.get('https://www.saramin.co.kr/zf_user/auth')
driver.implicitly_wait(3)
driver.find_element_by_name('id').send_keys('<<my_id>>')
driver.find_element_by_name('password').send_keys('<<my_password>>')
driver.find_element_by_xpath('//*[#id="frmNIDLogin"]/fieldset/input').click()
HTML source of the button:

Eventually I figured it out! Thanks for your answer though.
Here is the final code.
driver = webdriver.Chrome(executable_path="C:\chromedriver\chromedriver.exe")
browser = webdriver.Chrome('C:\chromedriver\chromedriver.exe')
driver.get('https://www.saramin.co.kr/zf_user/auth')
driver.implicitly_wait(3)
driver.find_element_by_name('id').send_keys('ID') driver.find_element_by_name('password').send_keys('PW')
driver.find_element_by_xpath( '//*[#class="btn-login"]' ).click()

The xpath is incorrect, it doesn't match anything in the page. Try
driver.find_element_by_xpath('//form[#id="login_frm"]//button[#class="btn-login"]').click()
or simply use submit() function on the <form>
form = driver.find_element_by_id('login_frm')
form.submit()

In the first case you were using 'id' ('//*[#id="frmNIDLogin"]) for click button, because 'id' changes every time page loads it was giving error. But in the second case when you used class ( '//*[#class="btn-login"]' ) it worked because it remains same every time page is loaded. Also as mentioned above the value of id in first case was wrong.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Selenium/Python - Extract dynamically generated HTML after submitting form - python

Related

Cannot locate form-control object to send_keys using python Selenium

can't get page source from selenium

Trying to log into website with selenium throws either TimeoutException or ElementNotInteractableException

How do I retrieve the link of an image through Selenium

how to click the login button using selenium xpath

Categories

Resources