Staying logged in with Selenium in Python - python

I am trying to log into a website and then once logged in navigate to a different page on the website remaining logged in, using Selenium. However, when I try to navigate to the different page, I found I have become logged off.
I believe this is because I do not understand how the webdriver.Firefox().get() function works exactly.
My code:
from selenium import webdriver
from Code.Other import XMLParser
#Initialise driver and go to webpage
driver = webdriver.Firefox()
URL = 'http://www.website.com'
driver.get(URL)
#Login
UserName = XMLParser.XMLParse('./Config.xml','UserName')
Password = XMLParser.XMLParse('./Config.xml','Password')
element = driver.find_elements_by_id('UserName')
element[0].send_keys(UserName)
element = driver.find_elements_by_id('Password')
element[0].send_keys(Password)
element = driver.find_elements_by_id('Submit')
element[0].click()
#Go to new page
URL = 'http://www.website.com/page1'
driver.get(URL)
Unfortunately I am navigated to the new page but I am no longer logged in. How do I fix this?

Looks like website doesn't have enough time to react on your submit in authorization form. You click submit but you don't wait for response and open another url.
Wait until some event after login (like getting cookies or some changes in DOM or just time.sleep) and only then go to another page.
P.S.: if it won't help, try to check your cookies after login and after you open new url, maybe it's problem with authorization backend or webdriver

Related

Issue in web scraping using Selenium and driver.get()

I am trying to scrape this url but the url I enter in the driver.get() gets changed when the program runs and chrome page is opened.
What can be causing it to change?
I want to open this link and get specific things but the url changes and it displays error because this class doesnot exist on the changed url.
Here's my code:
s=Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=s)
driver.get("https://www.booking.com/hotel/pk/one-bahawalpur.html?aid=378266&label=bdot-Os1%2AaFx2GVFdW3rxGd0MYQS541115605091%3Apl%3Ata%3Ap1%3Ap22%2C563%2C000%3Aac%3Aap%3Aneg%3Afi%3Atikwd-334108349%3Alp1011080%3Ali%3Adec%3Adm%3Appccp%3DUmFuZG9tSVYkc2RlIyh9YYriJK-Ikd_dLBPOo0BdMww&sid=0a2d2e37ba101e6b9547da95c4a30c48&all_sr_blocks=41645002_248999805_0_1_0;checkin=2022-11-10;checkout=2022-11-11;dest_id=-2755460;dest_type=city;dist=0;group_adults=2;group_children=0;hapos=1;highlighted_blocks=41645002_248999805_0_1_0;hpos=1;matching_block_id=41645002_248999805_0_1_0;no_rooms=1;req_adults=2;req_children=0;room1=A%2CA;sb_price_type=total;sr_order=popularity;sr_pri_blocks=41645002_248999805_0_1_0__1120000;srepoch=1668070223;srpvid=e9da3e27383900b4;type=total;ucfs=1&#hotelTmpl")
print(driver.find_element(by=By.CLASS_NAME, value="d2fee87262"))
The url after chrome page is opened is as follows:
https://www.booking.com/searchresults.html?aid=378266&label=bdot-Os1%2AaFx2GVFdW3rxGd0MYQS541115605091%3Apl%3Ata%3Ap1%3Ap22%2C563%2C000%3Aac%3Aap%3Aneg%3Afi%3Atikwd-334108349%3Alp1011080%3Ali%3Adec%3Adm%3Appccp%3DUmFuZG9tSVYkc2RlIyh9YYriJK-Ikd_dLBPOo0BdMww&sid=63b32ef8c1d53ae0613d71baf62c3e56&checkin=2022-11-10&checkout=2022-11-11&city=-2755460&group_adults=2&group_children=0&highlighted_hotels=416450&hlrd=with_av&keep_landing=1&no_rooms=1&redirected=1&source=hotel&srpvid=e9da3e27383900b4&room1=A,A,;#hotelTmpl
The URL is changed by the site. You can not change its behavior. It is redirecting users with new sessions to the search instead of the page of a hotel.
As a workaround I can suggest to click the hotel name in search:
driver.find_element(By.XPATH, '//div[text()="Hotel One Bahawalpur"]').click
put this line between the line that opens the page and the line that finds the element.
Also I recommend you to add a line driver.implicitly_wait(10) after the line driver = webdriver.Chrome(service=s). This will improve your scraper's stability as it makes selenium wait while an element appears on the page.

How can I access the same website twice without losing the settings, using Selenium?

I access a website, login and then instead of going through the process of finding and writing into the website's search field, I thought I'd simply re-access the website through a URL with the search query I want.
The problem is that when I access the website with the second "driver.get" (last line of code in the code below), it's as though it forgets that I logged in previously; as though it was a totally new session that I opened.
I have this code structure:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
path = Service("C://chromedriver.exe")
driver = webdriver.Chrome(service=path)
driver.get('https://testwebsite.com/')
login_email_button = driver.find_element(By.XPATH,'XXXXX')
login_email_button.click()
username = driver.find_element(By.ID, 'email')
password = driver.find_element(By.ID, 'password')
username.send_keys('myuser')
password.send_keys('mypassword')
driver.get('https://testwebsite.com/search?=televisions')
when you do
driver.get('https://testwebsite.com/search?=televisions')
you're opening new session with no cookie or data of previous session. You can try to duplicate tab instead, to keep you logged in. You can do with:
Driver.execute_script
url = driver.current_url
driver.execute_script(f"window.open('{url}');")
driver.switch_to.window(window_handles[1])
# if you want give a name to tab, pass it as second param like
driver.execute_script(f"window.open('{url}', 'second_tab_name');")
driver.switch_to.window('second_tab_name')
remember to use the switch if you want go back to the main tab

Python webscraping Selenium fails to navigate page and get data after login

I am using Selenium to automate few checkboxes inside an internal Webportal. I am able to successfully login to the page with credentials using Selenium. But after logging in, if I try to navigate to the required page to get content of Webpage, even the login is not happening. And the page remains in the login page itself without being able to login to Webportal.
Here is the approach which I used:
driver.get('<myLoginURL>')
# maximize window
driver.maximize_window()
# wait for username input, scroll to it, enter username
username = WebDriverWait(driver, DELAY).until(EC.presence_of_element_located((By.ID, "inputUsername")))
driver.execute_script("arguments[0].scrollIntoView();", username)
username.send_keys("<userID>")
...... similarly with password.....And then submit...
submit = WebDriverWait(driver, DELAY).until(EC.element_to_be_clickable((By.CLASS_NAME, "btn")))
driver.execute_script("arguments[0].scrollIntoView();", submit)
submit.click()
WebDriverWait(driver,DELAY)
driver.get("<URL_To be Navigated to>")
# WebDriverWait(driver,DELAY).until(driver.get(desired_URL))
url = driver.current_url
driver.get(url)
source = driver.page_source
Problems occured:
WebPage will not login with login credentials mentioned.
Scraped content gives only login page data (i.e, enter username, enter password)
Can someone suggest me good tutorial how to effectively use Selenium to :
Scrape the data with tags. Currently I am super confused how to use 'EC.presence_of_element_located' and 'find_element_by_ID'
Reach the webpage I desire to scrape after logging in
how to extract data from the page I want
Can anyone suggest me what is the mistake I am doing while navigating the page and get the data
Many thanks in advance

Python and seleniumrequests getting request headers

I need to get a cookie from a specific request. Problem is it gets generated outside my eyes and i need to use Selenium to simulate browser open so i can generate it myself. The second problem is that i can't access the request cookie. The cookie i need is in the request, not the response.
from selenium import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
binary = FirefoxBinary('/usr/bin/firefox')
driver = webdriver.Firefox(firefox_binary=binary)
driver.get('http://www.princess.com/find/searchResults.do')
driver.find_elements_by_xpath('//*[#id="LSV010"]/div[3]/div[1]/div[1]/button')[0].click()
This code block opens the page and on the second result, clicks the "View all dates and pricing" link. The cookie is sent there but by the browser, not as a response. I need to get my hands on that cookie. Other libraries are ok if they can do the job.
If you go manually to the page, this is the thing i need:
I have selected the request and the Cookie i need and as it shows it is in the request not response. Is this possible to achieve?
I found how this is done. Using the selenium library i did manage to get this working:
def fix_header():
browser = webdriver.Firefox(executable_path='geckodriver.exe', firefox_profile=profile)
browser.get('https://www.princess.com/find/searchResults.do')
browser.find_element_by_xpath(
"//*[#class='expand-table view-all-link util-link plain-text-btn gotham-bold']")
WebDriverWait(browser, 60).until(EC.visibility_of_any_elements_located(
(By.XPATH, "//*[#class='expand-table view-all-link util-link plain-text-btn gotham-bold']")))
try:
browser.find_element_by_xpath(
"//*[#class='expand-table view-all-link util-link plain-text-btn gotham-bold']").click()
except Exception:
browser.find_element_by_class_name('mfp-close').click()
browser.find_element_by_xpath(
"//*[#class='expand-table view-all-link util-link plain-text-btn gotham-bold']").click()
cookies = browser.get_cookies()
browser.close()
chrome_cookie = ''
for c in cookies:
chrome_cookie += c['name']
chrome_cookie += '='
chrome_cookie += c['value']
chrome_cookie += "; "
return chrome_cookie[:-2]
Selenium actually goes to the page and "clicks" the url i need with a browser and gets the needed cookies.

How to access elements of page after login selenium?

I can successfully login using the selenium web driver, but I don't know how to access the frame on the next page. I have tried setting the new frame, but it does not find that element because I think it is looking through the elements on the login page instead of the page after. This is the url after the login is successful. https://homeaccess.katyisd.org/HomeAccess/Classes/Classwork
Path of frame on page AFTER login page
browser = webdriver.Chrome(executable_path = path_to_chromedriver)
url = "https://homeaccess.katyisd.org/HomeAccess/Account/LogOn?ReturnUrl=%2fHomeAccess"
browser.get(url)
browser.find_element_by_id('LogOnDetails_UserName')
browser.find_element_by_id('LogOnDetails_Password')
browser.find_element_by_id('LogOnDetails_UserName').clear()
browser.find_element_by_id('LogOnDetails_Password').clear()
browser.find_element_by_id('LogOnDetails_UserName').send_keys('******')
browser.find_element_by_id('LogOnDetails_Password').send_keys('******')
This is all on the page after the login
frame = browser.find_element_by_xpath('//*[#id="sg-legacy-iframe"]') //prints no such element found
browser.switch_to_frame(frame)
browser.find_element_by_xpath('//*[#id="SignInSectionContainer"]/div[2]/button').click()
Try to wait for frame to appear:
frame = WebDriverWait(browser, 10).until(EC.frame_to_be_available_and_switch_to_it((By.ID, 'sg-legacy-iframe')))
After performing click on Login please wait for some time to load the page.
You can use WebDriverWait for waiting for required element
if above one fails generally in java in will go for Thread.sleep(5000);
As per provided HTML code you can try switching to frames by other locators if id fails
driver.switchTo().frame("sg-legacy-iframe"); //by providing id
driver.switchTo().frame(driver.findElement(By.xpath("//div[#id='MainContent/iframe"))); //by providing xpath
Thank You,
Murali

Categories