how to open existing instance of opera in selenium - python

I'm trying to use BeautifulSoup on a site that assigns a unique identifier to the session when a user logs in (e.g. d83c231c-a1a0-4b9b-85bb-7cc36aa6b66c) as well as a "serial number": serial:"ABzepITkLfRuQ+4hCMTXfA.
Upon examination via WebDriver, it seems that I receive a newly generated guid and serial number every time I login to the site if i use WebDriver. Is there a way to open the same instance of a browser programmatically so that it appears as the same guid to the site each time?
This is what i am using now to open browser:
import time
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome import service
from selenium.webdriver.support.ui import WebDriverWait
import pprint
webdriver_service = service.Service('operadriver')
webdriver_service.start()
options = webdriver.ChromeOptions()
options.binary_location = "C:\Program Files (x86)\Opera\launcher.exe"
driver = webdriver.Opera(opera_options=options) # success!
driver.get('http://somesite.com')

Related

Is it possible to maintain login session in selenium-python?

I use Selenium below method.
open chrome by using chromedriver selenium
manually login
get information of webpage
However, after doing this, Selenium seems to get the html code when not logged in.
Is there a solution?
Try this code:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.options import Options
options = Options()
# path of the chrome's profile parent directory - change this path as per your system
options.add_argument(r"user-data-dir=C:\\Users\\User\\AppData\\Local\\Google\\Chrome\\User Data")
# name of the directory - change this directory name as per your system
options.add_argument("--profile-directory=Default")
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
You can get the chrome profile directory by passing this command - 'chrome://version/' in chrome browser.
Add the code for login after the above code block, then if you execute the code for the second time onwards you can see the account is already logged in.
Before running the code close all the chrome browser windows and execute.
Instead of storing and maintaining the login session another easy approach would be to use pickle library to store the cookies post login.
As an example to store the cookies from Instagram after logging in and then to reuse them you can use the following solution:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pickle
# first login
driver.get('http://www.instagram.org')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[name='username']"))).send_keys("_SelmanFarukYılmaz_")
driver.find_element(By.CSS_SELECTOR, "input[name='password']").send_keys("Selman_Faruk_Yılmaz")
driver.find_element(By.CSS_SELECTOR, "button[type='submit'] div").click()
pickle.dump( driver.get_cookies() , open("cookies.pkl","wb"))
driver.quit()
# future logins
driver = webdriver.Chrome(service=s, options=options)
driver.get('http://www.instagram.org')
cookies = pickle.load(open("cookies.pkl", "rb"))
for cookie in cookies:
driver.add_cookie(cookie)
driver.get('http://www.instagram.org')

Specific web page doesn't load (empty page) HTML and CSS with Selenium?

I started working with Selenium, it works for any website I tried except one (myvisit.com) that doesn't load the page.
It opens Chrome but the page is empty. I tried to set number of delays but it still doesn't load.
When I go to the website on a regular Chrome (without Selenium) it loads everything.
Here is my simple code, not sure how to continue from that:
import os
import random
import time
# selenium libraries
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import ChromiumOptions
def delay():
time.sleep(random.randint(2,3))
driver = webdriver.Chrome(os.getcwd()+"\\webdriver\\chromedriver.exe")
driver.get("https://myvisit.com")
delay()
delay()
delay()
delay()
I also tried to use ChromiumOptions with flags like --no-sandbox but it didn't help:
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--disable-blink-features=AutomationControlled')
driver = webdriver.Chrome(os.getcwd()+"\\webdriver\\chromedriver.exe",options=options)
Simply add the arguement to remove it from determining it's an automation.

Download PDF file from .cfm URL

I'm trying to download a PDF file from this address: https://aisweb.decea.mil.br/inc/notam/gerar-boletim/reports/report-notam.cfm
I wrote some code that first fills out some information in this page (correctly) https://aisweb.decea.mil.br/?i=notam and then clicks a button that opens a new tab to the generated PDF file. The problem is that when it tries to save the PDF file at the end, it downloads directly from the .cfm address, resulting in an empty PDF template (you can see this by clicking the fist link).
How can I download the PDF that is currently being shown to me on the page, instead of accessing the first URL directly?
This is my code
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver import ActionChains
from selenium.webdriver.common.print_page_options import PrintOptions
from urllib import request
from bs4 import BeautifulSoup
import re
import os
import urllib
import time
import requests
from urllib.parse import urljoin
aerodromos = "SBNT,SBJP,SBFZ,SBRF" #TEST
driver = webdriver.Chrome('C:\Windows\chromedriver.exe')
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(options=options)
driver.get("https://aisweb.decea.mil.br/?i=notam")
driver.maximize_window()
caixaTexto = driver.find_element(By.XPATH,'//*[#id="icaocode"]')
caixaTexto.send_keys(aerodromos)
botao = driver.find_element(By.XPATH, '//*[#id="a"]/form/div/div[3]/div/input[2]')
botao.click()
botao = driver.find_element(By.XPATH, '//*[#id="select-all"]')
botao.click()
botao = driver.find_element(By.XPATH, '/html/body/div/div/div/div/div[2]/div/div/form/input[3]')
botao.click()
response = urllib.request.urlretrieve('https://aisweb.decea.mil.br/inc/notam/gerar-boletim/reports/report-notam.cfm', filename='relatorio1.pdf')
I did it! When I tried to change the settings in Chrome to download PDFs instead of opening them, it made no difference, but I ended up finding a solution while searching for another way to do it.
Unable to access the modal elements to download pdf with selenium
I changed Chrome experimental options profile in my code and it worked! Now it opens the tab, immediately downloads the file and closes the tab!

Selenium chrome remote webdriver cookies

I've looked over numerous questions and the selenium docs, explaining how to force Chrome to honor cookie data from pre-existing profiles. I've set Chrome settings to allow all cookies, and I tried to use Python pickling to persist cookie data across sessions. Still, I'm getting a guest or Profile 1 session instead of the signed-in session that I'm looking for. Note that I'm using an implementation that must use selenium remote, as I need it to work off of a server/client relationship rather than locally. On my old environment, Firefox profiles were much easier to implement.. I'm on a Linux/Ubuntu (Jammy) w/latest Chrome setup..
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import numpy as np
import pandas as pd
import pickle
import re
fp = webdriver.ChromeOptions()
fp.add_argument("home/dharkko/.config/google-chrome/default")
fp.add_argument('--headless')
fp.add_argument("--profile-directory=Profile 1")
#fp.add_cookie({"name":"hash_name","value":"hash_value"})
#fp.add_argument("user-data-dir=selenium")
driver = webdriver.Remote(
command_executor='http://localhost:4444/wd/hub',
options=fp
)
pickle.dump(driver.get_cookies(), open("/home/dharkko/.config/google-chrome/Default/cookies.pkl","wb"))
cookies = pickle.load(open("/home/dharkko/.config/google-chrome/Default/cookies.pkl", "rb"))
for cookie in cookies:
driver.add_cookie(cookie)
As you are using the default Chrome Profile you need to pass the user-data-dir parameter as follows:
fp.add_argument("user-data-dir=/path/to/Google/Chrome/User Data")
Additionally, incase google-chrome is installed in the default location you can remove the following line:
fp.add_argument("home/dharkko/.config/google-chrome/default")

How do I import cookies from any browser (preferably IE) to selenium Chrome driver?

I want to add my cookies from IE from a particular website to selenium browser by way of import.
Imports I currently have if they are any help to you.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
import time
if __name__ == '__main__':
Thanks in advance
You can save the current cookies as a Python object using pickle. For example:
import pickle
import selenium.webdriver
driver = selenium.webdriver.Firefox()
driver.get("http://www.google.com")
pickle.dump( driver.get_cookies() , open("cookies.pkl","wb"))
And later to add them back:
import pickle
import selenium.webdriver
driver = selenium.webdriver.Firefox()
driver.get("http://www.google.com")
cookies = pickle.load(open("cookies.pkl", "rb"))
for cookie in cookies:
driver.add_cookie(cookie)
Source
Edit: Be careful when you pickle things. This is a great way to have a deserialization vulnerability introduced into your application.

Categories