How to automatically accept cookie for a website with selenium python - python

I am using selenium to automate some test on many websites. Everytime I get the cookie wall popup.
I know I can search for the xpath of the Accept cookie button and then click on it with selenium. This solution is not convenient for me because I need to search for the button manually. I want a script that accepts cookie for all sites automatically.
What I tried to do is get a cookie jar by making a request to the website with python requests and then set the cookie in Selenium ==> Not working with many error
I found this on stackoverflow :
fp = webdriver.FirefoxProfile()
fp.set_preference("network.cookie.cookieBehavior", 2)
driver = webdriver.Firefox(firefox_profile=fp, executable_path="./geckodriver")
This worked for google.com (no accept cookie popup appeared) but it failed with facebook.com and instagram.com

Related

How can I get cookie request headers for webscraping

I am trying to scrape a that is blocking the scraper based on cookies. When I go incognito, open devtools and select the request header cookies from the network tab, the scraping works until the cookies gett blocked. Using undetected chromedriver I can't access the site from python, which is why I am having to manually input the cookies. I've tried all the recommended options for additional settings and headers to try and get undetected chromedrive to work, but it will not.

How to navigate using selenium Webdriver without being logged out?

I have managed to log into a website using webdriver. Now that I am logged in, I would like to navigate to a new URL on the same site using driver.get(). However, often (not all the time) in doing so I am logged out of the website. I have tried to duplicate the cookies after navigating to the new url, however, I still get the same problem. I am unsure if this method should work / if I am doing it correctly.
cookies = driver.get_cookies()
driver.get(link)
timer(time_limit)
for i in cookies:
driver.add_cookie(i)
How can I navigate to a different part of the website (without clicking links on the screen) whilst maintaining my log-in session?
I just had to refresh the page after adding the cookies: driver.refresh()

Selenium doesn't keep cache valid

I'm working on a python software with selenium. The problem is I want my script and selenium to save cookies after logging in. I save cookies using both "pickle" module and the below argument:
opts.add_argument("user-data-dir=cachedD")
But when I quit the browser and then launch it again and going to the same URL as it left off, the website again redirects to the login page. The website is using "moodle", and I guess it's cookies expire after quitting the browser. How can I save cookies and continue where it left off? I should say that there's just a maximum 15 seconds gap between two launches.
You're potentially not using the tag correctly.
With this tag you specify a folder path. If you review this page:
--user-data-dir
Directory where the browser stores the user profile. ↪
That link may not look right, but the chromium page say that's the right list.
Historically, I've had success with:
.add_argument("user-data-dir=C:\Temp")
If that is still not working as you expect, there are a few other things you can look at.
Review this page - cookies can be deleted when you close your browser. You'll want to verify the value of this option.
Another check is to open your chromedriver via selenium and goto chrome://version/ . From here you can review what you're running and you'll see there are a lot more tags that are enabled by default. You should check that these match up to how you want your browser to behave.

Selenium - use session from current Chrome instance

I'm trying to automate some form filling for a web app. Users have to login to the application and then start filling up pages of forms. I have the following Python script using Selenium that can open a window to my application:
from selenium import webdriver
driver = webdriver.Ie("C:\\Python\\Selenium\\Chrome\\chromedriver.exe")
driver.add_cookie()
driver.set_page_load_timeout(30)
driver.get("myurl/formpages")
driver.maximize_window()
driver.quit()
However, when Selenium starts the Chrome window, the user is not logged in. I want to bypass the need to log in every time.
On another Chrome window, I am already logged in as my test user. So whenever I go to the url on my Chrome window, I am already logged in and don't have to log in again.
Is there any way to pass this data into my Selenium script so that it uses the session currently on my existing Chrome instance, therefore not having to log in via the script?
Historically, this is not possible unfortunately (made frustrating by my agreement when I realize the effort it involves and for each browser!).
I've written code before that takes variables out of a CSV for username and password. This is bad because it's in plaintext but you can also hash the information if you like and handle that in your code.
So to recap, there are mediocre solutions, but no native way to handle this in selenium :(
Selenium by default creates a temporary chrome profile each time you start a new instance and deletes that profile once you quit the instance. If you change the chrome options in selenium driver to use a custom chrome profile and and allow that profile to save cookies, you will be able to login without each time typing your login details etc.

Scrape StreetEasy Page with Login Requirement

I am currently working on real-estate data and wanted to scrape some data from StreetEasy, which is the Register to see what it closed for about 2 months ago below listed price.
Example url
http://streeteasy.com/sale/1220187
The data I need requires login but the login mechanism is pretty different. There is no login page and the login is a pop-up. Is there anyway I can use Python to get the authentication and accesss the page after login like the image below?
With Selenium and PhantomJS, you get a powerful combination when it comes to scraping data.
from selenium import webdriver
host = "http://streeteasy.com/sale/1220187"
driver = webdriver.PhantomJS()
# Set the "window" wide enough so PhantomJS can "see" the right panel
driver.set_window_size(1280, 800)
driver.get(host)
driver.find_element_by_link_text("Register to see what it closed for").click()
driver.save_screenshot("output.jpg")
What you see is a small snippet of how Selenium can get you to the webpage login (verified via the JPG screencap). From there, it's a matter of toggling the login box, providing the credentials and click()ing your way in.
Oh, and be mindful of the TOS. Good luck!

Categories