Selenium Add Cookies From CookieJar - python

I am trying to add python requests session cookies to my selenium webdriver.
I have tried this so far
for c in self.s.cookies :
driver.add_cookie({'name': c.name, 'value': c.value, 'path': c.path, 'expiry': c.expires})
This code is working fine for PhantomJS whereas it's not for Firefox and Chrome.
My Questions:
Is there any special iterating of cookiejar for Firefox and Chrome?
Why it is working for PhantomJS?

for cookie in s.cookies: # session cookies
# Setting domain to None automatically instructs most webdrivers to use the domain of the current window
# handle
cookie_dict = {'domain': None, 'name': cookie.name, 'value': cookie.value, 'secure': cookie.secure}
if cookie.expires:
cookie_dict['expiry'] = cookie.expires
if cookie.path_specified:
cookie_dict['path'] = cookie.path
driver.add_cookie(cookie_dict)
Check this for a complete solution https://github.com/cryzed/Selenium-Requests/blob/master/seleniumrequests/request.py

Related

get cookies for www subdomain, or a particular domain?

I'm calling get_cookies() on my selenium web driver. Of course we know this fetches the cookies for the current domain. However, many popular sites set cookies on both example.com and www.example.com.
Technically, it's not really a "separate domain" or even sub domain. I think nearly every website on the internet has the same site at the www sub domain as it does the root.
So is it still impossible to save cookies for the two domains, since one is a sub domain? I know the answer is complicated if you want to save cookies for all domains, but I figured this is kind of different since they really are the same domain.
Replicate it with this code:
from selenium import webdriver
import requests
driver = webdriver.Firefox()
driver.get("https://www.instagram.com/")
print(driver.get_cookies())
output:
[{'name': 'ig_did', 'value': 'F5FDFBB0-7D13-4E4E-A100-C627BD1998B7', 'path': '/', 'domain': '.instagram.com', 'secure': True, 'httpOnly': True, 'expiry': 1671083433}, {'name': 'mid', 'value': 'X9hOqQAEAAFWnsZg8-PeYdGqVcTU', 'path': '/', 'domain': '.instagram.com', 'secure': True, 'httpOnly': False, 'expiry': 1671083433}, {'name': 'ig_nrcb', 'value': '1', 'path': '/', 'domain': '.instagram.com', 'secure': True, 'httpOnly': False, 'expiry': 1639547433}, {'name': 'csrftoken', 'value': 'Yy8Bew6500BinlUcAK232m7xPnhOuN4Q', 'path': '/', 'domain': '.instagram.com', 'secure': True, 'httpOnly': False, 'expiry': 1639461034}]
Then load the page in a fresh browser instance and check yourself. You'll see www is there.
The main domain looks fine though:
My idea is to use requests library and get all cookies via REST query?
import requests
# Making a get request
response = requests.get('https://www.instagram.com/')
# printing request cookies
print(response.cookies)
Domain
To host your application on the internet need a domain name. Domain names act as a placeholder for the complex string of numbers known as an IP address. As an example,
https://www.instagram.com/
With the latest firefox v84.0 accessing the Instagram application the following cookies are observed within the https://www.instagram.com domain:
Subdomain
A subdomain is an add-on to your primary domain name. For example, when using the sites e.g. Craigslist, you are always using a subdomain like reno.craigslist.org, or sfbay.craigslist.org. You will be automatically be forwarded to the subdomain that corresponds to your physical location. Essentially, a subdomain is a separate part of your website that operates under the same primary domain name.
Reusing cookies
If you have stored the cookie from domain example.com, these stored cookies can't be pushed through the webdriver session to any other different domanin e.g. example.edu. The stored cookies can be used only within example.com. Further, to automatically login an user in future, you need to store the cookies only once, and that's when the user have logged in. Before adding back the cookies you need to browse to the same domain from where the cookies were collected.
Demonstration
As an example, you can store the cookies once the user have logged in within an application as follows:
from selenium import webdriver
import pickle
driver = webdriver.Chrome()
driver.get('http://demo.guru99.com/test/cookie/selenium_aut.php')
driver.find_element_by_name("username").send_keys("abc123")
driver.find_element_by_name("password").send_keys("123xyz")
driver.find_element_by_name("submit").click()
# storing the cookies
pickle.dump( driver.get_cookies() , open("cookies.pkl","wb"))
driver.quit()
Later at any point of time if you want the user automatically logged-in, you need to browse to the specific domain /url first and then you have to add the cookies as follows:
from selenium import webdriver
import pickle
driver = webdriver.Chrome()
driver.get('http://demo.guru99.com/test/cookie/selenium_aut.php')
# loading the stored cookies
cookies = pickle.load(open("cookies.pkl", "rb"))
for cookie in cookies:
# adding the cookies to the session through webdriver instance
driver.add_cookie(cookie)
driver.get('http://demo.guru99.com/test/cookie/selenium_cookie.php')
Reference
You can find a detailed discussion in:
org.openqa.selenium.InvalidCookieDomainException: Document is cookie-averse using Selenium and WebDriver

Move cookies from requests to selenium

I would like to move the cookies in a python requests session to my selenium browser. At the moment, I am doing this:
cookies = session.cookie_jar
for cookie in cookies: # Add success cookies
driver.add_cookie({'name': cookie.name, 'value': cookie.value, 'path': cookie.path, 'expiry': cookie.expires})
However, I get some errors like
AttributeError: 'Morsel' object has no attribute 'path'
How can I fix that?
Thanks.

Python: iterate transfer cookies from requests session to Selenium

I have a problem regarding the transfer of cookies from a requests session to the Selenium WebDriver.
As WebDriver I use chromedriver.
for c in r.cookies:
driver.add_cookie({'name': c.name, 'value': c.value,'path': c.path, 'expiry': c.expires})
driver.get("https://www.bstn.com/de/cart")
Now it seems like the iteration doesn't transfer all cookies. I can see this because my cart at bstn.com is empty.
When I code it like following:
for c in r.cookies:
driver.add_cookie({'name': c.name, 'value': c.value,'path': c.path, 'expiry': c.expires})
driver.get("https://www.bstn.com/de/cart")
The browser calls the website approx 10 times. In the end, I can access my cart and see the carted item.
Could you please let me know what am I doing wrong with the iteration? In my opinion, the first code example is the right one. Which is weird. Maybe I need to call the website first?
Thanks for any suggestions.
Max.
Just do the refresh of the page after setting up cookies:
driver.refresh()
And you should see the changes.
Your code will look like this:
for c in r.cookies:
driver.add_cookie({'name': c.name, 'value': c.value,'path': c.path, 'expiry': c.expires})
driver.refresh()
Hope it helps you!

Pythons requests session to open browser using selenium

Im looking to use requests.session and beautifulsoup. If a specific status of 503 is identified I want to then open that session in a web browser. The problem is I have no idea how to move a python requests session into a browser using selenium. Any guidance would be appreciated.
Requests sessions have CookieJar objects that you can use to import into Selenium.
For example:
driver = webdriver.Firefox()
s = requests.Session()
s.get('http://example.com')
for cookie in s.cookies:
driver.add_cookie({
'name': cookie.name,
'value': cookie.value,
'path': '/',
'domain': cookie.domain,
})
driver should now have all of the cookies (and therefore sessions) that Requests has.

Python scrapy login using session cookie

I'm trying to scrape from sites after authentication. I was able to take the JSESSIONID cookie from an authenticated browser session and download the correct page using urlopener like below.
import cookielib, urllib2
cj = cookielib.CookieJar()
c1 = cookielib.Cookie(None, "JSESSIONID", SESSIONID, None, None, DOMAIN,
True, False, "/store",True, False, None, False, None, None, None)
cj.set_cookie(c1)
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
fh = opener.open(url)
But when I use this code for creating scrapy requests (tried both dict cookies and cookiejar), the downloaded page is the non-authenticated version. Anyone know what the problem is?
cookies = [{
'name': 'JSESSIONID',
'value': SESSIONID,
'path': '/store',
'domain': DOMAIN,
'secure': False,
}]
request1 = Request(url, cookies=self.cookies, meta={'dont_merge_cookies': False})
request2 = Request(url, meta={'dont_merge_cookies': True, 'cookiejar': cj})
You were able to get the JSESSIONID from your browser.
Why not let Scrapy simulate a user login for you?
Then, I think your JSESSIONID cookie will stick to subsequent requests given that :
Scrapy uses a single cookie jar (as opposed to Multiple cookie sessions per spider) for the entire spider
lifetime containing all your scraping steps,
the COOKIES_ENABLED setting for the cookie middleware defaults to
true,
dont_merge_cookies defaults to false :
When some site returns cookies (in a response) those are stored in the
cookies for that domain and will be sent again in future requests.
That’s the typical behaviour of any regular web browser. However, if,
for some reason, you want to avoid merging with existing cookies you
can instruct Scrapy to do so by setting the dont_merge_cookies key to
True in the Request.meta.
Example of request without merging cookies:
request_with_cookies = Request(url="http://www.example.com",
cookies={'currency': 'USD', 'country': 'UY'},
meta={'dont_merge_cookies': True})

Categories