I want to start a requests.Session() and add a cookie before starting the first request. I expected to have a cookie argument or something similar to do this
def session_start()
self.session = requests.Session(cookies=['session-id', 'xxx'])
def req1():
self.session.get('example.org')
def req2():
self.session.get('example2.org')
but this wont work, I only can provide cookies in the .get() method. Do I need to do a "dummy request" in session_start() or is there a way to prepare the cookie before starting the actual request?
From the documentation:
Note, however, that method-level parameters will not be persisted across requests, even if using a session. This example will only send the cookies with the first request, but not the second:
s = requests.Session()
r = s.get('https://httpbin.org/cookies', cookies={'from-my': 'browser'})
print(r.text)
# '{"cookies": {"from-my": "browser"}}'
r = s.get('https://httpbin.org/cookies')
print(r.text)
# '{"cookies": {}}'
session.cookies is a RequestsCookieJar and you can add cookies to that after constructing the session:
self.session = requests.Session()
self.session.cookies.set('session-id', 'xxx')
So session objects will persist any cookies that the url requests themselves set, but if you provide a cookie as an argument it will not persist on the next request.
You can add cookies in manually that persist too though, from the documentation:
"If you want to manually add cookies to your session, use the Cookie utility functions to manipulate Session.cookies."
https://requests.readthedocs.io/en/master/user/advanced/#session-objects
Where it links to on how to manipulate cookies: https://requests.readthedocs.io/en/master/api/#api-cookies
Related
My propose is to login at my application through python requests. I was able to get a token, that is expected, but passing it by GET isn't enough. So, i want to store the request in a cookie, pass the token, and maybe the browser can login.
So, let's resume what i did (this is pseudo code)
session = requests.Session()
session.get('<url>salt')
r = session.get('<url>login', params={username, password})
r.headers['token']
I discovered this by looking the requests while login. The token is passed to the application after. So, how can i store the "r" as a cookie?
you can simply access your session cookie using:
client = requests.session()
cook = client.cookies
extracted_token_value = client.cookies['token']
#this will print your cookie and token
print cook.text
print extracted_token_value
#updating your header now:
client.headers.update({'New Header': 'extracted_token_value')
BR
I am doing a simple get request to get a website cookies:
import requests
with requests.Session() as s:
session = requests.Session()
response = session.get("http://www.dailymail.co.uk/home/index.html")
ncooks = session.cookies.get_dict()
print(ncooks)
But, when the ncooks gets returned it is empty {}
Why is this? How can I solve this problem to get the cookies for the website?
Your code works, the website you're using just doesn't set a cookie. Try with http://nytimes.com, the dict won't be empty.
Unrelated, but you're creating an unnecessary extra Session object; it should look like this:
import requests
with requests.Session() as s:
response = s.get("http://nytimes.com")
cooks = s.cookies.get_dict()
print(cooks)
I have experiment with sessions in requests. One thing confuses me: when I reuse a session, on the second request the cookies are empty.
This short example boils it down, and the result is same with all host I try.
import requests
import time
# ==== First Request ====
session = requests.Session()
response = session.get(url="http://www.example.com")
print(response.cookies)
# <RequestsCookieJar[<Cookie UID=759854d4058cf52df60bbbe2a19d1402f5aee (...)
time.sleep(2)
# ==== Second Request ====
response = session.get(url="http://www.example.com")
print(response.cookies)
# <RequestsCookieJar[]> (EMPTY!)
But according to documentation:
The Session object allows you to persist certain parameters across
requests. It also persists cookies across all requests made from the
Session instance (...)
What am I missing?
Edit: the answer explained that I was doing wrong. And dir(session) made me realize that the cookies were stored in session.cookies
This is because you check the response's http header instead of the request.
Your first request creates the session on the server for the first time and the server responds to your request with the Set-Cookie HTTP header. This is what you see in the printout of the first response.
In your second request, the session is already created, therefore the server doesn't need to include the cookie in its response.
Try to examine your requests instead of the responses.
I am trying to log in with a post request using the python requests module on a MediaWiki page:
import requests
s = requests.Session()
s.auth = ('....', '....')
url = '.....'
values = {'wpName' : '....',
'wpPassword' : '.....'}
req = s.post(url, values)
print(req.content)
I can't tell from the return value of the post request whether the login attempt was succesful. Is there something I can do to check this? Thanks.
Under normal circumstances i would advise you to go the mechanize way and make things way too easy for yourself but since you insist on requests, then let us use that.
YOu obviously have got the values right but i personally don't use the auth() function. So, try this instead.
import requests
url = 'https://example.com/wiki/index.php?title=Special:UserLogin'
values = {
'wpName': 'myc00lusername',
'wpPassword': 'Myl33tPassw0rd12'
}
session = requests.session()
r = session.post(url, data=values)
print r.cookies
This is what I used to solve this.
After getting a successful login, I read the texts from
response.text
and compared it to the text I got when submitting incorrect information.
The reason I did this is that validation is done on the server side and Requests will get a 200 OK response whether it was successful or not.
So I ended up adding this line.
logged_in = True if("Incorrect Email or password" in session.text) else False
Typically such an authentication mechanism is implemented using HTTP cookies. You might be able to check for the existence of a session cookie after you've authenticated successfully. You find the cookie in the HTTP response header or the sessions cookie attribute s.cookies.
I'm just studying the requests library(http://docs.python-requests.org/en/latest/),
and got a problem on how to fetch a page with cookies using requests.
for example:
url2= 'https://passport.baidu.com'
parsedCookies={'PTOKEN': '412f...', 'BDUSS': 'hnN2...', ...} #Sorry that the cookies value is replaced by ... for instance of privacy
req = requests.get(url2, cookies=parsedCookies)
text=req.text.encode('utf-8','ignore')
f=open('before.html','w')
f.write(text)
f.close()
req.close()
when I use the codes above to fetch the page, it just saves the login page to 'before.html' instead of logined page, it refers that actually I haven't logged in successfully.
But if I use URLlib2 to fetch the page, it works properly as expected.
parsedCookies="PTOKEN=412f...;BDUSS=hnN2...;..." #Different format but same content with the aboved cookies
req = urllib2.Request(url2)
req.add_header('Cookie', parsedCookies)
ret = urllib2.urlopen(req)
f=open('before_urllib2.html','w')
f.write(ret.read())
f.close()
ret.close()
When I use these codes, it saves the logined page in before_urllib2.html.
--
Are there any mistakes in my code?
Any reply would be grateful.
You can use Session object to get what you desire:
url2='http://passport.baidu.com'
session = requests.Session() # create a Session object
cookie = requests.utils.cookiejar_from_dict(parsedCookies)
session.cookies.update(cookie) # set the cookies of the Session object
req = session.get(url2, headers=headers,allow_redirects=True)
If you use the requests.get function, it doesn't send cookies for the redirected page. Instead, if you use the Session().get function, it will maintain and send cookies for all http requests, this is what the concept "session" exactly means.
Let me try to elaborate to you what happens here:
When I sent cookies to http://passport.baidu.com/center and set the parameter allow_redirects as false, the returned status code is 302 and one of the headers of the response is 'location': '/center?_t=1380462657' (This is a dynamic value generated by server, you can replace it with what you get from server):
url2= 'http://passport.baidu.com/center'
req = requests.get(url2, cookies=parsedCookies, allow_redirects=False)
print req.status_code # output 302
print req.headers
But when I set the parameter allow_redirects as True, it still doesn't redirect to the page (http://passport.baidu.com/center?_t=1380462657) and the server return the login page. The reason is that the requests.get doesn't send cookies for the redirected page, here is http://passport.baidu.com/center?_t=1380462657, so we can login successfully. That is why we need the Session object.
If I set url2 = http://passport.baidu.com/center?_t=1380462657, it will return the page you want. One solution is use the above code to get the dynamic location value and form a path to you account like http://passport.baidu.com/center?_t=1380462657 , then you can get the desired page.
url2= 'http://passport.baidu.com' + req.headers.get('location')
req = session.get(url2, cookies=parsedCookies, allow_redirects=True )
But this is cumbersome, so when dealing with cookies, Session object do excellent job for us!