My propose is to login at my application through python requests. I was able to get a token, that is expected, but passing it by GET isn't enough. So, i want to store the request in a cookie, pass the token, and maybe the browser can login.
So, let's resume what i did (this is pseudo code)
session = requests.Session()
session.get('<url>salt')
r = session.get('<url>login', params={username, password})
r.headers['token']
I discovered this by looking the requests while login. The token is passed to the application after. So, how can i store the "r" as a cookie?
you can simply access your session cookie using:
client = requests.session()
cook = client.cookies
extracted_token_value = client.cookies['token']
#this will print your cookie and token
print cook.text
print extracted_token_value
#updating your header now:
client.headers.update({'New Header': 'extracted_token_value')
BR
Related
I want to start a requests.Session() and add a cookie before starting the first request. I expected to have a cookie argument or something similar to do this
def session_start()
self.session = requests.Session(cookies=['session-id', 'xxx'])
def req1():
self.session.get('example.org')
def req2():
self.session.get('example2.org')
but this wont work, I only can provide cookies in the .get() method. Do I need to do a "dummy request" in session_start() or is there a way to prepare the cookie before starting the actual request?
From the documentation:
Note, however, that method-level parameters will not be persisted across requests, even if using a session. This example will only send the cookies with the first request, but not the second:
s = requests.Session()
r = s.get('https://httpbin.org/cookies', cookies={'from-my': 'browser'})
print(r.text)
# '{"cookies": {"from-my": "browser"}}'
r = s.get('https://httpbin.org/cookies')
print(r.text)
# '{"cookies": {}}'
session.cookies is a RequestsCookieJar and you can add cookies to that after constructing the session:
self.session = requests.Session()
self.session.cookies.set('session-id', 'xxx')
So session objects will persist any cookies that the url requests themselves set, but if you provide a cookie as an argument it will not persist on the next request.
You can add cookies in manually that persist too though, from the documentation:
"If you want to manually add cookies to your session, use the Cookie utility functions to manipulate Session.cookies."
https://requests.readthedocs.io/en/master/user/advanced/#session-objects
Where it links to on how to manipulate cookies: https://requests.readthedocs.io/en/master/api/#api-cookies
I have a quick question regarding HTTP Basic Authentication after a redirect.
I am trying to login to a website which, for operational reasons, immediately redirects me to a central login site using an HTTP 302 response. In my testing, it appears that the Requests module does not send my credentials to the central login site after the redirect. As seen in the code snippet below, I am forced to extract the redirect URL from the response object and attempt the login again.
My question is simply this:
is there a way to force Requests to re-send login credentials after a redirect off-host?
For portability reasons, I would prefer not to use a .netrc file. Also, the provider of the website has made url_login static but has made no such claim about url_redirect.
Thanks for your time!
CODE SNIPPET
import requests
url_login = '<url_login>'
myauth = ('<username>', '<password')
login1 = requests.request('get', url_login, auth=myauth)
# this login fails; response object contains the login form information
url_redirect = login1.url
login2 = requests.request('get', url_redirect, auth=myauth)
# this login succeeds; response object contains a welcome message
UPDATE
Here is a more specific version of the general code above.
The first request() returns an HTTP 200 response and has the form information in its text field.
The second request() returns an HTTP 401 response with 'HTTP Basic: Access denied.' in its text field.
(Of course, the login succeeds when provided with valid credentials.)
Again, I am wondering whether I can achieve my desired login with only one call to requests.request().
import requests
url_login = 'http://cddis-basin.gsfc.nasa.gov/CDDIS_FileUpload/login'
myauth = ('<username>', '<password>')
with requests.session() as s:
login1 = s.request('get', url_login, auth=myauth)
url_earthdata = login1.url
login2 = s.request('get', url_earthdata, auth=myauth)
My solution to this would be use of "Session". Here is how you can implement Session.
import requests
s = requests.session()
url_login = "<loginUrl>"
payload = {
"username": "<user>",
"password": "<pass>"
}
req1 = s.post(url_login, data=payload)
# Now to make sure you do not get the "Access denied", use the same session variable for the request.
req2 = s.get(url_earthdata)
This should solve your problem.
This isn't possible with Requests, by design. The issue stems from a security vulnerability, where if an attacker modifies the redirect URL and the credentials are automatically sent to the redirect URL, then the credentials are compromised. So, credentials are stripped from redirect calls.
There's a thread about this on github:
https://github.com/psf/requests/issues/2949
I am trying to log in with a post request using the python requests module on a MediaWiki page:
import requests
s = requests.Session()
s.auth = ('....', '....')
url = '.....'
values = {'wpName' : '....',
'wpPassword' : '.....'}
req = s.post(url, values)
print(req.content)
I can't tell from the return value of the post request whether the login attempt was succesful. Is there something I can do to check this? Thanks.
Under normal circumstances i would advise you to go the mechanize way and make things way too easy for yourself but since you insist on requests, then let us use that.
YOu obviously have got the values right but i personally don't use the auth() function. So, try this instead.
import requests
url = 'https://example.com/wiki/index.php?title=Special:UserLogin'
values = {
'wpName': 'myc00lusername',
'wpPassword': 'Myl33tPassw0rd12'
}
session = requests.session()
r = session.post(url, data=values)
print r.cookies
This is what I used to solve this.
After getting a successful login, I read the texts from
response.text
and compared it to the text I got when submitting incorrect information.
The reason I did this is that validation is done on the server side and Requests will get a 200 OK response whether it was successful or not.
So I ended up adding this line.
logged_in = True if("Incorrect Email or password" in session.text) else False
Typically such an authentication mechanism is implemented using HTTP cookies. You might be able to check for the existence of a session cookie after you've authenticated successfully. You find the cookie in the HTTP response header or the sessions cookie attribute s.cookies.
I'm just studying the requests library(http://docs.python-requests.org/en/latest/),
and got a problem on how to fetch a page with cookies using requests.
for example:
url2= 'https://passport.baidu.com'
parsedCookies={'PTOKEN': '412f...', 'BDUSS': 'hnN2...', ...} #Sorry that the cookies value is replaced by ... for instance of privacy
req = requests.get(url2, cookies=parsedCookies)
text=req.text.encode('utf-8','ignore')
f=open('before.html','w')
f.write(text)
f.close()
req.close()
when I use the codes above to fetch the page, it just saves the login page to 'before.html' instead of logined page, it refers that actually I haven't logged in successfully.
But if I use URLlib2 to fetch the page, it works properly as expected.
parsedCookies="PTOKEN=412f...;BDUSS=hnN2...;..." #Different format but same content with the aboved cookies
req = urllib2.Request(url2)
req.add_header('Cookie', parsedCookies)
ret = urllib2.urlopen(req)
f=open('before_urllib2.html','w')
f.write(ret.read())
f.close()
ret.close()
When I use these codes, it saves the logined page in before_urllib2.html.
--
Are there any mistakes in my code?
Any reply would be grateful.
You can use Session object to get what you desire:
url2='http://passport.baidu.com'
session = requests.Session() # create a Session object
cookie = requests.utils.cookiejar_from_dict(parsedCookies)
session.cookies.update(cookie) # set the cookies of the Session object
req = session.get(url2, headers=headers,allow_redirects=True)
If you use the requests.get function, it doesn't send cookies for the redirected page. Instead, if you use the Session().get function, it will maintain and send cookies for all http requests, this is what the concept "session" exactly means.
Let me try to elaborate to you what happens here:
When I sent cookies to http://passport.baidu.com/center and set the parameter allow_redirects as false, the returned status code is 302 and one of the headers of the response is 'location': '/center?_t=1380462657' (This is a dynamic value generated by server, you can replace it with what you get from server):
url2= 'http://passport.baidu.com/center'
req = requests.get(url2, cookies=parsedCookies, allow_redirects=False)
print req.status_code # output 302
print req.headers
But when I set the parameter allow_redirects as True, it still doesn't redirect to the page (http://passport.baidu.com/center?_t=1380462657) and the server return the login page. The reason is that the requests.get doesn't send cookies for the redirected page, here is http://passport.baidu.com/center?_t=1380462657, so we can login successfully. That is why we need the Session object.
If I set url2 = http://passport.baidu.com/center?_t=1380462657, it will return the page you want. One solution is use the above code to get the dynamic location value and form a path to you account like http://passport.baidu.com/center?_t=1380462657 , then you can get the desired page.
url2= 'http://passport.baidu.com' + req.headers.get('location')
req = session.get(url2, cookies=parsedCookies, allow_redirects=True )
But this is cumbersome, so when dealing with cookies, Session object do excellent job for us!
I am trying to scrape some selling data using the StubHub API. An example of this data seen here:
https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata
You'll notice that if you try and visit that url without logging into stubhub.com, it won't work. You will need to login first.
Once I've signed in via my web browser, I open the URL which I want to scrape in a new tab, then use the following command to retrieve the scraped data:
r = requests.get('https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata')
However, once the browser session expires after ten minutes, I get this error:
<FormErrors>
<FormField>User Auth Check</FormField>
<ErrorMessage>
Either is not active or the session might have expired. Please login again.
</ErrorMessage>
I think that I need to implement the session ID via cookie to keep my authentication alive and well.
The Requests library documentation is pretty terrible for someone who has never done this sort of thing before, so I was hoping you folks might be able to help.
The example provided by Requests is:
s = requests.Session()
s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r = s.get("http://httpbin.org/cookies")
print r.text
# '{"cookies": {"sessioncookie": "123456789"}}'
I honestly can't make heads or tails of that. How do I preserve cookies between POST requests?
I don't know how stubhub's api works, but generally it should look like this:
s = requests.Session()
data = {"login":"my_login", "password":"my_password"}
url = "http://example.net/login"
r = s.post(url, data=data)
Now your session contains cookies provided by login form. To access cookies of this session simply use
s.cookies
Any further actions like another requests will have this cookie