I have a quick question regarding HTTP Basic Authentication after a redirect.
I am trying to login to a website which, for operational reasons, immediately redirects me to a central login site using an HTTP 302 response. In my testing, it appears that the Requests module does not send my credentials to the central login site after the redirect. As seen in the code snippet below, I am forced to extract the redirect URL from the response object and attempt the login again.
My question is simply this:
is there a way to force Requests to re-send login credentials after a redirect off-host?
For portability reasons, I would prefer not to use a .netrc file. Also, the provider of the website has made url_login static but has made no such claim about url_redirect.
Thanks for your time!
CODE SNIPPET
import requests
url_login = '<url_login>'
myauth = ('<username>', '<password')
login1 = requests.request('get', url_login, auth=myauth)
# this login fails; response object contains the login form information
url_redirect = login1.url
login2 = requests.request('get', url_redirect, auth=myauth)
# this login succeeds; response object contains a welcome message
UPDATE
Here is a more specific version of the general code above.
The first request() returns an HTTP 200 response and has the form information in its text field.
The second request() returns an HTTP 401 response with 'HTTP Basic: Access denied.' in its text field.
(Of course, the login succeeds when provided with valid credentials.)
Again, I am wondering whether I can achieve my desired login with only one call to requests.request().
import requests
url_login = 'http://cddis-basin.gsfc.nasa.gov/CDDIS_FileUpload/login'
myauth = ('<username>', '<password>')
with requests.session() as s:
login1 = s.request('get', url_login, auth=myauth)
url_earthdata = login1.url
login2 = s.request('get', url_earthdata, auth=myauth)
My solution to this would be use of "Session". Here is how you can implement Session.
import requests
s = requests.session()
url_login = "<loginUrl>"
payload = {
"username": "<user>",
"password": "<pass>"
}
req1 = s.post(url_login, data=payload)
# Now to make sure you do not get the "Access denied", use the same session variable for the request.
req2 = s.get(url_earthdata)
This should solve your problem.
This isn't possible with Requests, by design. The issue stems from a security vulnerability, where if an attacker modifies the redirect URL and the credentials are automatically sent to the redirect URL, then the credentials are compromised. So, credentials are stripped from redirect calls.
There's a thread about this on github:
https://github.com/psf/requests/issues/2949
Related
There is a parser of products from the STEPN marketplace. To receive a JSON response, you need to send a session with an authorized account in cookies.
# how the parser works
cookies = {'SESSIONIDD2': '7951767220820838781:1658220355588:1400231'} # cookies received from the developer tools in the browser
r = request.get('https://api.stepn.com/run/orderlist?order=2001&chain=103&refresh=true&page=0&type=600&gType=&quality=&level=0&bread=0', cookies=cookies)
# get a JSON response with the necessary data
But after some time, the session is logged out in cookies and you need to log in to the browser again and log in
I tried to log in via request.session (passed all the headers, cookies), but received an 'Incorrect username/password' in response
with requests.Session() as session:
r = session.get('https://m.stepn.com/')
r = session.get('https://api.stepn.com/run/login?account={email}&password={password}&type=3') # I also got the string for the request in the developer tools
# get {"code":201003,"msg":"Incorrect username/password"}
I've recently reversed Stepn web authentication(email and password encryption). Here is my solution in rust: https://github.com/Numenorean/stepn-password, you can remake it as a library using python(or C) ffi, and then call needed function just from your code, so after that you only need to send correct auth request
I am trying to log in with a post request using the python requests module on a MediaWiki page:
import requests
s = requests.Session()
s.auth = ('....', '....')
url = '.....'
values = {'wpName' : '....',
'wpPassword' : '.....'}
req = s.post(url, values)
print(req.content)
I can't tell from the return value of the post request whether the login attempt was succesful. Is there something I can do to check this? Thanks.
Under normal circumstances i would advise you to go the mechanize way and make things way too easy for yourself but since you insist on requests, then let us use that.
YOu obviously have got the values right but i personally don't use the auth() function. So, try this instead.
import requests
url = 'https://example.com/wiki/index.php?title=Special:UserLogin'
values = {
'wpName': 'myc00lusername',
'wpPassword': 'Myl33tPassw0rd12'
}
session = requests.session()
r = session.post(url, data=values)
print r.cookies
This is what I used to solve this.
After getting a successful login, I read the texts from
response.text
and compared it to the text I got when submitting incorrect information.
The reason I did this is that validation is done on the server side and Requests will get a 200 OK response whether it was successful or not.
So I ended up adding this line.
logged_in = True if("Incorrect Email or password" in session.text) else False
Typically such an authentication mechanism is implemented using HTTP cookies. You might be able to check for the existence of a session cookie after you've authenticated successfully. You find the cookie in the HTTP response header or the sessions cookie attribute s.cookies.
Python newbie here, so I'm sure this is a trivial challenge...
Using Requests module to make a POST request to the Instagram API in order to obtain a code which is used later in the OAuth process to get an access token. The code is usually accessed on the client-side as it's provided at the end of the redirect URL.
I have tried using Request's response history method, like this (client ID is altered for this post):
OAuthURL = "https://api.instagram.com/oauth/authorize/?client_id=cb0096f08a3848e67355f&redirect_uri=https://www.smashboarddashboard.com/whathappened&response_type=code"
OAuth_AccessRequest = requests.post(OAuthURL)
ResHistory = OAuth_AccessRequest.history
for resp in ResHistory:
print resp.status_code, resp.url
print OAuth_AccessRequest.status_code, OAuth_AccessRequest.url
But the URLs this returns are not revealing the code number string, instead, the redirect just looks like this:
302 https://api.instagram.com/oauth/authorize/?client_id=cb0096f08a3848e67355f&redirect_uri=https://www.dashboard.com/whathappened&response_type=code
200 https://instagram.com/accounts/login/?force_classic_login=&next=/oauth/authorize/%3Fclient_id%cb0096f08a3848e67355f%26redirect_uri%3Dhttps%3A//www.smashboarddashboard.com/whathappened%26response_type%3Dcode
Where if you do this on the client side, using a browser, code would be replaced with the actual number string.
Is there a method or approach I can add to the POST request that will allow me to have access to the actual redirect URL string that appears in the web browser?
It should work in a browser if you are already logged in at Instagram. If you are not logged in you are redirected to a login page:
https://instagram.com/accounts/login/?force_classic_login=&next=/oauth/authorize/%3Fclient_id%3Dcb0096f08a3848e67355f%26redirect_uri%3Dhttps%3A//www.smashboarddashboard.com/whathappened%26response_type%3Dcode
Your Python client is not logged in and so it is also redirected to Instagram's login page as shown by the value of OAuth_AccessRequest.url :
>>> import requests
>>> OAuthURL = "https://api.instagram.com/oauth/authorize/?client_id=cb0096f08a3848e67355f&redirect_uri=https://www.smashboarddashboard.com/whathappened&response_type=code"
>>> OAuth_AccessRequest = requests.get(OAuthURL)
>>> OAuth_AccessRequest
<Response [200]>
>>> OAuth_AccessRequest.url
u'https://instagram.com/accounts/login/?force_classic_login=&next=/oauth/authorize/%3Fclient_id%3Dcb0096f08a3848e67355f%26redirect_uri%3Dhttps%3A//www.smashboarddashboard.com/whathappened%26response_type%3Dcode'
So, to get to the next step, your Python client needs to login. This requires that the client extract and set fields to be posted back to the same URL. It also requires cookies and that the Referer header be properly set. There is a hidden CSRF token that must be extracted from the page (you could use BeautifulSoup for example), and form fields username and password must be set. So you would do something like this:
import requests
from bs4 import BeautifulSoup
OAuthURL = "https://api.instagram.com/oauth/authorize/?client_id=cb0096f08a3848e67355f&redirect_uri=https://www.smashboarddashboard.com/whathappened&response_type=code"
session = requests.session() # use session to handle cookies
OAuth_AccessRequest = session.get(OAuthURL)
soup = BeautifulSoup(OAuth_AccessRequest.content)
form = soup.form
login_data = {form.input.attrs['name'] : form.input['value']}
login_data.update({'username': 'your username', 'password': 'your password'})
headers = {'Referer': OAuth_AccessRequest.url}
login_url = 'https://instagram.com{}'.format(form.attrs['action'])
r = session.post(login_url, data=login_data, headers=headers)
>>> r
<Response [400]>
>>> r.json()
{u'error_type': u'OAuthException', u'code': 400, u'error_message': u'Invalid Client ID'}
Which looks like it will work once provided a valid client ID.
As an alternative, you could look at mechanize which will handle the form submission for you, including the hidden CSRF field:
import mechanize
OAuthURL = "https://api.instagram.com/oauth/authorize/?client_id=cb0096f08a3848e67355f&redirect_uri=https://www.smashboarddashboard.com/whathappened&response_type=code"
br = mechanize.Browser()
br.open(OAuthURL)
br.select_form(nr=0)
br.form['username'] = 'your username'
br.form['password'] = 'your password'
r = br.submit()
response = r.read()
But this doesn't work because the referer header is not being set, however, you could use this method if you can figure out a solution to that.
I have problem with simple authorization and upload API script.
When authorized, client receives several cookies, including PHPSESSID cookie (in browser).
I use requests.post method with form data for authorization:
r = requests.post(url, headers = self.headers, data = formData)
self.cookies = requests.utils.dict_from_cookieja(r.cookies)
Headers are used for custom User-Agent only.
Authorization is 100% fine (there is a logout link on the page).
Later, i try to upload data using the authorized session cookies:
r = requests.post(url, files = files, data = formData, headers = self.headers, cookies = self.cookies)
But site rejects the request. If we compare the requests from script and google chrome (using Wireshark), there is no differences in request body.
Only difference is that 2 cookies sent by requests class, while google chrome sends 7.
Update: Double checked, first request receives 7 cookies. post method just ignore half...
My mistake in code was that i was assigning cookies from each next API request to the session cookies dictionary. On each request since logged in, cookies was 'reset' by upcoming response cookies, that's was the problem. As auth cookies are assigned only at login request, they were lost at the next request.
After each authorized request i use update(), not assigning.
self.cookies.update( requests.utils.dict_from_cookiejar(r.cookies) )
Solves my issue, upload works fine!
I am pretty new to using urllib and requests module in python. I am trying to access a wikipage in my company's website which requires me to provide my login credentials through a pop up window when I try to access it through a browser.
I was able to write the following script to successfully access the webpage and read it using the following piece of code:
import sys
import urllib.parse
import urllib.request
import getpass
import http.cookiejar
wiki_page = 'http://wiki.company.com/wiki_page'
top_level_url = 'http://login.company.com/'
username = input("Enter Username: ")
password = getpass.getpass('Enter Password: ')
# Authenticate with login server and fetch the wiki page
password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
cj = http.cookiejar.CookieJar()
password_mgr.add_password(None, top_level_url, username, password)
handler = urllib.request.HTTPBasicAuthHandler(password_mgr)
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj),handler)
opener.open(wiki_page)
urllib.request.install_opener(opener)
with urllib.request.urlopen(wiki_page) as response:
# Do something
But now I need to use requests module to do the same. I tried using several methods including sessions but could not get it to work. The following is the piece of code which I think close to the actual solution but it gives Response 200 in the first print and Response 401 in the second print:
s = requests.Session()
print(s.post('http://login.company.com/', auth=(username, password))) # I have tried s.post() as well as s.get() in this line
print(s.get('http://wiki.company.com/wiki_page'))
The site uses the Basic Auth authorization scheme; you'll need to send the login credentials with each request.
Set the Session.auth attribute to a tuple with the username and password on the session:
s = requests.Session()
s.auth = (username, password)
response = s.get('http://wiki.company.com/wiki_page')
print(response.text)
The urllib.request.HTTPPasswordMgrWithDefaultRealm() object would normally only respond to challenges on URLs that start with http://login.company.com/ (so any deeper path will do too), and not send the password elsewhere.
If the simple approach (setting Session.auth) doesn't work, you'll need to find out what response is returned by accessing http://wiki.company.com/wiki_page directly, which is what your original code does. If the server redirects you to a login page, where you then use the Basic Auth information, you can replicate that:
s = requests.Session()
response = s.get('http://wiki.company.com/wiki_page', allow_redirects=False)
if response.status_code in (302, 303):
target = response.headers['location']
authenticated = s.get(target, auth=(username, password))
# continue on to the wiki again
response = s.get('http://wiki.company.com/wiki_page')
You'll have to investigate carefully what responses you get from the server. Open up an interactive console and see what responses you get back. Look at response.status_code and response.headers and response.text for hints. If you leave allow_redirects to the default True, look at response.history to see if there were any intermediate redirections.