Jupyterhub Custom Authenticator

Jupyterhub Custom Authenticator - python

I am a little stuck with writing a custom authenticator for jupyterhub. Most probably because I do not understand the inner workings of the available REMOTE_USER authenticator. I am not sure if it is applicable in my case... anyhow... this is what I'd like to do:
My general idea: I have a server that authenticates a user with his or her institutional login. After logging into the institution server/website, the users' data are encoded -- only some details to identify the user. They are then redirected to a the jupyterhub domain in the following way
https://<mydomain>/hub/login?data=<here go the encrypted data>
Now, if a request gets sent like this to my jupyterhub-domain, I'd like to decrypt the submitted data, and authenticate the user.
My trial:
I tried it with the following code. But it seems I am too nooby... :D
So please, pedantic comments are welcome :D
from tornado import gen
from jupyterhub.auth import Authenticator
class MyAuthenticator(Authenticator):
login_service = "my service"
authenticator_login_url="authentication url"
#gen.coroutine
def authenticate(self,handler,data=None):
# some verifications go here
# if data is verified the username is returned
My first problem... clicking the button on the login page, doesn't redirect me to my Authentication URL... it seems the variable authenticator_login_url from the login template is set somewhere else...
Second problem... a request made to .../hub/login?data=... is not evaluated by the authenticator (it seems...)
So: Has somebody any hints for me how to go about this?
As you see I followed the tutorials here:
https://universe-docs.readthedocs.io/en/latest/authenticators.html

So the following code does the job, however, I am always open to improvements.
So, what I did was redirect an empty login attempt to the login-url and deny access. If data is presented, check the validity of the data. If verified, user can login.
from tornado import gen, web
from jupyterhub.handlers import BaseHandler
from jupyterhub.auth import Authenticator
class MyAuthenticator(Authenticator):
login_service = "My Service"
#gen.coroutine
def authenticate(self,handler,data=None):
rawd = None
# If we receive no data we redirect to login page
while (rawd is None):
try:
rawd = handler.get_argument("data")
except:
handler.redirect("<The login URL>")
return None
# Do some verification and get the data here.
# Get the data from the parameters send to your hub from the login page, say username, access_token and email. Wrap everythin neatly in a dictionary and return it.
userdict = {"name": username}
userdict["auth_state"] = auth_state = {}
auth_state['access_token'] = verify
auth_state['email'] = email
#return the dictionary
return userdict
Simply add the file to the Python path, so that Jupyterhub is able to find it and make the necessary configurations in your jupyterhub_config.py file.

Related

How do I authorize a Google user in Python backend with ID token coming from iOS application?

Solution
So I don't think its a surprise to anyone but Google's documentation is god awful. It's so scattered and the Python docs still reference their old depreciated library. Anyways.
So what I really needed to look at was this link "Enabling Server Side Access for your App". This is not linked to anywhere. Keep in mind this is entirely different than "Authenticating with a Backend Server"
This was a start. On the iOS side of things, we need to specify the server or backend's client_id.
...
GIDSignIn.sharedInstance().clientID = SBConstants.Google.IOS_CLIENT_ID
GIDSignIn.sharedInstance().serverClientID = SBConstants.Google.SERVER_CLIENT_ID
...
And capture serverAuthCode from the sign method inside your sign-in delegate.
...
self.googleUser.userID = user.userID
self.googleUser.token = user.authentication.idToken
self.googleUser.serverAuthCode = user.serverAuthCode
...
Now when you want to perform some action in the backend on behalf of the frontend, we pass the captured serverAuthCode and send it as a parameter.
That was the easy part. In the backend, Google seems to have 13 different OAuth2 libraries for Python documented. Their example uses oauth2client which of course is deprecated.
What we want to use is their 'new' library google-api-python-client.
When the auth_token is passed to the backend we need to check if the user already has an access token in our database. If it does, we need to refresh. Otherwise, we need to request a new access token based on the auth_code. After much trial and error, here is the code to do so:
# we have record of this user
# we have record of this user
if user.exists:
# create new credentials, and refresh
credentials = Credentials(
token=user.token,
refresh_token=user.refresh_token,
client_id=CLIENT_ID,
client_secret=CLIENT_SECRET,
token_uri='https://oauth2.googleapis.com/token')
# now we have an access token
credentials.refresh(requests.Request())
else:
# get the auth_token
token_obj = json.loads(request.body)
code = token_obj.get('auth_code')
# request access token given auth_token
auth_flow = flow.Flow.from_client_secrets_file(creds, scopes=scopes)
auth_flow.fetch_token(code=code)
# now have access token
credentials = auth_flow.credentials
A warning: Pass or fail, the auth_token is only good for one request. This totally burned me. This also means once you have a successful backend interaction, you must store the user's token information to then request a refresh not a new access token.
Hope this helps someone.
Original Post
Following the documentation here, I am trying to authenticate a user in my iOS app and pass their ID token to my backend. The backend handles the Google API interactions for the iOS app.
I am missing how to actually authenticate that user in the backend. I read over the docs here regarding ID tokens but I am confused on where the service account comes into play.
Current endpoint:
#api_view(['POST'])
#authentication_classes([TokenAuthentication])
#permission_classes([IsAuthenticated])
def google_token_info(request):
try:
token_obj = json.loads(request.body)
token = token_obj['id_token']
id_info = id_token.verify_oauth2_token(token, requests.Request(), settings.IOS_CLIENT_ID)
# create session here - how?
This is all working fine. The ID info returns the expected decrypted JWT contents, and I have the user's unique Google ID at this point.
While testing I had authentication set up via my backend. I had code like this:
def google_auth(request):
web_flow = flow.Flow.from_client_secrets_file(creds, scopes=scopes)
web_flow.redirect_uri = request.build_absolute_uri(reverse('api.auth:oauth_callback'))
auth_url, state = web_flow.authorization_url(access_type='offline', include_granted_scopes='true', prompt='consent')
request.session['state'] = state
return redirect(auth_url)
def oauth_callback(request):
success_flow = flow.Flow.from_client_secrets_file(creds, scopes=scopes, state=request.session.get('state'))
success_flow.redirect_uri = request.build_absolute_uri(reverse('api.auth:oauth_callback'))
auth_response = request.build_absolute_uri()
success_flow.fetch_token(authorization_response=auth_response)
credentials = success_flow.credentials
if not request.session.get('google_credentials'):
request.session['google_credentials'] = _credentials_to_dict(credentials)
return redirect(reverse('api.auth:success'))
Which setup session credentials for the user. I'm assuming I need something similar, but I am unsure how to create a session without actual credentials.

Solution
So I don't think its a surprise to anyone but Google's documentation is god awful. It's so scattered and the Python docs still reference their old depreciated library. Anyways.
So what I really needed to look at was this link "Enabling Server Side Access for your App". This is not linked to anywhere. Keep in mind this is entirely different than "Authenticating with a Backend Server"
This was a start. On the iOS side of things, we need to specify the server or backend's client_id.
...
GIDSignIn.sharedInstance().clientID = SBConstants.Google.IOS_CLIENT_ID
GIDSignIn.sharedInstance().serverClientID = SBConstants.Google.SERVER_CLIENT_ID
...
And capture serverAuthCode from the sign method inside your sign-in delegate.
...
self.googleUser.userID = user.userID
self.googleUser.token = user.authentication.idToken
self.googleUser.serverAuthCode = user.serverAuthCode
...
Now when you want to perform some action in the backend on behalf of the frontend, we pass the captured serverAuthCode and send it as a parameter.
That was the easy part. In the backend, Google seems to have 13 different OAuth2 libraries for Python documented. Their example uses oauth2client which of course is deprecated.
What we want to use is their 'new' library google-api-python-client.
When the auth_token is passed to the backend we need to check if the user already has an access token in our database. If it does, we need to refresh. Otherwise, we need to request a new access token based on the auth_code. After much trial and error, here is the code to do so:
# we have record of this user
# we have record of this user
if user.exists:
# create new credentials, and refresh
credentials = Credentials(
token=user.token,
refresh_token=user.refresh_token,
client_id=CLIENT_ID,
client_secret=CLIENT_SECRET,
token_uri='https://oauth2.googleapis.com/token')
# now we have an access token
credentials.refresh(requests.Request())
else:
# get the auth_token
token_obj = json.loads(request.body)
code = token_obj.get('auth_code')
# request access token given auth_token
auth_flow = flow.Flow.from_client_secrets_file(creds, scopes=scopes)
auth_flow.fetch_token(code=code)
# now have access token
credentials = auth_flow.credentials
A warning: Pass or fail, the auth_token is only good for one request. This totally burned me. This also means once you have a successful backend interaction, you must store the user's token information to then request a refresh not a new access token.

Missing needed parameter state Python social auth Email Validation

I am using python social auth with django app. In email validation, the link received on the mail works fine in the same browser from which authentication was initiated but it show Missing needed parameter state in different browser.Did anybody fix this issue ?
The issue is discussed here Issue #577

This is because there's no partial pipeline data in other browser!
Christopher Keefer has worked on monkey-patch for this by fetching the session_key for Django session table. There's also a blog article here on this , refer Step 3 of this article.
# Monkey patching - an occasionally necessary evil.
from social import utils
from social.exceptions import InvalidEmail
from django.core import signing
from django.core.signing import BadSignature
from django.contrib.sessions.models import Session
from django.conf import settings
def partial_pipeline_data(backend, user=None, *args, **kwargs):
"""
Monkey-patch utils.partial_pipeline_data to enable us to retrieve session data by signature key in request.
This is necessary to allow users to follow a link in an email to validate their account from a different
browser than the one they were using to sign up for the account, or after they've closed/re-opened said
browser and potentially flushed their cookies. By adding the session key to a signed base64 encoded signature
on the email request, we can retrieve the necessary details from our Django session table.
We fetch only the needed details to complete the pipeline authorization process from the session, to prevent
nefarious use.
"""
data = backend.strategy.request_data()
if 'signature' in data:
try:
signed_details = signing.loads(data['signature'], key=settings.EMAIL_SECRET_KEY)
session = Session.objects.get(pk=signed_details['session_key'])
except BadSignature, Session.DoesNotExist:
raise InvalidEmail(backend)
session_details = session.get_decoded()
backend.strategy.session_set('email_validation_address', session_details['email_validation_address'])
backend.strategy.session_set('next', session_details.get('next'))
backend.strategy.session_set('partial_pipeline', session_details['partial_pipeline'])
backend.strategy.session_set(backend.name + '_state', session_details.get(backend.name + '_state'))
backend.strategy.session_set(backend.name + 'unauthorized_token_name',
session_details.get(backend.name + 'unauthorized_token_name'))
partial = backend.strategy.session_get('partial_pipeline', None)
if partial:
idx, backend_name, xargs, xkwargs = \
backend.strategy.partial_from_session(partial)
if backend_name == backend.name:
kwargs.setdefault('pipeline_index', idx)
if user: # don't update user if it's None
kwargs.setdefault('user', user)
kwargs.setdefault('request', backend.strategy.request_data())
xkwargs.update(kwargs)
return xargs, xkwargs
else:
backend.strategy.clean_partial_pipeline()
utils.partial_pipeline_data = partial_pipeline_data
This fixes the problem to much extent, still its not perfect. It will fail if session_key gets deleted/changed in the database. Django updates session_key each time the session data changes. So in case any other user logs in the same browser the session_key gets changed and user can't verify with the email link.
Omab has mentioned in the comment on issue,
I see the problem now, and even if I think that this could be solved with a re-write of the email validation pipeline, this affects all the pipeline functions that use the partial mechanism, so, I'm already working on a restructure of the pipeline serialization functionality that will improve this behavior. Basically the pipeline data will be dumped to a DB table and a hash code will be used to identify the processes which can be stopped and continue later, removing the dependency of the session.
Looking for update on this.

Tornado Websocket "error: Authentication missing"

So I've got redis feature and tornado running on my server and whenever I open my websocket chat through a login, the terminal displays the following message
Error: Authentication missing
I'm not sure why this is happening because there are cookies in the authentication part of the app,
# Save user when authentication was successful.
def on_user_find(result, user=user):
##todo: We should check if email is given even though we can assume.
if result == "null" or not result:
# If user does not exist, create a new entry.
self.application.client.set("user:" + user["email"], tornado.escape.json_encode(user))
else:
# Update existing user.
# #todo: Should use $set to update only needed attributes?
dbuser = tornado.escape.json_decode(result)
dbuser.update(user)
user = dbuser
self.application.client.set("user:" + user["email"], tornado.escape.json_encode(user))
# Save user id in cookie.
self.set_secure_cookie("user", user["email"])
self.application.usernames[user["email"]] = user.get("name") or user["email"]
And in the websocket.py (where I run the script) I've made it so that the websocket handle checks if there are cookies available first before user access the app,
class ChatSocketHandler(tornado.websocket.WebSocketHandler):
def open(self, user):
self.login = self.get_secure_cookie("user")
if not self.login:
# self.login = "anonymous"
print "Not authorized"
self.disconnect()
return
Yet it's still displaying the error, I've searched online and checked several SO answers but they don't show any solid solution in regards to this question. So far the most I've gotten is that I have to access the websocket header to put the above code inside, but I have no clue how I would do that. Help?

Pyramid authentication: Why does it work?

I'm just now getting into authentication in my app, and all of the pyramid examples that I can find explain the straightforward parts very well, but handwave over the parts that don't make any sense to me.
Most of the examples look something like this:
login = request.params['login']
password = request.params['password']
if USERS.get(login) == password:
headers = remember(request, login)
return HTTPFound(location = came_from,
headers = headers)
And from init:
session_factory = UnencryptedCookieSessionFactoryConfig(
settings['session.secret']
)
authn_policy = SessionAuthenticationPolicy()
authz_policy = ACLAuthorizationPolicy()
Trying to track down the point in which the login actually happens, I'm assuming it's this one:
headers = remember(request, login)
It appears to me that what is going on is we're storing the username in the session cookie.
If I put this line in my app, the current user is magically logged in, but why?
How does pyramid know that I'm passing a username? It looks like I'm just passing the value of login. Further, this variable is named differently in different examples.
Even if it does know that it's a username, how does it connect it with the user ID? If I run authenticated_userid(request) afterwards, it works, but how has the system connected the username with the userid? I don't see any queries as part of the remember() documentation.

Pyramid's security system revolves around principals; your login value is that principal. It is up to your code to provide remember() with a valid principal name; if your login name filled in the form is used as your principal, then that's great. If you are using an email address but use a database primary key as the principal string, then you'd have to map that yourself.
What exactly remember() does depends on your authentication policy; it is up to the policy to 'know' from request to request what principal you asked it to remember.
If you are using the AuthTktAuthenticationPolicy policy, then the principal value is stored in a cryptographically signed cookie; your next response will have a Set-Cookie header added. Then next time a request comes in with that cookie, provided it is still valid and the signature checks out, the policy now 'knows' what principle is making that request.
When that request then tries to access a protected resource, Pyramid sees that a policy is in effect, and asks that policy for the current authenticated principle.

Flickr API automated login using Python library flickrapi

I have a web application that I want to sync with Flickr. I don't want the users to have to log into Flickr so I plan to use a single login. I believe I'll need to do something like this:
import flickrapi
flickr = flickrapi.FlickrAPI(myKey, mySecret)
(token, frob) = flickr.get_token_part_one(perms='write', my_auth_callback)
flickr.get_token_part_two((token, frob,))
flickr.what_have_you(...
I don't know what my_auth_callback should look like though. I suspect it will have to post my login information to flickr. Could I do the get_token_part_one step just once manually perhaps and then re-use it in get_token_part_two?
Edit
Wooble has it. Here are some explicit directions that I wrote down using the Django shell and the flickrapi library.
import flickrapi
api_key = "xxxx...xxxx"
api_secret = "xxxx...xxxx"
_flickr = flickrapi.FlickrAPI(api_key, api_secret)
_flickr.web_login_url("write")
# Go to that url.
# That sends you back to the callback url you set by "editing the
# authentication workflow" on your flicks admin page located on the site.
# This callback url will contain a frob in the form
# xxxxxxxxxxxxxxxxx-xxxxxxxxxxxxxxxx-xxxxxxxx
_flickr.get_token("xxxxxxxxxxxxxxxxx-xxxxxxxxxxxxxxxx-xxxxxxxx")
# That returns the token. Then test with
import flickrapi
from django.conf import settings
_flickr = flickrapi.FlickrAPI(api_key, api_secret, token=api_token)
_flickr.groups_pools_getGroups()

If you don't want your users to authenticate with Flickr, you don't need to use the token-getting code at all. Just get a token for yourself once and include it with your code.
Note that "syncing" other users' photos with your own account probably breaks Flickr's TOS.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.