Generating unique and opaque user IDs in Google App Engine

Generating unique and opaque user IDs in Google App Engine - python

I'm working on an application that lets registered users create or upload content, and allows anonymous users to view that content and browse registered users' pages to find that content - this is very similar to how a site like Flickr, for example, allows people to browse its users' pages.
To do this, I need a way to identify the user in the anonymous HTTP GET request. A user should be able to type http://myapplication.com/browse/<userid>/<contentid> and get to the right page - should be unique, but mustn't be something like the user's email address, for privacy reasons.
Through Google App Engine, I can get the email address associated with the user, but like I said, I don't want to use that. I can have users of my application pick a unique user name when they register, but I would like to make that optional if at all possible, so that the registration process is as short as possible.
Another option is to generate some random cookie (a GUID?) during the registration process, and use that, I don't see an obvious way of guaranteeing uniqueness of such a cookie without a trip to the database.
Is there a way, given an App Engine user object, of getting a unique identifier for that object that can be used in this way?
I'm looking for a Python solution - I forgot that GAE also supports Java now. Still, I expect the techniques to be similar, regardless of the language.

Your timing is impeccable: Just yesterday, a new release of the SDK came out, with support for unique, permanent user IDs. They meet all the criteria you specified.

I think you should distinguish between two types of users:
1) users that have logged in via Google Accounts or that have already registered on your site with a non-google e-mail address
2) users that opened your site for the first time and are not logged in in any way
For the second case, I can see no other way than to generate some random string (e.g. via uuid.uuid4() or from this user's session cookie key), as an anonymous user does not carry any unique information with himself.
For users that are logged in, however, you already have a unique identifier -- their e-mail address. I agree with your privacy concerns -- you shouldn't use it as an identifier. Instead, how about generating a string that seems random, but is in fact generated from the e-mail address? Hashing functions are perfect for this purpose. Example:
>>> import hashlib
>>> email = 'user#host.com'
>>> salt = 'SomeLongStringThatWillBeAppendedToEachEmail'
>>> key = hashlib.sha1('%s$%s' % (email, salt)).hexdigest()
>>> print key
f6cd3459f9a39c97635c652884b3e328f05be0f7
As hashlib.sha1 is not a random function, but for given data returns always the same result, but it is proven to be practically irreversible, you can safely present the hashed key on the website without compromising user's e-mail address. Also, you can safely assume that no two hashes of distinct e-mails will be the same (they can be, but probability of it happening is very, very small). For more information on hashing functions, consult the Wikipedia entry.

Do you mean session cookies?
Try http://code.google.com/p/gaeutilities/
What DzinX said. The only way to create an opaque key that can be authenticated without a database roundtrip is using encryption or a cryptographic hash.
Give the user a random number and hash it or encrypt it with a private key. You still run the (tiny) risk of collisions, but you can avoid this by touching the database on key creation, changing the random number in case of a collision. Make sure the random number is cryptographic, and add a long server-side random number to prevent chosen plaintext attacks.
You'll end up with a token like the Google Docs key, basically a signature proving the user is authenticated, which can be verified without touching the database.
However, given the pricing of GAE and the speed of bigtable, you're probably better off using a session ID if you really can't use Google's own authentication.

Related

Is it safe to hash a password two separate times with two separate salts and save both hashes on the same server

I hope I didn't create a duplicate question.
I tried to look for already existing questions, but I didn't find anything.
I have successfully set up a database with username, salt and hashed password for logging in.
For checking the password, I compare the generated hash with the one of the database, see code below.
password_hashed_from_user = res[0][0]
salt = res[0][1]
key_generated = hashlib.pbkdf2_hmac('sha256', password.encode('utf-8'), base64.b64decode(salt.encode('utf-8')), 100000, dklen=128)
key_encoded = base64.b64encode(key_generated).decode('utf-8')
if key_encoded != password_hashed_from_user:
logging.debug("Password was wrong:\n{}\n{}".format(key_encoded, password_hashed_from_user))
return "Username and/or password incorrect", False
The problem now is that I want the user to be able to act completely anonymously, which means I want the user to be able to use a generated token for identification, which cannot be traced back to his account.
Therefore I would need to store the token in a separate table, not correlated to the one with the credentials.
In order for the user to not be able to cheat and just ask the server for a new token every time he logs in (and therefore act as a new user), I wanted to compute the token based on the credentials.
So I figured, I could just have a separate salt and create a new hash based on the password (with the same method as in the code example).
Since the password itself is not stored on the server, this hash could not be generated without the password of the user itself.
This way, the generated token is always the same, as long as the salt doesn't change.
So I can make sure that a specific user is always identified as the same one, while the user can make sure that I cannot trace back his actions.
Background
The background is that I need to create a voting environment, where people have to register and identify themselves in order to prevent double voting, but the vote results, as well as the participation etc should not be traced back to the specific user.
As this is a project in my studies, I cannot just use existing frameworks/libraries.
Now my question:
Is it safe to store two separate hashes of the same password with different salts on the same server or would the duplication make it feasible to recreate the actual password? Both salts would be stored together, together with one of the hashes. The other hash would be in a separate, unrelated table.
I always struggled a bit with encryption on that level.

Is it safe to store two separate hashes of the same password with different salts on the same server or would the duplication make it feasible to recreate the actual password?
Yes, it is safe.
The basic idea behind that statement is that the salt "injects" sufficient uniqueness into the process that the password hash can work with to ensure that two different salts yield unrelated-looking hashes. A real-world example of this would be the worry of two different users having the same password (but different salts) - which also doesn't leak anything about the password and was one of the main motivations to introduce salts.
The more cryptographic argument is either you assume your hash acts like a random oracle - which yields unrelated random ouputs for unique inputs - in which case the uniqueness of the salt hides all output. Or you use a weaker assumption that your password hash is a randomness extractor combined with a pseudo-random function (not unreasonable for a cryptographic hash-based password-hash) with the key in the password input. In that case assuming the password is unknown and sufficiently random all unique salts will be mapped to strings that are indistinguishable from random output and therefore cannot yield any information about the output.
Alternatively you can also use Bellare, Ristenpart and Tessaro's definition for password hashing security which essentially says "breaking a password hash is as hard as guessing the password if said hash is good".

Where does flask store token for password recovery?

I need to provide password recovery token in order to test it's functionality with integration test. But I can't trace the place its stored.

Apparently it doesn't. It hashes the user's current password [hash] and their id and sends that as token. Which is entirely reasonable, since that's already user-specific information stored in the database, no need to generate yet another token. And it will even invalidate itself once the password has been changed. I'd probably add a timestamp somewhere in there though so the link isn't valid forever.

I can't send Emails using Django non-rel on GAE

Im trying to send a simple email to do the password recover of a user, the input is just a email to send the new password..
But i can't... i get this error
SMTPServerDisconnected: please run connect() first
I already tried a few examples, like, https://bitbucket.org/andialbrecht/appengine_emailbackends/overview, but i get the same error
I really need this, maybe someone can tell me how to use an alternative to code in my view to send an email...Also i changed the backend to
EMAIL_BACKEND = 'djangoappengine.mail.EmailBackend'
but nothing,i don't know how to use this backend anyway :(
Plz Help :(

maybe someone can tell me how to use an alternative to code in my view to send an email...
I can help with this, seeing as it seems that perhaps this repository you're trying to use is based on an earlier version of App Engine and is throwing the error due to a required code change somewhere in the library - either that or the fact that you changed the string from what the library recommends (your version: 'djangoappengine.mail.EmailBackend') to a string that seems to not be correct, as it's different to what the author of the repository directed you to use (their version: 'appengine_emailbackend.EmailBackend'), and this is causing trouble.
Whenever possible, I'd recommend seeing if there is an "app-engine-y" way to do something, before going to a third-party library or deploying a module somebody else wrote to hack in third-party capabilities, or looking for an advanced/experimental feature (for example, use Datastore first, rather than remotely connecting to a MySQL VM, unless you need MySQL). If you absolutely need that library, this is a different story, but if you just want to send emails, the Mail API is what you need. It's a convenient way to send emails on App Engine.
I'm going to assume in the following that you are storing your user's usernames and hashed passwords in custom-defined User-kind entities in your Datastore. If you have your users using simple OAuth to sign into your site, there is never any reason to "reset/recover password":
Create the <form action="/some/route" action="POST"> element on
the page where the user requests password recovery.
Put the code responsible for handling this form submission (they will input their email, or whatever account info they need for your code to find their User entity in the Datastore in a handler that will respond on that route.
In the handler, generate a unique token and store it in the Datastore. Send the token in the email that you generate and send using the Mail API (see the example code in the link to the docs I provided). This will allow your user to return to your site, authenticate with the token from the email, and then fill out a form to create a new password. You will then hash this password (with a salt) and store it in their User entity in your Datastore.
I'm skipping over the details of how to implement a "password recovery form", given what I said about OAuth and that you are probably really only concerned with how to send mail. In the email you send, for example, you can insert a hyperlink to your site with the token already inserted as a query param, so that the user doesn't have to copy and paste, etc.

Django : How to count number of people viewed

I'm making a simple BBS application in Django and I want it so that whenever someone sees a post, the number of views on that post (post_view_no) is increased.
At the moment, I face two difficulties:
I need to limit the increase in post_view_no so that one user can only increase it once regardless of how many times the user refreshes/clicks on the post.
I also need to be able to track the users that are not logged in.
Regards to the first issue, it seems pretty easy as long as I create a model called 'View' and check the db but I have a feeling this may be an overkill.
In terms of second issue, all I can think of is using cookies / IP address to track the users but IP is hardly unique and I cannot figure out how to use cookies
I believe this is a common feature on forum/bbs solutions but google search only turned up with plugins or 'dumb' solutions that increase the view each time the post is viewed.
What would be the best way to go about this?

I think you can do both things via cookies. For example, when user visits a page, you can
Check if they have “viewed_post_%s” (where %s is post ID) key set in their session.
If they have, do nothing. If they don't, increase view_count numeric field of your corresponding Post object by one, and set the key (cookie) “viewed_post_%s” in their session (so that it won't count in future).
This would work with both anonymous and registered users, however by clearing cookies or setting up browser to reject them user can game the view count.
Now using cookies (sessions) with Django is quite easy: to set a value for current user, you just invoke something like
request.session['viewed_post_%s' % post.id] = True
in your view, and done. (Check the docs, and especially examples.)
Disclaimer: this is off the top of my head, I haven't done this personally, usually when there's a need to do some page view / activity tracking (so that you see what drives more traffic to your website, when users are more active, etc.) then there's a point in using a specialized system (e.g., Google Analytics, StatsD). But for some specific use case, or as an exercise, this should work.

Just to offer a secondary solution, which I think would work but is also prone to gaming (if coming by proxy or different devices). I haven't tried this either but I think it should work and wouldn't require to think about cookies, plus you aggregate some extra data which is noice.
I would make a model called TrackedPosts.
class TrackedPosts(models.Model):
post = models.ForeignKey(Post)
ip = models.CharField(max_length=16) #only accounting for ipv4
user = models.ForeignKey(User) #if you want to track logged in or anonymous
Then when you view a post, you would take the requests ip.
def my_post_view(request, post_id):
#you could check for logged in users as well.
tracked_post, created = TrackedPost.objects.get_or_create(post__pk=id, ip=request.ip, user=request.user) #note, not actual api
if created:
tracked_post.post.count += 1
tracked_post.post.save()
return render_to_response('')

Google App Engine using UserProperty to link data

I am writing an application in GAE. In order to link the currently logged in user to data in my application I have been relying on equality between users.get_current_user() and members.user (see model below). Where I run into trouble is when the user signs in with an email address using different capitalization than his/her initial login (janeDoe#example.com != janedoe#example.com). What is the most reliable way to link the current user to application specific data?
class members(db.Model):
firstName=db.StringProperty(verbose_name='First Name',required=True)
lastName=db.StringProperty(verbose_name='Last Name',required=True)
user=db.UserProperty()

Don't use the username - call user.user_id(), and compare on that. It's guaranteed to remain the same, even if nickname or email address change.

Always convert username to lowercase and then do operations on it: when storing the first time and on later comparisons.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Generating unique and opaque user IDs in Google App Engine - python

Your timing is impeccable: Just yesterday, a new release of the SDK came out, with support for unique, permanent user IDs. They meet all the criteria you specified.

Related

Is it safe to hash a password two separate times with two separate salts and save both hashes on the same server

Where does flask store token for password recovery?

I can't send Emails using Django non-rel on GAE

Django : How to count number of people viewed

Google App Engine using UserProperty to link data

Categories

Resources