how to give some unique id to each anonymous user in django - python

I want to create a table(postgres) that stores data about what items were viewed by what user. authenticated users are no problem but how can I tell one anonymous user from another anonymous user? This is needed for analysis purposes.
maybe store their IP address as unique ID? How can I do this?

I think you should use cookies.
When a user that is not authenticated makes a request, look for a cookie named whatever ("nonuserid" in this case). If the cookie is not present it means it's a new user so you should set the cookie with a random id. If it's present you can use the id in it to identificate the anonymous user.

Option 1
Use the IP Address. Checkout this answer. There is no need to write the IP into the session because you could always get the client IP when you receive a request.
Option 2
Generate the unique ID by the uuid, as the doc said. And set the ID in the session with a given name, suppose it USER_ID.
When you get the request from a user, check if the USER_ID in session. If so, read the value of it and write a record to database like user id visit page X. If not, generate and set.

Related

Is using UUId as pk a good idea in microservices?

I am working on a microservice project which contains 4 services developed in django i use dj rest auth to handle login and register process , each service has its own database and the information of users are kept in account service and other 3 services get the users information via an api request to account service , in each service i have only access to logged in user pk (dj rest auth handles this) and when i need to save a record for example location of logged in user ,i save a user object which only has pk alongside other info so the record in db will be like this :
user=request.user(which saves logged in user but i only see the pk)
lat = latitue number
lng = longitude number
everything is fine but if i loose the database of account service and restore the backup and some how the records generate a different pk (for example before restoring backup some new records would be added) from the ones saved in other services which makes a huge problem in all services. the solution i tried is to change the pk to uuid filed but is it a good idea? or maybe it is better to add a uuid filed to user model in account database and in other services i saves this uuid alongside the user's pk ?
The answers to this question may be subjective to different perspectives.
Here is my view on this:
There should be an id field of type INT which is a primary key that can auto-increment. Alongside that, you can add a UUID field, let's say uid.
Advantages:
Using id as a primary key makes your schema consistent with the rest of the database tables.
You can use the id field as a foreign key and this will take up less space than UUID.
In the public URLs you can use uid field and this does not expose guessable information. For eg, if you use and id, and in URL the resource id is 5, then the attacker can guess that there might be a resource with an id 6, 7. But using uid field which is UUID field, you are not exposing information related to the database.

Scraping ASPX after login with Python but every login gives you a different URL

I'm trying to get the exam result data from my college website for every Roll No. in my class.
Normally you can POST url (www.example.com/login.aspx)with login information, and GET a fixed url after login(www.example.com/home.aspx).
But the page I'm trying to get has a different URL for every Roll no. entered. The URL of login page look like this: "www.example.com/View.aspx". After login, the URL of the result page looks like: "www.example.com/ovengine.aspx?enc=BunchOfNumbersandAlphabets". And those numbers and alphabets are different for each roll number.
So I can't put a URL in my code to get the final result. I don't know how to get the page that comes automatically after the login, without mentioning it's URL.
But the page I'm trying to get has a different URL for every Roll no. entered
No, it is the same URL, and the URL has a parameter. You see this in URL's all the time.
So, for a temperature site it might look like
www.TheWeatherSite.com/?City=Rome
So, the above URL is always the same, but the web site "city" parameter is for the City of Rome. The web code behind can thus use/get/grab/consume that parameter in the code behind. That way we don't create a web page for EACH weather for each city.
so you create ONE page, and then and then PASS the web page a city value that the code behind can consume and use. (say query temperature data from a database for city = above value).
And thus you have to know ahead of time what city you want the weather for. Of course this approach is great since you don't have to create a new web site page to just show/display the weather in a given city.
You are in effect passing a value to some code behind that will run, and use that passed value.
The same goes for your example URL. You note there is ONE parameter called "enc".
So, the web site code behind would:
Grab, get, set the users ID. However, the users ID would be from the security system and the authentication provider. Unless you logged in as that particular user, then you not get that user id.
So, both a user ID (limited to the internal code).
And the "enc" value as the parameter in the URL you have would be required.
So, note in the above sql, we VERY likely need both a studentID and ALSO the "enc" value that some OTHER code from another page gets/grabs from the database.
Now that funny "GUID" (please do google what a GUID is), from a programmers point of view WOULD be sufficient to pull this one row of data from the database, but by ALSO using in the query the users logged on internal id?
Well, then only a given logged on user would be able to see their own set of values that belong to them.
In other words?
Only a drunken un-employed Rodeo clown would JUST require that GUID for pulling out that data. Since if that was the case, then any user could type in that GUID and see others peoples marks. However, there is "some" security by using a GUID, since a user could never guess that value.
If they used "city" like my first URL and parameter example? Then yes, you could guess and know the city value to type in. Or they could have used say student name, or even student number - those you COULD guess with relative ease.
But, for such data, no doubt the user adopted something MUCH more difficult then a starting number like a row number or PK id from a database. So, when the code added the results to that table? They also added a GUID of some type and saved that as a row in the database also.
So you NOT only need JUST the GUID, but that URL will ONLY work for a given pair of values. (the student ID - which is ONLY internal to the code and pulled FROM the authenticated provider. That was this line of code:
= Membership.GetUser.ProviderUserKey
So that above value is going to be the users logon internal ID.
The enc (external) exposed value in the web URL as a parameter, and ALSO the internal logged on value. So the code behind (asp.net) would look something like this:
Dim strSQL As String
strSQL = "SELECT * from tblStudentMarks where StudentID = #pID " &
" AND TestResultsGID = #GID"
Dim cmdSQL As New SqlCommand(strSQL, GetCon)
cmdSQL.Parameters.Add("#pID", SqlDbType.Int).Value = Membership.GetUser.ProviderUserKey
cmdSQL.Parameters.Add("#GID", SqlDbType.VarChar).Value = Request.QueryString("enc")
Dim dReader As New SqlDataAdapter(cmdSQL)
Dim rstData As DataTable
dReader.Fill(rstData)
Note the code:
Request.QueryString("enc")
That allows the code behind to get/grab the parameter (enc) from the URL. But, as I stated, it is high unlikely that JUST the "enc" number is required here. It is possible that ONLY this value is required to pull the data from the row, but then that would be a security hole the size of a open barn door.
Think of your on-line banking.
www.mybank.com/?CustomerNumber=1234
Well, if we JUST use the above CustomerNumber as the means to pull bank data, then I could go to the site and type in YOUR number, or someone's else's number.
So, for this to work?
You will need to obtain a list of enc values (that messy funny long string). Without that parameter then you not be able to set the parameter in the URL.
However, as I stated, you ALSO very likely need some internal "user" logon id that is NOT included in the public exposed URL to ALSO grab that one row of data from the database.
And, even more important? Such web pages usually cannot be hit UNLESS you are a logged in as an authenticated user. In other words that web page will ONLY be dished out to logged in users - if you not logged in, then the server security will automatic NOT dish out the web page unless you are logged in user.
So, for this to work, you need to contact the web site developers, and obtain that list of "enc" values. Once you have that list, then you can generate some code to process that list and insert the correct parameter in the URL. However, you also need to ask if that URL and parameter value will work for JUST you the logged in user, or if that this URL and parameter ONLY works for a give logged in user. Without these values, and without knowing if the URL and parameter will work for any user? (which I doubt it would), then just using a URL to get these values will not work.
It would be even BETTER to have the web site folks create a web service that you can call and in one command it would return all of the data you need anyway, as opposed to over and over having to send the "enc" value, which you don't have anyway.

Modify a Google App Engine entity id?

I'm using Google App Engine NDB. Sometimes I will want to get all users with a phone number in a specified list. Using queries is extremely expensive for this, so I thought I'll just make the id value of the User entity the phone number of the user so I can fetch directly by ids.
The problem is that the phone number field is optional, so initially a User entity is created without a phone number, and thus no value for id. So it would be created user = User() as opposed to user = User(id = phone_number).
So when a user at a later point decides to add a phone number to his account, is there anyway to modify that User entity's id value to the new phone number?
The entity ID forms part of the primary key for the entity, so there's no way to change it. Changing it is identical to creating a new entity with the new key and deleting the old one - which is one thing you can do, if you want.
A better solution would be to create a PhoneNumber kind that provides a reference to the associated User, allowing you to do lookups with get operations, but not requiring every user to have exactly one phone number.

How do I create a session variable in Python?

I use Python 3 as a serverside scripting language, and I want a way to keep users logged into my site. I don't use any framework, since I prefer to hand code pages, so how do I create session variables like in PHP in Python 3?
The logic of a session is storing a unique session id inside the user cookie ( uuid package will do a perfect job for that ). And you store the sessions data inside a file, database or other semi-permanent datastore.
The idea is matching the sessionid that you receive from your user cookie, to some data stored somewhere on your server.
I assume that you know how to add the right header to set a cookie via the response header.
Otherwise there is more information here : http://en.wikipedia.org/wiki/List_of_HTTP_header_fields#Responses

Generating unique and opaque user IDs in Google App Engine

I'm working on an application that lets registered users create or upload content, and allows anonymous users to view that content and browse registered users' pages to find that content - this is very similar to how a site like Flickr, for example, allows people to browse its users' pages.
To do this, I need a way to identify the user in the anonymous HTTP GET request. A user should be able to type http://myapplication.com/browse/<userid>/<contentid> and get to the right page - should be unique, but mustn't be something like the user's email address, for privacy reasons.
Through Google App Engine, I can get the email address associated with the user, but like I said, I don't want to use that. I can have users of my application pick a unique user name when they register, but I would like to make that optional if at all possible, so that the registration process is as short as possible.
Another option is to generate some random cookie (a GUID?) during the registration process, and use that, I don't see an obvious way of guaranteeing uniqueness of such a cookie without a trip to the database.
Is there a way, given an App Engine user object, of getting a unique identifier for that object that can be used in this way?
I'm looking for a Python solution - I forgot that GAE also supports Java now. Still, I expect the techniques to be similar, regardless of the language.
Your timing is impeccable: Just yesterday, a new release of the SDK came out, with support for unique, permanent user IDs. They meet all the criteria you specified.
I think you should distinguish between two types of users:
1) users that have logged in via Google Accounts or that have already registered on your site with a non-google e-mail address
2) users that opened your site for the first time and are not logged in in any way
For the second case, I can see no other way than to generate some random string (e.g. via uuid.uuid4() or from this user's session cookie key), as an anonymous user does not carry any unique information with himself.
For users that are logged in, however, you already have a unique identifier -- their e-mail address. I agree with your privacy concerns -- you shouldn't use it as an identifier. Instead, how about generating a string that seems random, but is in fact generated from the e-mail address? Hashing functions are perfect for this purpose. Example:
>>> import hashlib
>>> email = 'user#host.com'
>>> salt = 'SomeLongStringThatWillBeAppendedToEachEmail'
>>> key = hashlib.sha1('%s$%s' % (email, salt)).hexdigest()
>>> print key
f6cd3459f9a39c97635c652884b3e328f05be0f7
As hashlib.sha1 is not a random function, but for given data returns always the same result, but it is proven to be practically irreversible, you can safely present the hashed key on the website without compromising user's e-mail address. Also, you can safely assume that no two hashes of distinct e-mails will be the same (they can be, but probability of it happening is very, very small). For more information on hashing functions, consult the Wikipedia entry.
Do you mean session cookies?
Try http://code.google.com/p/gaeutilities/
What DzinX said. The only way to create an opaque key that can be authenticated without a database roundtrip is using encryption or a cryptographic hash.
Give the user a random number and hash it or encrypt it with a private key. You still run the (tiny) risk of collisions, but you can avoid this by touching the database on key creation, changing the random number in case of a collision. Make sure the random number is cryptographic, and add a long server-side random number to prevent chosen plaintext attacks.
You'll end up with a token like the Google Docs key, basically a signature proving the user is authenticated, which can be verified without touching the database.
However, given the pricing of GAE and the speed of bigtable, you're probably better off using a session ID if you really can't use Google's own authentication.

Categories