I am writing a web app on google app engine with python. I am using jinja2 as a templating engine.
I currently have it set up so that users can upvote and downvote posts but right now they can vote on them as many times as they would like. I simply have the vote record in a database and then calculate it right after that. How can I efficiently prevent users from casting multiple votes?
I suggest making a toggleVote method, which accepts the key of the item you want to toggle the vote on, and the key of the user making the vote.
I'd also suggest adding a table to record the votes, basically containing two fields:
"keyOfUserVoting", "keyOfItemBeingVotedOn"
That way you can simply do a very query where the keys match, and if an item exists, then you know the user voted on that item. (Query where keyOfUserVoting = 'param1' and keyOfItemVoted='param2', if result != None, then it means the user voted)
For the toggleVote() method the case could be very simple:
toggleVote(keyOfUserVoting, keyOfItemToVoteOn):
if (queryResultExists):
// delete this record from the 'votes' table
else:
// add record to the 'votes' table
That way you'll never have to worry about keeping track on an individual basis of how many times the user has voted or not.
Also this way, if you want to find out how many votes are on an item, you can do another query to quickly count where keyOfItemToVoteOn = paramKeyOfItem. Again, with GAE, this will be very fast.
In this setup, you can also quickly tell how many times a user has voted on one item (count where userKey = value and where itemKey = value), or how many times a user has voted in the entire system (count where userKey = value)...
Lastly, for best reliability, you can wrap the updates in the toggleVote() method in a transaction, especially if you'll be doing other things on the user or item being voted on.
Hope this helps.
Store the voting user with the vote, and check for an existing vote by the current user, using your database.
You can perform the check either before you serve the page (and so disable your voting buttons), or when you get the vote attempt (and show some kind of message). You should probably write code to handle both scenarios if the voting really matters to you.
Related
I'm using the following method to perform a bulk insert, and to optionally avoid inserting duplicates, with SQLAlchemy:
def bulk_insert_users(self, users, allow_duplicates = False):
if not allow_duplicates:
users_new = []
for user in users:
if not self.SQL_IO.db.query(User_DB.id).filter_by(user_id = user.user_id).scalar():
users_new.append(user)
users = users_new
self.SQL_IO.db.bulk_save_objects(users)
self.SQL_IO.db.commit()
Can the above functionality be implemented such that the function is faster?
You can load all user ids first, put them into a set and then use user.user_id in existing_user_ids to determine whether to add a new user or not instead of sending a SELECT query every time. Even with ten thousands of users this will be quite fast, especially compared to contacting the database for each user.
How many users do you have? You're querying for the users one at a time, every single iteration of that loop. You might have more luck querying for ALL user Ids, put them in a list, then check against that list.
existing_users = #query for all user IDs
for user in new_users:
if user not in existing_users:
#do_stuff
Let's assume I am developing a service that provides a user with articles. Users can favourite articles and I am using Solr to store these articles for search purposes.
However, when the user adds an article to their favourites list, I would like to be able to figure out out which articles the user has added to favourites so that I can highlight the favourite button.
I am thinking of two approaches:
Fetch articles from Solr and then loop through each article to fetch the "favourite-status" of this article for this specific user from MySQL.
Whenever a user favourites an article, add this user's ID to a multi-valued column in Solr and check whether the ID of the current user is in this column or not.
I don't know the capacity of the multivalued column... and I also don't think the second approach would be a "good practice" (saving user-related data in index).
What other options do I have, if any? Is approach 2 a correct approach?
I'd go with a modified version of the first one - it'll keep user specific data that's not going to be used for search out of the index (although if you foresee a case where you want to search for favourite'd articles, it would probably be an interesting field to have in the index) for now. For just display purposes like in this case, I'd take all the id's returned from Solr, fetch them in one SQL statement from the database and then set the UI values depending on that. It's a fast and easy solution.
If you foresee that "search only in my fav'd articles" as a use case, I would try to get that information into the index as well (or other filter applications against whether a specific user has added the field as a favourite). I'd try to avoid indexing anything more than the user id that fav'd the article in that case.
Both solutions would however work, although the latter would require more code - and the required response from Solr could grow large if a large number of users fav's an article, so I'd try to avoid having to return a set of userid's if that's the case (many fav's for a single article).
I'm making a simple BBS application in Django and I want it so that whenever someone sees a post, the number of views on that post (post_view_no) is increased.
At the moment, I face two difficulties:
I need to limit the increase in post_view_no so that one user can only increase it once regardless of how many times the user refreshes/clicks on the post.
I also need to be able to track the users that are not logged in.
Regards to the first issue, it seems pretty easy as long as I create a model called 'View' and check the db but I have a feeling this may be an overkill.
In terms of second issue, all I can think of is using cookies / IP address to track the users but IP is hardly unique and I cannot figure out how to use cookies
I believe this is a common feature on forum/bbs solutions but google search only turned up with plugins or 'dumb' solutions that increase the view each time the post is viewed.
What would be the best way to go about this?
I think you can do both things via cookies. For example, when user visits a page, you can
Check if they have “viewed_post_%s” (where %s is post ID) key set in their session.
If they have, do nothing. If they don't, increase view_count numeric field of your corresponding Post object by one, and set the key (cookie) “viewed_post_%s” in their session (so that it won't count in future).
This would work with both anonymous and registered users, however by clearing cookies or setting up browser to reject them user can game the view count.
Now using cookies (sessions) with Django is quite easy: to set a value for current user, you just invoke something like
request.session['viewed_post_%s' % post.id] = True
in your view, and done. (Check the docs, and especially examples.)
Disclaimer: this is off the top of my head, I haven't done this personally, usually when there's a need to do some page view / activity tracking (so that you see what drives more traffic to your website, when users are more active, etc.) then there's a point in using a specialized system (e.g., Google Analytics, StatsD). But for some specific use case, or as an exercise, this should work.
Just to offer a secondary solution, which I think would work but is also prone to gaming (if coming by proxy or different devices). I haven't tried this either but I think it should work and wouldn't require to think about cookies, plus you aggregate some extra data which is noice.
I would make a model called TrackedPosts.
class TrackedPosts(models.Model):
post = models.ForeignKey(Post)
ip = models.CharField(max_length=16) #only accounting for ipv4
user = models.ForeignKey(User) #if you want to track logged in or anonymous
Then when you view a post, you would take the requests ip.
def my_post_view(request, post_id):
#you could check for logged in users as well.
tracked_post, created = TrackedPost.objects.get_or_create(post__pk=id, ip=request.ip, user=request.user) #note, not actual api
if created:
tracked_post.post.count += 1
tracked_post.post.save()
return render_to_response('')
Background:
I have the core functionality of a very simple vote-based site setup and working well in pyramid utilizing a sqlite database. The last requirement for this application is to allow only one vote per day, per user. It has been specified that this must be done via cookies, and that no users shall be allowed to vote on Saturdays or Sundays.
I am currently using UnencryptedCookieSessionFactoryConfig for session management and to handle flash messages.
Question:
I've identified that I need the following functionality, but can't determine what modules of pyramid might provide it (or if I should be looking elsewhere):
Create a cookie for each user that persists between browser sessions (I am aware this is insecure as a method of preventing multiple votes. That's fine.)
Allow a single vote to be placed per day, per user.
Give a new vote to a user once 24 hours has elapsed.
Prevent all voting if day of week = saturday or sunday (this should be trivial with the use of a datetime() check placed prior to any cookie-checking logic.
Additional Info:
My current db schema is as follows, and must stay this way:
create table if not exists games (
id integer primary key autoincrement,
title char(100) not null,
owned bool not null,
created char(40) not null
);
create table if not exists votes (
gameId integer,
created char(40) not null,
FOREIGN KEY(gameId) REFERENCES games(id)
);
and the current vote function is as follows:
#view_config(route_name='usevote')
def usevote_view(request):
game_id = int(request.matchdict['id'])
request.db.execute('insert into votes (gameId,created) values (?,?)',
(game_id,now))
request.db.commit()
request.session.flash('Your vote has been counted. You can vote again in 24 hours.')
return HTTPFound(location=request.route_url('list'))
Thanks!
session data on cookies only
To integrate cookie sessions on pyramid, take a look on pyramid_beaker
To guarantee integrity only using cookies (and avoid the user poking into the cookie data), you should use an encrypted cookie (take a look into the Session Based Cookie and the Encryption Options).
Your main configuration will look somewhat like this:
[app:main]
...
session.type = cookie
session.key = SESSION
session.encrypt_key = R9RD9qx7uzcybJt1iBzeMoohyDUbZAnFCyfkWfxOoX8s5ay3pM
session.validate_key = pKs3JDwWiJmt0N0wQjJIqdG5c1XsHSlauM6T2DfB8FqOifsWZN
...
The session.key is just the name of the cookie. Change for whatever you want
The session.encrypt_key and session.validate_key above are just examples of big random strings. You should generate them yourself and keep them private.
Also, to encrypt the cookies properly you will need an AES cipher implementation. Installing pycrypto should do it:
pip install pycryto
Also your main function that creates the wsgi application should be changed to something like this:
from pyramid_beaker import session_factory_from_settings
...
def main(global_config, **settings):
...
config = Configurator(settings=settings)
...
config.set_session_factory(session_factory_from_settings(settings))
Now you can store the cookie data directly into the client browser and avoid data tampering. The simple solution to solve your problem is setting this cookie to never expire, storing the date of the last time he voted inside it and check based on what day is today and what day did he last voted
the main issue
The main problem now is dealing with users that delete the cookie, use another browser or simple use the browser's incognito window (chrome) or private navigation (firefox). This user appears to be a new user to your system and thus can vote again.
IMO to solve that you will need to have a server side control or penalize the user in a way that deleting the cookie will actually make his life harder to the point that deleting the cookie to gain a vote is not desirable anymore.
Security is not about perfect unhackable systems, but building systems that the cost to bypass it is actually higher than the benefit of doing it.
Using cookies for that kind of control will not prevent even the most simple attack (using a different browser for example :)). But you seem to know it and not actually care so it should be fine I guess:
Everytime a user votes, add a field to the cookie (you should also set its age limit to at least a week) with the value of the current date.
Next time the user tries to vote, you check if it's Saturday or Sunday (according to the user time settings),if that field exists and if the value is older than one day.
If you set the cookie validity to the next Saturday, you will have an extra verification mechanism as the cookie won't be valid anyway if it's Saturday :)
It sounds like an odd one but it's a really simple idea. I'm trying to make a simple Flickr for a website I'm building. This specific problem comes when I want to show a single photo (from my Photo model) on the page but I also want to show the image before it in the stream and the image after it.
If I were only sorting these streams by date, or was only sorting by ID, that might be simpler... But I'm not. I want to allow the user to sort and filter by a whole variety of methods. The sorting is simple. I've done that and I have a result-set, containing 0-many Photos.
If I want a single Photo, I start off with that filtered/sorted/etc stream. From it I need to get the current Photo, the Photo before it and the Photo after it.
Here's what I'm looking at, at the moment.
prev = None
next = None
photo = None
for i in range(1, filtered_queryset.count()):
if filtered_queryset[i].pk = desired_pk:
if i>1: prev = filtered_queryset[i-1]
if i<filtered_queryset.count(): next = filtered_queryset[i+1]
photo = filtered_queryset[i]
break
It just seems disgustingly messy. And inefficient. Oh my lord, so inefficient. Can anybody improve on it though?
Django queries are late-binding, so it would be nice to make use of that though I guess that might be impossible given my horrible restrictions.
Edit: it occurs to me that I can just chuck in some SQL to re-filter queryset. If there's a way of selecting something with its two (or one, or zero) closest neighbours with SQL, I'd love to know!
You could try the following:
Evaluate the filtered/sorted queryset and get the list of photo ids, which you hold in the session. These ids all match the filter/sort criteria.
Keep the current index into this list in the session too, and update it when the user moves to the previous/next photo. Use this index to get the prev/current/next ids to use in showing the photos.
When the filtering/sorting criteria change, re-evaluate the list and set the current index to a suitable value (e.g. 0 for the first photo in the new list).
I see the following possibilities:
Your URL query parameters contain the sort/filtering information and some kind of 'item number', which is the item number within your filtered queryset. This is the simple case - previous and next are item number minus one and plus one respectively (plus some bounds checking)
You want the URL to be a permalink, and contain the photo primary key (or some unique ID). In this case, you are presumably storing the sorting/filtering in:
in the URL as query parameters. In this case you don't have true permalinks, and so you may as well stick the item number in the URL as well, getting you back to option 1.
hidden fields in the page, and using POSTs for links instead of normal links. In this case, stick the item number in the hidden fields as well.
session data/cookies. This will break if the user has two tabs open with different sorts/filtering applied, but that might be a limitation you don't mind - after all, you have envisaged that they will probably just be using one tab and clicking through the list. In this case, store the item number in the session as well. You might be able to do something clever to "namespace" the item number for the case where they have multiple tabs open.
In short, store the item number wherever you are storing the filtering/sorting information.