I am using redis sorted sets to save user notifications. But as i never did a notification system, I am asking about my logic.
I need to save 4 things for each notification.
post_id
post_type - A/B
visible - Y/N
checked - Y/N
My question is how can I store this type of structure in sorted sets?
ZADD users_notifications:1 10 1_A_Y_Y
ZADD users_notifications:1 20 2_A_Y_N
....
There is a better way to do this type of stuff in redis? In the case above i am saving the four thing in each element, and i need to split by the underscore in the server language.
It really depends on how you need to query the data.
The most common way to approach this problem is to use a sorted set for the order and a hash for each object.
So:
ZADD notifications:<user-id> <timestamp> <post-id>
HMSET notifications:<user-id>:<post-id> type <type> visible <visible> checked <checked>
You'd use ZRANGE to get the latest notifications in order and then a pipelined call to HMGET to get the attributes for each object.
As I mentioned, it depends on how you need to access the data. If, for example, you always show visible and unchecked notifications to a user, then you probably want to store those IDs in a different sorted set, so that you don't have to query for the status.
Assuming you have such a sorted set, when a user dismisses a notification you'd do:
HSET notifications:<user-id>:<post-id> visible 0
ZREM notifications:<user-id>:visible <post-id>
Related
Using Python SDK, could not find how to get all the keys from one bucket
in couchbase.
Docs reference:
http://docs.couchbase.com/sdk-api/couchbase-python-client-2.2.0/api/couchbase.html#item-api-methods
https://github.com/couchbase/couchbase-python-client/tree/master/examples
https://stackoverflow.com/questions/27040667/how-to-get-all-keys-from-couchbase
Is there a simple way to get all the keys ?
I'm a little concerned as to why you would want every single key. The number of documents can get very large, and I can't think of a good reason to want every single key.
That being said, here are a couple of ways to do it in Couchbase:
N1QL. First, create a primary index (CREATE PRIMARY INDEX ON bucketname), then select the keys: SELECT META().id FROM bucketname; In Python, you can use N1QLQuery and N1QLRequest to execute these.
Create a map/reduce view index. Literally the default map function when you create a new map/reduce view index is exactly that: function (doc, meta) { emit(meta.id, null); }. In Python, use the View class.
You don't need Python to do these things, by the way, but you can use it if you'd like. Check out the documentation for the Couchbase Python SDK for more information.
I'm a little concerned as to why you would want every single key. The number of documents can get very large, and I can't think of a good reason to want every single key.
There is a document for every customer with the key being the username for the customer. That username is only held as a one-way hash (along with the password) for authentication. It is not stored in its original form or in a form from which the original can be recovered. It's not feasible to ask the 100 million customers to provide their userids. This came from an actual customer on #seteam.
The image above gives an example of what I hope to achieve with flask.
For now I have a list of tuples such as [(B,Q), (A,B,C), (T,R,E,P), (M,N)].
The list can be any length as well as the tuples. When I submit or pass my form, I receive the data one the server side, all good.
However now I am asked to remember the state of previously submited ad passed forms in order to go back to it and eventually modify the information.
What would be the best way to remember the state of the forms?
Python dictionary with the key being the form number as displayed at the bottom (1 to 4)
Store the result in an SQL table and query it every time I need to access a form again
Other ideas?
Notes: The raw data should be kept for max one day, however, the data are to be processed to generate meaningful information to be stored permanently. Hence, if a modification is made to the form, the final database should reflect it.
This will very much depend on how the application is built.
One option is to simply always return all the answers posted, with each request, but that won't work well if you have a lot of data.
Although you say that you need the data to be accessible for a day. So it seems reasonable to store it to a database. Performing select queries using the indexed key is rather insignificant for most cases.
I'm trying to delete all entries in the cache store that contain (in this case start with) a substring of the cache key, but I don't see any easy way of doing this. I'm using Memcache as backend.
If I understand the code correctly, I need to pass the full cache key when calling delete or delete_many. Is there any other way of doing this?
I'll explain what I'm trying to do in case there is a better way: I need to clear the cache for certain users when they modify their settings. Clearing the cache with clear() will remove the cache entries for all the users, which are some 110K, so I don't want to use that.
I am generating key_prefix with the ID of the user, the request's path, and other variables. The cache keys always start with the ID of the authenticated user. So ideally I would use something like delete_many(user_id + ".*")
It's not supported because Memcache is designed to be a distributed hash. There's no index of keys stored to search in.
Ideally you should know what suffixes a key may have.
If not, you could maintain an index yourself in a special key for the user.
Like user_id + '_keys' which contains a list of keys.
This way you can cycle key by key and delete all the cache for the user.
You can override the .set function to manage this new key.
I have a distributed application that sends and receives data from a specific service on the Internet. When a node receives data, sometimes it needs to validate that that data is correlated with data it or another node previously sent. The value also needs to be unique enough so that I can practically expect never to generate identical values within 24 hours.
In the current implementation I have been using a custom header containing a value of uuid.uuid1(). I can easily validate that that value comes from the one single node running by comparing the received uuid to uuid.getnode(), but this implementation was written before we had a requirement that this app should be multi-node.
I still think that some uuid version is the right answer, but I can't seem to figure out how to validate an incoming uuid value.
>>> received = uuid.uuid5(uuid.NAMESPACE_URL, 'http://example.org')
>>> received
UUID('c57c6902-3774-5f11-80e5-cf09f92b03ac')
Is there some way to validate that received was generated with 'http://example.org'?
Is uuid the right approach at all? If not, what is?
If so, should I even be using uuid5 here?
If the goal is purely to create a unique value across your nodes couldn't you just give each node a unique name and append that to the uuid you are generating?
Wasn't clear to me if you are trying to do this for security reasons or you simply just want a guaranteed unique value across the nodes.
I have a simple GAE system that contains models for Account, Project and Transaction.
I am using Django to generate a web page that has a list of Projects in a table that belong to a given Account and I want to create a link to each project's details page. I am generating a link that converts the Project's key to string and includes that in the link to make it easy to lookup the Project object. This gives a link that looks like this:
My Project Name
Is it secure to create links like this? Is there a better way? It feels like a bad way to keep context.
The key string shows up in the linked page and is ugly. Is there a way to avoid showing it?
Thanks.
There is few examples, in GAE docs, that uses same approach, and also Key are using characters safe for including in URLs. So, probably, there is no problem.
BTW, I prefer to use numeric ID (obj_key.id()), when my model uses number as identifier, just because it's looks not so ugly.
Whether or not this is 'secure' depends on what you mean by that, and how you implement your app. Let's back off a bit and see exactly what's stored in a Key object. Take your key, go to shell.appspot.com, and enter the following:
db.Key(your_key)
this returns something like the following:
datastore_types.Key.from_path(u'TestKind', 1234, _app=u'shell')
As you can see, the key contains the App ID, the kind name, and the ID or name (along with the kind/id pairs of any parent entities - in this case, none). Nothing here you should be particularly concerned about concealing, so there shouldn't be any significant risk of information leakage here.
You mention as a concern that users could guess other URLs - that's certainly possible, since they could decode the key, modify the ID or name, and re-encode the key. If your security model relies on them not guessing other URLs, though, you might want to do one of a couple of things:
Reconsider your app's security model. You shouldn't rely on 'secret URLs' for any degree of real security if you can avoid it.
Use a key name, and set it to a long, random string that users will not be able to guess.
A final concern is what else users could modify. If you handle keys by passing them to db.get, the user could change the kind name, and cause you to fetch a different entity kind to that which you intended. If that entity kind happens to have similarly named fields, you might do things to the entity (such as revealing data from it) that you did not intend. You can avoid this by passing the key to YourModel.get instead, which will check the key is of the correct kind before fetching it.
All this said, though, a better approach is to pass the key ID or name around. You can extract this by calling .id() on the key object (for an ID - .name() if you're using key names), and you can reconstruct the original key with db.Key.from_path('kind_name', id) - or just fetch the entity directly with YourModel.get_by_id.
After doing some more research, I think I can now answer my own question. I wanted to know if using GAE keys or ids was inherently unsafe.
It is, in fact, unsafe without some additional code, since a user could modify URLs in the returned webpage or visit URL that they build manually. This would potentially let an authenticated user edit another user's data just by changing a key Id in a URL.
So for every resource that you allow access to, you need to ensure that the currently authenticated user has the right to be accessing it in the way they are attempting.
This involves writing extra queries for each operation, since it seems there is no built-in way to just say "Users only have access to objects that are owned by them".
I know this is an old post, but i want to clarify one thing. Sometimes you NEED to work with KEYs.
When you have an entity with a #Parent relationship, you cant get it by its ID, you need to use the whole KEY to get it back form the Datastore. In these cases you need to work with the KEY all the time if you want to retrieve your entity.
They aren't simply increasing; I only have 10 entries in my Datastore and I've already reached 7001.
As long as there is some form of protection so users can't simply guess them, there is no reason not to do it.