Keep form result in memory

Keep form result in memory - python

The image above gives an example of what I hope to achieve with flask.
For now I have a list of tuples such as [(B,Q), (A,B,C), (T,R,E,P), (M,N)].
The list can be any length as well as the tuples. When I submit or pass my form, I receive the data one the server side, all good.
However now I am asked to remember the state of previously submited ad passed forms in order to go back to it and eventually modify the information.
What would be the best way to remember the state of the forms?
Python dictionary with the key being the form number as displayed at the bottom (1 to 4)
Store the result in an SQL table and query it every time I need to access a form again
Other ideas?
Notes: The raw data should be kept for max one day, however, the data are to be processed to generate meaningful information to be stored permanently. Hence, if a modification is made to the form, the final database should reflect it.

This will very much depend on how the application is built.
One option is to simply always return all the answers posted, with each request, but that won't work well if you have a lot of data.
Although you say that you need the data to be accessible for a day. So it seems reasonable to store it to a database. Performing select queries using the indexed key is rather insignificant for most cases.

Related

how to get all the key from one bucket in couchbase?

Using Python SDK, could not find how to get all the keys from one bucket
in couchbase.
Docs reference:
http://docs.couchbase.com/sdk-api/couchbase-python-client-2.2.0/api/couchbase.html#item-api-methods
https://github.com/couchbase/couchbase-python-client/tree/master/examples
https://stackoverflow.com/questions/27040667/how-to-get-all-keys-from-couchbase
Is there a simple way to get all the keys ?

I'm a little concerned as to why you would want every single key. The number of documents can get very large, and I can't think of a good reason to want every single key.
That being said, here are a couple of ways to do it in Couchbase:
N1QL. First, create a primary index (CREATE PRIMARY INDEX ON bucketname), then select the keys: SELECT META().id FROM bucketname; In Python, you can use N1QLQuery and N1QLRequest to execute these.
Create a map/reduce view index. Literally the default map function when you create a new map/reduce view index is exactly that: function (doc, meta) { emit(meta.id, null); }. In Python, use the View class.
You don't need Python to do these things, by the way, but you can use it if you'd like. Check out the documentation for the Couchbase Python SDK for more information.

I'm a little concerned as to why you would want every single key. The number of documents can get very large, and I can't think of a good reason to want every single key.
There is a document for every customer with the key being the username for the customer. That username is only held as a one-way hash (along with the password) for authentication. It is not stored in its original form or in a form from which the original can be recovered. It's not feasible to ask the 100 million customers to provide their userids. This came from an actual customer on #seteam.

Best way to show a user random data from an SQL database?

I'm working on a web app in Python (Flask) that, essentially, shows the user information from a PostgreSQL database (via Flask-SQLAlchemy) in a random order, with each set of information being shown on one page. Hitting a Next button will direct the user to the next set of data by replacing all data on the page with new data, and so on.
My conundrum comes with making the presentation truly random - not showing the user the same information twice by remembering what they've seen and not showing them those already seen sets of data again.
The site has no user system, and the "already seen" sets of data should be forgotten when they close the tab/window or navigate away.
I should also add that I'm a total newbie to SQL in general.
What is the best way to do this?

The easiest way is to do the random number generation in javascript at the client end...
Tell the client what the highest number row is, then the client page keeps track of which ids it has requested (just a simple js array). Then when the "request next random page" button is clicked, it generates a new random number less than the highest valid row id, and providing that the number isn't in its list of previously viewed items, it will send a request for that item.
This way, you (on the server) only have to have 2 database accessing views:
main page (which gives the js, and the highest valid row id)
display an item (by id)
You don't have any complex session tracking, and the user's browser is only having to keep track of a simple list of numbers, which even if they personally view several thousand different items is still only going to be a meg or two of memory.
For performance reasons, you can even pre-fetch the next item as soon as the current item loads, so that it displays instantly and loads the next one in the background while they're looking at it. (jQuery .load() is your friend :-) )
If you expect a large number of items to be removed from the database (so that the highest number is not helpful), then you can instead generate a list of random ids, send that, and then request them one at a time. Pre-generate the random list, as it were.
Hope this helps! :-)

You could stick the "already seen" data in a session cookie. Selecting random SQL data is explained here

CouchDB-Python query performance

I have several CouchDB databases. The largest is about 600k documents, and I am finding that queries are prohibitively long (several hours or more). The DB is updated infrequently (once a month or so), and only involves adding new documents, never updating existing documents.
Queries are of the type: Find all documents where key1='a' or multiple keys: key1='a', key2='b'...
I don't see that permanent views are practical here, so have been using the CouchDB-Python 'query' method.
I have tried several approaches, and I am unsure what is most efficient, or why.
Method 1:
map function is:
map_fun = '''function(doc){
if(doc.key1=='a'){
emit(doc.A, [doc.B, doc.C,doc.D,doc.E]);
}
}'''
The Python query is:
results = ui.db.query(map_fun, key2=user)
Then some operation with results.rows. This takes up the most time.
It takes about an hour for 'results.rows' to come back. If I change key2 to something else, it comes back in about 5 seconds. If I repeat the original user, it's also fast.
But sometimes I need to query on more keys, so I try:
map_fun = '''function(doc){
if(doc.key1=='a' && doc.key2=user && doc.key3='something else' && etc.){
emit(doc.A, [doc.B, doc.C,doc.D,doc.E]);
}
}'''
and use the python query:
results = ui.db.query(map_fun)
Then some operation with results.rows
Takes a long time for the first query. When I change key2, takes a long time again. If
I change key2 back to the original data, takes the same amount of time. (That is, nothing seems to be getting cached, B-tree'ed or whatever).
So my question is: What's the most efficient way to do queries in couchdb-python, where the queries are ad hoc and involve multiple keys for search criteria?
The UI is QT-based, using PyQt underneath.

There are two caveats for couchdb-python db.query() method:
It executes temporary view. This means that code flow processing would be blocked until this all documents would be proceeded by this view. And this would happened again and again for each call. Try to save view and use db.view() method instead to get results on demand and have incremental index updates.
It's reads whole result no matter how bigger it is. db.query() nor db.view() methods aren't lazy so if view result is 100 MB JSON object, you have to fetch all this data before use them somehow. To query data in more memory-optimized way, try to apply patch to have db.iterview() method - it allows you to fetch data in pagination style.

I think that the fix to your problem is to create an index for the keys you are searching. It is what you called permanent view.
Note the difference between map/reduce and SQL queries in a B-tree based table:
simple SQL query searching for a key (if you have an index for it) traverses single path in the B+-tree from root to leaf,
map function reads all the elements, event if it emits small result.
What you are doing is for each query
reading every document (most of the cost) and
searching for a key in the emitted result (quick search in the B-tree).
and I think your solution has to be slow by the design.
If you redesign database structure to make permanent views practical, (1.) will be executed once and only (2.) will be executed for each query. Each document will be read by a view after addition to DB and a query will search in B-tree storing emitted result. If emitted set is smaller than the total documents number, then the query searches smaller structure and you have the benefit over SQL databases.
Temporary views are far less efficient, then the permanent ones and are meant to be used only for development. CouchDB was designed to work with permanent views. To make map/reduce efficient one has to implement caching or make the view permanent. I am not familiar with the details of the CouchDB implementation, perhaps second query with different key is faster because of some caching. If for some reason you have to use temporary view then perhaps CouchDB is a mistake and you should consider DBMS created and optimized for online queries like MongoDB.

Get records before and after current selection in Django query

It sounds like an odd one but it's a really simple idea. I'm trying to make a simple Flickr for a website I'm building. This specific problem comes when I want to show a single photo (from my Photo model) on the page but I also want to show the image before it in the stream and the image after it.
If I were only sorting these streams by date, or was only sorting by ID, that might be simpler... But I'm not. I want to allow the user to sort and filter by a whole variety of methods. The sorting is simple. I've done that and I have a result-set, containing 0-many Photos.
If I want a single Photo, I start off with that filtered/sorted/etc stream. From it I need to get the current Photo, the Photo before it and the Photo after it.
Here's what I'm looking at, at the moment.
prev = None
next = None
photo = None
for i in range(1, filtered_queryset.count()):
if filtered_queryset[i].pk = desired_pk:
if i>1: prev = filtered_queryset[i-1]
if i<filtered_queryset.count(): next = filtered_queryset[i+1]
photo = filtered_queryset[i]
break
It just seems disgustingly messy. And inefficient. Oh my lord, so inefficient. Can anybody improve on it though?
Django queries are late-binding, so it would be nice to make use of that though I guess that might be impossible given my horrible restrictions.
Edit: it occurs to me that I can just chuck in some SQL to re-filter queryset. If there's a way of selecting something with its two (or one, or zero) closest neighbours with SQL, I'd love to know!

You could try the following:
Evaluate the filtered/sorted queryset and get the list of photo ids, which you hold in the session. These ids all match the filter/sort criteria.
Keep the current index into this list in the session too, and update it when the user moves to the previous/next photo. Use this index to get the prev/current/next ids to use in showing the photos.
When the filtering/sorting criteria change, re-evaluate the list and set the current index to a suitable value (e.g. 0 for the first photo in the new list).

I see the following possibilities:
Your URL query parameters contain the sort/filtering information and some kind of 'item number', which is the item number within your filtered queryset. This is the simple case - previous and next are item number minus one and plus one respectively (plus some bounds checking)
You want the URL to be a permalink, and contain the photo primary key (or some unique ID). In this case, you are presumably storing the sorting/filtering in:
in the URL as query parameters. In this case you don't have true permalinks, and so you may as well stick the item number in the URL as well, getting you back to option 1.
hidden fields in the page, and using POSTs for links instead of normal links. In this case, stick the item number in the hidden fields as well.
session data/cookies. This will break if the user has two tabs open with different sorts/filtering applied, but that might be a limitation you don't mind - after all, you have envisaged that they will probably just be using one tab and clicking through the list. In this case, store the item number in the session as well. You might be able to do something clever to "namespace" the item number for the case where they have multiple tabs open.
In short, store the item number wherever you are storing the filtering/sorting information.

Dealing with URLs in Django

So, basically what I'm trying to do is a hockey pool application, and there are a ton of ways I should be able to filter to view the data. For example, filter by free agent, goals, assists, position, etc.
I'm planning on doing this with a bunch of query strings, but I'm not sure what the best approach would be to pass along the these query strings. Lets say I wanted to be on page 2 (as I'm using pagination for splitting the pages), sort by goals, and only show forwards, I would have the following query set:
?page=2&sort=g&position=f
But if I was on that page, and it was showing me all this corresponding info, if I was to click say, points instead of goals, I would still want all my other filters in tact, so like this:
?page=2&sort=p&position=f
Since HTTP is stateless, I'm having trouble on what the best approach to this would be.. If anyone has some good ideas they would be much appreciated, thanks ;)
Shawn J

Firstly, think about whether you really want to save all the parameters each time. In the example you give, you change the sort order but preserve the page number. Does this really make sense, considering you will now have different elements on that page. Even more, if you change the filters, the currently selected page number might not even exist.
Anyway, assuming that is what you want, you don't need to worry about state or cookies or any of that, seeing as all the information you need is already in the GET parameters. All you need to do is to replace one of these parameters as required, then re-encode the string. Easy to do in a template tag, since GET parameters are stored as a QueryDict which is basically just a dictionary.
Something like (untested):
#register.simple_tag
def url_with_changed_parameter(request, param, value):
params = request.GET
request[param] = value
return "%s?%s" % (request.path, params.urlencode())
and you would use it in your template:
{% url_with_changed_parameter request "page" 2 %}

Have you looked at django-filter? It's really awesome.

Check out filter mechanism in the admin application, it includes dealing with dynamically constructed URLs with filter information supplied in the query string.
In addition - consider saving actual state information in cookies/sessions.

If You want to save all the "parameters", I'd say they are resource identifiers and should normally be the part of URI.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.