I are currently using nextval() function for postgresql insert id ,But now i want to implement this function in python to generate the id which will later inserted into the postgresql. Any idea what's the Pseudocode of this function, or in python.
Thanks in Advance
Look at nextval_internal in src/backend/commands/sequence.c.
Essentially, you lock the object against concurrent access, then increment or decrement according to the sequence definitions, unlock the object and return the new value.
The hard part is to persist this modification efficiently, so that you can be sure that you don't deal out duplicate values if the application happens to crash.
Databases are particularly good at persisting this efficiently and reliably, so you are best off using database sequences rather than trying to re-invent the wheel.
Related
A python script I write generates IDs that I need to add to a list, while another script checks, if the ID already exists in the list. There are no other tables, relations or anything, just a huge, growing list of 6-letter strings. I need to add and read values as fast as possible at the same time. What would be the database of choice in that case? A NoSQL database like redis?
Redis is a perfect fit for this, you can use redis hashes. Also ensure you shard hashes for efficiency see below link for more details on how instagram reduces the total memory requirements by using redis hashes.
https://instagram-engineering.com/storing-hundreds-of-millions-of-simple-key-value-pairs-in-redis-1091ae80f74c
Yes, MongoDB or Redis, can do the work
I'm trying to export all data connected to an User instance to CSV file. In order to do so, I need to get it from the DB first. Using something like
data = SomeModel.objects.filter(owner=user)
on every model possible seems to be very inefficient, so I want to use prefetch_related(). My question is, is there any way to prefetch all different model's instances with FK pointing at my User, at once?
Actually, you don't need to "prefetch everything" in order to create a CSV file – or, anything else – and you really don't want to. Python's CSV support is of course designed to work "row by row," and that's what you want to do here: in a loop, read one row at a time from the database and write it one row at a time to the file.
Remember that Django is lazy. Functions like filter() specify what the filtration is going to be, but things really don't start happening until you start to iterate over the actual collection. That's when Django will build the query, submit it to the SQL engine, and start retrieving the data that's returned ... one row at a time.
Let the SQL engine, Python and the operating system take care of "efficiency." They're really good at that sort of thing.
In my project, we use python to connect to Neo4j and run the cypher queries. Are there any inbuilt auditing techniques with Neo4j? If not, could anyone please give suggestions on methods to achieve the same. I want to audit every single change to nodes, relationships and the attributes.
Currently, we are thinking of querying Neo4j to know whether it is a create or update and get a list of attributes with new values. These values will later be inserted into a table in cassandra. This seems to be expensive to implement and messy. If anyone could point me towards a more elegant method, it would be greatly appreciated.
Thanks in Advance,
Anusha
You can try using triggers from the apoc library. For example:
CALL apoc.trigger.add(
'auditCreateNodes',
'UNWIND {createdNodes} AS n
...audit actions...',
{phase:'after'}
)
I am working heavily with a database, using python, and I am trying to write code that actually makes my life easier.
Most of the time, I need to run a query and get results to process them; most of the time I get the same fields from the same table, so my idea was to collect the various results in an object, to process it later.
I am using SQLAlchemy for the DB interaction. From what I can read, there is no direct way to just say "dump the result of this query to an object", so I can access the various fields like
print object.fieldA
print object.fieldB
and so on. I tried dumping the results to JSON, but even that require parsing and it is not as straightforward as I hoped.
So at this point is there anything else that I can actually try? Or should I write a custom object that mimic the db structure, and parse the result with for loops, to put the data in the right place? I was hoping to find a way to do this automatically, but so far it seems that the only way to get something close to what I am looking for, is to use JSON.
EDIT:
Found some info about serialization and the capabilities that SQLAlchemy has, to read a table and reproduce a sort of 1:1 copy of it in an object, but I am not sure that this will actually work with a query.
Found that the best way is to actually use a custom object.
You can use reflection trough SQLAlchemy to extrapolate the structure, but if you are dealing with a small database with few tables, you can simply create on your own the object that will host the data. This gives you control over the object and what you can put in it.
There are obvious other ways, but since nobody posted anything; I assume that either are too easy to be mentioned, or too hard and specific to each case.
I have a SQLAlchemy-based tool for selectively copying data between two different databases for testing purposes. I use the merge() function to take model objects from one session and store them in another session. I'd like to be able to store the source objects in some intermediate form and then merge() them at some later point in time.
It seems like there are a few options to accomplish this:
Exporting DELETE/INSERT SQL statements. Seems pretty straightforward, I think I can get SQLAlchemy to give me the INSERT statements, and maybe even the DELETEs.
Exproting the data to a SQLite database file with the same (or similar) schema, that could then be read in as a source at a later point in time.
Serializing the data in some manner and then reading them back into memory for the merge. I don't know if SQLAlchemy has something like this built-in or not. I'm not sure what the challenges would be in rolling this myself.
Has anyone tackled this problem before? If so, what was your solution?
EDIT: I found a tool built on top of SQLAlchemy called dataset that provides the freeze functionality I'm looking for, but there seems to be no corresponding thaw functionality for restoring the data.
I haven't used it before, but the dogpile caching techniques described in the documentation might be what you want. This allows you to query to and from a cache using the SQLAlchemy API:
http://docs.sqlalchemy.org/en/rel_0_9/orm/examples.html#module-examples.dogpile_caching