I have this excel sheet where i can have as many as 3000-5000 or may be more data( email ids) and i need to insert each of these into my db one by one .I have been asked not to make these many write operations in one go .How should i go about solving this so that i don't do so many entries to database ?Any hint will be highly appreciated .
The solution i could think of is this .
https://www.caktusgroup.com/blog/2019/01/09/django-bulk-inserts/
I'd vote for doing bulk operations withithe django model itself
you can for example do the following with your model class :
... Entry.objects.bulk_create([
... Entry(headline='This is a test'),
... Entry(headline='This is only a test'),
... ])
(code from django website)
you can also you transaction dicorator were you can in sure that a database trasaction is atomic (All or Nothing), thisis handy when you want to do long processeing for your entities or transform them
from django.db import transaction
#transaction.atomic
decorate your method that inserts records into the database
I think that the link you included would work fairly well!
Just wanted to throw one other option out there. Many databases have batch/bulk upload capabilities. Depending on what DB you're using, you could potentially connect directly to it, and perform your writes without Django.
Related
I have a bigquery table about 200 rows, i need to insert,delete and update values in this through a web interface(the table cannot be migrated to any other relational or non-relational database).
The web application will be deployed in google-cloud on app-engine and the user who acts as admin and owner privileges on Bigquery will be able to create and delete records and the other users with view permissions on the dataset in bigquery will be able to view records only.
I am planning to use the scripting language as python,
server(django or flask or any other)-> not sure which one is better
The web application should be displayed as a data-grid like appearance with buttons create,delete or view visiblility according to their roles.
I have not done anything like this in python,bigquery and django. I am already familiar with calling bigquery from python-client but to call in a web interface and in a transactional way, i am totally new.
I am seeing examples only related to django with their inbuilt model and not with big-query.
Can anyone please help me and clarify whether this is possible to implement and how?
I was able to achieve all of "C R U D" on Bigquery with the help of SQLAlchemy, though I had make a lot of concessions like if i use sqlalchemy class i needed to use a false primary key as Bigquery does not use any primary key and for storing sessions i needed to use file based session On Django for updates and create sqlalchemy does not allow without primary key, so i used raw sql part of SqlAlchemy. Thanks to the #mhawke who provided the hint for me to carry out this exericse
No, at most you could achieve the "R" of "CRUD." BigQuery isn't a transactional database, it's for querying vast amounts of data and preparing the results as an immutable view.
It doesn't provide a method to modify the source data directly and even if you did you'd need to run the query again. Also important to note are that queries are asynchronous and require much longer to perform than traditional databases.
The only reasonable solution would be to export the table data to GCS and then import it into a normal database for querying. Alternatively if you can't use another database and since you said there are only 1,000 rows you could perform your CRUD actions directly on that exported CSV.
I have a huge amount of data in my db.
I cannot use .delete() method cause performance of Django ORM is insufficient in my case.
_raw_delete() method suits me cause I can do it python instead using raw SQL.
But I have problem I have no idead how can I delete relation tables using _raw_delete. They need to be deleted before models cause I have restrict in DB. Any ideas how can I achieve this?
I have found a solution.
You can operate on link model with this:
link_model = MyModel._meta.get_field('my_m2m_field').remote_field.through
qs = link_model.objects.filter(mymodel_id__in=mymodel_ids)
qs._raw_delete(qs.db)
Can you take form data and change database schema? Is it a good idea? Is there a downside to many migrations from a 'default' database?
I want users to be able to add / remove tables, columns, and rows. Making schema changes requires migrations, so adding in that functionality would require writing a view that takes form data and inserts it into a function that then uses Flask-Migrate.
If I manage to build this, don't migrations build the required separate scripts and everything that goes along with that each time something is added or removed? Is that practical for something like this, where 10 or 20 new tables might be added to the starting database?
If I allow users to add columns to a table, it will have to modify the table's class. Is that possible, or a safe idea? If not, I'd appreciate it if someone could help me out, and at least get me pointed in the right direction.
In a typical web application, the deployed database does not change its schema at runtime. The schema is only changed during an upgrade, and only the developers make these changes. Operations that users perform on the application can add, remove or modify rows, but not modify the tables or columns themselves.
If you need to offer your users a way to add flexible data structures, then you should design your database schema in a way that this is possible. For example, if you wanted your users to add custom key/value pairs, you could have a table with columns user_id, key_name and value. You may also want to investigate if a schema-less database fits your needs better.
I am working on a project which requires me to create a table of every user who registers on the website using the username of that user. The columns in the table are same for every user.
While researching I found this Django dynamic model fields. I am not sure how to use django-mutant to accomplish this. Also, is there any way I could do this without using any external apps?
PS : The backend that I am using is Mysql
An interesting question, which might be of wider interest.
Creating one table per user is a maintenance nightmare. You should instead define a single table to hold all users' data, and then use the database's capabilities to retrieve only those rows pertaining to the user of interest (after checking permissions if necessary, since it is not a good idea to give any user unrestricted access to another user's data without specific permissions having been set).
Adopting your proposed solution requires that you construct SQL statements containing the relevant user's table name. Successive queries to the database will mostly be different, and this will slow the work down because every SQL statement has to be “prepared” (the syntax has to be checked, the names of table and columns has to be verified, the requesting user's permission to access the named resources has to be authorized, and so on).
By using a single table (model) the same queries can be used repeatedly, with parameters used to vary specific data values (in this case the name of the user whose data is being sought). Your database work will move along faster, you will only need a single model to describe all users' data, and database management will not be a nightmare.
A further advantage is that Django (which you appear to be using) has an extensive user-based permission model, and can easily be used to authenticate user login (once you know how). These advantages are so compelling I hope you will recant from your heresy and decide you can get away with a single table (and, if you planning to use standard Django logins, a relationship with the User model that comes as a central part of any Django project).
Please feel free to ask more questions as you proceed. It seems you are new to database work, and so I have tried to present an appropriate level of detail. There are many pitfalls such as this if you cannot access knowledgable advice. People on SO will help you.
This page shows how to create a model and install table to database on the fly. So, you could use type('table_with_username', (models.Model,), attrs) to create a model and use django.core.management to install it to the database.
I'm starting a Django project and need to shard multiple tables that are likely to all be of too many rows. I've looked through threads here and elsewhere, and followed the Django multi-db documentation, but am still not sure how that all stitches together. My models have relationships that would be broken by sharding, so it seems like the options are to either drop the foreign keys of forgo sharding the respective models.
For argument's sake, consider the classic Authot, Publisher and Book scenario, but throw in book copies and users that can own them. Say books and users had to be sharded. How would you approach that? A user may own a copy of a book that's not in the same database.
In general, what are the best practices you have used for routing and the sharding itself? Did you use Django database routers, manually selected a database inside commands based on your sharding logic, or overridden some parts of the ORM to achive that?
I'm using PostgreSQL on Ubuntu, if it matters.
Many thanks.
In the past I've done something similar using Postgresql Table Partitioning, however this merely splits a table up in the same DB. This is helpful in reducing table search time. This is also nice because you don't need to modify your django code much. (Make sure you perform queries with the fields you're using for constraints).
But it's not sharding.
If you haven't seen it yet, you should check out Sharding Postgres with Instagram.
I agree with #DanielRoseman. Also, how many is too many rows. If you are careful with indexing, you can handle a lot of rows with no performance problems. Keep your indexed values small (ints). I've got tables in excess of 400 million rows that produce sub-second responses even when joining with other many million row tables.
It might make more sense to break user up into multiple tables so that the user object has a core of commonly used things and then the "profile" info lives elsewhere (std Django setup). Copies would be a small table referencing books which has the bulk of the data. Considering how much ram you can put into a DB server these days, sharding before you have too seems wrong.