Basically I have a program which scraps some data from a website, I need to either print it out to a django template or to REST API without using a database. How do I do this without a database?
Your best bet is to
a.) Perform the scraping in views themselves, and pass the info in a context dict to the template
or
b.) Write to a file and have your view pull info from the file.
Django can be run without a database, but it depends on what applications you enable. Some of the default functionality (auth, sites, contenttypes) requires a database. So you'd need to disable those. If you need to use them, you're SOL.
Other functionality (like sessions) usually uses a database, but you can configure it to use a cache or file or something else.
I've taken two approaches in the past:
1) Disable the database completely and disable the applications that require the database:
DATABASES = {}
2) Use a dummy sqlite database just so it works out of box with the default apps without too much tweaking, but don't really use it for anything. I find this method faster and good for setting up quick testing/prototyping.
And to actually get the data from the scraper into your view, you can take a number of approaches. Store the data in a cache, or just write it directly to your context variables, etc.
Related
I have a bigquery table about 200 rows, i need to insert,delete and update values in this through a web interface(the table cannot be migrated to any other relational or non-relational database).
The web application will be deployed in google-cloud on app-engine and the user who acts as admin and owner privileges on Bigquery will be able to create and delete records and the other users with view permissions on the dataset in bigquery will be able to view records only.
I am planning to use the scripting language as python,
server(django or flask or any other)-> not sure which one is better
The web application should be displayed as a data-grid like appearance with buttons create,delete or view visiblility according to their roles.
I have not done anything like this in python,bigquery and django. I am already familiar with calling bigquery from python-client but to call in a web interface and in a transactional way, i am totally new.
I am seeing examples only related to django with their inbuilt model and not with big-query.
Can anyone please help me and clarify whether this is possible to implement and how?
I was able to achieve all of "C R U D" on Bigquery with the help of SQLAlchemy, though I had make a lot of concessions like if i use sqlalchemy class i needed to use a false primary key as Bigquery does not use any primary key and for storing sessions i needed to use file based session On Django for updates and create sqlalchemy does not allow without primary key, so i used raw sql part of SqlAlchemy. Thanks to the #mhawke who provided the hint for me to carry out this exericse
No, at most you could achieve the "R" of "CRUD." BigQuery isn't a transactional database, it's for querying vast amounts of data and preparing the results as an immutable view.
It doesn't provide a method to modify the source data directly and even if you did you'd need to run the query again. Also important to note are that queries are asynchronous and require much longer to perform than traditional databases.
The only reasonable solution would be to export the table data to GCS and then import it into a normal database for querying. Alternatively if you can't use another database and since you said there are only 1,000 rows you could perform your CRUD actions directly on that exported CSV.
I have a method to load all ideas from database. There are few comments on each idea.
I have store users id who commented on ideas, in respective table.
I have a web service which contains all the data related to user id
When i load a page, it took time to fetch information for all users using web service.
I want to use databse caching to store that web service response and use that later on.
How to achieve this, to reduce page load timing?
when should i store that web service response in cache table?
The Django documentation on the cache framework is pretty easy to follow and will show you exactly how to set up a database cache for pages and other things in your views, including for how long you'd like the cache to exist, TIMEOUT, as well as other arguments a cache can take.
Another way to speed up accessing your DB information is to take advantage of CONN_MAX_AGE for your database in your settings.py, who's time can be dependent on how often the database needs to be accessed or how much traffic the site gets (as an example). This is basically letting your DB connection know how long to stay open, and can be found in the settings documentation.
You can store information in a cache when something on the site occurs (such as a new comment) or when that particular request is made for a particular page. It can be entirely up to the needs of your project.
So I have a site that on a per-user basis, and it is expected to query a very large database, and flip through the results. Due to the size of the number of entries returned, I run the query once (which takes some time...), store the result in a global, and let folks iterate through the results (or download them) as they want.
Of course, this isn't scalable, as the globals are shared across sessions. What is the correct way to do this in Django? I looked at session management, but I always ran into the "xyz is not serializeable on json" issue. Do I look into how I do this correctly using sessions, or is there another preferred way to do this?
If the user is flipping through the results, you probably don't want to pull back and render any more than you have to. Most SQL dialects have TOP and LIMIT clauses that will let you pull back a limited range of results, as long as your data is ordered consistently. Django's Pagination classes are a nice abstraction of this on top of Django Model classes: https://docs.djangoproject.com/en/dev/topics/pagination/
I would be careful of storing large amounts of data in user sessions, as it won't scale as your number of users grows, and user sessions can stay around for a while after the user has left the site. If you're set on this option, make sure you read about clearing the expired sessions. Django doesn't do it for you:
https://docs.djangoproject.com/en/1.7/topics/http/sessions/#clearing-the-session-store
I am working on a project which requires me to create a table of every user who registers on the website using the username of that user. The columns in the table are same for every user.
While researching I found this Django dynamic model fields. I am not sure how to use django-mutant to accomplish this. Also, is there any way I could do this without using any external apps?
PS : The backend that I am using is Mysql
An interesting question, which might be of wider interest.
Creating one table per user is a maintenance nightmare. You should instead define a single table to hold all users' data, and then use the database's capabilities to retrieve only those rows pertaining to the user of interest (after checking permissions if necessary, since it is not a good idea to give any user unrestricted access to another user's data without specific permissions having been set).
Adopting your proposed solution requires that you construct SQL statements containing the relevant user's table name. Successive queries to the database will mostly be different, and this will slow the work down because every SQL statement has to be “prepared” (the syntax has to be checked, the names of table and columns has to be verified, the requesting user's permission to access the named resources has to be authorized, and so on).
By using a single table (model) the same queries can be used repeatedly, with parameters used to vary specific data values (in this case the name of the user whose data is being sought). Your database work will move along faster, you will only need a single model to describe all users' data, and database management will not be a nightmare.
A further advantage is that Django (which you appear to be using) has an extensive user-based permission model, and can easily be used to authenticate user login (once you know how). These advantages are so compelling I hope you will recant from your heresy and decide you can get away with a single table (and, if you planning to use standard Django logins, a relationship with the User model that comes as a central part of any Django project).
Please feel free to ask more questions as you proceed. It seems you are new to database work, and so I have tried to present an appropriate level of detail. There are many pitfalls such as this if you cannot access knowledgable advice. People on SO will help you.
This page shows how to create a model and install table to database on the fly. So, you could use type('table_with_username', (models.Model,), attrs) to create a model and use django.core.management to install it to the database.
At my work, we use Oracle for our database. Which works great. I am not the main db admin, but I do work with it. One thing I like is that the DB has a built in logic layer using PL/SQL which ca handle logic related to saving the data and retrieve it. I really like this because it allows our MVC application (PHP/Zend Framework) to be lighter, and makes it easier to tie in another platform into the data, such as desktop or mobile.
Although, I have a personal project where I want to use couchdb or mongodb, and I want to try and accomplish a similar goal. outside of the mvc/framework, I want to have an API layer that the main applications talk to. they dont actually talk directly to the database. They specify the design document (couchdb) or something similar for mongo, to get the results. And that API layer will validate the incoming data and make sure that data itself is saved and updated properly. Such as saving a new user, in the framework I only need to send a json obejct with the keys/values that need to be saved and the api layer saves the data in the proper places where needed.
This API would probably have a UI, but only for administrative purposes and to make my life easier. In general it will always reply with json strings, or pre-rendered/cached html in some cases. Since each api layer would be specific to the application anyways.
I was wondering if anyone has done anything like this, or had any tips on nethods I could accomplish this. I am currently looking to write my application in python, and the front end will likely be something like Angularjs. Although I am also looking at node.js for a back end.
We do this exact thing at my current job. We have MongoDB on the back end, a RESTful API on top of it and then PHP/Zend on the front end.
Most of our data is read only, so we import that data into MongoDB and then the RESTful API (in Java) just serves it up.
Some things to think about with this approach:
Write generic sorting/paging logic in your API. You'll need this for lists of data. The user can pass in things like http://yourapi.com/entity/1?pageSize=10&page=3.
Make sure to create appropriate indexes in Mongo to match what people will query on. Imagine you are storing users. Make an index in Mongo on the user id field, or just use the _id field that is already indexed in all your calls.
Make sure to include all relevant data in a given document. Mongo doesn't do joins like you're used to in Oracle. Just keep in mind modeling data is very different with a document database.
You seem to want to write a layer (the middle tier API) that is database agnostic. That's a good goal. Just be careful not to let Mongo specific terminology creep into your exposed API. Mongo has specific operators/concepts that you'll need to mask with more generic terms. For example, they have a $set operator. Don't expose that directly.
Finally after having a decent amount of experience with CouchDB and Mongo, I'd definitely go with Mongo.