How do I control where SQLAlchemy places a sqlite db file

How do I control where SQLAlchemy places a sqlite db file - python

I plan to (attempt) to upload a flask application to a web-host that does not provide ssh access. I have confirmed that it (the webhost) will run Flask applications, and I can create one that works, when it has no database, but I am getting errors when attempting to create the database. I can't work out how to control where it is trying to place the database. My code looks like this:
from flask_sqlalchemy import SQLAlchemy
# create the extension
db = SQLAlchemy()
# create the app
app = Flask(__name__)
# configure the SQLite database, relative to the app instance folder
app.config["SQLALCHEMY_DATABASE_URI"] = "sqlite:///flaskapp.db"
# initialize the app with the extension
db.init_app(app)
In my development machine, using Geany, running db.create_all() places the database in "var/app-instance/". Using PyCharm, on the same machine it places it in "instance/".
Some variable presumably dictates what this path is, but so far I haven't worked out what, or how to influence it. My application works as expected on my development server, using either development environment (Geany or Pycharm), but this does not work on the webhost I am trying to use, as described below.
As well as 'googling', I have grepped through the sqlalchemy files, and I found the def create_all(..), in schema.py but can't work out where it gets the information on what directory-structure to create.
I am not able to use "os" in the web-host, a suggestion made in other answers and tutorials.
I tried creating a path in various forms, and on my development machine, this, for example, works:
"sqlite:////tmp/flaskapp.db" , but I don't have access to /tmp on the web host, and I was unable to find an absolute path that the webhost would accept (ie without complaining that I don't have access to write to the directory). I can't 'pwd' on the webhost either.
Using "sqlite://instance/flaskapp.db" on my development machine produces an error pointing out that:
Valid SQLite URL forms are:
sqlite:///:memory: (or, sqlite://)
sqlite:///relative/path/to/file.db
sqlite:////absolute/path/to/file.db
However, if I try, a relative path, for example, "sqlite:///instance/flaskapp.db", I get "sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file -(Background on this error at: https://sqlalche.me/e/20/e3q8)", even if I create the directory myself (ie relative to the app.py root directory). [In this case write permissions are the same as for all other parts of the project].
That link, in the error output, says "This error is a DBAPI Error and originates from the database driver (DBAPI), not SQLAlchemy itself". Unfortunately I am not clear how to proceed from that.
If someone could help direct me to information that would help me understand and resolve the issue, that would be great, thanks!
I would like to be able to explicitly state where the database will be stored, relative to the route of my application directory
I am using Linux (Arch), in case that is important, too.
The question linked by Sam below shows the use of "url.make_url()". When I use this as shown, I get
>>> import sqlalchemy.engine.url as url
>>> url.make_url('sqlite:///flaskcw.db')
sqlite:///flaskcw.db
which is what I would expect (and what I want). But this is not what happens when I run db.create_all()
>>> from main import db
database binding
sqlite:///flaskcw.db
Engine(sqlite:////home/user/PycharmProjects/cwflaskapp/instance/flaskcw.db)
whereas I would expect it to place the database in the root of the project (in this case cwflaskapp/, as ..cwflaskapp/flaskcw.db) - given that is where main.py is - rather than in 'projroot'/instance/, a directory that is created in the process. (Or in the case of Geany in 'projroot'/var/app-instance/ - also created only on creating the database as above).
What am I missing?

I found another question this code isnt creating my site.db file in directory with an answer that allows me to specify a folder on my development server, without a dependency on "os".
To briefly repeat the relevant part of the answer, I can do this:
app.config['SQLALCHEMY_DATABASE_URI'] = f"sqlite:///{app.root_path}/mydir/site.db"
and this puts the database in 'projroot'/mydir/site.db . I am going to confirm that I can do the same on the webhost, and will re-edit accordingly

Related

Use alembic migration or docker volumes to populate docker postgres database?

I believe this question already shows that I am new to docker and alembic. I am building a flask+sqlalchemy app using docker and postgres. So far I am not using alembic, but I am about to plug it in and some questions came up. I will have to create a pg_trgm extension and also populate one of the tables with data I already have. Until now I have only created brand new databases using sqlalchemy for the tests. So here is what I am thinking/doing:
To create the extension I could simple add a volume to the postgres docker service like: ./pg_dump.sql:/docker-entrypoint-initdb.d/pg_dump.sql. The extension does not depend on any specific db, so a simple "CREATE EXTENSION IF NOT EXISTS pg_trgm WITH SCHEMA public;" would do it, right?
If I use the same strategy to populate the tables I need a pg_dump.sql that creates the complete db and tables. To accomplish that I first created the brand new database on sqlalchemy, then I used a script to populate the tables with data I have on a json file. I then generated the complete pg_dump.sql and now I can place this complete .sql file on the docker service volume and when I run my docker-compose the postgres container will have the dabatase ready to go.
Now I am starting with alembic and I am thinking I could just keep the pg_dump.sql to create the extensions, and have a alembic migration script to populate the empty tables (dropping the item 2 above).
Which way is the better way? 2, 3 or none of them? tks

Create the extension in a /docker-entrypoint-initdb.d script (1). Load the data using your application's migration system (3).
Mechanically, one good reason to do this is that the database init scripts only run the very first time you create a database container on a given storage. If you add a column to a table and need to run migrations, the init-script sequence requires you to completely throw away and recreate the database.
Philosophically, I'd give you the same answer whether you were using Docker or something else. You could imagine running a database on a dedicated server, or using a cloud-hosted database. You'd have to ask your database administrator to install the extension for you, but they'd generally expect to give you credentials to an empty database and have you load the data yourself; or in a cloud setup you could imagine checking a "install this extension" checkbox in their console but there wouldn't be a way to load the data without connecting to the database remotely.
So, a migration system will work anywhere you have access to the database, and will allow incremental changes to the schema. The init script setup is Docker-specific and requires deleting the database to make any change.

Should an embedded SQLite DB used by CLI app be uploaded to version-control (Git)?

I'm working on a Python CLI app that has to manage some data on a sqlite db (creating, updating and deleting records). I want the users to be able to install the app and use it right away. So my question is, can I just upload an empty sqlite db to GitHub? Or should I just upload a schema file and during installation build the db in a build step? I suppose if going the second way, users should have sqlite pre-installed or else the installation will fail. What I want is for them to just install the app, without worrying about dependencies and such.

When it comes to SQLite, My understanding is that SQLite is generally used as an embedded DB thus users wouldn't need to have SQLite preinstalled. (Of course, it can be used as a standalone DB server, but it's mainly known for its "ease of embeddability" aka...simply just run). Without any effort, in the embedded form, the client itself would create this db.
Using SQLite is just a one-liner as:
conn = sqlite3.connect('my.db')
or
conn = sqlite3.connect('/path/to/my.db')
Or even in-memory (as cache)
conn = sqlite3.connect(':memory:')
When this line runs, it would create a connection by either opening the file (if it exists) or create this file (as an empty DB) if the file is not present. In short, The SQLite library will always read the existing file or create it if it doesn't exist. Thus, You will always have a running DB out of the box. (The only time I can see it failing is if this db file is corrupt for some reason or the SQLite library cannot create the file in a location due to permission issues)
From a user perspective (or developer perspective for that matter), there is nothing that needs to be done to install SQLite. There are no external dependencies for embedded DB or anything to be preinstalled. It simply works. If there are other applications that share this database, they just need to open the particular db file and that's it.
Therefore coming back to your main question, the general best practice is that the application instantiates the database (Whatever the DB is for that matter) on its first run by importing the SQL/Schema (and initial data) file (SQL File, CSV, JSON, XML, from code etc...). The SQL/Schema file can be maintained along with the application source in Github (or whatever VCS) or packaged with the binary in the packaged format (zip, tar...etc) that is given for distribution. So in your case, the second approach that you have thought of might be better. This is even good from a code maintenance and review perspective.
It is best not to upload the "database" as a binary, rather instantiate it on the first run and populate it with data.

If your sqlite db have some pre tables and records, you should upload it to vc in order to be used by the users. but if you need a clean db for each instance of your project I suggest creating db during the initialization process of your app.
Also if your app needs some pre-data inside the db, one of the best practices is to put the data into a file like predata.json and during initialization, create db and import it into the db.

How can I add an sqlite database to Apache Superset?

I'm trying to add a python sqlite3 generated database to superset. Getting that strange error. Is there a way to work around it?

You have to modify superset configuration (config.py file) adding this parameter:
PREVENT_UNSAFE_DB_CONNECTION = False
This is the link to a similar question in superset github repository: https://github.com/apache/incubator-superset/issues/9748, it points to the request to add this security measure.

Dockerized Django app not applying unique constraint

I have two issues both of which are inter-related
Issue #1
My app has an online Postgres Database that it is using to store data. Because this is a Dockerized app, migrations that I create no longer appear on my local host but are instead stored in the Docker container.
All of the questions I've seen so far do not seem to have issues with making migrations and adding the unique constraint to one of the fields within the table.
I have written shell code to run a python script that returns me the contents of the migrations file in the command prompt window. I was able to obtain the migrations file that was to be applied and added a row to the django_migrations table to specify the same. I then ran makemigrations and migrate but it said there were no changes applied (which leads me to believe that the row which I added into the database should only have automatically been created by django after it had detected migrations on its own instead of me specifying the migrations file and asking it to make the changes). The issue is that now, the new migrations still detect the following change
Migrations for 'mdp':
db4mdp/mdp/migrations/0012_testing.py
- Alter field mdp_name on languages
Despite detecting this apparent 'change', I get the following error,
return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: relation "mdp_mdp_mdp_fullname_281e4228_uniq" already exists
I have already checked on my postgres server using pgadmin4 to check if the constraint has actually been applied on it. And it has with the name next to relation as specified above. So why then, does Django apparently detect this as a change that is to be made. The thing is, if I now remove the new migrations file that I created in my python directory, it probably will run (since the changes have 'apparently' been made in the database) but I won't have the migrations file to keep track of the changes. I don't need if I need to keep the migrations around now that I'm using an online database though. I will not be rolling any changes I make back nor will I be making changes to often. This is just a one/two time thing but I want to resolve the error.
Issue #2
The reason I used 'apparently' in my above issue is that even though the constraints section in my public schema show me that the constraints have been applied, for some reason, when I try to create a new entry into my table with a non-unique string in the field that I've defined as unique, it allows it creation anyway.

You never add anything manually to django_migrations table. Let django do it. If it is not doing it, no matter what, you code is not production ready.
I understand that you are doing your development inside docker. When you do it, you mount your docker volume to local volume. Since you have not mounted that, your migratins will not show in local.
Refer Volumes. It should resolve your issues.

For anyone trying to find an alternate solution to this problem other than mounting volumes due to time constraints, this answer might help but #deosha 's is still the correct way to go about it. I fixed the problem by deleting all my tables and the rows corresponding to migrations to my specific app (it isn't necessary to delete the auth tables etc. because you won't be deleting the rows corresponding to those in the django_migrations table). Following this I used the following within the shell script called by my Dockerfile.
python manage.py makemigrations --name testing
python testing_migrations.py
It needs to be named for the next step. After this line of code I ran the python script testing_migrations which contains the following code:
import os
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
migrations_file = os.path.join(BASE_DIR, '<path_to_new_migration_file>')
with open(migrations_file, 'r') as file:
print(file.read())
Typically the first migration created for the app you have made will be /0001_testing.py (which is why naming it was necessary earlier). Since the contents of this file are visible to the container while it is up and running, you will be able to print the contents of the file. Then run the migrate command. This creates a column in the django_migrations table that makes it appear to django that the migration has been applied. However, on your local machine this migrations file doesn't exist. So copy the contents of the file from the print statement above and put it into a .py file with the same name as mentioned above and save it in the migrations folder of the above on your local device.
You can follow this method for all successive migrations by repeating the process and incrementing the number in the testing_migrations file as required.
Squashing migrations once you're done making the table will help. If you're doing this all in development and have no requirement to roll-back the changes to the database schema then just put this into production after deleting all the migrations files and the rows in your django_migrations table corresponding to your app as was done initially. Drop your tables and let your first new migrations file re-create them and then import in your data once again.
This is not the recommended method. Use deosha's if you're not on a time crunch.

Relative import of a apackage in python flask application

Trying to make the sample flask application more modular,I am new to python and flask trying to build a sample application where , I have planned to maintain the folder structure of the application as shown below
where the description of the package are as fallows
config ---> database configuration details
flaskApp
1 model--->which has the mongodb schema
2 viewController----> the endpoint to be accessed
static--->
which contains the single html page which i just need to serve (not render it)
The code repo for the same is in github
https://github.com/dhanalakshmiZendynamix/python-Flask-relative-module.git
I am facing following problems
1: I am not finding a easy way to access the packages to another packages as in folder structure(ie, models inside viewController where the end points are present)
2:Not sure how to serve the html page inside static folder
Tried reading many source
https://exploreflask.com/en/latest/preface.html
http://pyvideo.org/pycon-us-2014/writing-restful-web-services-with-flask.html
But still not sure how to get it working
Please help to adopt to the above folder structure and access to the end point really not sure how to go about it
Any suggestion and pointer would help a lot Thank you

#dhana lakshmi
Check the registered url endpoints in your app.
Start Python on the commandline in your project directory and execute the following commands:
>>> import flaskApp
>>> app = flaskApp.create_app()
>>> app.url_map
Please add the output to your question
And I really think you need to read up a bit on python and flask first, here is a list with some great resources on flask https://github.com/humiaozuzu/awesome-flask

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.