Python sqlite: Preventing SQL injection in table names [duplicate] - python

This question already has answers here:
Being that string substitution is frowned upon with forming SQL queries, how do you assign the table name dynamically?
(2 answers)
Closed 1 year ago.
pretty basic question here. In python, using sqlite3, I'm able to prevent most sql injections with ? placeholders. However this option doesn't exist for table names.
In most scenarios it's pretty easy for me to just check if the table exists, however I also need tables to be created from placeholders that will be user supplied.
How would I go about doing this?
edit: Honestly the associated question really doesn't help in this situation... The issue here is that I want a user to supply their own table name. Regardless I decided to use a scheme that I described in my own answer here so I don't really care about this question anymore.

Well it might not be best practice but my solution will just be storing the tables and column names as Table1, Column1, etc. I can store the names separately.

Related

Query across all schemas on identical table on Postgres

I'm using postgres and I have multiple schemas with identical tables where they are dynamically added the application code.
foo, bar, baz, abc, xyz, ...,
I want to be able to query all the schemas as if they are a single table
!!! I don't want to query all the schemas one by one and combine the results
I want to "combine"(not sure if this would be considered a huge join) the tables across schemas and then run the query.
For example, an order by query shouldn't be like
1. schema_A.result_1
2. schema_A.result_3
3. schema_B.result_2
4. schema_B.result 4
but instead it should be
1. schema_A.result_1
2. schema_B.result_2
3. schema_A.result_3
4. schema_B.result 4
If possible I don't want to generate a query that goes like
SELECT schema_A.table_X.field_1, schema_B.table_X.field_1 FROM schema_A.table_X, schema_B.table_X
But I want that to be taken care of in postgresql, in the database.
Generating a query with all the schemas(namespaces) appended can make my queries HUGE with ~50 field and ~50 schemas.
Since these tables are generated I also cannot inherit them from some global table and query that instead.
I'd also like to know if this is not really possible in a reasonable speed.
EXTRA:
I'm using django and django-tenants so I'd also accept any answer that actually helps me generate the entire query and run it to get a global queryset EVEN THOUGH it would be really slow.
Your question isn't as much of a question as it is an admission that you've got a really terrible database and applicaiton design. It's as if you parittioned something that iddn't need to be parittioned, or partitioned it in the wrong way.
Since you're doing something awkward, the database itself won't provide you with any elegant solution. Instead, you'll have to get more and more awkward until the regret becomes too much to bear and you redesign your database and/or your application.
I urge you to repent now, the sooner the better.
After that giant caveat based on a haughty moral position, I acknolwedge that the only reason we answer questions here is to get imaginary internet points. And so, my answer is this: use a view that unions all of the values together and presents them as if they came from one table. I can't make any sense of the "order by query", so I just ignore it for now. Maybe you mean that you want the results in a certain order; if so, you can add constants to each SELECT operand of each UNION ALL and ORDER BY that constant column coming out of the union. But if the order of the rows matters, I'd assert that you are showing yet another symptom of a poor database design.
You can programatically update the view whenever it is you update or create the new schemas and their catalogs.
A working example is here: http://sqlfiddle.com/#!17/c09265/1
with this schema creation and population code:
CREATE Schema Fooey;
CREATE SCHEMA Junk;
CREATE TABLE Fooey.Baz (SomeINteger INT);
CREATE TABLE Junk.Baz (SomeINteger INT);
INSERT INTO Fooey.Baz (SomeInteger) VALUES (17), (34), (51);
INSERT INTO Junk.Baz (SomeInteger) VALUES (13), (26), (39);
CREATE VIEW AllOfThem AS
SELECT 'FromFooey' AS SourceSchema, SomeINteger FROM Fooey.Baz
UNION ALL
SELECT 'FromJunk' AS SourceSchema, SomeInteger FROM Junk.Baz;
and this query:
SELECT *
FROM AllOfThem
ORDER BY SourceSchema;
Why are per-tenant schemas a bad design?
This design favors laziness over scalability. If you don't want to make changes to your application, you can simply slam connections to a particular shcema and keep working without any code changes. Adding more tennants means adding more schemas, which it sounds like you've automated. Adding many schemas will eventually make database management cumbersome (what if you have thousands or millions of tenants?) and even if you have only a few, the dynamic nature of the list and the problems in writing system-wide queries is an issue that you've already discovered.
Consider instead combining everything and adding the tenant ID as part of a key on each table. In that case, adding more tenants means adding more rows. Any summary queries trivially come from single tables, and all of the features and power of the database implementation and its query language are at your fingertips without any fuss whatsoever.
It's simply false that a database design can't be changed, even in an existing and busy system. It takes a lot of effort to do it, but it can be done and people do it all the time. That's why getting the database design right as early as possible is important.
The README of the django-tenants package you're using describes thier decision to trade-off towards laziness, and cites a whitpaper that outlines many of the shortcomings and alternatives of that method.

In python, can we set table name as a variable using Sqlite3?

For example:
import Sqlite3
def ChangeTable(c,a):
c.execute('''DELETE FROM MY_TABLE WHERE id = ?''',(a,))
This way I can change the value of a and process the database with a function in python.
But how can I do something similar with Table names?
This way I can use one function to handle different tables.
Parsing in table names is made to not work intentionally, since dynamically using table names in SQL queries is generally a bad idea. If you're in a situation where you end up dynamically wanting to do stuff with table names, you should think about redesigning your database and make it relational. It's hard to give specific advice about how to do this in your case, since we don't know how your database is arranged.
The other answer involves using string formatting in SQL queries. This is very bad coding practice, since it makes your code vulnerable to SQL injection. The sqlite documentation says to never do this, in fact. You don't want to have a Bobby Tables situation on your hands.
You can't use variables for table names. You have to perform string concatenation or substitution. As an example, using an F-string (Python >= 3.6):
def change_table(table_name, id):
c.execute(f'DELETE FROM {table_name} WHERE id = ?', (id,))
with more meaningful variable names...
Triple quoting is not required here but useful for multiline statements.

Python: Efficient lookup in a list of dicts

I have a huge list of dicts (or objects), that have exactly the same fields. I need to perform a search within this collection to retrieve a single object based on the given criteria (there could be many matches, but I take only the first one).
The search criteria don't care about all the fields, and the fields are often different, so changing the list into a dictionary with hashed values is impossible.
This looks like a job for a database, so I'm thinking about using an in-memory sqlite database for it. The problem is that I need to write some kind of wrapper that will translate SQL queries into Python API, which makes me think that perhaps there's already a solution for that somewhere.
Maybe someone already had a similar problem, and there's already a tool that will help me with that? Or sqlite is the only way?

How is SQLAlchemy's column documentation (Column.doc) actually used? [duplicate]

This question already has answers here:
How to add comments to table or column in mysql using SQLAlchemy?
(3 answers)
Closed 4 months ago.
When creating a Column object with SQLAlchemy, you can specify a documentation string. Once set, how can this documentation be accessed in Python?
In addition, is using doc a good practise for documenting SQL Columns, or would it be better to use comment or perhaps Sphinx's standard for documenting general instance variables?
The doc parameter creates python docstrings for classes and fields of your ORM (see also the PEP 257). You can access them through the __doc__ attribute or through the built-in help() function. Probably some IDEs will use them for hints as well.
See this answer for some examples.

Does a one-to-one Relationship Mean Unnecessary Tables When Designing An SQL Database? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
My question is more conceptual than anything else. I'm designing the schema for a database that I am about to create using SQLAlchemy (a python toolkit for sql databases, http://www.sqlalchemy.org/), and I've come across a situation where, between two tables of the database, there exists a one-to-one relationship. To make this concrete, I have a table called "Identification", which contains two columns: 1) An internal id (used as the primary key), and 2) an object ID. This table has a one-to-one relationship to another table we can call "Analysis", which represents a unique look-up table to data which I have personally generated on each of these objects (through my research).
My question is, does the existence of a one-to-one relationship mean that I should combine "Identification" and "Analysis" into a single table? Every schema I've looked at for sql databases show one-to-many, or many-to-many type scenarios. Is there something conceptually wrong or bad about having a one-to-one relationship between tables of a database? The reason I'm even pursuing this is because this analysis that I've performed might be one of many types of analysis I'd like to do on each object, and it would be nice to have the identification table separated out from the data I get back from each analysis.
Any opinions or comments are appreciated!
This could easily go both ways.
In the case as you've described, where it seems it's one to one and the first table only contains identifiers, I would say that it's best to normalize the tables (combine them), because this allows you to remove a join from what would otherwise be required in effectively every query.
In general, I would call this true anytime there is an effective 1 to 1 relationship. That being said, there is another variation of this, the 0..1 to 1. In this case, it may be possible for example, to have a row in table a that either has 1 row in table b, or does not. In this case, it's advantageous to keep them separate because otherwise you have columns that may not apply to every row and just sit empty.
Generally, in designing your database, ask yourself the question "What am I gaining by keeping these separate, vs what am I losing if I don't merge them". In the first case, you gain little and lose performance.

Categories