I am making a script, that should create a schema for each customer. I’m fetching all metadata from a database that defines how each customer’s schema should look like, and then create it. Everything is well defined, the types, names of tables, etc. A customer has many tables (fx, address, customers, contact, item, etc), and each table has the same metadata.
My procedure now:
get everything I need from the metadataDatabase.
In a for loop, create a table, and then Alter Table and add each metadata (This is done for each table).
Right now my script runs in about a minute for each customer, which I think is too slow. It has something to do with me having a loop, and in that loop, I’m altering each table.
I think that instead of me altering (which might be not so clever approach), I should do something like the following:
Note that this is just a stupid but valid example:
for table in tables:
con.execute("CREATE TABLE IF NOT EXISTS tester.%s (%s, %s);", (table, "last_seen date", "valid_from timestamp"))
But it gives me this error (it seems like it reads the table name as a string in a string..):
psycopg2.errors.SyntaxError: syntax error at or near "'billing'"
LINE 1: CREATE TABLE IF NOT EXISTS tester.'billing' ('last_seen da...
Consider creating tables with a serial type (i.e., autonumber) ID field and then use alter table for all other fields by using a combination of sql.Identifier for identifiers (schema names, table names, column names, function names, etc.) and regular format for data types which are not literals in SQL statement.
from psycopg2 import sql
# CREATE TABLE
query = """CREATE TABLE IF NOT EXISTS {shm}.{tbl} (ID serial)"""
cur.execute(sql.SQL(query).format(shm = sql.Identifier("tester"),
tbl = sql.Identifier("table")))
# ALTER TABLE
items = [("last_seen", "date"), ("valid_from", "timestamp")]
query = """ALTER TABLE {shm}.{tbl} ADD COLUMN {col} {typ}"""
for item in items:
# KEEP IDENTIFIER PLACEHOLDERS
final_query = query.format(shm="{shm}", tbl="{tbl}", col="{col}", typ=i[1])
cur.execute(sql.SQL(final_query).format(shm = sql.Identifier("tester"),
tbl = sql.Identifier("table"),
col = sql.Identifier(item[0]))
Alternatively, use str.join with list comprehension for one CREATE TABLE:
query = """CREATE TABLE IF NOT EXISTS {shm}.{tbl} (
"id" serial,
{vals}
)"""
items = [("last_seen", "date"), ("valid_from", "timestamp")]
val = ",\n ".join(["{{}} {typ}".format(typ=i[1]) for i in items])
# KEEP IDENTIFIER PLACEHOLDERS
pre_query = query.format(shm="{shm}", tbl="{tbl}", vals=val)
final_query = sql.SQL(pre_query).format(*[sql.Identifier(i[0]) for i in items],
shm = sql.Identifier("tester"),
tbl = sql.Identifier("table"))
cur.execute(final_query)
SQL (sent to database)
CREATE TABLE IF NOT EXISTS "tester"."table" (
"id" serial,
"last_seen" date,
"valid_from" timestamp
)
However, this becomes heavy as there are too many server roundtrips.
How many tables with how many columns are you creating that this is slow? Could you ssh to a machine closer to your server and run the python there?
I don't get that error. Rather, I get an SQL syntax error. A values list is for conveying data. But ALTER TABLE is not about data, it is about metadata. You can't use a values list there. You need the names of the columns and types in double quotes (or no quotes) rather than single quotes. And you can't have a comma between name and type. And you can't have parentheses around each pair. And each pair needs to be introduced with "ADD", you can't have it just once. You are using the wrong tool for the job. execute_batch is almost the right tool, except it will use single quotes rather than double quotes around the identifiers. Perhaps you could add a flag to it tell it to use quote_ident.
Not only is execute_values the wrong tool for the job, but I think python in general might be as well. Why not just load from a .sql file?
I have a Python API that inserts data into Postgres table using stored procedure that takes in a jsonb and does an insert into table. example:
create function insert_new_employee(i jsonb) returns json as:
$$
begin
insert into newhire(name, payload) values (i ->> 'name', i)
end
$$
When I do a test insert using SQL client such as datagrip as such it inserts successfully:
select insert_new_employee('{"name":"alfred"}')
However, when I use Python API the payload gets transformed into:
"{\"name\":\"alfred\"}"
Because of the escaping the name doesn't get parsed and stored into the name column however the jsonb payload column takes it in.
Is there a way for Postgres to clean up and deal with character escape when request are passed through API?
i need to recreate indexes on a table as i have to insert a lot of data into the table.
i am trying to get the defination of an index in postgres using
SELECT pg_get_indexdef('start_date_sr_index_its'::regclass);
it works, but when i try to run this same command from psycopg2 it says relation does not exist
psycopg2.ProgrammingError: relation "start_date_sr_index_its" does not exist
LINE 1: SELECT pg_get_indexdef('start_date_sr_index_its'::regclass);
^
i have tried to replace ' with " but it says the same
An easier way to get index definition in postgres is to get it directly through the pg_index table instead of using utility function pg_get_indexdef().
You can simply query
SELECT indexdef FROM pg_indexes WHERE indexname = ''
you can also get schemaname, tablename and tablespace from this table.
Having a little tricky issue with python and mysql. To keep it simple, the following code returns whatever is in the variable 'field', which is a string. Such as 'username' or 'password'.
options = [field, userID]
entries = cursor.execute('select (?) from users where id=(?)', options).fetchall()
print(entries);
This code works correctly if I remove the first (?) and just use the actually name (like 'username') instead. Can anyone provide some input?
Your query is actually formed as:
select "field" from users where id="value"
which returns you a string "field" instead of the actual table field value.
You cannot parameterize column and table names (docs):
Parameter placeholders can only be used to insert column values. They
can not be used for other parts of SQL, such as table names,
statements, etc.
Use string formatting for that part:
options = [userID]
query = 'select {field} from users where id=(?)'.format(field=field)
cursor.execute(query, options).fetchall()
Related threads with some more explanations:
pysqlite: Placeholder substitution for column or table names?
Python MySQLdb: Query parameters as a named dictionary
The following query causes python to crash ('python.exe has encountered a problem ...'
Process terminated with an exit code of -1073741819
The query is:
create temp table if not exists MM_lookup2 as
select lower(Album) || lower(SongTitle) as concat, ID
from MM.songs
where artist like '%;%' collate nocase
If I change from "like" to = it runs as expected, eg
create temp table if not exists MM_lookup2 as
select lower(Album) || lower(SongTitle) as concat, ID
from MM.songs
where artist = '%;%' collate nocase
I am running python v2.7.2, with whatever version of sqlite that ships in there.
The problem query runs without problem outside python.
You didn't write the database system/driver you are using. I suspect that your SQL is the problem. The % characters needs to be escaped. Possibly the db driver module tries to interpret %, and %) as format chars, and it cannot convert non-existing parameter value into a format that is acceptable by the database backend.
Can you please give us concrete Python code? Can you please try to run the same query but putting the value of '%,%' into a parameter, and pass it to the cursor as a parameter?