I'm creating tables using a sqlalchemy engine, but even though my create statements execute without error, the tables don't show up in the database when I try to set the role beforehand.
url = 'postgresql://{}:{}#{}:{}/{}'
url = url.format(user, password, host, port, db)
engine = sqlalchemy.create_engine(url)
# works fine
engine.execute("CREATE TABLE testpublic (id int, val text); \n\nINSERT INTO testpublic VALUES (1,'foo'), (2,'bar'), (3,'baz');")
r = engine.execute("select * from testpublic")
r.fetchall() # returns expected tuples
engine.execute("DROP TABLE testpublic;")
# appears to succeed/does NOT throw any error
engine.execute("SET ROLE read_write; CREATE table testpublic (id int, val text);")
# throws error "relation testpublic does not exist"
engine.execute("select * FROM testpublic")
For context, I am on python 3.6, sqlalchemy version 1.2.17 and postgres 11.1 and the role "read_write" absolutely exists and has all necessary permissions to create a table in public (I have no problem running the exact sequence above in pgadmin).
Does anyone know why this is the case and how to fix?
The issue here how sqlalchemy decides to issue a commit after each statement.
if a text is passed to engine.execute, sqlalchemy will attempt to determine if the text is a DML or DDL using the following regex. You can find it in the sources here
AUTOCOMMIT_REGEXP = re.compile(
r"\s*(?:UPDATE|INSERT|CREATE|DELETE|DROP|ALTER)", re.I | re.UNICODE
)
This only detects the words if they're at the start of the text, ignoring any leading whitespaces. So, while your first attempt # works fine, the second example fails to recognize that a commit needs to be issued after the statement is executed because the first word is SET.
Instead, sqlalchemy issues a rollback, so it # appears to succeed/does NOT throw any error.
the simplest solution is to manually commit.
example:
engine.execute("SET ROLE read_write; CREATE table testpublic (id int, val text); COMMIT;")
or, wrap the sql in text and set autocommit=True, as shown in the documentation
stmt = text('set role read_write; create table testpublic (id int, val text);').execution_options(autocommit=True)
e.execute(stmt)
Related
TLDR / Summmary :
How to convert / Cast remote database (and constituent tables) fetched by asyncpg into sqlite3 ?
I have a remote database (the wrapper connecting to this database is asyncpg) and wanted to clone it locally in sqlite3 such that both table structure and data are cloned completely.
My initial idea was to do something like :
CREATE TABLE new_table LIKE ?;
INSERT INTO new_table SELECT * FROM ?;
but this runs into a problem of what object should be passed as an arg in sqlite3 cursor, since this dilemma couldn't be solved by myself, I thought of the next solution to be get the table name and the constituent column data to reconstruct the SQL query which could then be passed onto sqlite3, if worked correctly we could at least clone the table structure and then pass in the adequate data.
To proceed with this notion of thinking, I first got all the tables in the remote database,
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
(Albeit this gets some tables that weren't even defined by me but this works)
Now we have the table name (of every table), let's say there was a table fetched named foo so that we can work with it. Now we are interested in cloning this table locally into sqlite3, we attempt to do that by first getting the column info
SELECT column_name, data_type, character_maximum_length, column_default, is_nullable
FROM INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'foo`
Here is the code for "reconstructing" the table :
import asyncpg
import asyncio
import os
from asyncpg import Record
from typing import List
def make_sql(table_name: str, column_data: List[Record]) -> str:
sql = f"CREATE TABLE {table_name} (\n"
for i in column_data:
sql += f"{i['column_name']} {i['data_type'].upper()} {'NOT NULL' if i['is_nullable'].upper() == 'NO' else ''} {'' if i['column_default'] is None else i['column_default']},\n"
sql = sql[:-2] + "\n);"
return sql
async def main():
con = await asyncpg.connect(os.getenv("DATABASE_URL"))
# In the actual case we will be getting every table name from the database
# for now let's only focus on the table foo
response = await con.fetch(
"SELECT * FROM information_schema.columns WHERE table_name = 'foo'"
)
print(make_sql("foo", response))
await con.close()
asyncio.get_event_loop().run_until_complete(main())
Here is how the table foo was actually created :
CREATE TABLE foo (
bar bigint NOT NULL,
baz text NOT NULL,
CONSTRAINT foo_pkey PRIMARY KEY (bar)
);
Here is how my code attempted to reconstruct the previous query :
CREATE TABLE foo (
bar BIGINT NOT NULL ,
baz TEXT NOT NULL
);
This very clearly has lost its structure (referring to constraints) while trying to reconstruct the query and I am sure there will be other edge cases that this didn't cover, and I think sewing and stitching strings around like this is extremely vulnerable, I think some other approach to this would be better because this seems fundamentally wrong.
Which leads me to pose the question, how do I make it so that the exact table structure and data is cloned done to sqlite3 properly?
I'd like to execute a DDL statement (for example: create table test(id int, str varchar)) in different DB schemas.
In order to execute this DDL i was going to use the following code:
from sqlalchemy import DDL, text, create_engine
engine = create_engine(...)
ddl_cmd = "create table test(id int, str varchar)"
DDL(ddl_cmd).execute(bind=engine)
How can I specify in which DB schema to execute this DDL statement, not changing the DDL command itself?
I don't understand why such a basic parameter like schema is missing in the DDL().execute() method. I guess I'm missing some important concept, but I couldn't figure it out.
UPD: I've found the "schema_translate_map" execution option, but it didn't work for me - the table will be still created in the default schema.
Here are my attempts:
conn = engine.connect().execution_options(schema_translate_map={None: "my_schema"})
then i tried different variants:
# variant 1
conn.execute(ddl_cmd)
# variant 2
conn.execution_options(schema_translate_map={None: "my_schema"}).execute()
# variant 3
DDL(ddl_cmd).compile(bind=conn).execute()
# variant 4
DDL(ddl_cmd).compile(bind=conn).execution_options(schema_translate_map={None: "my_schema"})
but every time the table will be created in the default schema. :(
I am making a script, that should create a schema for each customer. I’m fetching all metadata from a database that defines how each customer’s schema should look like, and then create it. Everything is well defined, the types, names of tables, etc. A customer has many tables (fx, address, customers, contact, item, etc), and each table has the same metadata.
My procedure now:
get everything I need from the metadataDatabase.
In a for loop, create a table, and then Alter Table and add each metadata (This is done for each table).
Right now my script runs in about a minute for each customer, which I think is too slow. It has something to do with me having a loop, and in that loop, I’m altering each table.
I think that instead of me altering (which might be not so clever approach), I should do something like the following:
Note that this is just a stupid but valid example:
for table in tables:
con.execute("CREATE TABLE IF NOT EXISTS tester.%s (%s, %s);", (table, "last_seen date", "valid_from timestamp"))
But it gives me this error (it seems like it reads the table name as a string in a string..):
psycopg2.errors.SyntaxError: syntax error at or near "'billing'"
LINE 1: CREATE TABLE IF NOT EXISTS tester.'billing' ('last_seen da...
Consider creating tables with a serial type (i.e., autonumber) ID field and then use alter table for all other fields by using a combination of sql.Identifier for identifiers (schema names, table names, column names, function names, etc.) and regular format for data types which are not literals in SQL statement.
from psycopg2 import sql
# CREATE TABLE
query = """CREATE TABLE IF NOT EXISTS {shm}.{tbl} (ID serial)"""
cur.execute(sql.SQL(query).format(shm = sql.Identifier("tester"),
tbl = sql.Identifier("table")))
# ALTER TABLE
items = [("last_seen", "date"), ("valid_from", "timestamp")]
query = """ALTER TABLE {shm}.{tbl} ADD COLUMN {col} {typ}"""
for item in items:
# KEEP IDENTIFIER PLACEHOLDERS
final_query = query.format(shm="{shm}", tbl="{tbl}", col="{col}", typ=i[1])
cur.execute(sql.SQL(final_query).format(shm = sql.Identifier("tester"),
tbl = sql.Identifier("table"),
col = sql.Identifier(item[0]))
Alternatively, use str.join with list comprehension for one CREATE TABLE:
query = """CREATE TABLE IF NOT EXISTS {shm}.{tbl} (
"id" serial,
{vals}
)"""
items = [("last_seen", "date"), ("valid_from", "timestamp")]
val = ",\n ".join(["{{}} {typ}".format(typ=i[1]) for i in items])
# KEEP IDENTIFIER PLACEHOLDERS
pre_query = query.format(shm="{shm}", tbl="{tbl}", vals=val)
final_query = sql.SQL(pre_query).format(*[sql.Identifier(i[0]) for i in items],
shm = sql.Identifier("tester"),
tbl = sql.Identifier("table"))
cur.execute(final_query)
SQL (sent to database)
CREATE TABLE IF NOT EXISTS "tester"."table" (
"id" serial,
"last_seen" date,
"valid_from" timestamp
)
However, this becomes heavy as there are too many server roundtrips.
How many tables with how many columns are you creating that this is slow? Could you ssh to a machine closer to your server and run the python there?
I don't get that error. Rather, I get an SQL syntax error. A values list is for conveying data. But ALTER TABLE is not about data, it is about metadata. You can't use a values list there. You need the names of the columns and types in double quotes (or no quotes) rather than single quotes. And you can't have a comma between name and type. And you can't have parentheses around each pair. And each pair needs to be introduced with "ADD", you can't have it just once. You are using the wrong tool for the job. execute_batch is almost the right tool, except it will use single quotes rather than double quotes around the identifiers. Perhaps you could add a flag to it tell it to use quote_ident.
Not only is execute_values the wrong tool for the job, but I think python in general might be as well. Why not just load from a .sql file?
I wrote a stored procedure in SQL Server that gets passed 4 parameters. I want to check the first parameter #table_name to make sure it uses only whitelist chars, and if not, write the attempt to a table in a different database (to both I have DML premissions).
If the name is good, it works fine, but if not, then python returns
TypeError: 'NoneType' object is not iterable
(which is expected and fine for me, as there is no such table), but it doesn't write to the table to which it was supposed to write, and the table gets stuck until I shut down the program.
When I run the stored procedure from SSMS with the same parameters, it works perfect, and writes successfully to the log table.
create_str = "CREATE PROC sp_General_GetAllData (#table_name AS NVARCHAR(MAX),\
#year AS INT, #month AS INT, #pd AS INT)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #sql NVARCHAR(MAX) # for later uses
IF #table_name LIKE '%[^a-zA-Z0-9_]%'
BEGIN
# the 'log' table allows NULLs and the fields are used in other cases
INSERT INTO [myDB].[mySchema].[log]
VALUES (SUSER_SNAME(), NULL, NULL, NULL, NULL, NULL,
'INVALID TABLE NAME: ' + #table_name, GETDATE())
RAISERROR ('Bad table name! Contact your friendly DBA for details', 0, 0)
RETURN
END
# do some things if the #table_name is ok...
END"
cursor = sql_conn.execute(create_str)
cursor.commit()
# calling the SP from python - doesn't write to the log table which gets stuck
query = "{CALL sp_General_GetAllData (?, ?, ?, ?)}"
data = pd.read_sql(query, sql_conn, params=['just*testing', 2019, 7, 2])
# calling the SP from SSMS - works fine
EXEC sp_General_GetAllData 'just*testing', 2019, 7, 2
Because of the "*" inside the first parameter, it is expected to insert a line to [myDB].[mySchema].[log], which is happening only if I call the SP from SSMS, but not from python. Why?
SOLUTION:
With some luck I found out that the problem was that when the call to the SP was sent from python, the INSERT INTO clause was just not committed and it just waited for the commit order. The solution was to add auto_commit=True to the pyodbc.connect() function
I had the same issue, however, adding "auto_commit=True" to "pyodbc.connect()" didn't solve my problem. I solved it by adding the following command after the insert statement:
commit
Try this link.
Maybe your problem is from your configuration of mysql connection. By using SqlAlchemy, MySql connector or other connector, auto commit is disable by default.
About MySQLdb conn.autocommit(True), Mika's Answer
For the record, I have looked into this, but cannot seem to figure out what is wrong.
So I'm doing the tutorial on web.py, and I get to the database part (can do everything above it). I wanted to use sqlite3 for various reasons. Since I couldn't figure out where to type the
sqlite3 test.db
line, I look into the sqlite3 module, and create a database with that. The code for that is:
import sqlite3
conn = sqlite3.connect("test.db")
print("Opened database successfully");
conn.execute('''CREATE TABLE todo
(id serial primary key,
title text,
created timestamp default now(),
done boolean default 'f');''')
conn.execute("INSERT INTO todo (title) VALUES ('Learn web.py')");
but I get the error
done boolean default 'f');''')
sqlite3.OperationalError: near "(": syntax error
I've tried looking into this, but cannot figure out for the life of me what the issue is.
I haven't had luck with other databases (new to this, so not sure on the subtleties), I wasn't able to just make the sqlite database directly so it might be a python thing, but it matches the tester.py I made with the sqlite with python tutorial...
Thanks if anyone can help me!
The problem causing the error is that you can't use the MySQL now() function here. Try instead
created default current_timestamp
This works:
conn.execute('''CREATE TABLE todo
(id serial primary key,
title text,
created default current_timestamp,
done boolean default 'f');''')
You are using SQLite but are specifying data types from some other database engine. SQLite accepts only INT, TEXT, REAL, NUMERIC, and NONE. Boolean is most likely being mapped to one of the number types and therefore DEFAULT 'F' is not valid syntax (although I don't think it would be valid in any version of SQL that does support BOOLEAN as a datatype, since they general use INTEGER for the underlying storage).
Rewrite the CREATE TABLE statement with SQLite datatypes and allowable default values and your code should work fine.
More details on the (somewhat unusual) SQLite type system: http://www.sqlite.org/datatype3.html