I'm using SQLAlchemy to insert a row into a table. The table is defined like this:
class MyTable():
__table__ = "my_table"
id = Column(BigInteger, primary_key=True)
stuff1 = Column(Numeric)
Here's the alchemy line:
MyTable(stuff1=100)
Here's the query it generates:
INSERT INTO my_table (id, stuff1) VALUES (null, 100)
And i get this error:
IntegrityError('(psycopg2.IntegrityError) null value in column \"id\" violates not-null constraint
Since the id is the primary key, i expected it to get generated automatically. But it seems like i have to manually apply a sequence to it? What am I doing wrong?
This has happened to me before and usually I either didn't use sqlalchemy to create the table or my table definition in sqlalchemy was incomplete/incorrect when I did. If you have a 'create table if not exists' setup then your changes won't be synchronized with the table definition in postgres. I would checkout the table definition with the psql command line tool to verify if a server default is not setup for the primary key. It should be a serial data type or using an external sequence.
Related
I am using SQLite v3.31.1 and python 3.7.9.
How do I identify which tables a specific id value exists in? I think something like this exists in MYSQL as "INFORMATION_SCHEMA", but I have not found a SQLite alternative.
I have a table structure that follows what I believe is called "Class Table Inheritance". Basically, I have a product table that has attributes common to all products, plus tables that contain specific attributes to that class of products. The "Products" main table contains an id column that is a primary key, this is used as a foreign key in the child tables. So a specific item exists in both the product table and the child table, but nowhere else (other than some virtual tables for FTS).
My current solution is to get all tables names like
SELECT tbl_name
FROM sqlite_master
WHERE type='table'
and then loop over the tables using
SELECT 1
FROM child_table
WHERE id = value
but it seems like there should be a better way.
Thank you.
While the answer I originally posted will work, it is slow. You could add a hash map in python, but it will be slow while the hash map is rebuilt.
Instead, I am using triggers in my SQL setup to create a table with the relevant info. This is a bit more brittle, but much faster for large number of search results as the db is queried just once and the table is loaded when the database is initialized.
My example, -- there is a parent table called Products that has the id primary key.
CREATE TABLE IF NOT EXISTS ThermalFuse (
id INTEGER NOT NULL,
ratedTemperature INTEGER NOT NULL,
holdingTemperature INTEGER NOT NULL,
maximumTemperature INTEGER NOT NULL,
voltage INTEGER NOT NULL,
current INTEGER NOT NULL,
FOREIGN KEY (id) REFERENCES Product (id)
);
CREATE TABLE IF NOT EXISTS Table_Hash (
id INTEGER NOT NULL,
table_name TEXT NOT NULL,
FOREIGN KEY (id) REFERENCES Product (id)
);
CREATE TRIGGER ThermalFuse_ai AFTER INSERT ON ThermalFuse
BEGIN
INSERT INTO Table_Hash (id, table_name)
VALUES (new.id, 'ThermalFuse');
END;
I test commands for sql by python. Generaly everything is okey, in this case, its doesn't work.
import sqlite3
conn = sqlite3.connect('Chinook_Sqlite.sqlite')
cursor = conn.cursor()
result = None
try:
cursor.executescript("""CREATE TABLE <New>;""")
result = cursor.fetchall()
except sqlite3.DatabaseError as err:
print("Error: ", err)
else:
conn.commit()
print(result)
conn.close()
Name writes with out <> and must include: name, type, default value after in ().
https://www.sqlite.org/lang_createtable.html - thanks #deceze
The "CREATE TABLE" command is used to create a new table in an SQLite
database. A CREATE TABLE command specifies the following attributes of
the new table:
he name of the new table.
The database in which the new table is created. Tables may be created in the main database, the temp database, or in any attached
database.
The name of each column in the table.
The declared type of each column in the table.
A default value or expression for each column in the table.
A default collation sequence to use with each column.
Optionally, a PRIMARY KEY for the table. Both single column and composite (multiple column) primary keys are supported.
A set of SQL constraints for each table. SQLite supports UNIQUE, NOT NULL, CHECK and FOREIGN KEY constraints.
Optionally, a generated column constraint.
Whether the table is a WITHOUT ROWID table.
cursor.executescript("""CREATE TABLE New ( AuthorId INT IDENTITY (1, 1) NOT NULL, AuthorFirstName NVARCHAR (20) NOT NULL, AuthorLastName NVARCHAR (20) NOT NULL, AuthorAge INT NOT NULL);""")
(sorry, there are many similar questions on SO but none I could find that match well enough)
Attempting to upsert to a Postgres RDS table via a temp table...
import sqlalchemy as sa
# assume db_engine is already set up
with db_engine.connect() as conn:
conn.execute(sa.text("DROP TABLE IF EXISTS temp_table"))
build_temp_table = f"""
CREATE TABLE temp_table (
unique_id VARCHAR(40) NOT NULL,
date TIMESTAMP,
amount NUMERIC,
UNIQUE (unique_id)
);
"""
conn.execute(sa.text(build_temp_table))
upsert_sql_string = """
INSERT INTO production_table(unique_id, date, amount)
SELECT unique_id, date, amount FROM temp_table
ON CONFLICT (unique_id)
DO UPDATE SET
date = excluded.date,
amount = excluded.amount
"""
conn.execute(sa.text(upsert_sql_string))
Note: production_table is configured the identically to temp_table
Other methods I have tried include:
Specifying unique_id as PRIMARY KEY or UNIQUE in table definition
Running ALTER TABLE temp_table ADD PRIMARY KEY (unique_id) after creating temp_table
Regardless of what I do, I get the error:
psycopg2.errors.InvalidColumnReference: there is no unique or exclusion constraint matching the ON CONFLICT specification
Thanks
I have the following model where TableA and TableB have 1 to 1 relationship:
class TableA(db.Model):
id = Column(db.BigInteger, primary_key=True)
title = Column(String(1024))
table_b = relationship('TableB', uselist=False, back_populates="table_a")
class TableB(db.Model):
id = Column(BigInteger, ForeignKey(TableA.id), primary_key=True)
a = relationship('TableA', back_populates='table_b')
name = Column(String(1024))
when I insert 1 record everything goes fine:
rec_a = TableA(title='hello')
rec_b = TableB(a=rec_a, name='world')
db.session.add(rec_b)
db.session.commit()
but when I try to do this for bulk of records:
bulk_ = []
for title, name in zip(titles, names):
rec_a = TableA(title=title)
bulk_.append(TableB(a=rec_a, name=name))
db.session.bulk_save_objects(bulk_)
db.session.commit()
I get the following exception:
sqlalchemy.exc.InternalError: (pymysql.err.InternalError) (1364, "Field 'id' doesn't have a default value")
Am I doing something wrong? Did I configure the model wrong?
Is there a way to bulk commit this type of data?
The error you see is thrown by Mysql. It is complaining that the attempt to insert records into table_b violates the foreign key constraint.
One technique could be to write all the titles in one bulk statement, then write all the names in a 2nd bulk statement. Also, I've never passed relationships successfully to bulk operations, to this method relies on inserting simple values.
bulk_titles = [TableA(title=title) for title in titles]
session.bulk_save_objects(bulk_titles, return_defauls=True)
bulk_names = [TableB(id=title.id, name=name) for title, name in zip(bulk_titles, names)]
session.bulk_save_objects(bulk_names)
return_defaults=True is needed above because we need title.id in the 2nd bulk operation. But this greatly reduces the performance gains of the bulk operation
To avoid the performance degradation due to return_defauts=True, you could generate the primary keys from the application, rather than the database, e.g. using uuids, or fetching the max id in each table and generating a range from that start value.
Another technique might be to write your bulk insert statement using sqlalchemy core or plain text.
I'm using the redshift-sqlalchemy package to connect SQLAlchemy to Redshift. In Redshift I have a simple "companies" table:
create table if not exists companies (
id bigint identity primary key,
name varchar(1024) not null
);
On the SQLAlchemy side I have mapped it like so:
Base = declarative_base()
class Company(Base):
__tablename__ = 'companies'
id = Column(BigInteger, primary_key=True)
name = Column(String)
If I try to create a company:
company = Company(name = 'Acme')
session.add(company)
session.commit()
then I get this error:
sqlalchemy.exc.StatementError: (raised as a result of Query-invoked autoflush;
consider using a session.no_autoflush block if this flush is occurring prematurely)
(sqlalchemy.exc.ProgrammingError) (psycopg2.ProgrammingError)
relation "companies_id_seq" does not exist
[SQL: 'select nextval(\'"companies_id_seq"\')']
[SQL: u'INSERT INTO companies (id, name)
VALUES (%(id)s, %(name)s)'] [parameters: [{'name': 'Acme'}]]
The problem is surely that SQLAlchemy is expecting an auto-incrementing sequence - standard technique with Postgres and other conventional DBs. But Redshift doesn't have sequences, instead it offers "identity columns" for auto-generated unique values (not necessarily sequential). Any advice on how to make this work? To be clear, I don't care about auto-incrementing, just need unique primary key values.
Just like you said, Redshift doesn't support sequences so you can remove this part:
select nextval(\'"companies_id_seq"\')
And your insert statement should simply be:
INSERT INTO companies
(name)
VALUES
('Acme')
In your table, you will see that 'Acme' has a id column with a unique value. You can't insert a value into the id column so you don't specify it in the insert statement. It will be auto populated.
Here is more explanation:
http://docs.aws.amazon.com/redshift/latest/dg/c_Examples_of_INSERT_30.html