How to check if a table exists in Python? - python

I'm currently going over a course in Web design. What I want to do is to check if a table named portafolio exists in my database, if not I want to create one.
I'm using Python (flask) and sqlite3 to manage my database.
So far I the some of the logic in SQL to create a table if it doesn't exist:
# db is my database variable
db.execute('''create table if not exists portafolio(id INTEGER PRIMARY KEY AUTOINCREMENT,
stock TEXT,
shares INTEGER,
price FLOAT(2),
date TEXT
''');
But instead of using SQL commands I'd like to know how would I do the exact same checking in Python instead, since it would look a lot cleaner.
Any help would be appreciated.

Not sure about which way is cleaner but you can issue a simple select and handle the exception:
try:
cursor.execute("SELECT 1 FROM portafolio LIMIT 1;")
exists = True
except sqlite3.OperationalError as e:
message = e.args[0]
if message.startswith("no such table"):
print("Table 'portafolio' does not exist")
exists = False
else:
raise
Note that here we have to check what kind of OperationalError it is and, we have to have this "not pretty" message substring in a string check because there is currently no way to get the actual error code.
Or, a more SQLite specific approach:
table_name = "portafolio"
cursor.execute("""
SELECT name
FROM sqlite_master
WHERE type='table' AND name=?;
""", (table_name, ))
exists = bool(cursor.fetchone())

If you are looking for a neat job with Database operations, I advice you learn more about ORM(Object Relation Model).
I use Flask with SQLAlchemy. You can use python classes to manage SQL operations like this:
class Portafolio(db.Model):
id = db.Column(db.Integer, primary_key=True)
stock = db.Column(db.String(255), unique=True)
shares = db.Column(db.Integer, unique=True)
It does all the database checks and migration easily.

Related

Preventing duplicate child entries in an ORM relationship

Basically I have a service that reads from a spreadsheet and inserts into database.
In SQLAlchemy I have the following relationship
class Customer(Base):
__tablename__ = 'customers'
id = Column(Integer, primary_key=True)
name = Column(String)
children = relationship('Email', backref=('customer')
class Email(Base):
__tablename__ = 'emails'
id = Column(Integer, primary_key=True)
customer = Column(Integer, ForeignKey('customer.id'))
email = Column(String)
primary = Column(Boolean)
Is it possible for SQLAlchemy to check for a duplicate entry between a fetched resource and one created in the ORM?
For example let's say customer 123 has an email some_email, and we try to add it again:
email_object = Email(customer=123, email='some_email', primary=True)
cust = connection.query(Customer).options(joinedload(Customer.emails)).filter_by(
id=123).first()
cust.emails.append(email_object)
Ideally I would like SQLAlchemy to either notice that such a combination exists and merge/ignore it, or throw some kind of exception.
But instead I'm getting the following result if I print out cust.emails
[<Email(id=1, email=some_email, primary=True, customer=123>),
<Email(customer=192071, email='some_email', primary=True, customers=<Employee(id=123, name='John', emails=['some_email', 'some_email']>>)]
and doing a merge and commit just seems to add an extra identical row in the database (except for the pk).
I think maybe it has to do with the unused primary key in Emails, but that is autogenerated when committing to the DB.
Any ideas?
Lemme know if I need to clarify anything.
Setting the Email class to have two primary keys doesn't seem to make SQLAlchemy stop from appending the extra email
That's correct. Using a composite primary key on (customer_id, email) does not prevent SQLAlchemy from trying to insert a new object that essentially duplicates an existing email — although it will warn you if an object with the same primary key already exists in the identity map — and the INSERT will fail (throw an exception and be rolled back) because of the duplicate PK, thus preventing the duplicate child record.
If you want to check whether an email exists before trying to add it you can either use session.get() …
with Session(engine) as session:
# retrieve John's object
john = (
session.execute(select(Customer).where(Customer.name == "John"))
.scalars()
.one()
)
print(john) # <Customer(id=123, name='John')>
# check if email already exists using .get()
email = session.get(Email, (john.id, "some_email"))
if email:
print(f"email already exists: {email}")
# email already exists: <Email(customer_id=123, email='some_email')>
else:
print("email does not already exist")
… or a relationship in Customer could provide the existing emails, allowing you to search for the one you want to add
# alternative (less efficient) method: check via relationship
e_list = [e for e in john.emails if e.email == "some_email"]
if e_list: # list not empty
print("email already exists")
else:
print("email does not already exist")

Look for existing entry in database before trying to insert

I am currently working with Access 2013. I have built a database that revolves around applicants submitting for a Job. The database is set up so that a person can apply for many different jobs, when the same person applies for a job through our website (uses JotForms) it automatically updates the database.
I have a Python script pulling the applicants submission information from an email which updates the database. The problem that I am running into is that within the database I have the applicants primary email set to "no duplicates", thus not allowing the same person to apply for many different jobs as the Python script is trying to create a new record within the database causing an error.
Within my Access form (VBA) or in Python what do I need to write to tell my database if the primary emails are the same only create a new record within the position applied for table that is related to the persons primary email?
Tables:
tblPerson_Information tblPosition_Applied_for
Personal_ID (PK) Position_ID
First_Name Position_Personal_ID (FK)
Last_Name Date_of_Submission
Clearance_Type
Primary_Phone
Primary_email
Education_Level
Simply look up the email address in the [tblPerson_Information] table:
primary_email = 'gord#example.com' # test data
crsr = conn.cursor()
sql = """\
SELECT Personal_ID FROM tblPerson_Information WHERE Primary_email=?
"""
crsr.execute(sql, (primary_email))
row = crsr.fetchone()
if row is not None:
personal_id = row[0]
print('Email found: tblPerson_Information.Personal_ID = {0}'.format(personal_id))
else:
print('Email not found in tblPerson_Information')

ProgrammingError, Flask with postgres and sqlalchemy

I am trying to get this setup to work, the database is created correctly, but trying to insert data I get the following error:
On sqlite:
sqlalchemy.exc.OperationalError
OperationalError: (sqlite3.OperationalError) no such column: Author [SQL: u'SELECT count(*) AS count_1 \nFROM (SELECT Author) AS anon_1']
On postgres:
sqlalchemy.exc.ProgrammingError
ProgrammingError: (psycopg2.ProgrammingError) column "author" does not exist
LINE 2: FROM (SELECT Author) AS anon_1
^
[SQL: 'SELECT count(*) AS count_1 \nFROM (SELECT Author) AS anon_1']
edit: Perhaps this has to do with it: I don't understand why it says "anon_1", as I am using credentials clearly?
I have inspected postgres and sqlite and the tables are created correctly. It seems to be an ORM configuration error, as it only seems to happend on inspecting or creating entries, any suggestion would be welcome!
class Author(CommonColumns):
__tablename__ = 'author'
author = Column(String(200))
author_url = Column(String(2000))
author_icon = Column(String(2000))
comment = Column(String(5000))
registerSchema('author')(Author)
SETTINGS = {
'SQLALCHEMY_TRACK_MODIFICATIONS': True,
'SQLALCHEMY_DATABASE_URI': 'sqlite:////tmp/test.db',
# 'SQLALCHEMY_DATABASE_URI': 'postgresql://xxx:xxx#localhost/test',
}
application = Flask(__name__)
# bind SQLAlchemy
db = application.data.driver
Base.metadata.bind = db.engine
db.Model = Base
db.create_all()
if __name__ == "__main__":
application.run(debug=True)
What is the query you're using to insert data?
I think the error messages may be a bit more opaque than they need to be because you're using Author/author in three very similar contexts:
the Class name
the table name
the column name
For easier debugging, the first thing I'd do is temporarily make each one unique (AuthorClass, author_table, author_column) so you can check which 'Author' is actually being referred to by the error message.
Since you're using the ORM, I suspect the underlying issue is that your insert statement uses Author (the object) when it should actually be using Author.author (the attribute/column name). The SELECT statements are complaining that they can't find the column 'author', but because you use author for both the table and column name, it's unclear what's actually being passed into the SQL statement.

Bulk inserts with Flask-SQLAlchemy

I'm using Flask-SQLAlchemy to do a rather large bulk insert of 60k rows. I also have a many-to-many relationship on this table, so I can't use db.engine.execute for this. Before inserting, I need to find similar items in the database, and change the insert to an update if a duplicate item is found.
I could do this check beforehand, and then do a bulk insert via db.engine.execute, but I need the primary key of the row upon insertion.
Currently, I am doing a db.session.add() and db.session.commit() on each insert, and I get a measly 3-4 inserts per second.
I ran a profiler to see where the bottleneck is, and it seems that the db.session.commit() is taking 60% of the time.
Is there some way that would allow me to make this operation faster, perhaps by grouping commits, but which would give me primary keys back?
This is what my models looks like:
class Item(db.Model):
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String(1024), nullable=True)
created = db.Column(db.DateTime())
tags_relationship = db.relationship('Tag', secondary=tags, backref=db.backref('items', lazy='dynamic'))
tags = association_proxy('tags_relationship', 'text')
class Tag(db.Model):
id = db.Column(db.Integer, primary_key=True)
text = db.Column(db.String(255))
My insert operation is:
for item in items:
if duplicate:
update_existing_item
else:
x = Item()
x.title = "string"
x.created = datetime.datetime.utcnow()
for tag in tags:
if not tag_already_exists:
y = Tag()
y.text = "tagtext"
x.tags_relationship.append(y)
db.session.add(y)
db.session.commit()
else:
x.tags_relationship.append(existing_tag)
db.session.add(x)
db.session.commit()
Perhaps you should try to db.session.flush() to send the data to the server, which means any primary keys will be generated. At the end you can db.session.commit() to actually commit the transaction.
I use the following code to quickly read the content of a pandas DataFrame into SQLite. Note that it circumvents the ORM features of SQLAlchemy. myClass in this context is a db.Model derived class that has a tablename assigned to it. As the code snippets mentions, I adapted
l = df.to_dict('records')
# bulk save the dictionaries, circumventing the slow ORM interface
# c.f. https://gist.github.com/shrayasr/5df96d5bc287f3a2faa4
connection.engine.execute(
myClass.__table__.insert(),
l
)
from app import db
data = [{"attribute": "value"}, {...}, {...}, ... ]
db.engine.execute(YourModel.__table__.insert(), data)
for more information refer https://gist.github.com/shrayasr/5df96d5bc287f3a2faa4

sqlalchemy use of inheritance in postgres

in an attempt to learn sqlalchemy (and python), i am trying to duplicate an already existing project, but am having trouble figuring out sqlalchemy and inheritance with postgres.
here is an example of what our postgres database does (obviously, this is simplified):
CREATE TABLE system (system_id SERIAL PRIMARY KEY,
system_name VARCHAR(24) NOT NULL);
CREATE TABLE file_entry(file_entry_id SERIAL,
file_entry_msg VARCHAR(256) NOT NULL,
file_entry_system_name VARCHAR(24) REFERENCES system(system_name) NOT NULL);
CREATE TABLE ops_file_entry(CONSTRAINT ops_file_entry_id_pkey PRIMARY KEY (file_entry_id),
CONSTRAINT ops_system_name_check CHECK ((file_entry_system_name = 'ops'::bpchar))) INHERITS (file_entry);
CREATE TABLE eng_file_entry(CONSTRAINT eng_file_entry_id_pkey PRIMARY KEY (file_entry_id),
CONSTRAINT eng_system_name_check CHECK ((file_entry_system_name = 'eng'::bpchar)) INHERITS (file_entry);
CREATE INDEX ops_file_entry_index ON ops_file_entry USING btree (file_entry_system_id);
CREATE INDEX eng_file_entry_index ON eng_file_entry USING btree (file_entry_system_id);
And then the inserts would be done with a trigger, so that they were properly inserted into the child databases. Something like:
CREATE FUNCTION file_entry_insert_trigger() RETURNS "trigger"
AS $$
DECLARE
BEGIN
IF NEW.file_entry_system_name = 'eng' THEN
INSERT INTO eng_file_entry(file_entry_id, file_entry_msg, file_entry_type, file_entry_system_name) VALUES (NEW.file_entry_id, NEW.file_entry_msg, NEW.file_entry_type, NEW.file_entry_system_name);
ELSEIF NEW.file_entry_system_name = 'ops' THEN
INSERT INTO ops_file_entry(file_entry_id, file_entry_msg, file_entry_type, file_entry_system_name) VALUES (NEW.file_entry_id, NEW.file_entry_msg, NEW.file_entry_type, NEW.file_entry_system_name);
END IF;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
in summary, i have a parent table with a foreign key to another table. then i have 2 child tables that exist, and the inserts are done based upon a given value. in my example above, if file_entry_system_name is 'ops', then the row goes into the ops_file_entry table; 'eng' goes into eng_file_entry_table. we have hundreds of children tables in our production environment, and considering the amount of data, it really speeds things up, so i would like to keep this same structure. i can query the parent, and as long as i give it the right 'system_name', it immediately knows which child table to look into.
my desire is to emulate this with sqlalchemy, but i can't find any examples that go into this much detail. i look at the sql generated by sqlalchemy by examples, and i can tell it is not doing anything similar to this on the database side.
the best i can come up with is something like:
class System(_Base):
__tablename__ = 'system'
system_id = Column(Integer, Sequence('system_id_seq'), primary_key = True)
system_name = Column(String(24), nullable=False)
def __init(self, name)
self.system_name = name
class FileEntry(_Base):
__tablename__ = 'file_entry'
file_entry_id = Column(Integer, Sequence('file_entry_id_seq'), primary_key=True)
file_entry_msg = Column(String(256), nullable=False)
file_entry_system_name = Column(String(24), nullable=False, ForeignKey('system.system_name'))
__mapper_args__ = {'polymorphic_on': file_entry_system_name}
def __init__(self, msg, name)
self.file_entry_msg = msg
self.file_entry_system_name = name
class ops_file_entry(FileEntry):
__tablename__ = 'ops_file_entry'
ops_file_entry_id = Column(None, ForeignKey('file_entry.file_entry_id'), primary_key=True)
__mapper_args__ = {'polymorphic_identity': 'ops_file_entry'}
in the end, what am i missing? how do i tell sqlalchemy to associate anything that is inserted into FileEntry with a system name of 'ops' to go to the 'ops_file_entry' table? is my understanding way off?
some insight into what i should do would be amazing.
You just create a new instance of ops_file_entry (shouldn't this be OpsFileEntry?), add it into the session, and upon flush, one row will be inserted into table file_entry as well as table ops_file_entry.
You don't need to set the file_entry_system_name attribute, nor the trigger.
I don't really know python or sqlalchemy, but I figured I'd give it a shot for old times sake. ;)
Have you tried basically setting up your own trigger at the application level? Something like this might work:
from sqlalchemy import event, orm
def my_after_insert_listener(mapper, connection, target):
# set up your constraints to store the data as you want
if target.file_entry_system_name = 'eng'
# do your child table insert
   elseif target.file_entry_system_name = 'ops'
# do your child table insert
#…
mapped_file_entry_class = orm.mapper(FileEntry, 'file_entry')
# associate the listener function with FileEntry,
# to execute during the "after_insert" hook
event.listen(mapped_file_entry_class, 'after_insert', my_after_insert_listener)
I'm not positive, but I think target (or perhaps mapper) should contain the data being inserted.
Events (esp. after_create) and mapper will probably be helpful.

Categories