Insertion With SQLAlchemy

Insertion With SQLAlchemy - python

So I have a class defined as follows:
class User(db.Model):
id = db.Column(db.Integer, primary_key=True)
username = db.Column(db.String(80), unique=True)
email = db.Column(db.String(120), unique=True)
hashed_password = db.Column(db.String(100))
friends = db.relationship('User',
backref='user', lazy='dynamic')
def __init__(self, username, email, hashed_password):
self.username = username
self.email = email
self.hashed_password = hashed_password
def __repr__(self):
return '<User %r>' % self.username
Essentially, I want each User to have a list of Users (their friends), but I can't seem to find out if this is the right way to model this relationship and, if it seems reasonable, how would I go about inserting a new User as a friend into this list for another given User?

You need to look at your problem from the database schema perspective and then write the SQLAlchemy code to generate that schema. This is important because SA abstracts little out of the schema generation and the ORM is a separate component (this is one of the best aspects SQLAlchemy). Using the declarative extension does not make this distinction go away.
Specifically, you are declaring a relationship, which is an ORM construct, without any underlying column and foreign-key constraint. What you need to do is :
define the schema, with Tables, Columns and Constraints
define the relationship that ties the Python classes together
In this case, you want a many-to-many relationship between users, so you'll need a secondary table with (at least) two columns, both foreign keys to the users table. This case is well covered by the SQLAlchemy docs, including the tutorials.

Related

SQLAlchemy bulk create if not exists

I am trying to optimize my code by reducing the calls to the Database. I have the following models:
class PageCategory(Base):
category_id = Column(Text, ForeignKey('category.category_id'), primary_key=True)
page_id = Column(Text, ForeignKey('page.page_id'), primary_key=True)
class Category(Base):
category_id = Column(Text, primary_key=True)
name = Column(Text, nullable=False)
pages = relationship('Page', secondary='page_category')
class Page(Base):
page_id = Column(Text, primary_key=True)
name = Column(Text, nullable=False)
categories = relationship('Category', secondary='page_category')
The code receives a stream of Facebook likes and each one comes with a Pagea Category and the obvious relation between them a PageCategory. I need to find a way to bulk create, if not existing already, the different Pages, Categories and the relation between them. Given that the code needs to be fast I can't afford a round trip to the Database when creating every object.
page = Page(page_id='1', name='1')
category = Category(category_id='2', name='2')
session.add(page)
session.add(category)
session.commit()
...same for PageCategory
Now, given that a page_id and category_id are PK, the database will raise an IntegrityError if we try to insert duplicates, but that is still a round-trip dance. I would need a utility that receives, say a list of objects like session.bulk_save_objects([page1, page2, category1, category2, page_category1, page_category2]) but just create the objects that do not raise an IntegrityError, and ignore the ones that do.
This way I will be avoiding Database IO for every triple of objects. I don't know if this is possible or this exceeds SQLAlchemy capabilities.

SQLAlchemy Join to retrieve data from multiple tables

I'm trying to retrieve data from multiple tables with SQLAlchemy using the .join() method.
When I run the query I was expecting to get a single object back which had all the data from the different tables joined so that I could use a.area_name and so on where area_name is on one of the joined tables. Below is the query I am running and the table layout, if anyone could offer insight into how to achieve the behavior I'm aiming for I would greatly appreciate it! I've been able to use the .join() method with this same syntax to match results and return them, I figured it would return the extra data from the rows as well since it joins the tables (perhaps I'm misunderstanding how the method works or how to retrieve the information via the query object?).
If it helps with the troubleshooting I'm using MySQL as the database
query:
a = User.query.filter(User.user_id==1).join(UserGroup,
User.usergroup==UserGroup.group_id).join(Areas, User.area==Areas.area_id).first()
and the tables:
class User(db.Model):
user_id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(20), unique=True)
usergroup = db.Column(db.Integer, db.ForeignKey('user_group.group_id'), nullable=False)
area = db.Column(db.Integer, db.ForeignKey('areas.area_id'), nullable=False)
class UserGroups(db.Model):
id = db.Column(db.Integer, primary_key=True)
group_id = db.Column(db.Integer, nullable=False, unique=True)
group_name = db.Column(db.String(64), nullable=False, unique=True)
class Areas(db.Model):
id = db.Column(db.Integer, primary_key=True)
area_id = db.Column(db.Integer, nullable=False, unique=True)
area_name = db.Column(db.String(64), nullable=False, unique=True)

So it seems that I need to use a different approach to the query, and that it returns a tuple of objects which I then need to parse.
What worked is:
a = db.session.query(User, UserGroups, Areas
).filter(User.user_id==1
).join(UserGroup,User.usergroup==UserGroup.group_id
).join(Areas, User.area==Areas.area_id
).first()
The rest remaining the same. This then returned a tuple that I could parse where the data from User is a[0], from UserGroups is a[1], and Areas is a[2]. I can then access the group_name column with a[1].group_name etc.
Hopefully this helps someone else who's trying to work with this!

Take a look at SQLAlchemy's relationship function:
http://docs.sqlalchemy.org/en/latest/orm/basic_relationships.html#one-to-many
You may want to add a new attribute to your User class like so:
group = sqlalchemy.relationship('UserGroups', back_populates='users')
This will automagically resolve the one-to-many relationship between User and UserGroups (assuming that a User can only be member of one UserGroup at a time). You can then simply access the attributes of the UserGroup once you have queried a User (or set of Users) from your database:
a = User.query.filter(...).first()
print(a.group.group_name)
SQLAlchemy resolves the joins for you, you do not need to explicitly join the foreign tables when querying.
The reverse access is also possible; if you just query for a UserGroup, you can access the corresponding members directly (via the back_populates-keyword argument):
g = UserGroup.query.filter(...).first()
for u in g.users:
print(u.name)

Custom operations when creating an SQLAlchemy object

Say I have a set of users, a set of games, and I track whether a user has finished a game in a separate table (name: 'game_progress'). I want it to be that whenever a user is created, the 'game_progress' table is auto-populated with her ID and a 'No' against all the available games. (I know that I can wait until she starts a game to create the record, but, I need this for an altogether different purpose.) How would I go about doing this?
I tried using the after_insert() event. But, then I can't retrieve the ID of the User to insert into 'game_progress'. I don't want to use after_flush (even if I can figure out how to do it) because it may be a bit of an overkill, as the user creation operation doesn't happen that often.
class Game(db.Model):
__tablename__ = 'games'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.Unicode(30))
class User(db.Model):
__tablename__ = 'users'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.Unicode(30))
class GameProgress(db.Model):
__tablename__ = 'game_progress'
user_id = db.Column(db.Integer, db.ForeignKey('users.id'), primary_key=True)
game_id = db.Column(db.Integer, db.ForeignKey('games.id'), primary_key=True)
game_finished = db.Column(db.Boolean)
#event.listens_for(User, "after_insert")
def after_insert(mapper, connection, target):
progress_table = GameProgress.__table__
user_id = target.id
connection.execute(
progress_table.insert().\
values(user_id=user_id, game_id=1, game_finished=0)
)
db.create_all()
game = Game(name='Solitaire')
db.session.add(game)
db.session.commit()
user = User(name='Alice')
db.session.add(user)
db.session.commit()

You don't need to do anything fancy with triggers or event listeners at all, you can just set up the relations and then make related objects in the constructor for User. As long as you have defined relationships (which you're not doing at present, you'd only added the foreign keys), then you don't need User to have an id to set up the associated objects. Your constructor can just do something like this:
class User(db.Model):
def __init__(self, all_games, **kwargs):
for k,v in kwargs.items():
setattr(self, k, v)
for game in all_games:
self.game_progresses.append( GameProgress(game=game, \
user=self, game_finished=False) )
When you commit the user, you'll also commit a list of GameProgress objects, one for each game. But the above depends on you setting up relationships on all your objects. You need to add the below to GameProgress class
game = relationship("Game", backref="game_progresses")
user = relationship("User", backref="game_progresses")
And pass in a list of games to user when you make your user:
all_games = dbs.query(Game).all()
new_user = User(all_games=all_games, name="Iain")
Once that's done you can just add GameProgress objects to the instrumented list user.game_progresses and you don't need to have committed anything before the first commit. SQLA will chase through all the relationships. Basically any time you need to muck with an id directly, ask yourself if you're using the ORM right, you rarely need to. The ORM tutorial on the SQLA docs goes through this very well. There are lots of options you can pass to relationships and backrefs to get the cascading doing what you want.

SQLALCHEMY: There is no unique constraint matching given keys for referenced table

I'm trying to create a relationship between my tables seller and item where each seller can sell any number of items but they can't sell the same item twice. Here's what I have:
sells = db.Table('sells',
db.Column('seller_email', db.String(), db.ForeignKey('seller.email'), primary_key=True),
db.Column('item_id', db.Integer, ForeignKey('item.id'), primary_key=True)
)
class Item(db.Model):
__tablename__ = 'item'
id = db.Column(db.Integer, primary_key=True)
coverPhoto = db.Column(db.String())
price = db.Column(db.Integer)
condition = db.Column(db.Integer)
title = db.Column(db.String())
def __init__(self, title, coverPhoto, price, condition):
self.coverPhoto = coverPhoto
self.price = price
self.condition = condition
self.title = title
def __repr__(self):
return '<id {}>'.format(self.id)
class Seller(db.Model):
__tablename__ = 'seller'
email = db.Column(db.String(), primary_key=True)
password = db.Column(db.String())
firstName = db.Column(db.String())
lastName = db.Column(db.String())
location = db.Column(db.String())
def __init__(self, email, password, firstName, lastName, location):
self.email = email
self.password = password
self.firstName = firstName
self.lastName = lastName
self.location = location
def __repr__(self):
return "<Seller {email='%s'}>" % (self.email)
And I get the following error:
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) there is no unique constraint matching given keys for referenced table "seller"
[SQL: '\nCREATE TABLE sells (\n\tseller_email VARCHAR NOT NULL, \n\titem_id INTEGER NOT NULL, \n\tPRIMARY KEY (seller_email, item_id), \n\tFOREIGN KEY(item_id) REFERENCES item (id), \n\tFOREIGN KEY(seller_email) REFERENCES seller (email)\n)\n\n']
Bother seller.email and item.id are primary keys so shouldn't they inherently be unique?

You're creating the table sells using db.Table, a SQLAlchemy Core function. Nothing wrong with that. Then you create your other tables by inheriting from db.Model using the SQLAlchemy ORM's declarative syntax. (If you're not familiar with the difference between SQLAlchemy Core and ORM, the tutorial is a good place to start.)
You've got a couple of potential issues here:
You're using db.Model and the SQLAlchemy ORM's declarative syntax. When you do this, your model subclasses don't need an __init__ function. In fact, using an __init__ likely will cause problems (possibly even the root cause here) as it will interfere with the monkey-patching that SQLAlchemy does to make the Declarative syntax so convenient...
I suspect that the root cause here might actually be that you use SQLAlchemy Core to create a table with a foreign key reference to a SQLAlchemy ORM-managed table. Normally, when you do db.metadata.create_all(), SQLAlchemy will collect all the table/model mappings, look at the dependencies and figure out the correct ordering for emitting the CREATE TABLE / ADD CONSTRAINT commands to the database. This under-the-covers dependency resolving is what allows the app programmer to define a table that includes a foreign key to a table that is defined later. When you have a Core-based Table object that references a ORM db.Model-based object, it might prevent the dependency resolving from working correctly during table creation. I'm not 100% confident this is the issue, but it's worth experimenting with making all your tables either Table objects, or db.Model subclasses. If you're not sure, I'd suggest the latter.

One-to-many relationships SQLAlchemy that depend on each other

I'm trying to get the following models working together. Firstly the scenario is as follows:
A user can have many email addresses, but each email address can only be associated with one user;
Each user can only have one primary email address (think of it like their current email address).
An email address is a user's id, so they must always have one, but when they change it, I want to keep track of other ones they've used in the past. So far the setup is to have a helper table user_emails that holds a tie between an email and a user, which I hear is not supposed to be setup as a class in using the declarative SQLAlchemy approach (though I don't know why). Also, am I right in thinking that I need to use use_alter=True because the users table won't know the foreign key email_id until it's inserted?
models.py looks like this:
"""models.py"""
user_emails = Table('user_emails', Base.metadata,
Column('user_id', Integer, ForeignKey('users.id'),
primary_key=True),
Column('email', String(50), ForeignKey('emails.address'),
primary_key=True))
class User(Base):
__tablename__ = 'users'
id = Column(Integer, Sequence('usr_id_seq', start=100, increment=1),
primary_key=True)
email_id = Column(String(50),
ForeignKey('emails.address', use_alter=True, name='fk_email_id'),
unique=True, nullable=False)
first = Column(String(25), unique=True, nullable=False)
last = Column(String(25), unique=True, nullable=False)
def __init__(self, first, last):
self.first = first
self.last = last
class Email(Base):
__tablename__ = 'emails'
address = Column(String(50), unique=True, primary_key=True)
user = relationship(User, secondary=user_emails, backref='emails')
added = Column(DateTime, nullable=False)
verified = Column(Boolean, nullable=False)
def __init__(self, address, added, verified=False):
self.address = address
self.added = added
self.verified = verified
Everything seems OK until I try and commit to the DB:
>>> user = models.User("first", "last")
>>> addy = models.Email("example#example.com", datetime.datetime.utcnow())
>>> addy
<Email 'example#example.com' (verified: False)>
>>> user
>>> <User None (active: True)>
>>>
>>> user.email_id = addy
>>> user
>>> <User <Email 'example#example.com' (verified: False)> (active: True)>
>>> Session.add_all([user, addy])
>>> Session.commit()
>>> ...
>>> sqlalchemy.exc.ProgrammingError: (ProgrammingError) can't adapt type 'Email' "INSERT INTO users (id, email_id, first, last, active) VALUES (nextval('usr_id_seq'), %(email_id)s, %(first)s, %(last)s, %(active)s) RETURNING users.id" {'last': 'last', 'email_id': <Email 'example#example.com' (verified: False)>, 'active': True, 'first': 'first'}
So, I figure I'm doing something wrong/stupid, but I'm new to SQLAlchemy so I'm not sure what I need to do to setup the models correctly.
Finally, assuming I get the right models setup, is it possible to add a relationship so that by loading an arbitrary email object I'll be able to access the user who owns it, from an attribute in the Email object?
Thanks!

You have already got a pretty good solution, and a small fix will make your code work. Find below the quick feedback on your code below:
Do you need the use_alter=True? No, you actually do not need that. If the primary_key for the Email table was computed on the database level (as with autoincrement-based primary keys), then you might need it when you have two tables with foreign_keys to each other. In your case, you even do not have that because you have a third table, so for any relationship combination the SA (sqlalchemy) will figure it out by inserting new Emails, then Users, then relationships.
What is wrong with your code?: Well, you are assigning an instance of Email to User.email_id which is supposed to get the email value only. There are two ways how you can fix it:
Assign the email directly. so change the line user.email_id = addy to user.email_id = addy.address
Create a relationship and then make the assignment (see code below).
Personally, I prefer the option-2.
Other things: your current model does not check that the User.email_id is actually one of the User.emails. This might be by design, but else just add a ForeignKey from [users.id, users.email_id] to [user_emails.user_id, user_emails.email]
Sample code for version-2:
""" models.py """
class User(Base):
__tablename__ = 'users'
# ...
email_id = Column(String(50),
ForeignKey('emails.address', use_alter=True,
name='fk_email_id'), unique=True,
nullable=False)
default_email = relationship("Email", backref="default_for_user")
""" script """
# ... (all that you have below until next line)
# user.email_id = addy.address
user.default_email = addy

I'm not familiar with Python/SQLAlchemy, but here is one way to represent what you want in the database:
You'd either use deferred constraints (if your DBMS supports them), of leave USER.PRIMARY_EMAIL NULL-able (as shown in the model above) to break the data modification cycle.
Alternatively, you could do something like this:
E-mails belonging to the same user are ordered (note the alternate key on: {USER_ID, ORDER}), and whichever e-mail is on top of that ordering can be considered "primary". The nice thing about this approach is that it completely avoids the circular reference.

The examples in http://docs.sqlalchemy.org/en/latest/orm/tutorial.html are close to what you want. I never use an intermediate join table like your user_emails unless I need a many-to-many relationship. user-to-email should be a one-to-many.
For your need to keep track of old email addresses? Add an "obsolete" Boolean attribute to your Email class and filter on that to show current or old email addresses.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.