Using Python/Flask/SQLAlchemy/Heroku.
Want to store dictionaries of objects as properties of an object:
TO CLARIFY
class SoccerPlayer(db.Model):
name = db.Column(db.String(80))
goals_scored = db.Column(db.Integer())
^How can I set name and goals scored as one dictionary?
UPDATE: The user will input the name and goals_scored if that makes any difference.
Also, I am searching online for an appropriate answer, but as a noob, I haven't been able to understand/implement the stuff I find on Google for my Flask web app.
I would second the approach provided by Sean, following it you get properly
normalized DB schema and can easier utilize RDBMS to do the hard work for you. If,
however, you insist on using dictionary-like structure inside your DB, I'd
suggest to try out hstore
data type which allows you to store key/value pairs as a single value in
Postgres. I'm not sure if hstore extension is created by default in Postgres
DBs provided by Heroku, you can check that by typing \dx command inside
psql. If there are no lines with hstore in them, you can create it by
typing CREATE EXTENSION hstore;.
Since hstore support in SQLAlchemy is available in version 0.8 which is not
released yet (but hopefully will be in coming weeks), you need to install it
from its Mercurial repository:
pip install -e hg+https://bitbucket.org/sqlalchemy/sqlalchemy#egg=SQLAlchemy
Then define your model like this:
from sqlalchemy.dialects.postgresql import HSTORE
from sqlalchemy.ext.mutable import MutableDict
class SoccerPlayer(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80), nullable=False, unique=True)
stats = db.Column(MutableDict.as_mutable(HSTORE))
# Note that hstore only allows text for both keys and values (and None for
# values only).
p1 = SoccerPlayer(name='foo', stats={'goals_scored': '42'})
db.session.add(p1)
db.session.commit()
After that you can do the usual stuff in your queries:
from sqlalchemy import func, cast
q = db.session.query(
SoccerPlayer.name,
func.max(cast(SoccerPlayer.stats['goals_scored'], db.Integer))
).group_by(SoccerPlayer.name).first()
Check out HSTORE docs
for more examples.
If you are storing such information in a database I would recommend another approach:
class SoccerPlayer(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80))
team_id = db.Column(db.Integer, db.ForeignKey('Team.id'))
stats = db.relationship("Stats", uselist=False, backref="player")
class Team(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80))
players = db.relationship("SoccerPlayer")
class Stats(db.Model):
id = db.Column(db.Integer, primary_key=True)
player_id = db.Column(db.Integer, db.ForeignKey('SoccerPlayer.id'))
goals_scored = db.Column(db.Integer)
assists = db.Column(db.Integer)
# Add more stats as you see fit
With this model setup you can do crazy things like this:
from sqlalchemy.sql import func
max_goals_by_team = db.session.query(Team.id,
func.max(Stats.goals_scored).label("goals_scored")
). \
join(SoccerPlayer, Stats). \
group_by(Team.id).subquery()
players = SoccerPlayer.query(Team.name.label("Team Name"),
SoccerPlayer.name.label("Player Name"),
max_goals_by_team.c.goals_scored). \
join(max_goals_by_team,
SoccerPlayer.team_id == max_goals_by_team.c.id,
SoccerPlayer.stats.goals_scored == max_goals_by_team.c.goals_scored).
join(Team)
thus making the database do the hard work of pulling out the players with the highest goals per team, rather than doing it all in Python.
Not even django(a bigger python web framework than flask) doesn't support this by default. But in django you can install it, it's called a jsonfield( https://github.com/bradjasper/django-jsonfield ).
What i'm trying to tell you is that not all databases know how to store binaries, but they do know how to store strings and jsonfield for django is actually a string that contains the json dump of a dictionary.
So, in short you can do in flask
import simplejson
class SoccerPlayer(db.Model):
_data = db.Column(db.String(1024))
#property
def data(self):
return simplejson.loads(self._data)
#data.setter
def data(self, value):
self._data = simplejson.dumps(value)
But beware, this way you can only assign the entire dictionary at once:
player = SoccerPlayer()
player.data = {'name': 'Popey'}
print player.data # Will work as expected
{'name': 'Popey'}
player.data['score'] = '3'
print player.data
# Will not show the score becuase the setter doesn't know how to input by key
{'name': 'Popey'}
Related
I'm trying to create a frozen list in Cassandra so that I can use that column as a primary key, I can do that if I run the query manually,
some_field frozen <list<int>>
but I'm having a hard time figuring out how to do it using cqlengine in Python,
some_field = columns.List(columns.Integer(), primary_key=True)
How can I accomplish the same thing using cqlengine?
EDIT: Final code snippet looks like this,
from cassandra.cqlengine import columns
from cassandra.cqlengine.models import Model
class MyModel(Model):
__table_name__ = 'my_table'
__options__ = {'compaction': {'class': 'DateTieredCompactionStrategy',
'base_time_seconds': 3600,
'max_sstable_age_days': 1}}
keys = columns.Set(columns.Integer(), primary_key=True)
columns.BaseCollectionColumn._freeze_db_type(keys)
time_stamp = columns.DateTime(primary_key=True, clustering_order="DESC")
...
As I see in the source code, you need to call function _freeze_db_type on the instance of the collection type after you created it - this will change type to frozen<...>.
I am using a flask API as my rest point for my Angular application. Currently I am testing the API. I tested my /users point to make sure I got all the users.
//importing db, app, models, schema etc.
from flask import jsonify, request
#app.route('/users')
def get_users():
# fetching from database
users_objects = User.query.all()
# transforming into JSON-serializable objects
users_schema = UserSchema(many=True)
result = users_schema.dump(users_objects)
# serializing as JSON
return jsonify(result.data)
That worked. However, now that I am trying to get other data(which has more than 9000 objects.. it doesn't work(when I try querying all of them). I first just grabbed the first item
#app.route('/aggregated-measurements')
def get_aggregated_measurements():
aggregated_measurements_objects = AggregatedMeasurement.query.first()
# transforming into JSON-serializable objects
aggregated_measurement_schema = AggregatedMeasurementSchema()
result = aggregated_measurement_schema.dump(aggregated_measurements_objects)
return jsonify(result.data)
That showed me the first AggregatedMeasurement. However when I try to query all of them aggregated_measurements_objects = AggregatedMeasurement.query.all() Nothing displays. I did the same thing on my jupyter notebook and that displayed them. I then thought that maybe this was too much info, so I tried to just limit the query like this aggregated_measurements_objects = AggregatedMeasurement.query.all()[:5]. That works on the jupyter notebook, but displays nothing when I hit the route.
I don't understand why when I hit the /users point I can see all of them, but when I try to do the same for aggregated-measurements I get nothing(even when I limit the query). I am using flask_sqlalchemy with sqlite db.
**update with model and schema **
from datetime import datetime
# ... import db
import pandas as pd
from marshmallow import Schema, fields
class AggregatedMeasurement(db.Model):
id = db.Column(db.Integer, primary_key=True)
created = db.Column(db.DateTime, nullable=False, default=datetime.utcnow)
time = db.Column(db.DateTime, nullable=False)
speed = db.Column(db.Float, nullable=False)
direction = db.Column(db.Float, nullable=False)
# related fields
point_id = db.Column(db.Integer, db.ForeignKey('point.id'), nullable=False)
point = db.relationship('Point',backref=db.backref('aggregated_measurements', lazy=True))
class AggregatedMeasurementSchema(Schema):
id = fields.Int(dump_only=True)
time = fields.DateTime()
speed = fields.Number()
direction = fields.Number()
point_id = fields.Number()
SECOND UPDATE found the error.
After verifying that indeed it was hitting the db( thank you #gbozee) I noticed that on the /aggregated-measurements route when I made the schema I did it for just one object. I forgot to include the many = True like I did in the users_schema. Therefore that is why only one point appeared and when I tried more, it did not. I was using the marshmallow(an object serialization package).
I am using flaks python and sqlalchemy to connect to a huge db, where a lot of stats are saved. I need to create some useful insights with the use of these stats, so I only need to read/get the data and never modify.
The issue I have now is the following:
Before I can access a table I need to replicate the table in my models file. For example I see the table Login_Data in the DB. So I go into my models and recreate the exact same table.
class Login_Data(Base):
__tablename__ = 'login_data'
id = Column(Integer, primary_key=True)
date = Column(Date, nullable=False)
new_users = Column(Integer, nullable=True)
def __init__(self, date=None, new_users=None):
self.date = date
self.new_users = new_users
def get(self, id):
if self.id == id:
return self
else:
return None
def __repr__(self):
return '<%s(%r, %r, %r)>' % (self.__class__.__name__, self.id, self.date, self.new_users)
I do this because otherwise I cant query it using:
some_data = Login_Data.query.limit(10)
But this feels unnecessary, there must be a better way. Whats the point in recreating the models if they are already defined. What shall I use here:
some_data = [SOMETHING HERE SO I DONT NEED TO RECREATE THE TABLE].query.limit(10)
Simple question but I have not found a solution yet.
Thanks to Tryph for the right sources.
To access the data of an existing DB with sqlalchemy you need to use automap. In your configuration file where you load/declare your DB type. You need to use the automap_base(). After that you can create your models and use the correct table names of the DB without specifying everything yourself:
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
import stats_config
Base = automap_base()
engine = create_engine(stats_config.DB_URI, convert_unicode=True)
# reflect the tables
Base.prepare(engine, reflect=True)
# mapped classes are now created with names by default
# matching that of the table name.
LoginData = Base.classes.login_data
db_session = Session(engine)
After this is done you can now use all your known sqlalchemy functions on:
some_data = db_session.query(LoginData).limit(10)
You may be interested by reflection and automap.
Unfortunately, since I never used any of those features, I am not able to tell you more about them. I just know that they allow to use the database schema without explicitly declaring it in Python.
I have an existing, working Flask app that uses SQLAlchemy. Several of the models/tables in this app have columns that store raw HTML, and I'd like to inject a function on a column's setter so that the incoming raw html gets 'cleansed'. I want to do this in the model so I don't have to sprinkle "clean this data" all through the form or route code.
I can currently already do this like so:
from application import db, clean_the_data
from sqlalchemy.ext.hybrid import hybrid_property
class Example(db.Model):
__tablename__ = 'example'
normal_column = db.Column(db.Integer,
primary_key=True,
autoincrement=True)
_html_column = db.Column('html_column', db.Text,
nullable=False)
#hybrid_property
def html_column(self):
return self._html_column
#html_column.setter
def html_column(self, value):
self._html_column = clean_the_data(value)
This works like a charm - except for the model definition the _html_column name is never seen, the cleaner function is called, and the cleaned data is used. Hooray.
I could of course stop there and just eat the ugly handling of the columns, but why do that when you can mess with metaclasses?
Note: the following all assumes that 'application' is the main Flask module, and that it contains two children: 'db' - the SQLAlchemy handle and 'clean_the_data', the function to clean up the incoming HTML.
So, I went about trying to make a new base Model class that spotted a column that needs cleaning when the class is being created, and juggled things around automatically, so that instead of the above code, you could do something like this:
from application import db
class Example(db.Model):
__tablename__ = 'example'
__html_columns__ = ['html_column'] # Our oh-so-subtle hint
normal_column = db.Column(db.Integer,
primary_key=True,
autoincrement=True)
html_column = db.Column(db.Text,
nullable=False)
Of course, the combination of trickery with metaclasses going on behind the scenes with SQLAlchemy and Flask made this less than straight-forward (and is also why the nearly matching question "Custom metaclass to create hybrid properties in SQLAlchemy" doesn't quite help - Flask gets in the way too). I've almost gotten there with the following in application/models/__init__.py:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.ext.hybrid import hybrid_property
# Yes, I'm importing _X stuff...I tried other ways to avoid this
# but to no avail
from flask_sqlalchemy import (Model as BaseModel,
_BoundDeclarativeMeta,
_QueryProperty)
from application import db, clean_the_data
class _HTMLBoundDeclarativeMeta(_BoundDeclarativeMeta):
def __new__(cls, name, bases, d):
# Move any fields named in __html_columns__ to a
# _field/field pair with a hybrid_property
if '__html_columns__' in d:
for field in d['__html_columns__']:
if field not in d:
continue
hidden = '_' + field
fget = lambda self: getattr(self, hidden)
fset = lambda self, value: setattr(self, hidden,
clean_the_data(value))
d[hidden] = d[field] # clobber...
d[hidden].name = field # So we don't have to explicitly
# name the column. Should probably
# force a quote on the name too
d[field] = hybrid_property(fget, fset)
del d['__html_columns__'] # Not needed any more
return _BoundDeclarativeMeta.__new__(cls, name, bases, d)
# The following copied from how flask_sqlalchemy creates it's Model
Model = declarative_base(cls=BaseModel, name='Model',
metaclass=_HTMLBoundDeclarativeMeta)
Model.query = _QueryProperty(db)
# Need to replace the original Model in flask_sqlalchemy, otherwise it
# uses the old one, while you use the new one, and tables aren't
# shared between them
db.Model = Model
Once that's set, your model class can look like:
from application import db
from application.models import Model
class Example(Model): # Or db.Model really, since it's been replaced
__tablename__ = 'example'
__html_columns__ = ['html_column'] # Our oh-so-subtle hint
normal_column = db.Column(db.Integer,
primary_key=True,
autoincrement=True)
html_column = db.Column(db.Text,
nullable=False)
This almost works, in that there's no errors, data is read and saved correctly, etc. Except the setter for the hybrid_property is never called. The getter is (I've confirmed with print statements in both), but the setter is ignored totally and the cleaner function is thus never called. The data is set though - changes are made quite happily with the un-cleaned data.
Obviously I've not quite completely emulated the static version of the code in my dynamic version, but I honestly have no idea where the issue is. As far as I can see, the hybrid_property should be registering the setter just like it has the getter, but it's just not. In the static version, the setter is registered and used just fine.
Any ideas on how to get that final step working?
Maybe use a custom type ?
from sqlalchemy import TypeDecorator, Text
class CleanedHtml(TypeDecorator):
impl = Text
def process_bind_param(self, value, dialect):
return clean_the_data(value)
Then you can just write your models this way:
class Example(db.Model):
__tablename__ = 'example'
normal_column = db.Column(db.Integer, primary_key=True, autoincrement=True)
html_column = db.Column(CleanedHtml)
More explanations are available in the documentation here: http://docs.sqlalchemy.org/en/latest/core/custom_types.html#augmenting-existing-types
I have recently made a decision to start using the Pyramid (python web framework) for my projects from now on.
I have also decided to use SQLalchemy, and I want to use raw MySQL (personal reasons) but still keep the ORM features.
The first part of the code in models.py reads:
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
Base = declarative_base()
Now from here how do I exectue a query for CREATE TABLE using raw MySQL.
the traditional SQLalchemy way would be:
class Page(Base):
__tablename__ = 'pages'
id = Column(Integer, primary_key=True)
name = Column(Text, unique=True)
data = Column(Text)
def __init__(self, name, data):
self.name = name
self.data = data
DBSession.execute('CREATE TABLE ....')
Have a look at sqlalchemy.text() for parametrized queries.
My own biased suggestion would be to use http://pypi.python.org/pypi/khufu_sqlalchemy to setup the sqlalchemy engine.
Then inside a pyramid view you can do something like:
from khufu_sqlalchemy import dbsession
db = dbsession(request)
db.execute("select * from table where id=:id", {'id':7})
Inside the views.py if you are adding form elements, first create an object of the database.
In your snippet, do it as
pg = Page()
and add it with
DBSession.add(pg)
for all the form elements you want to add e.g name and data from your snippet.
the final code would be similar to:
pg = Page()
name = request.params['name']
data = request.params['data']
DBSession.add(pg)