How to define frozen columns in Cassandra using cqlengine?

How to define frozen columns in Cassandra using cqlengine? - python

I'm trying to create a frozen list in Cassandra so that I can use that column as a primary key, I can do that if I run the query manually,
some_field frozen <list<int>>
but I'm having a hard time figuring out how to do it using cqlengine in Python,
some_field = columns.List(columns.Integer(), primary_key=True)
How can I accomplish the same thing using cqlengine?
EDIT: Final code snippet looks like this,
from cassandra.cqlengine import columns
from cassandra.cqlengine.models import Model
class MyModel(Model):
__table_name__ = 'my_table'
__options__ = {'compaction': {'class': 'DateTieredCompactionStrategy',
'base_time_seconds': 3600,
'max_sstable_age_days': 1}}
keys = columns.Set(columns.Integer(), primary_key=True)
columns.BaseCollectionColumn._freeze_db_type(keys)
time_stamp = columns.DateTime(primary_key=True, clustering_order="DESC")
...

As I see in the source code, you need to call function _freeze_db_type on the instance of the collection type after you created it - this will change type to frozen<...>.

Related

Query by computed property in python mongoengine

I wondered if it is possible to query documents in MongoDB by computed properties using mongoengine in python.
Currently, my model looks like this:
class SnapshotIndicatorKeyValue(db.Document):
meta = {"collection": "snapshot_indicator_key_values"}
snapshot_id = db.ObjectIdField(nullable=False)
indicator_key_id = db.ObjectIdField(nullable=False)
value = db.FloatField(nullable=False)
created_at = db.DateTimeField()
updated_at = db.DateTimeField()
#property
def snapshot(self):
return Snapshot.objects(id=self.snapshot_id).first()
def indicator_key(self):
return IndicatorKey.objects(id=self.indicator_key_id).first()
When I do for example SnapshotIndicatorKeyValue .objects().first().snapshot, I can access the snapshotproperty.
But when I try to query it, it doesn't work. For example:
SnapshotIndicatorKeyValue.objects(snapshot__date_time__lte=current_date_time)
I get the error `mongoengine.errors.InvalidQueryError: Cannot resolve field "snapshot"``
Is there any way to get this working with queries?
I need to query SnapshotIndicatorKeyValue based on a property of snapshot.

In order to query the snapshot property directly through mongoengine, you can reference the related snapshot object rather than the snapshot_id in your SnapshotIndicatorKeyValue document definition.
An amended model using a Reference field would be like this:
from mongoengine import Document, ReferenceField
class Snapshot(Document)
property_abc = RelevantPropertyHere() # anything you need
class SnapshotIndicatorKeyValue(Document):
snapshot = ReferenceField(Snapshot)
You would sucessively save an instance of Snapshot and an instance of SnapshotIndicatorKeyValue like this:
sample_snapshot = Snapshot(property_abc=relevant_value_here) # anything you need
sample_snapshot.save()
sample_indicatorkeyvalue = SnapshotIndicatorKeyValue()
sample_indicatorkeyvalue.snapshot = sample_snapshot
sample_indicatorkeyvalue.save()
You can then refer to any of the snapshot's properties through:
SnapshotIndicatorKeyValue.objects.first().snapshot.property_abc

Python SQLalchemy access huge DB data without creating models

I am using flaks python and sqlalchemy to connect to a huge db, where a lot of stats are saved. I need to create some useful insights with the use of these stats, so I only need to read/get the data and never modify.
The issue I have now is the following:
Before I can access a table I need to replicate the table in my models file. For example I see the table Login_Data in the DB. So I go into my models and recreate the exact same table.
class Login_Data(Base):
__tablename__ = 'login_data'
id = Column(Integer, primary_key=True)
date = Column(Date, nullable=False)
new_users = Column(Integer, nullable=True)
def __init__(self, date=None, new_users=None):
self.date = date
self.new_users = new_users
def get(self, id):
if self.id == id:
return self
else:
return None
def __repr__(self):
return '<%s(%r, %r, %r)>' % (self.__class__.__name__, self.id, self.date, self.new_users)
I do this because otherwise I cant query it using:
some_data = Login_Data.query.limit(10)
But this feels unnecessary, there must be a better way. Whats the point in recreating the models if they are already defined. What shall I use here:
some_data = [SOMETHING HERE SO I DONT NEED TO RECREATE THE TABLE].query.limit(10)
Simple question but I have not found a solution yet.

Thanks to Tryph for the right sources.
To access the data of an existing DB with sqlalchemy you need to use automap. In your configuration file where you load/declare your DB type. You need to use the automap_base(). After that you can create your models and use the correct table names of the DB without specifying everything yourself:
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
import stats_config
Base = automap_base()
engine = create_engine(stats_config.DB_URI, convert_unicode=True)
# reflect the tables
Base.prepare(engine, reflect=True)
# mapped classes are now created with names by default
# matching that of the table name.
LoginData = Base.classes.login_data
db_session = Session(engine)
After this is done you can now use all your known sqlalchemy functions on:
some_data = db_session.query(LoginData).limit(10)

You may be interested by reflection and automap.
Unfortunately, since I never used any of those features, I am not able to tell you more about them. I just know that they allow to use the database schema without explicitly declaring it in Python.

SqlAlchemy and PostgreSql datetime update

I have a PostgreSql table in which I want to update an attribute of type timestamp with timezone (I tried also without timestamp but it does not work). I'm using SqlAlchemy session for that purpose.
I fetch an existing record, and update it with a current timestamp:
from model import Table
from dbconf import session
t=session.query(Table).filter(Table.id==1).first()
t.available=datetime.now()
session.add(t)
session.commit()
After this command nothing change in the database. What am I doing wrong?

I can assume that you have model of this table, you should add there new update method like this:
class table(Base):
__tablename__ = 'table'
id = Column(Integer, primary_key=True)
available = Column(DateTime)
asd = Column(Unicode(255))
def update(self, available=None, asd = None): #etc.
if available:
self.available = available
if asd:
self.asd = asd
and updating happens then like this:
import transaction
with transaction.manager:
t=session.query(Table).filter(Table.id==1).first() #search the object what you want to update
t.update(available=datetime.now()) #you can update only one or several cell like this

can I store a Dictionary as the property of an object?

Using Python/Flask/SQLAlchemy/Heroku.
Want to store dictionaries of objects as properties of an object:
TO CLARIFY
class SoccerPlayer(db.Model):
name = db.Column(db.String(80))
goals_scored = db.Column(db.Integer())
^How can I set name and goals scored as one dictionary?
UPDATE: The user will input the name and goals_scored if that makes any difference.
Also, I am searching online for an appropriate answer, but as a noob, I haven't been able to understand/implement the stuff I find on Google for my Flask web app.

I would second the approach provided by Sean, following it you get properly
normalized DB schema and can easier utilize RDBMS to do the hard work for you. If,
however, you insist on using dictionary-like structure inside your DB, I'd
suggest to try out hstore
data type which allows you to store key/value pairs as a single value in
Postgres. I'm not sure if hstore extension is created by default in Postgres
DBs provided by Heroku, you can check that by typing \dx command inside
psql. If there are no lines with hstore in them, you can create it by
typing CREATE EXTENSION hstore;.
Since hstore support in SQLAlchemy is available in version 0.8 which is not
released yet (but hopefully will be in coming weeks), you need to install it
from its Mercurial repository:
pip install -e hg+https://bitbucket.org/sqlalchemy/sqlalchemy#egg=SQLAlchemy
Then define your model like this:
from sqlalchemy.dialects.postgresql import HSTORE
from sqlalchemy.ext.mutable import MutableDict
class SoccerPlayer(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80), nullable=False, unique=True)
stats = db.Column(MutableDict.as_mutable(HSTORE))
# Note that hstore only allows text for both keys and values (and None for
# values only).
p1 = SoccerPlayer(name='foo', stats={'goals_scored': '42'})
db.session.add(p1)
db.session.commit()
After that you can do the usual stuff in your queries:
from sqlalchemy import func, cast
q = db.session.query(
SoccerPlayer.name,
func.max(cast(SoccerPlayer.stats['goals_scored'], db.Integer))
).group_by(SoccerPlayer.name).first()
Check out HSTORE docs
for more examples.

If you are storing such information in a database I would recommend another approach:
class SoccerPlayer(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80))
team_id = db.Column(db.Integer, db.ForeignKey('Team.id'))
stats = db.relationship("Stats", uselist=False, backref="player")
class Team(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80))
players = db.relationship("SoccerPlayer")
class Stats(db.Model):
id = db.Column(db.Integer, primary_key=True)
player_id = db.Column(db.Integer, db.ForeignKey('SoccerPlayer.id'))
goals_scored = db.Column(db.Integer)
assists = db.Column(db.Integer)
# Add more stats as you see fit
With this model setup you can do crazy things like this:
from sqlalchemy.sql import func
max_goals_by_team = db.session.query(Team.id,
func.max(Stats.goals_scored).label("goals_scored")
). \
join(SoccerPlayer, Stats). \
group_by(Team.id).subquery()
players = SoccerPlayer.query(Team.name.label("Team Name"),
SoccerPlayer.name.label("Player Name"),
max_goals_by_team.c.goals_scored). \
join(max_goals_by_team,
SoccerPlayer.team_id == max_goals_by_team.c.id,
SoccerPlayer.stats.goals_scored == max_goals_by_team.c.goals_scored).
join(Team)
thus making the database do the hard work of pulling out the players with the highest goals per team, rather than doing it all in Python.

Not even django(a bigger python web framework than flask) doesn't support this by default. But in django you can install it, it's called a jsonfield( https://github.com/bradjasper/django-jsonfield ).
What i'm trying to tell you is that not all databases know how to store binaries, but they do know how to store strings and jsonfield for django is actually a string that contains the json dump of a dictionary.
So, in short you can do in flask
import simplejson
class SoccerPlayer(db.Model):
_data = db.Column(db.String(1024))
#property
def data(self):
return simplejson.loads(self._data)
#data.setter
def data(self, value):
self._data = simplejson.dumps(value)
But beware, this way you can only assign the entire dictionary at once:
player = SoccerPlayer()
player.data = {'name': 'Popey'}
print player.data # Will work as expected
{'name': 'Popey'}
player.data['score'] = '3'
print player.data
# Will not show the score becuase the setter doesn't know how to input by key
{'name': 'Popey'}

Python - SqlAlchemy. How to relate tables from different modules or files?

I have this class in one file and item class in another file in the same module. If they are in different modules or files when I define a new Channel, I got an error because Item is not in the same file. How can I solve this problem? If both classes are in the same file, I don't get any error.
ChannelTest.py
from ItemTest import Item
metadata = rdb.MetaData()
channel_items = Table(
"channel_items",
metadata,
Column("channel_id", Integer,
ForeignKey("channels.id")),
Column("item_id", Integer,
ForeignKey("items.id"))
)
class Channel(rdb.Model):
""" Set up channels table in the database """
rdb.metadata(metadata)
rdb.tablename("channels")
id = Column("id", Integer, primary_key=True)
title = Column("title", String(100))
items = relation("Item",
secondary=channel_items, backref="channels")
Item.py Different file, but in the same module
class Item(rdb.Model):
""" Set up items table in the database """
rdb.metadata(metadata)
rdb.tablename("items")
id = Column("id", Integer, primary_key=True)
title = Column("title", String(100))
Thanks in advance!

"NoReferencedTableError: Could not find table 'items' with which to generate a foreign key"
All your table definitions should share metadata object.
So you should do metadata = rdb.MetaData() in some separate module, and then use this metadata instance in ALL Table()'s.

The string method should work, but if it doesn't than there is also the option of simply importing the module.
And if that gives you import loops than you can still add the property after instantiating the class like this:
import item
Channel.items = relation(item.Item,
secondary=item.channel_items,
backref='channels')

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to define frozen columns in Cassandra using cqlengine? - python

As I see in the source code, you need to call function _freeze_db_type on the instance of the collection type after you created it - this will change type to frozen<...>.

Related

Query by computed property in python mongoengine

Python SQLalchemy access huge DB data without creating models

SqlAlchemy and PostgreSql datetime update

can I store a Dictionary as the property of an object?

Python - SqlAlchemy. How to relate tables from different modules or files?

Categories

Resources