How to store a dictionary with SQLAlchemy and SQLite3 [duplicate] - python

Is it possible to create JSON type Column in SQLite with sqlalchemy?
I've tried
import sqlalchemy.types as types
...
myColumn = Column(types.JSON())
and
from sqlalchemy import JSON
...
mycolumn = Column(JSON)
Both get error message:
Compiler can't render element of type
Wondering if there is any solution in sqlalchemy or I should just change into SQL instead. Thanks in advance.
[Updates] SQLite version 3.16.0

My solution is:
import json
from sqlalchemy import TypeDecorator, types
class Json(TypeDecorator):
#property
def python_type(self):
return object
impl = types.String
def process_bind_param(self, value, dialect):
return json.dumps(value)
def process_literal_param(self, value, dialect):
return value
def process_result_value(self, value, dialect):
try:
return json.loads(value)
except (ValueError, TypeError):
return None
...
myColumn = Column("name", Json)
For more information: TypeDecorator

JSON was not added to SQLite until version 3.9. You'll either need to upgrade your SQLite or convert your json to a string and save it as such, while converting it back to a json object when you pull it out.

SQLAlchemy 1.3 includes support for SQLite JSON extension, so don't forget to upgrade:
pip install --user -U SQLAlchemy
The dialect specific type sqlite.JSON implements JSON member access, usable through the base type types.JSON as well.

Related

How can I write a distinct on query in sqlalchemy? [duplicate]

I'd really like to be able to print out valid SQL for my application, including values, rather than bind parameters, but it's not obvious how to do this in SQLAlchemy (by design, I'm fairly sure).
Has anyone solved this problem in a general way?
In the vast majority of cases, the "stringification" of a SQLAlchemy statement or query is as simple as:
print(str(statement))
This applies both to an ORM Query as well as any select() or other statement.
Note: the following detailed answer is being maintained on the sqlalchemy documentation.
To get the statement as compiled to a specific dialect or engine, if the statement itself is not already bound to one you can pass this in to compile():
print(statement.compile(someengine))
or without an engine:
from sqlalchemy.dialects import postgresql
print(statement.compile(dialect=postgresql.dialect()))
When given an ORM Query object, in order to get at the compile() method we only need access the .statement accessor first:
statement = query.statement
print(statement.compile(someengine))
with regards to the original stipulation that bound parameters are to be "inlined" into the final string, the challenge here is that SQLAlchemy normally is not tasked with this, as this is handled appropriately by the Python DBAPI, not to mention bypassing bound parameters is probably the most widely exploited security holes in modern web applications. SQLAlchemy has limited ability to do this stringification in certain circumstances such as that of emitting DDL. In order to access this functionality one can use the 'literal_binds' flag, passed to compile_kwargs:
from sqlalchemy.sql import table, column, select
t = table('t', column('x'))
s = select([t]).where(t.c.x == 5)
print(s.compile(compile_kwargs={"literal_binds": True}))
the above approach has the caveats that it is only supported for basic
types, such as ints and strings, and furthermore if a bindparam
without a pre-set value is used directly, it won't be able to
stringify that either.
To support inline literal rendering for types not supported, implement
a TypeDecorator for the target type which includes a
TypeDecorator.process_literal_param method:
from sqlalchemy import TypeDecorator, Integer
class MyFancyType(TypeDecorator):
impl = Integer
def process_literal_param(self, value, dialect):
return "my_fancy_formatting(%s)" % value
from sqlalchemy import Table, Column, MetaData
tab = Table('mytable', MetaData(), Column('x', MyFancyType()))
print(
tab.select().where(tab.c.x > 5).compile(
compile_kwargs={"literal_binds": True})
)
producing output like:
SELECT mytable.x
FROM mytable
WHERE mytable.x > my_fancy_formatting(5)
Given that what you want makes sense only when debugging, you could start SQLAlchemy with echo=True, to log all SQL queries. For example:
engine = create_engine(
"mysql://scott:tiger#hostname/dbname",
encoding="latin1",
echo=True,
)
This can also be modified for just a single request:
echo=False – if True, the Engine will log all statements as well as a repr() of their parameter lists to the engines logger, which defaults to sys.stdout. The echo attribute of Engine can be modified at any time to turn logging on and off. If set to the string "debug", result rows will be printed to the standard output as well. This flag ultimately controls a Python logger; see Configuring Logging for information on how to configure logging directly.
Source: SQLAlchemy Engine Configuration
If used with Flask, you can simply set
app.config["SQLALCHEMY_ECHO"] = True
to get the same behaviour.
This works in python 2 and 3 and is a bit cleaner than before, but requires SA>=1.0.
from sqlalchemy.engine.default import DefaultDialect
from sqlalchemy.sql.sqltypes import String, DateTime, NullType
# python2/3 compatible.
PY3 = str is not bytes
text = str if PY3 else unicode
int_type = int if PY3 else (int, long)
str_type = str if PY3 else (str, unicode)
class StringLiteral(String):
"""Teach SA how to literalize various things."""
def literal_processor(self, dialect):
super_processor = super(StringLiteral, self).literal_processor(dialect)
def process(value):
if isinstance(value, int_type):
return text(value)
if not isinstance(value, str_type):
value = text(value)
result = super_processor(value)
if isinstance(result, bytes):
result = result.decode(dialect.encoding)
return result
return process
class LiteralDialect(DefaultDialect):
colspecs = {
# prevent various encoding explosions
String: StringLiteral,
# teach SA about how to literalize a datetime
DateTime: StringLiteral,
# don't format py2 long integers to NULL
NullType: StringLiteral,
}
def literalquery(statement):
"""NOTE: This is entirely insecure. DO NOT execute the resulting strings."""
import sqlalchemy.orm
if isinstance(statement, sqlalchemy.orm.Query):
statement = statement.statement
return statement.compile(
dialect=LiteralDialect(),
compile_kwargs={'literal_binds': True},
).string
Demo:
# coding: UTF-8
from datetime import datetime
from decimal import Decimal
from literalquery import literalquery
def test():
from sqlalchemy.sql import table, column, select
mytable = table('mytable', column('mycol'))
values = (
5,
u'snowman: ☃',
b'UTF-8 snowman: \xe2\x98\x83',
datetime.now(),
Decimal('3.14159'),
10 ** 20, # a long integer
)
statement = select([mytable]).where(mytable.c.mycol.in_(values)).limit(1)
print(literalquery(statement))
if __name__ == '__main__':
test()
Gives this output: (tested in python 2.7 and 3.4)
SELECT mytable.mycol
FROM mytable
WHERE mytable.mycol IN (5, 'snowman: ☃', 'UTF-8 snowman: ☃',
'2015-06-24 18:09:29.042517', 3.14159, 100000000000000000000)
LIMIT 1
We can use compile method for this purpose. From the docs:
from sqlalchemy.sql import text
from sqlalchemy.dialects import postgresql
stmt = text("SELECT * FROM users WHERE users.name BETWEEN :x AND :y")
stmt = stmt.bindparams(x="m", y="z")
print(stmt.compile(dialect=postgresql.dialect(),compile_kwargs={"literal_binds": True}))
Result:
SELECT * FROM users WHERE users.name BETWEEN 'm' AND 'z'
Warning from docs:
Never use this technique with string content received from untrusted
input, such as from web forms or other user-input applications.
SQLAlchemy’s facilities to coerce Python values into direct SQL string
values are not secure against untrusted input and do not validate the
type of data being passed. Always use bound parameters when
programmatically invoking non-DDL SQL statements against a relational
database.
So building on #zzzeek's comments on #bukzor's code I came up with this to easily get a "pretty-printable" query:
def prettyprintable(statement, dialect=None, reindent=True):
"""Generate an SQL expression string with bound parameters rendered inline
for the given SQLAlchemy statement. The function can also receive a
`sqlalchemy.orm.Query` object instead of statement.
can
WARNING: Should only be used for debugging. Inlining parameters is not
safe when handling user created data.
"""
import sqlparse
import sqlalchemy.orm
if isinstance(statement, sqlalchemy.orm.Query):
if dialect is None:
dialect = statement.session.get_bind().dialect
statement = statement.statement
compiled = statement.compile(dialect=dialect,
compile_kwargs={'literal_binds': True})
return sqlparse.format(str(compiled), reindent=reindent)
I personally have a hard time reading code which is not indented so I've used sqlparse to reindent the SQL. It can be installed with pip install sqlparse.
This code is based on brilliant existing answer from #bukzor. I just added custom render for datetime.datetime type into Oracle's TO_DATE().
Feel free to update code to suit your database:
import decimal
import datetime
def printquery(statement, bind=None):
"""
print a query, with values filled in
for debugging purposes *only*
for security, you should always separate queries from their values
please also note that this function is quite slow
"""
import sqlalchemy.orm
if isinstance(statement, sqlalchemy.orm.Query):
if bind is None:
bind = statement.session.get_bind(
statement._mapper_zero_or_none()
)
statement = statement.statement
elif bind is None:
bind = statement.bind
dialect = bind.dialect
compiler = statement._compiler(dialect)
class LiteralCompiler(compiler.__class__):
def visit_bindparam(
self, bindparam, within_columns_clause=False,
literal_binds=False, **kwargs
):
return super(LiteralCompiler, self).render_literal_bindparam(
bindparam, within_columns_clause=within_columns_clause,
literal_binds=literal_binds, **kwargs
)
def render_literal_value(self, value, type_):
"""Render the value of a bind parameter as a quoted literal.
This is used for statement sections that do not accept bind paramters
on the target driver/database.
This should be implemented by subclasses using the quoting services
of the DBAPI.
"""
if isinstance(value, basestring):
value = value.replace("'", "''")
return "'%s'" % value
elif value is None:
return "NULL"
elif isinstance(value, (float, int, long)):
return repr(value)
elif isinstance(value, decimal.Decimal):
return str(value)
elif isinstance(value, datetime.datetime):
return "TO_DATE('%s','YYYY-MM-DD HH24:MI:SS')" % value.strftime("%Y-%m-%d %H:%M:%S")
else:
raise NotImplementedError(
"Don't know how to literal-quote value %r" % value)
compiler = LiteralCompiler(dialect, statement)
print compiler.process(statement)
I would like to point out that the solutions given above do not "just work" with non-trivial queries. One issue I came across were more complicated types, such as pgsql ARRAYs causing issues. I did find a solution that for me, did just work even with pgsql ARRAYs:
borrowed from:
https://gist.github.com/gsakkis/4572159
The linked code seems to be based on an older version of SQLAlchemy. You'll get an error saying that the attribute _mapper_zero_or_none doesn't exist. Here's an updated version that will work with a newer version, you simply replace _mapper_zero_or_none with bind. Additionally, this has support for pgsql arrays:
# adapted from:
# https://gist.github.com/gsakkis/4572159
from datetime import date, timedelta
from datetime import datetime
from sqlalchemy.orm import Query
try:
basestring
except NameError:
basestring = str
def render_query(statement, dialect=None):
"""
Generate an SQL expression string with bound parameters rendered inline
for the given SQLAlchemy statement.
WARNING: This method of escaping is insecure, incomplete, and for debugging
purposes only. Executing SQL statements with inline-rendered user values is
extremely insecure.
Based on http://stackoverflow.com/questions/5631078/sqlalchemy-print-the-actual-query
"""
if isinstance(statement, Query):
if dialect is None:
dialect = statement.session.bind.dialect
statement = statement.statement
elif dialect is None:
dialect = statement.bind.dialect
class LiteralCompiler(dialect.statement_compiler):
def visit_bindparam(self, bindparam, within_columns_clause=False,
literal_binds=False, **kwargs):
return self.render_literal_value(bindparam.value, bindparam.type)
def render_array_value(self, val, item_type):
if isinstance(val, list):
return "{%s}" % ",".join([self.render_array_value(x, item_type) for x in val])
return self.render_literal_value(val, item_type)
def render_literal_value(self, value, type_):
if isinstance(value, long):
return str(value)
elif isinstance(value, (basestring, date, datetime, timedelta)):
return "'%s'" % str(value).replace("'", "''")
elif isinstance(value, list):
return "'{%s}'" % (",".join([self.render_array_value(x, type_.item_type) for x in value]))
return super(LiteralCompiler, self).render_literal_value(value, type_)
return LiteralCompiler(dialect, statement).process(statement)
Tested to two levels of nested arrays.
To log SQL queries using Python logging instead of the echo=True flag:
import logging
logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)
per the documentation.
Just a simple colored example with ORM's Query and pygments.
import sqlparse
from pygments import highlight
from pygments.formatters.terminal import TerminalFormatter
from pygments.lexers import SqlLexer
from sqlalchemy import create_engine
from sqlalchemy.orm import Query
engine = create_engine("sqlite+pysqlite:///db.sqlite", echo=True, future=True)
def format_sql(query: Query):
compiled = query.statement.compile(
engine, compile_kwargs={"literal_binds": True})
parsed = sqlparse.format(str(compiled), reindent=True, keyword_case='upper')
print(highlight(parsed, SqlLexer(), TerminalFormatter()))
Or version without sqlparse (without sqlparse there are less new lines in output)
def format_sql(query: Query):
compiled = query.statement.compile(
engine, compile_kwargs={"literal_binds": True})
print(highlight(str(compiled), SqlLexer(), TerminalFormatter()))
This is my approach
# query is instance of: from sqlalchemy import select
def raw_query(query):
q = str(query.compile())
p = query.compile().params
for k in p.keys():
v = p.get(k)
if isinstance(v, (int, float, complex)):
q = q.replace(f":{k}", f"{v}")
else:
q = q.replace(f":{k}", f"'{v}'")
print(q)
How to use it:
from sqlalchemy import select
select_query = select([
any_model_table.c["id_account"],
any_model_table.c["id_provider"],
any_model_table.c["id_service"],
func.sum(any_model_table.c["items"]).label("items"),
# #eaf
func.date_format(func.now(), "%Y-%m-%d").label("some_date"),
func.date_format(func.now(), "%Y").label("as_year"),
func.date_format(func.now(), "%m").label("as_month"),
func.date_format(func.now(), "%d").label("as_day"),
]).group_by(
any_model_table.c.id_account,
any_model_table.c.id_provider,
any_model_table.c.id_service
).where(
any_model_table.c.id == 5
).where(
func.date_format(any_model_table.c.dt, "%Y-%m-%d") == datetime.utcnow().strftime('%Y-%m-%d')
)
raw_query(select_query)

How to create a SQLite Table with a JSON column using SQLAlchemy?

According to this answer SQLite supports JSON data since version 3.9. I use version 3.24 in combination with SQLALchemy (1.2.8) and Python 3.6, but I cannot create any tables containing JSON columns.
What am I missing or doing wrong? A minimal (not) working example is given below:
import sqlalchemy as sa
import os
import tempfile
metadata = sa.MetaData()
foo = sa.Table(
'foo',
metadata,
sa.Column('bar', sa.JSON)
)
tmp_dir = tempfile.mkdtemp()
dbname = os.path.join(tmp_dir, 'foo.db')
engine = sa.create_engine('sqlite:////' + dbname)
metadata.bind = engine
metadata.create_all()
This fails giving the following error:
sqlalchemy.exc.CompileError: (in table 'foo', column 'bar'): Compiler <sqlalchemy.dialects.sqlite.base.SQLiteTypeCompiler object at 0x7f1eae1dab70> can't render element of type <class 'sqlalchemy.sql.sqltypes.JSON'>
Thanks!
Use a TEXT column. Sqlite has a JSON extension with some functions for working with JSON data, but no dedicated JSON type.

sqlAlchemy to access blob via a hybrid propery?

I'm trying to add a block of text into a sqlAlchemy table, which I want to compress to save space with it. Looking through various answers I came up with what I think should be working, but is not. I'm working with a sqlite database.
Updated: Was pointed out I was attempting to use mysql on sqlite which I wasn't aware that was what was happening. I adjusted to use zlib instead and it works to a degree, which gives me a new error that I do not understand.
# proper imports and stuff to make this work
from sqlalchemy import func
class Data(Base):
__tablename__ = 'data'
# ...
text_blobbed = Column('text', BLOB)
#hybrid_property
def text(self):
# return func.decompress(self.text_blobbed)
return self.text_blobbed.decode("zlib")
#text.setter
def text(self, stuff):
# self.text_blobbed = func.compress(stuff)
self.text_blobbed = stuff.encode("zlib")
old error from func.
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such function: compress [SQL: ...... ]
I can now add in the text via Data.text = "a really big block of text"
But when I go to query for this like
session.query(Data.text).filter(Data.id.like(2)).first()
I get an error:
AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with Data.text_blobbed has an attribute 'decode'
Doing this is fine.
r = session.query(Data).filter(Data.id.like(2)).first()
print r.text
I've also looked at the text_blobbed which is a set(). And I can do this that works:
r = session.query(Data.text_blobbed.filter( ... ).first()[0].decode("zlib")
print r
But if I move that [0] into the hybrid_property for
...
return self.text_blobbed[0].decode("zlib")
and query:
r = session.query(Data.text).filter( ... ).first()
I get the error:
NotImplementedError: Operator 'getitem' is not supported on this expression
So, I'm a bit confused still.
I've been looking at these things:
SQLAlchemy - Writing a hybrid method for child count
mysql Compress() with sqlalchemy
SELECT UNCOMPRESS(text) FROM with sqlalchemy
http://docs.sqlalchemy.org/en/latest/orm/mapped_sql_expr.html?highlight=descriptor

Google API storage returns string, not credentials object

I'm trying to use the google-api for python.
I've managed to store the credentials in a CredentialsField (basically copying this) implementation.
I can get a storage object:
>>> storage = Storage(CredentialsModel, 'id', user, 'credential')
>>> storage
<oauth2client.django_orm.Storage object at 0x7f1f8f1260f0>
no problem. But when I try to get the credentials object:
>>> credential = storage.get()
>>> credential
I just get a massive string (7482 characters) instead of a credentials object. What gives? (I think the string might be a bytearray, since it begins with '\\x67414e6a6.
I'm also using Python 3.
Any thoughts on why I'm getting a string instead of a Credentials object?
I've actually found the answer if you still need it:
The problem is that the class CredentialFields in django_orm defined a metaclass variable, which isn't supported in python3 anymore.
Therefore, it is necessary to change it to something like this:
class CredentialsField(models.Field, metaclass=models.SubfieldBase):
someone has opened an issue in the github repo:
https://github.com/google/oauth2client/issues/168
My solution for python3, django 1.8 with postgres database:
There is a missing step right before saving the byte data to database, and after retrieving the data back from database: The byte data need to be converted to/from string first. You can convert byte to string and vice versa with decode("utf-8") and encode("utf-8").
You also do not need the __metaclass__, but need to have get_prep_value() and from_db_value() functions.
The entire class CredentialsField should be rewritten like so:
class CredentialsField(models.Field):
def __init__(self, *args, **kwargs):
if 'null' not in kwargs:
kwargs['null'] = True
super().__init__(*args, **kwargs)
def get_internal_type(self):
return "TextField"
def to_python(self, value):
if value is None:
return None
if isinstance(value, oauth2client.client.Credentials):
return value
value = value.encode("utf-8") # string to byte
return pickle.loads(base64.b64decode(value))
def from_db_value(self, value, expression, connection, context):
return self.to_python(value)
def get_db_prep_value(self, value, connection, prepared=False):
if value is None:
return None
byte_repr = base64.b64encode(pickle.dumps(value))
return byte_repr.decode("utf-8") # byte to string
def get_prep_value(self, value):
return self.get_db_prep_value(value)

Return MongoEngine Documents as JSON

Not too sure if this is really simple or not, but I can't really find anything on the topic. But, either using the regular MongoEngine library, or even Flask-MongoEngine for my Flask based website, would it be possible to return a MongoEngine document as straight JSON?
Thanks!
In 0.8 there are helpers see https://github.com/MongoEngine/mongoengine/issues/1
in the meantime you have to use pymongo's json_utils directly:
from bson import json_util
json_util.dumps(MyDoc._collection_obj.find(MyDoc.objects()._query))
Ross's and Jellyflower's workarounds don't work when field projection or ordering is used.
More general workaround:
from bson import json_util
json = json_util.dumps(query._cursor)
The correct workaround should probably be:
from bson import json_util
objects = MyDoc.objects()
json_util.dumps(objects._collection_obj.find(objects._query))
Update: thanks to Lo-Tan for to_mongo() method usage suggestion.
Eventually I came up with the following solution:
from json import JSONEncoder
from mongoengine.base import BaseDocument
class MongoEncoder(JSONEncoder):
def default(self, o):
if isinstance(o, BaseDocument):
data = o.to_mongo()
# might not be present if EmbeddedDocument
o_id = data.pop('_id', None)
if o_id:
data['id'] = str(o_id['$oid'])
data.pop('_cls', None)
return data
else:
return JSONEncoder.default(self, o)
# consider `obj` to be MongoEngine object
json_data = json.dumps(obj, cls=MongoEncoder)
It uses to_json() method, added as the response to the aforementioned issue.

Categories