SQLAlchemy not behaving correctly after table dropped

SQLAlchemy not behaving correctly after table dropped - python

I have some test code that creates tables and drops them for each test case. However, all tests fail after the first one because I am relying on code that uses sa.Table() for something first and only creates the new tables if calling that method errors with NoSuchTableError. However, this error is not thrown after the tables are first dropped, even though the engine correctly reports that they do not exist, and so they are not created again.
I've reproduced this behavior as follows:
>>>import sqlalchemy as sa
>>>from sqlalchemy import *
>>>m = MetaData()
The tables don't exist, so calling sa.Table here errors as expected:
>>>t = sa.Table('test', m, autoload_with=eng)
...
NoSuchTableError: test
But if I create the table and then drop it, sa.Table does not error as expected:
>>>t = sa.Table('test', m, Column('t', String(2)))
>>>m.create_all(eng)
>>>eng.has_table('test')
True
>>>t = sa.Table('test', m, autoload_with=eng)
>>>eng.execute('drop table "test" cascade')
<sqlalchemy.engine.result.ResultProxy at 0x106a06150>
>>>eng.has_table('test')
False
>>>t = sa.Table('test', m, autoload_with=eng)
No error is thrown after that last call to sa.Table, even though the table does not exist.
What do I need to do to get sa.Table() to correctly error after the tables have been dropped? The engine object I am passing to it knows that the tables do not exist, but is there something else I need to do, like refreshing/reconnecting somehow, so that I get the expected behavior?

Turns out I need to refresh the MetaData object (create a new one) each time I call sa.Table if I expect the schema to change. This solves the problem.

Related

PonyORM (Python) "Value was updated outside of current transaction" but it wasn't

I'm using Pony ORM version 0.7 with a Sqlite3 database on disk, and running into this issue: I am performing a select, then an update, then a select, then another update, and getting an error message of
pony.orm.core.UnrepeatableReadError: Value of Task.order_id for
Task[23654] was updated outside of current transaction (was: 1, now: 2)
I've reduced the problem to the minimum set of commands that causes the problem (i.e. removing anything causes the problem not to occur):
#db_session
def test_method():
tasks = list(map(Task.to_dict, Task.select()))
db.execute("UPDATE Task SET order_id=order_id*2")
task_to_move = select(task for task in Task if task.order_id == 2).first()
task_to_move.order_id = 1
test_method()
For completeness's sake, here is the definition of Task:
class Task(db.Entity):
text = Required(unicode)
heading = Required(int)
create_timestamp = Required(datetime)
done_timestamp = Optional(datetime)
order_id = Required(int)
Also, if I remove the constraint that task.order_id == 2 from my select, the problem no longer occurs, so I assume the problem has something to do with querying based on a field that has been changed since the transaction has started, but I don't know why the error message is telling me that it was changed by a different transaction (unless maybe db.execute is executing in a separate transaction because it is raw SQL?)
I've already looked at this similar question, but the problem was different (Pony ORM reports record "was updated outside of current transaction" while there is not other transaction) and at this documentation (https://docs.ponyorm.com/transactions.html) but neither solved my problem.
Any ideas what might be going on here?

Pony uses optimistic concurrency control by default. For each attribute Pony remembers its current value (potentially modified by application code) as well as original value which was read from the database. During UPDATE Pony checks that the value of column in the database is still the same. If the value is changed, Pony assumes that some concurrent transaction did it, and throw exception in order to avoid the "lost update" situation.
If you execute some raw SQL query, Pony does not know what exactly was modified in the database. So when Pony encounters that the counter value was changed, it mistakenly thinks that the value was changed by another transaction.
In order to avoid the problem you can mark order_id attribute as volatile. Then Pony will assume, that the value of attribute can change at any time (by trigger or raw SQL update), and will exclude that attribute from optimistic checks:
class Task(db.Entity):
text = Required(unicode)
heading = Required(int)
create_timestamp = Required(datetime)
done_timestamp = Optional(datetime)
order_id = Required(int, volatile=True)
Note that Pony will cache the value of volatile attribute and will not re-read the value from the database until the object was saved, so in some situation you can get obsolete value in Python.
Update:
Starting from release 0.7.4 you can also specify optimistic=False option to db_session to turn off optimistic checks for specific transaction that uses raw SQL queries:
with db_session(optimistic=False):
...
or
#db_session(optimistic=False)
def some_function():
...
Also it is possible now to specify optimistic=False option for attribute instead of specifying volatile=True. Then Pony will not make optimistic checks for that attribute, but will still consider treat it as non-volatile

Python peewee save() doesn't work as expected

I'm inserting/updating objects into a MySQL database using the peewee ORM for Python. I have a model like this:
class Person(Model):
person_id = CharField(primary_key=True)
name = CharField()
I create the objects/rows with a loop, and each time through the loop have a dictionary like:
pd = {"name":"Alice","person_id":"A123456"}
Then I try creating an object and saving it.
po = Person()
for key,value in pd.items():
setattr(po,key,value)
po.save()
This takes a while to execute, and runs without errors, but it doesn't save anything to the database -- no records are created.
This works:
Person.create(**pd)
But also throws an error (and terminates the script) when the primary key already exists. From reading the manual, I thought save() was the function I needed -- that peewee would perform the update or insert as required.
Not sure what I need to do here -- try getting each record first? Catch errors and try updating a record if it can't be created? I'm new to peewee, and would normally just write INSERT ... ON DUPLICATE KEY UPDATE or even REPLACE.

Person.save(force_insert=True)
It's documented: http://docs.peewee-orm.com/en/latest/peewee/models.html#non-integer-primary-keys-composite-keys-and-other-tricks

I've had a chance to re-test my answer, and I think it should be replaced. Here's the pattern I can now recommend; first, use get_or_create() on the model, which will create the database row if it doesn't exist. Then, if it is not created (object is retrieved from db instead), set all the attributes from the data dictionary and save the object.
po, created = Person.get_or_create(person_id=pd["person_id"],defaults=pd)
if created is False:
for key in pd:
setattr(fa,key,pd[key])
po.save()
As before, I should mention that these are two distinct transactions, so this should not be used with multi-user databases requiring a true upsert in one transaction.

I think you might try get_or_create()? http://peewee.readthedocs.org/en/latest/peewee/querying.html#get-or-create

You may do something like:
po = Person()
for key,value in pd.items():
setattr(po,key,value)
updated = po.save()
if not updated:
po.save(force_insert=True)

Strange error after SQLAlchemy update: 'list' object has no attribute '_all_columns'

Here's a simplified version of my query:
subquery = (select(['latitude'])
.select_from(func.unnest(func.array_agg(Room.latitude))
.alias('latitude')).limit(1).as_scalar())
Room.query.with_entities(Room.building, subquery).group_by(Room.building).all()
When executing it I get an error deep inside SQLAlchemy:
File ".../sqlalchemy/sql/selectable.py", line 429, in columns
self._populate_column_collection()
File ".../sqlalchemy/sql/selectable.py", line 992, in _populate_column_collection
for col in self.element.columns._all_columns:
AttributeError: 'list' object has no attribute '_all_columns'
Inspecting it in a debugger shows me this:
>>> self.element
<sqlalchemy.sql.functions.Function at 0x7f72d4fcae50; unnest>
>>> str(self.element)
'unnest(array_agg(rooms.id))'
>>> self.element.columns
[<sqlalchemy.sql.functions.Function at 0x7f72d4fcae50; unnest>]
The problem started with SQLAlchemy 0.9.4; in 0.9.3 everything worked fine.
When running it in SQLAlchemy 0.9.3 the following query is executed (as expected):
SELECT rooms.building AS rooms_building,
(SELECT latitude
FROM unnest(array_agg(rooms.latitude)) AS latitude
LIMIT 1) AS anon_1
FROM rooms
GROUP BY rooms.building
Am I doing something wrong here or is it a bug in SQLAlchemy?

This turned out to be a bug in SQLAlchemy: https://bitbucket.org/zzzeek/sqlalchemy/issue/3137/decide-what-funcxyz-alias-should-do
func.foo().alias() should in fact be equivalent to func.foo().select().alias(), however in this case that will push out a second level of nesting here which you don't want. So to make that correction to the API probably needs to be a 1.0 thing, unless I can determine that func.foo().alias() is totally unusable right now.
The proper way to do it, according to the SQLAlchemy developer, is this:
subquery = (select(['*']).select_from(
func.unnest(func.array_agg(Room.latitude)))
.limit(1)
.as_scalar())
Most likely the next version (0.9.8 I assume) is going to have the old behavior restored:
I'm restoring the old behavior, but for now just use select(['*']). the column is unnamed. PG's behavior of assigning the column name based on the alias in the FROM is a little bit magic (e.g., if the function returned multiple columns, then it ignores that name and uses the ones the function reports?)

SQLAlchamy Database Construction & Reuse

there's something I'm struggling to understand with SQLAlchamy from it's documentation and tutorials.
I see how to autoload classes from a DB table, and I see how to design a class and create from it (declaratively or using the mapper()) a table that is added to the DB.
My question is how does one write code that both creates the table (e.g. on first run) and then reuses it?
I don't want to have to create the database with one tool or one piece of code and have separate code to use the database.
Thanks in advance,
Peter

create_all() does not do anything if a table exists already, so just call it as soon as you set up your engine or connection.
(Note that if you change your table schema, create_all() will not update it! So you still need "another program" to do that.)
This is the usual pattern:
def createEngine(metadata, dsn, **args):
engine = create_engine(dsn, **args)
metadata.create_all(engine)
return engine
def doStuff(engine):
res = engine.execute('select * from mytable')
# etc etc
def main():
engine = createEngine(metadata, 'sqlite:///:memory:')
doStuff(engine)
if __name__=='__main__':
main()

I think you're perhaps over-thinking the situation. If you want to create the database afresh, you normally just call Base.metadata.create_all() or equivalent, and if you don't want to do that, you don't call it.
You could try calling it every time and handling the exception if it goes wrong, assuming that the database is already set up.
Or you could try querying for a certain table and if that fails, call create_all() to put everything in place.
Every other part of your app should work in the same way whether you perform the db creation or not.

Can't unpickle GeoAlchemy GeometryColumn. Possible solution: 2 sqlalchemy models for same table

I have some sqlalchemy models that has a GeometryColumn.
For my caching purposes i use cPickle.
Now, while this column is empty everything fine. However, if it has data I receive:
TypeError: buffer() takes at least 1 argument (0 given)
while doing cPickle.loads(data)
I dont really need this column in this query, i will be happy to exclude it.
But doing something like mymodel.geom = None before pickling still give this error
The only solution i get in mind is to define another sqlalchemy model, that will not have this column. But if i set:
__tablename__ = 'same_table'
I receive:
Table 'my_model' is already defined for this MetaData instance. Specify 'extend_existing=True' to redefine options and columns on an existing Table object.
Is there any solution for any of this errors or some getaround?
btw trying to do something like:
class MyModelNoGeom(MyModel):
__tablename__=MyModel.__tablename__
__table__.extend_existing=True
geom=None
Also gives error:
NameError: name '__table__' is not defined

The single solution I found:
A bit ugly, but works.
temp_geom=MyModel.geom
MyModel.geom=None
objects_MyModel.send_to_cache()
MyMyodel.geom=temp_geom

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

SQLAlchemy not behaving correctly after table dropped - python

Turns out I need to refresh the MetaData object (create a new one) each time I call sa.Table if I expect the schema to change. This solves the problem.

Related

PonyORM (Python) "Value was updated outside of current transaction" but it wasn't

Python peewee save() doesn't work as expected

Strange error after SQLAlchemy update: 'list' object has no attribute '_all_columns'

SQLAlchamy Database Construction & Reuse

Can't unpickle GeoAlchemy GeometryColumn. Possible solution: 2 sqlalchemy models for same table

Categories

Resources