I want to duplicate a model instance (row) in SQLAlchemy using the orm. My first thought was to do this:
i = session.query(Model)
session.expunge(i)
old_id = i.id
i.id = None
session.add(i)
session.flush()
print i.id #New ID
However, apparently the detached object still "remembers" what id it had, even though I set the id to None while it was detached. Thus, session.flush() tries to execute an UPDATE changing the primary key to null.
Is this expected behavior? How can I remove the 'memory' of this attribute, and just treat the detached object as a new object upon re-adding it to the session? How, in general, does one clone an SQLAlchemy model instance?
this case is available using the make_transient() helper function:
inst = session.query(Model).first()
session.expunge(inst)
make_transient(inst)
inst.id = None
session.add(inst)
session.flush()
print inst.id #New ID
def duplicate(self):
arguments = dict()
for name, column in self.__mapper__.columns.items():
if not (column.primary_key or column.unique):
arguments[name] = getattr(self, name)
return self.__class__(**arguments)
Related
I have a special configuration where I've created two SQLAlchemy Sessions from the sessionmaker :
db.read = sessionmaker(autoflush=False)
db.write = sessionmaker()
When calling db.write.add(obj), I sometimes have the following error (but not always):
sqlalchemy.exc.InvalidRequestError: Object '<User at 0x7f3fa9ebee50>' is already attached to session '14' (this is '15')
Is there a way to do the following :
"If session is db.write, then call db.write.add(obj)
Otherwise, add obj to the db.write session in order to save it, and (important), update both the object in db.write session AND db.read session"
(I've managed to save to via write, but then, obj.id is empty in db.read)
Thank you for your help
We can approximate the desired behaviour by extending Session.add to detect and handle the case when an object is already in another session, and by adding an event listener to reinstate the object into the read session once it has been committed. It isn't possible for an object to exist in two sessions simultaneously: if it's preferable that the object remains in the write session the listener and the _prev_session attribute are not required.
import weakref
import sqlalchemy as sa
from sqlalchemy import orm
...
class CarefulSession(orm.Session):
def add(self, object_):
# Get the session that contains the object, if there is one.
object_session = orm.object_session(object_)
if object_session and object_session is not self:
# Remove the object from the other session, but keep
# a reference so we can reinstate it.
object_session.expunge(object_)
object_._prev_session = weakref.ref(object_session)
return super().add(object_)
# The write session must use the Session subclass; the read
# session need not.
WriteSession = orm.sessionmaker(engine, class_=CarefulSession)
#sa.event.listens_for(WriteSession, 'after_commit')
def receive_after_commit(session):
"""Reinstate objects into their previous sessions."""
objects = filter(lambda o: hasattr(o, '_prev_session'), session.identity_map.values())
for object_ in objects:
prev_session = object_._prev_session()
if prev_session:
session.expunge(object_)
prev_session.add(object_)
delattr(object_, '_prev_session')
A complete implementation might add an after_rollback listener, and perhaps initialise _prev_session in the base model's __init__ to avoid the del/has attr calls.
I am building a web scraper and trying to assign an entity a UUID.
Since one entity may be scraped at different times, I want to store the initial UUID along with the extracted id from the webpage
// example document
{
"ent_eid_type": "ABC-123",
"ent_uid_type": "123e4567-aaa-123e456"
}
below is code that runs for every id field that is found in a scraped item
# if the current ent_eid_type is a key in mongo...
if db_coll.find({ent_eid_type: ent_eid}).count() > 0:
# return the uid value
ent_uid = db_coll.find({ent_uid_type: ent_uid })
else:
# create a fresh uid
ent_uid = uuid.uuid4()
# store it with the current entity eid as key, and uid as value
db_coll.insert({ent_eid_type: ent_eid, ent_uid_type: ent_uid})
# update the current item with the stored uid for later use
item[ent_uid_type] = ent_uid
Console is returning KeyError: <pymongo.cursor.Cursor object at 0x104d41710>. Not sure how to parse the cursor for the ent_uid
Any tips/ suggests appreciated!
Pymongo Find command returns a cursor object you need to iterate or access to get the object
Access the first result (you already checked one exists), and access the ent_uid field.
Presumably, you're going to search on EID type, with ent_eid not ent_uid. No reason to search if you already have it.
ent_uid = db_coll.find({ent_eid_type: ent_eid })[0]['ent_uid']
or don't worry about the cursor and use the find_one command instead (http://api.mongodb.com/python/current/api/pymongo/collection.html#pymongo.collection.Collection.find_one)
ent_uid = db_coll.find_one({ent_eid_type: ent_eid })['ent_uid']
I am running a Flask application with SQLAlchemy (1.1.0b3) and Postgres.
With Flask I provide an API over which the client is able to GET all instances of a type database and POST them again on a clean version of the Flask application, as a way of local backup. When the client posts them again, they should again have the same ID as they had when he downloaded them.
I don't want to disable the "increment" option for primary keys for normal operation but if the client provides an ID with a POST and wishes to give a new resource said ID I would like to set it accordingly without breaking the SQLAlchemy. How can I access/reset the current maximum value of ids?
#app.route('/objects', methods = ['POST'])
def post_object():
if 'id' in request.json and MyObject.query.get(request.json['id']) is None: #1
object = MyObject()
object.id = request.json['id']
else: #2
object = MyObject()
object.fillFromJson(request.json)
db.session.add(object)
db.session.commit()
return jsonify(object.toDict()),201
When adding a bunch of object WITH an id #1 and then trying to add on WITHOUT an id or with a used id #2, I get.
duplicate key value violates unique constraint "object_pkey"
DETAIL: Key (id)=(2) already exists.
Usually, the id is generated incrementally but when that id is already used, there is no check for that. How can I get between the auto-increment and the INSERT?
After adding an object with a fixed ID, you have to make sure the normal incremental behavior doesn't cause any collisions with future insertions.
A possible solution I can think of is to set the next insertion ID to the maximum ID (+1) found in the table. You can do that with the following additions to your code:
#app.route('/objects', methods = ['POST'])
def post_object():
fixed_id = False
if 'id' in request.json and MyObject.query.get(request.json['id']) is None: #1
object = MyObject()
object.id = request.json['id']
fixed_id = True
else: #2
object = MyObject()
object.fillFromJson(request.json)
db.session.add(object)
db.session.commit()
if fixed_id:
table_name = MyObject.__table__.name
db.engine.execute("SELECT pg_catalog.setval(pg_get_serial_sequence('%s', 'id'), MAX(id)) FROM %s;" % (table_name, table_name))
return jsonify(object.toDict()),201
The next object (without a fixed id) inserted into the table will continue the id increment from the biggest id found in the table.
I am using a sqlite database as my application file through sqlalchemy. I have a separate configuration file.
There are some classes whose information I persist on my application file that I would like to replicate on my configuration file. The thing is that I would load it alternatively from one or the other source depending on availability.
I saw this mention on the documentation, but I think it does not directly apply as the secondary mapping will not not persist the information. Also, the notion of which would be the primary is blurry. Both databases would carry the same information, maybe not on the same version, though.
http://sqlalchemy.readthedocs.org/en/rel_1_0/orm/nonstandard_mappings.html#multiple-mappers-for-one-class
I will try to make it clearer with an example:
I have a class A which represents a multi-field user input. I save this on my application file.
A class B also on my application file file is composed of an instance of Class A.
The same instance from Class A may compose several suitable instances of Class B. These are all stored on my application file.
My problem is that on another session, with a brand new configuration file I might want to reuse that Class A instance. I can not have it only on the application file, because if it gets updated, it will be relevant across all application files that use it.
On the other hand, it can not be only in the configuration file, as a user might share his application file with another and the later might not have a suitable configuration and would have to do it manually.
I need to have it in both places, be able to choose which database will be the source at runtime and have all changes persist on both databases at once.
Can it be done in sqlalchemy+sqlite? Is it a good idea? Are there classic solutions for this?
EDIT:
I think I am describing something that looks like a cache, which sqlalchemy does not do. Does any other approach come to mind?
Does sqlalchemy allow me to map an instance to a database upon instance creation? This would allow for two instances of the same class to be mapped against different databases. Then I would listen for an update event by sqlalchemy and issue the same sql to the other database. I also do not know how to do this.
Another option: map my class against a union query. Sqlalchemy might allow as it does for arbitrary selects, BUT then there is the persistence issue.
Another option: add a layer to the engine so that it connects to two databases simultaneously, issuing the same commands to both for reading and writing. I could deal with the duplicated returns.
I came up with the mixin below. I does not handle expunge or rollback, as I do not use those in my application nor know how to get about them.
It looks like it is working. I will proceed to expand it to handle collections.
import os
from sqlalchemy import Column, Float, String, Enum, Integer, event
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import orm
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
class ReplicateMixin:
#classmethod
def get_or_create(cls,prime_session, sessoes = None, **kwargs):
if sessoes is None:
sessoes = []
if not isinstance(sessoes, list):
sessoes = [sessoes]
sessoes = [prime_session] + sessoes #They are passed separatelly just to make explicit that the first might receive diferent treatment
replicas = []
for sessao in sessoes: #Gets a result or creates a new instance from each database
instance = sessao.query(Datum).filter_by(**kwargs).first()
if instance is None:
instance = cls(**kwargs)
setattr(instance, "__new", True)
sessao.add(instance)
instance.sessao = sessao
replicas.append(instance)
fittest = cls.__select_fittest(replicas) #Selects the instance whose data will prevail
prime = replicas.pop(0) #Instance from the session we will be issuing commits to. The others must simply follow.
cls.__copy_data(fittest, prime, ReplicateMixin.__get_primary_keys(prime))
setattr(prime, "__replicas", replicas) #The object will carry references to its copies
return prime
#staticmethod
def __select_fittest(instances):
"""This method should contain logic for choosing the instance that has
the most relevant information. It may be altered by child classes"""
if getattr(instances[0], "__new", False):
return instances[1]
else:
return instances[0]
#staticmethod
def __copy_data(source, dest, primary_keys = None):
primary_keys = [] if primary_keys is None else primary_keys
for prop in orm.class_mapper(type(source)).iterate_properties:
if (isinstance(prop, orm.ColumnProperty)
and prop.key not in primary_keys):
setattr(dest, prop.key,
getattr(source, prop.key))
#staticmethod
def __replicate(mapper, connection, original_obj):
replicants = getattr(original_obj, "__replicas", []) #if it IS a replicant it will not have a __replicas attribute
primary_keys = ReplicateMixin.__get_primary_keys(original_obj)
for objeto in replicants:
ReplicateMixin.__copy_data(original_obj, objeto, primary_keys)
objeto.sessao.commit()
#staticmethod
def __replicate_del(mapper, conection, original_obj):
replicants = getattr(original_obj, "__replicas", []) #if it IS a replicant it will not have a __replicas attribute
for objeto in replicants:
if objeto in objeto.sessao.new:
objeto.sessao.expunge(objeto)
else:
objeto.sessao.delete(objeto)
objeto.sessao.commit()
#staticmethod
def __get_primary_keys(mapped_object):
return [key.name for key in orm.class_mapper(type(mapped_object)).primary_key]
#classmethod
def __declare_last__(cls):
"""Binds certain events to functions"""
event.listen(cls, "before_insert", cls.__replicate)
event.listen(cls, "before_update", cls.__replicate)
event.listen(cls, "before_delete", cls.__replicate_del)
#FIXME might not play well with rollback
Example:
DeclarativeBase = declarative_base()
class Datum (ReplicateMixin, DeclarativeBase):
__tablename__ = "xUnitTestData"
Key = Column(Integer, primary_key=True)
Value = Column(Float)
nome = Column(String(10))
def __repr__(self):
return "{}; {}; {}".format(self.Key, self.Value, self.nome)
end_local = os.path.join(os.path.expanduser("~"), "Desktop", "local.bd")
end_remoto = os.path.join(os.path.expanduser("~"), "Desktop", "remoto.bd")
src_engine = create_engine('sqlite:///'+end_local, echo=False)
dst_engine = create_engine('sqlite:///'+end_remoto, echo=False)
DeclarativeBase.metadata.create_all(src_engine)
DeclarativeBase.metadata.create_all(dst_engine)
SessionSRC = sessionmaker(bind=src_engine)
SessionDST = sessionmaker(bind=dst_engine)
session1 = SessionSRC()
session2 = SessionDST()
item = Datum.pegar_ou_criar(session1, session2, Value = 0.5, nome = "terceiro")
item.Value = item.Value/2
print(item)
session1.delete(item)
session1.commit()
session1.close()
I'm really new to Python & as new to Pyramid (this is the first thing I've written in Python) and am having trouble with a database query...
I have the following models (relevant to my question anyway):
MetadataRef (contains info about a given metadata type)
Metadata (contains actual metadata) -- this is a child of MetadataRef
User (contains users) -- this is linked to metadata. MetadataRef.model = 'User' and metadata.model_id = user.id
I need access to name from MetadataRef and value from Metadata.
Here's my code:
class User(Base):
...
_meta = None
def meta(self):
if self._meta == None:
self._meta = {}
try:
for item in DBSession.query(MetadataRef.key, Metadata.value).\
outerjoin(MetadataRef.meta).\
filter(
Metadata.model_id == self.id,
MetadataRef.model == 'User'
):
self._meta[item.key] = item.value
except DBAPIError:
##TODO: actually do something with this
self._meta = {}
return self._meta
The query SQLAlchemy is generating does return what I need (close enough anyway -- it needs to query model_id as part of the ON clause rather than the WHERE, but that's minor and I'm pretty sure I can figure that out myself):
SELECT metadata_refs.`key` AS metadata_refs_key, metadata.value AS metadata_value
FROM metadata_refs LEFT OUTER JOIN metadata ON metadata_refs.id = metadata.metadata_ref_id
WHERE metadata.model_id = %s AND metadata_refs.model = %s
However, when I access the objects I get this error:
AttributeError: 'KeyedTuple' object has no attribute 'metadata_value'
This leads me to think there's some other way I need to access it, but I can't figure out how. I've tried both .value and .metadata_value. .key does work as expected.
Any ideas?
You're querying separate attributes ("ORM-enabled descriptors" in SA docs):
DBSession.query(MetadataRef.key, Metadata.value)
in this case the query returns not full ORM-mapped objects, but a KeyedTuple, which is a cross between a tuple and an object with attributes corresponding to the "labels" of the fields.
So, one way to access the data is by its index:
ref_key = item[0]
metadata_value = item[1]
Alternatively, to make SA to use a specific name for column, you may use Column.label() method:
for item in DBSession.query(MetadataRef.key.label('ref_key'), Metadata.value.label('meta_value'))...
self._meta[item.key] = item.meta_value
For debugging you can use Query.column_descriptions() method which will tell you the names of the columns returned by the query.