Can an association class be implemented in Python?

Can an association class be implemented in Python? - python

I have just started learning software development and I am modelling my system in a UML Class diagram. I am unsure how I would implement this in code.
To keep things simple let’s assume the followimg example:
There is a Room and a Guest Class with association Room(0..)-Guest(0..) and an association class RoomBooking, which contains booking details. How would I model this in Python if my system wants to see all room bookings made by a particular guest?

Most Python applications developed from a UML design are backed by a relational database, usually via an ORM. In which case your design is pretty trivial: your RoomBooking is a table in the database, and the way you look up all RoomBooking objects for a given Guest is just an ORM query. Keeping it vague rather than using a particular ORM syntax, something like this:
bookings = RoomBooking.select(Guest=guest)
With an RDBMS but no ORM, it's not much different. Something like this:
sql = 'SELECT Room, Guest, Charge, Paid FROM RoomBooking WHERE Guest = ?'
cur = db.execute(sql, (guest.id))
bookings = [RoomBooking(*row) for row in cur]
And this points to what you'd do if you're not using a RDBMS: any relation that would be stored as a table with a foreign key is instead stored as some kind of dict in memory.
For example, you might have a dict mapping guests to sets of room bookings:
bookings = guest_booking[guest]
Or, alternatively, if you don't have a huge number of hotels, you might have this mapping implicit, with each hotel having a 1-to-1 mapping of guests to bookings:
bookings = [hotel.bookings[guest] for hotel in hotels]
Since you're starting off with UML, you're probably thinking in strict OO terms, so you'll want to encapsulate this dict in some class, behind some mutator and accessor methods, so you can ensure that you don't accidentally break any invariants.
There are a few obvious places to put it—a BookingManager object makes sense for the guest-to-set-of-bookings mapping, and the Hotel itself is such an obvious place for the per-hotel-guest-to-booking that I used it without thinking above.
But another place to put it, which is closer to the ORM design, is in a class attribute on the RoomBooking type, accessed by classmethods. This also allows you to extend things if you later need to, e.g., look things up by hotel—you'd then put two dicts as class attributes, and ensure that a single method always updates both of them, so you know they're always consistent.
So, let's look at that:
class RoomBooking
guest_mapping = collections.defaultdict(set)
hotel_mapping = collections.defaultdict(set)
def __init__(self, guest, room):
self.guest, self.room = guest, room
#classmethod
def find_by_guest(cls, guest):
return cls.guest_mapping[guest]
#classmethod
def find_by_hotel(cls, hotel):
return cls.hotel_mapping[hotel]
#classmethod
def add_booking(cls, guest, room):
booking = cls(guest, room)
cls.guest_mapping[guest].add(booking)
cls.hotel_mapping[room.hotel].add(booking)
Of course your Hotel instance probably needs to add the booking as well, so it can raise an exception if two different bookings cover the same room on overlapping dates, whether that happens in RoomBooking.add_booking, or in some higher-level function that calls both Hotel.add_booking and RoomBooking.add_booking.
And if this is multi-threaded (which seems like a good possibility, given that you're heading this far down the Java-inspired design path), you'll need a big lock, or a series of fine-grained locks, around the whole transaction.
For persistence, you probably want to store these mappings along with the public objects. But for a small enough data set, or for a server that rarely restarts, it might be simpler to just persist the public objects, and rebuild the mappings at load time by doing a bunch of add_booking calls as part of the load process.
If you want to make it even more ORM-style, you can have a single find method that takes keyword arguments and manually executes a "query plan" in a trivial way:
#classmethod
def find(cls, guest=None, hotel=None):
if guest is None and hotel is None:
return {booking for bookings in cls.guest_mapping.values()
for booking in bookings}
elif hotel is None:
return cls.guest_mapping[guest]
elif guest is None:
return cls.hotel_mapping[hotel]
else:
return {booking for booking in cls.guest_mapping[guest]
if booking.room.hotel == hotel}
But this is already pushing things to the point where you might want to go back and ask whether you were right to not use an ORM in the first place. If that sounds ridiculously heavy duty for your simple toy app, take a look at sqlite3 for the database (which comes with Python, and which takes less work to use than coming up with a way to pickle or json all your data for persistence) and SqlAlchemy for the ORM. There's not much of a learning curve, and not much runtime overhead or coding-time boilerplate.

Sure you can implement it in Python. But there is not a single way. Quite often you have a database layer where the association class is used with two foreign keys (in your case to the primaries of Room and Guest). So in order to search you would just code an according SQL to be sent. In case you want to cache this table you would code it like this (or similarly) with an associative array:
from collections import defaultdict
class Room():
def __init__(self, num):
self.room_number = num
def key(self):
return str(self.room_number)
class Guest():
def __init__(self, name):
self.name = name
def key(self):
return self.name
def nested_dict(n, type):
if n == 1:
return defaultdict(type)
else:
return defaultdict(lambda: nested_dict(n-1, type))
room_booking = nested_dict(2, str)
class Room_Booking():
def __init__(self, date):
self.date = date
room1 = Room(1)
guest1 = Guest("Joe")
room_booking[room1.key()][guest1.key()] = Room_Booking("some date")
print(room_booking[room1.key()][guest1.key()])

Related

Why use MongoAlchemy when you could subclass a Python Dict?

A friend recently showed me that you can create an instance that is a subclass of dict in Python, and then use that instance to save, update, etc. Seems like you have more control, and it looks easier as well.
class Marker(dict):
def __init__(self, username, email=None):
self.username = username
if email:
self.email = email
#property
def username(self):
return self.get('username')
#username.setter
def username(self, val):
self['username'] = val
def save(self):
db.collection.save(self)

Author here. The general reason you'd want to use it (or one of the many similar libraries) is for safety. When you assign a value to a MongoAlchemy Document it does a check a check to make sure all of the constraints you specified are satisfied (e.g. type, lengths of strings, numeric bounds).
It also has a query DSL that can be more pleasant to use than the json-like built in syntax. Here's an example from the docs:
>>> query = session.query(BloodDonor)
>>> for donor in query.filter(BloodDonor.first_name == 'Jeff', BloodDonor.age < 30):
>>> print donor
Jeff Jenkins (male; Age: 28; Type: O+)
The MongoAlchemy Session object also allows you to simulate transactions:
with session:
do_stuff()
session.insert(doc1)
do_more_stuff()
session.insert(doc2)
do_even_more_stuff()
session.insert(doc3)
# note that at this point nothing has been inserted
# now things are inserted
This doesn't mean that these inserts are one atomic operation—or even that all of the write will succeed—but it does mean that if your application has errors in the "do_stuff" functions that you won't have done half of the inserts. So it prevents a specific and reasonably common type of error

GAE Datastore ndb models accessed in 5 different ways

I run an online marketplace. I don't know the best way to access NDB models. I'm afraid it's a real mess and I really don't know which way to turn. If you don't have time for a full response, I'm happy to read an article on NDB best practices
I have these classes, which are interlinked in different ways:
User(webapp2_extras.appengine.auth.models.User) controls seller logins
Partner(ndb.Model) contains information about sellers
menuitem(ndb.Model) contains information about items on menu
order(ndb.Model) contains buyer information & information about an order (all purchases are "guest" purchases)
Preapproval(ndb.Model) contains payment information saved from PayPal
How they're linked.
User - Partner
A 1-to-1 relationship. Both have "email address" fields. If these match, then can retrieve user from partner or vice versa. For example:
user = self.user
partner = model.Partner.get_by_email(user.email_address)
Where in the Partner model we have:
#classmethod
def get_by_email(cls, partner_email):
query = cls.query(Partner.email == partner_email)
return query.fetch(1)[0]
Partner - menuitem
menuitems are children of Partner. Created like so:
myItem = model.menuitem(parent=model.partner_key(partner_name))
menuitems are referenced like this:
menuitems = model.menuitem.get_by_partner_name(partner.name)
where get_by_partner_name is this:
#classmethod
def get_by_partner_name(cls, partner_name):
query = cls.query(
ancestor=partner_key(partner_name)).order(ndb.GenericProperty("itemid"))
return query.fetch(300)
and where partner_key() is a function just floating at the top of the model.py file:
def partner_key(partner_name=DEFAULT_PARTNER_NAME):
return ndb.Key('Partner', partner_name)
Partner - order
Each Partner can have many orders. order has a parent that is Partner. How an order is created:
partner_name = self.request.get('partner_name')
partner_k = model.partner_key(partner_name)
myOrder = model.order(parent=partner_k)
How an order is referenced:
myOrder_k = ndb.Key('Partner', partnername, 'order', ordernumber)
myOrder = myOrder_k.get()
and sometimes like so:
order = model.order.get_by_name_id(partner.name, ordernumber)
(where in model.order we have:
#classmethod
def get_by_name_id(cls, partner_name, id):
return ndb.Key('Partner', partner_name, 'order', int(id)).get()
)
This doesn't feel particularly efficient, particularly as I often have to look up the partner in the datastore just to pull up an order. For example:
user = self.user
partner = model.Partner.get_by_email(user.email_address)
order = model.order.get_by_name_id(partner.name, ordernumber)
Have tried desperately to get something simple like myOrder = order.get_by_id(ordernumber) to work, but it seems that having a partner parent stops that working.
Preapproval - order.
a 1-to-1 relationship. Each order can have a 'Preapproval'. Linkage: a field in the Preapproval class: order = ndb.KeyProperty(kind=order).
creating a Preapproval:
item = model.Preapproval( order=myOrder.key, ...)
accessing a Preapproval:
preapproval = model.Preapproval.query(model.Preapproval.order == order.key).get()
This seems like the easiest method to me.
TL;DR: I'm linking & accessing models in many ways, and it's not very systematic.

User - Parner
You could replace:
#classmethod
def get_by_email(cls, partner_email):
query = cls.query(Partner.email == partner_email)
return query.fetch(1)[0]
with:
#classmethod
def get_by_email(cls, partner_email):
query = cls.query(Partner.email == partner_email).get()
But because of transactions issues is better to use entity groups: User should be parent of Partner.
In this case instead of using get_by_email you can get user without queries:
user = partner.key.parent().get()
Or do an ancestor query for getting the partner object:
partner = Partner.query(ancestor=user_key).get()
Query
Don't use fetch() if you don't need it. Use queries as iterators.
Instead of:
return query.fetch(300)
just:
return query
And then use query as:
for something in query:
blah
Relationships: Partner-Menu Item and Partner - Order
Why are you using entity groups? Ancestors are not used for modeling 1 to N relationships (necessarily). Ancestors are used for transactions, defining entity groups. They are useful in composition relationships (e.g.: partner - user)
You can use a KeyProperty for the relationship. (multivalue (i.e. repeated=true) or not, depending on the orientation of the relationship)
Have tried desperately to get something simple like myOrder = order.get_by_id(ordernumber) to work, but it seems that having a partner parent stops that working.
No problem if you stop using ancestors in this relationship.
TL;DR: I'm linking & accessing models in many ways, and it's not very systematic
There is not a systematic way of linking models. It depends of many factors: cardinality, number of possible items in each side, need transactions, composition relationship, indexes, complexity of future queries, denormalization for optimization, etc.

Ok, I think the first step in cleaning this up is as follows:
At the top of your .py file, import all your models, so you don't have to keep using model.ModelName. That cleans up a bit if the code. model.ModelName becomes ModelName.
First best practice in cleaning this up is to always use a capital letter as the first letter to name a class. A model name is a class. Above, you have mixed model names, like Partner, order, menuitem. It makes it hard to follow. Plus, when you use order as a model name, you may end up with conflicts. Above you redefined order as a variable twice. Use Order as the model name, and this_order as the lookup, and order_key as the key, to clear up some conflicts.
Ok, let's start there

python database implementation

I am trying to implement a simple database program in python. I get to the point where I have added elements to the db, changed the values, etc.
class db:
def __init__(self):
self.database ={}
def dbset(self, name, value):
self.database[name]=value
def dbunset(self, name):
self.dbset(name, 'NULL')
def dbnumequalto(self, value):
mylist = [v for k,v in self.database.items() if v==value]
return mylist
def main():
mydb=db()
cmd=raw_input().rstrip().split(" ")
while cmd[0]!='end':
if cmd[0]=='set':
mydb.dbset(cmd[1], cmd[2])
elif cmd[0]=='unset':
mydb.dbunset(cmd[1])
elif cmd[0]=='numequalto':
print len(mydb.dbnumequalto(cmd[1]))
elif cmd[0]=='list':
print mydb.database
cmd=raw_input().rstrip().split(" ")
if __name__=='__main__':
main()
Now, as a next step I want to be able to do nested transactions within this python code.I begin a set of commands with BEGIN command and then commit them with COMMIT statement. A commit should commit all the transactions that began. However, a rollback should revert the changes back to the recent BEGIN. I am not able to come up with a suitable solution for this.

A simple approach is to keep a "transaction" list containing all the information you need to be able to roll-back pending changes:
def dbset(self, name, value):
self.transaction.append((name, self.database.get(name)))
self.database[name]=value
def rollback(self):
# undo all changes
while self.transaction:
name, old_value = self.transaction.pop()
self.database[name] = old_value
def commit(self):
# everything went fine, drop undo information
self.transaction = []

If you are doing this as an academic exercise, you might want to check out the Rudimentary Database Engine recipe on the Python Cookbook. It includes quite a few classes to facilitate what you might expect from a SQL engine.
Database is used to create database instances without transaction support.
Database2 inherits from Database and provides for table transactions.
Table implements database tables along with various possible interactions.
Several other classes act as utilities to support some database actions that would normally be supported.
Like and NotLike implement the LIKE operator found in other engines.
date and datetime are special data types usable for database columns.
DatePart, MID, and FORMAT allow information selection in some cases.
In addition to the classes, there are functions for JOIN operations along with tests / demonstrations.

This is all available for free in the built in sqllite module. The commits and rollbacks for sqllite are discussed in more detail than I can understand here

How do I use TDD to create a database representation of existing objects?

I have used TDD to develop a set of classes in Python. These objects contain data fields, functions and links to each other. Everything functionally works like I want.
Eventually all of this should be stored in a database, to be used in a Django web application.
I have sketched some possible database schema's to hold the same information, but I feel this is a "sudden big leap", compared to the traditional TDD way of developing the rest of the application.
So, now I wonder, which tests should I write to force me to store these objects in a database in a step-by-step TDD way?
Making this question a bit more concrete, the classes are currently like this:
class Connector(object):
def __init__(self, title = None):
self.value = None
self.valid = False
self.title = title
...
class Element(object):
def __init__(self, title = None):
self.title = title
self.input_connectors = []
self.output_connectors = []
self.number_of_runs = 0
def run(self):
...
self.number_of_runs += 1
class Average(Element):
def __init__(self, title = None):
super(OpenCVMean, self).__init__(title = title)
self.src = Connector("source")
self.avg = Connector("average")
self.input_connectors.append(self.src)
self.output_connectors.append(self.avg)
def run(self):
super(Average, self).run()
self.avg.set_value(numpy.average(self.src.value))
I realize some of the data should be in the database, while processing functions should not. I think there should be a table which represents the details of the different "types / subclasses " of Element, while also one which stores actual instances. But, as I said, I don't see how to get there using TDD.

First, ask yourself if you will be testing your code or the Django ORM. Storing and reading will be fine most of the time.
The things you will need to test are validating the data and any properties that are not model fields. I think you should end up with a good schema through writing tests on the next layer up from the database.
Also, use South or some other means of migration to reduce the cost of schema change. That will give you some peace of mind.
If you have several levels of tests (eg. integration tests), it makes sense to check that the database configuration is intact. Most of the tests do not need to hit the database. You can do this by mocking the model or some database operations (at least save()). Having said that, you can check database writes and reads with this simple test:
def test_db_access(self):
input = Element(title = 'foo')
input.save()
output = Element.objects.get(title='foo')
self.assertEquals(input, output)
Mocking save:
def save(obj):
obj.id = 1

Basic logic on Python OOP

I have studied Python and understand how the OOP concept work. The doubt I have is how to implement in a application that has sql interaction..
Assume I have a employee table in SQL which contains Emp name, Emp address , Emp Salary..Now, I need to give a salary raise for all the employees for which I create a class with a method.
My Database logic
db = MySQLdb.connect("localhost","testuser","test123","TESTDB" )
cursor = db.cursor()
sql = "Select * from Emp"
cursor.execute(sql)
results = cursor.fetchall()
for row in results:
name = row[0]
address = row[1]
salary = row[2]
db.close()
Class definiton:
Class Emp():
def __init__(self,name,salary):
self.name = name
self.salary = salary
def giveraise(self):
self.salary = self.salary*1.4
How do I create the object with the details fetched from the Employee table...I know you don't need to create a class to perform this small operation..but Im thinking in the lines of practical implementation.
I need to process record by record.

It sounds like you are looking for an Object-Relational Mapper (ORM). This is a way of automatically integrating an SQL database with an object-oriented representation of the data, and sounds perfect for your use case. A list of ORMs is at http://wiki.python.org/moin/HigherLevelDatabaseProgramming. Once you have the ORM set up, you can just call the giveraise method on all the Emp objects. (Incidentally, the giveraise function you have defined is not valid Python.)

As Abe has mentioned, there are libraries that do this sort of mapping for you already (such as SQLAlchmy or Django's included ORM).
One thing to keep in mind though is that a "practical implementation" is subjective. For a small application and a developer with no experience with ORMs, coming up to speed on the general idea of an ORM can add time to the project, and subtle errors to the code. Too many times will someone be confused why something like this takes forever with an orm...
for employee in database.query(Employee).all():
employee.give_raise(0.5)
Assuming a table with 100 employees, this could make anything between a few and 200+ individual calls to the database, depending on how you've set up your ORM.
It may be completely justifiable to not use OOP, or at least not in the way you've described. This is a completely valid object:
class Workforce(object):
def __init__(self, sql_connection):
self.sql_connection = sql_connection
def give_raise(self, employee_id, raise_amount):
sql = 'UPDATE Employee SET Salary=Salary + ? WHERE ID = ?'
self.sql_connection.execute(sql, raise_amount, employee_id)
By no means am I trying to knock on ORMs or dissuade you from using them. Especially if you're just experimenting for the heck of it, learning how to use them properly can make them valuable tools. Just keep in mind that they CAN be a hassle, especially if you're never used one before, and to consider that there are simpler solutions if your application is small enough.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Can an association class be implemented in Python? - python

Related

Why use MongoAlchemy when you could subclass a Python Dict?

GAE Datastore ndb models accessed in 5 different ways

python database implementation

How do I use TDD to create a database representation of existing objects?

Basic logic on Python OOP

Categories

Resources