Using slots and "constants" in Python - python

I'm working with some LDAP data in python (I'm not great at Python) and trying to organize a class object to hold the LDAP variables. Since it's LDAP data, the end result will be many copies of the same data structure (per user) collected in an iterable list.
I started with hard-coded attribute names for the __slots__ which seemed to be working, but as things progressed I realized those LDAP attributes ought to be immutable constants of some sort to minimize hard coded text/typos. I assigned variables to the __slot__ attributes but it seems this is not such a workable plan:
AttributeError: 'LDAP_USER' object has no attribute 'ATTR_GIVEN_NAME'
Now that I think about it, I'm not actually creating immutable "constants" with the ATTR_ definitions so those values could theoretically be changed during runtime. I can see why Python might be having a problem with this design.
What is a better way to reduce the usage of hard coded text in the code while maintaining a class object which can be instantiated?
ATTR_DN = 'dn'
ATTR_GIVEN_NAME = 'givenName'
ATTR_SN = 'sn'
ATTR_EMP_TYPE = 'employeeType'
class LDAP_USER (object):
__slots__ = ATTR_GIVEN_NAME, ATTR_DN, ATTR_SN, ATTR_EMP_TYPE
user = LDAP_USER()
user.ATTR_GIVEN_NAME = "milton"
user.ATTR_SN = "waddams"
user.ATTR_EMP_TYPE = "collating"
print ("user is " + user.ATTR_GIVEN_NAME)

With __slots__ defined as [ATTR_GIVEN_NAME, ATTR_DN], the attributes should be referenced using user.givenName and user.dn, since those are the string values in __slots__.
If you want to actually reference the attribute as user.ATTR_GIVEN_NAME then that should be the value in the __slots__ array. You can then add a mapping routine to convert the object attributes to LDAP fields when performing LDAP operations with the object.
Referencing a property not in __slots__ will generate an error, so typos will be caught at runtime.

Related

"Updater" design pattern, as opposed to "Builder"

This is actually language agnostic, but I always prefer Python.
The builder design pattern is used to validate that a configuration is valid prior to creating an object, via delegation of the creation process.
Some code to clarify:
class A():
def __init__(self, m1, m2): # obviously more complex in life
self._m1 = m1
self._m2 = m2
class ABuilder():
def __init__():
self._m1 = None
self._m2 = None
def set_m1(self, m1):
self._m1 = m1
return self
def set_m2(self, m1):
self._m2 = m2
return self
def _validate(self):
# complicated validations
assert self._m1 < 1000
assert self._m1 < self._m2
def build(self):
self._validate()
return A(self._m1, self._m2)
My problem is similar, with an extra constraint that I can't re-create the object each time due to to performance limitations.
Instead, I want to only update an existing object.
Bad solutions I came up with:
I could do as suggested here and just use setters like so
class A():
...
set_m1(self, m1):
self._m1 = m1
# and so on
But this is bad because using setters
Beats the purpose of encapsulation
Beats the purpose of the buillder (now updater), which is supposed to validate that some complex configuration is preserved after the creation, or update in this case.
As I mentioned earlier, I can't recreate the object every time, as this is expensive and I only want to update some fields, or sub-fields, and still validate or sub-validate.
I could add update and validation methods to A and call those, but this beats the purpose of delegating the responsibility of updates, and is intractable in the number of fields.
class A():
...
def update1(m1):
pass # complex_logic1
def update2(m2):
pass # complex_logic2
def update12(m1, m2):
pass # complex_logic12
I could just force to update every single field in A in a method with optional parameters
class A():
...
def update("""list of all fields of A"""):
pass
Which again is not tractable, as this method will soon become a god method due to the many combinations possible.
Forcing the method to always accept changes in A, and validating in the Updater also can't work, as the Updater will need to look at A's internal state to make a descision, causing a circular dependency.
How can I delegate updating fields in my object A
in a way that
Doesn't break encapsulation of A
Actually delegates the responsibility of updating to another object
Is tractable as A becomes more complicated
I feel like I am missing something trivial to extend building to updating.
I am not sure I understand all of your concerns, but I want to try and answer your post. From what you have written I assume:
Validation is complex and multiple properties of an object must be checked to decide if any change to the object is valid.
The object must always be in a valid state. Changes that make the object invalid are not permitted.
It is too expensive to copy the object, make the change, validate the object, and then reject the change if the validation fails.
Move the validation logic out of the builder and into a separate class like ModelValidator with a validateModel(model) method
The first option is to use a command pattern.
Create abstract class or interface named Update (I don't think Python abstract classes/interfaces, but that's fine). The Update interface implements two methods, execute() and undo().
A concrete class has a name like UpdateAdress, UpdatePortfolio, or UpdatePaymentInfo.
Each concrete Update object also holds a reference to your model object.
The concrete classes hold the state needed to for a particular kind of update. Imageine these methods exist on the UpdateAddress class:
UpdateAddress
setStreetNumber(...)
setCity(...)
setPostcode(...)
setCountry(...)
The update object needs to hold both the current and new values of a property. Like:
setStreetNumber(aString):
self.oldStreetNumber = model.getStreetNumber
self.newStreetNumber = aString
When the execute method is called, the model is updated:
execute:
model.setStreetNumber(newStreetNumber)
model.setCity(newCity)
# Set postcode and country
if not ModelValidator.isValid(model):
self.undo()
raise ValidationError
and the undo method looks like:
undo:
model.setStreetNumber(oldStreetNumber)
model.setCity(oldCity)
# Set postcode and country
That is a lot of typing, but it would work. Mutating your model object is nicely encapsulated by different kinds of updates. You can execute or undo the changes by calling those methods on the update object. You can even store a list of update objects for multi-level undos and re-tries.
However, it is a lot of typing for the programmer. Consider using persistent data structures. Persistent data structures can be used to copy objects very quickly -- approximately constant time complexity. Here is a python library of persistent data structures.
Let's assume your data was in a persistent data structure version of a dict. The library I referenced calls it a PMap.
The implementation of the update classes can be simpler. Starting with the constructor:
UpdateAddress(pmap)
self.oldPmap = pmap
self.newPmap = pmap
The setters are easier:
setStreetNumber(aString):
self.newPmap = newPmap.set('streetNumber', aString)
Execute passes back a new instance of the model, with all the updates.
execute:
if ModelValidator.isValid(newModel):
return newModel;
else:
raise ValidationError
The original object has not changed at all, thanks to the magic of persistent data structures.
The best thing is to not do any of this. Instead, use an ORM or object database. That is the "enterprise grade" solution. These libraries give you sophisticated tools like transactions and object version history.

Creating Object With A For Loop

Firstly, I do apologise as I'm not quite sure how to word this query within the Python syntax. I've just started learning it today having come from a predominantly PowerShell-based background.
I'm presently trying to obtain a list of projects within our organisation within Google Cloud. I want to display this information in two columns: project name and project number - essentially an object. I then want to be able to query the object to say: where project name is "X", give me the project number.
However, I'm rather having difficulty in creating said object. My code is as follows:
import os
from pprint import pprint
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials
credentials = GoogleCredentials.get_application_default()
service = discovery.build('cloudresourcemanager', 'v1', credentials=credentials)
request = service.projects().list()
response = request.execute()
projects = response.get('projects')
The 'projects' variable then seems to be a list, rather than an object I can explore and run queries against. I've tried running things like:
pprint(projects.name)
projects.get('name')
Both of which return the error:
"AttributeError: 'list' object has no attribute 'name'"
I looked into creating a Class within a For loop as well, which nearly gave me what I wanted, but only displayed one project name and project number at a time, rather than the entire collection I can query against:
projects=[]
for project in response.get('projects', []):
class ProjectClass:
name = project['name']
projectNumber = project['projectNumber']
projects.append(ProjectClass.name)
projects.append(ProjectClass.projectNumber)
I thought if I stored each class in a list it might work, but alas, no such joy! Perhaps I need to have the For loop within the class variables?
Any help with this would be greatly appreciated!
As #Code-Apprentice mentioned in a comment, I think you are missing a critical understanding of object-oriented programming, namely the difference between a class and an object. Think of a class as a "blueprint" for creating objects. I.E. your class ProjectClass tells python that objects of type ProjectClass will have two fields, name and projectNumber. However, ProjectClass itself is just the blueprint, not an object. You then need to create an instance of ProjectClass, which you would do like so:
project_class_1 = ProjectClass()
Great, now you have an object of type ProjectClass, and it will have fields name and projectNumber, which you can reference like so:
project_class_1.name
project_class_1.projectNumber
However, you will notice that all instances of the class that you create will have the same value for name and projectNumber, this just won't do! We need to be able to specify values when we create each instance. Enter init(), a special python method colloquially referred to as the constructor. This function is called by python automatically when we create a new instance of our class as above, and is responsible for setting up all the fields of that class. Another powerful feature of classes and objects is that you can define a collection of different functions that can be called at will.
class ProjectClass:
def __init__(self, name, projectNumber):
self.name = name
self.projectNumber = projectNumber
Much better. But wait, what's that self variable? Well, just as before we were able reference the fields of our instance via the "project_class_1" variable name, we need a way to access the fields of our instance when we're running functions that are a part of that instance, right? Enter self. Self is another python builtin parameter that contains a reference to the current instance of the ProjectClass that is being accessed. That way, we can set fields on the instance of the class that will persist, but not be shared or overwritten by other instances of the ProjectClass. It's important to remember that the first argument passed to any function defined on a class will always be self (except for some edge-cases you don't need to worry about now).
So restructuring your code, you would have something like this:
class ProjectClass:
def __init__(self, name, projectNumber):
self.name = name
self.projectNumber = projectNumber
projects = []
for project in response.get('projects', []):
projects.append(ProjectClass(project["name"], project["projectNumber"])
Hopefully I've explained this well and given you a complete answer on how all these pieces fit together. The hope is for you to be able to write that code on your own and not just give you the answer!

Names of instances and loading objects from a database

I got for example the following structure of a class.
class Company(object):
Companycount = 0
_registry = {}
def __init__(self, name):
Company.Companycount +=1
self._registry[Company.Companycount] = [self]
self.name = name
k = Company("a firm")
b = Company("another firm")
Whenever I need the objects I can access them by using
Company._registry
which gives out a dictionary of all instances.
Do I need reasonable names for my objects since the name of the company is a class attribute, and I can iterate over Company._registry?
When loading the data from the database does it matter what the name of the instance (here k and b) is? Or can I just use arbitrary strings?
Both your Company._registry and the names k and b are just references to your actual instances. Neither play any role in what you'd store in the database.
Python's object model has all objects living on a big heap, and your code interacts with the objects via such references. You can make as many references as you like, and objects automatically are deleted when there are no references left. See the excellent Facts and myths about Python names and values article by Ned Batchelder.
You need to decide, for yourself, if the Company._registry structure needs to have names or not. Iteration over a list is slow if you already have a name for a company you wanted to access, but a dictionary gives you instant access.
If you are going to use an ORM, then you don't really need that structure anyway. Leave it to the ORM to help you find your objects, or give you a sequence of all objects to iterate over. I recommend using SQLAlchemy for this.
the name doesn't matter but if you are gonna initialize a lot of objects you are still gonna make it reasonable somehow

Dynamically building up types in python

Suppose I am building a composite set of types:
def subordinate_type(params):
#Dink with stuff
a = type(myname, (), dict_of_fields)
return a()
def toplevel(params)
lots_of_types = dict(keys, values)
myawesomedynamictype = type(toplevelname, (), lots_of_types)
#Now I want to edit some of the values in myawesomedynamictype's
#lots_of_types.
return myawesomedynamictype()
In this particular case, I want a reference to the "typeclass" myawesomedynamictype inserted into lots_of_types.
I've tried to iterate through lots_of_types and set it, supposing that the references were pointed at the same thing, but I found that the myawesomedynamictype got corrupted and lost its fields.
The problem I'm trying to solve is that I get values related to the type subordinate_type, and I need to generate a toplevel instantiation based on subordinate_type.
This is an ancient question, and because it's not clear what the code is trying to do (being a code gist rather than working code), it's a little hard to answer.
But it sounds like you want a reference to the dynamically created class "myawesomedynamictype" on the class itself. A copy of (I believe a copy of) the dictionary lots_of_types became the __dict__ of this new class when you called type() to construct it.
So, just set a new attribute on the class to have a value of the class you just constructed; Is that what you were after?
def toplevel(params)
lots_of_types = dict(keys, values)
myawesomedynamictype = type(toplevelname, (), lots_of_types)
myawesomedynamictype.myawesomedynamictype = myawesomedynamictype
return myawesomedynamictype()

SqlAlchemy optimizations for read-only object models

I have a complex network of objects being spawned from a sqlite database using sqlalchemy ORM mappings. I have quite a few deeply nested:
for parent in owner.collection:
for child in parent.collection:
for foo in child.collection:
do lots of calcs with foo.property
My profiling is showing me that the sqlalchemy instrumentation is taking a lot of time in this use case.
The thing is: I don't ever change the object model (mapped properties) at runtime, so once they are loaded I don't NEED the instrumentation, or indeed any sqlalchemy overhead at all. After much research, I'm thinking I might have to clone a 'pure python' set of objects from my already loaded 'instrumented objects', but that would be a pain.
Performance is really crucial here (it's a simulator), so maybe writing those layers as C extensions using sqlite api directly would be best. Any thoughts?
If you reference a single attribute of a single instance lots of times, a simple trick is to store it in a local variable.
If you want a way to create cheap pure python clones, share the dict object with the original object:
class CheapClone(object):
def __init__(self, original):
self.__dict__ = original.__dict__
Creating a copy like this costs about half of the instrumented attribute access and attribute lookups are as fast as normal.
There might also be a way to have the mapper create instances of an uninstrumented class instead of the instrumented one. If I have some time, I might take a look how deeply ingrained is the assumption that populated instances are of the same type as the instrumented class.
Found a quick and dirty way that seems to at least somewhat work on 0.5.8 and 0.6. Didn't test it with inheritance or other features that might interact badly. Also, this touches some non-public API's, so beware of breakage when changing versions.
from sqlalchemy.orm.attributes import ClassManager, instrumentation_registry
class ReadonlyClassManager(ClassManager):
"""Enables configuring a mapper to return instances of uninstrumented
classes instead. To use add a readonly_type attribute referencing the
desired class to use instead of the instrumented one."""
def __init__(self, class_):
ClassManager.__init__(self, class_)
self.readonly_version = getattr(class_, 'readonly_type', None)
if self.readonly_version:
# default instantiation logic doesn't know to install finders
# for our alternate class
instrumentation_registry._dict_finders[self.readonly_version] = self.dict_getter()
instrumentation_registry._state_finders[self.readonly_version] = self.state_getter()
def new_instance(self, state=None):
if self.readonly_version:
instance = self.readonly_version.__new__(self.readonly_version)
self.setup_instance(instance, state)
return instance
return ClassManager.new_instance(self, state)
Base = declarative_base()
Base.__sa_instrumentation_manager__ = ReadonlyClassManager
Usage example:
class ReadonlyFoo(object):
pass
class Foo(Base, ReadonlyFoo):
__tablename__ = 'foo'
id = Column(Integer, primary_key=True)
name = Column(String(32))
readonly_type = ReadonlyFoo
assert type(session.query(Foo).first()) is ReadonlyFoo
You should be able to disable lazy loading on the relationships in question and sqlalchemy will fetch them all in a single query.
Try using a single query with JOINs instead of the python loops.

Categories