Make Python Class Methods searchable in Mongodb - python

I have a Mongodb collection and I have created a Python class for documents in the collection. The class has some properties and methods which are not stored with the document. Should I try and store them to make the properties searchable or should I not store them and search the objects in Python?
Here is an example:
# Child
class Child:
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
#property
def parent(self):
try:
return Parent(**db.Parents.find_one({'child':self._id}))
except:
return None
# Parent
class Parent:
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
#property
def child(self):
try:
return Child(**db.Children.find_one({'parent':self._id}))
except:
return None
In this example, to search for all the children who's parent's name is "foo", I have to do this:
results = [Child(**c) for c in db.Children.find() if c.parent.name == 'foo']
This means I have to pull all the Children documents from Mongodb and search them. Is it smarter to write the Parent data (or a subset of it) to the Child document, so I can use Mongodb to do the searching?? So my Child class could look like this:
# Child
class Child:
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
#property
def parent_name(self):
try:
return db.Parents.find_one({'child':self._id})['name']
except:
return None
def _save(self):
# something like this to get and save all the properties
data = {m[0]:getattr(self,m[0]) for m in inspect.getmembers(self)}
db.Children.find_and_modify({'_id':self._id},{'$set':data},upsert=True)
# search
results = [Child(**c) for c in db.Children.find({'parent_name':'foo'})]
So the search is more efficient, but I think having to keep the Child objects updated could be painful and dangerous. If I change the name of a Parent, I have to also rewrite its children. Feels wrong. Any better ideas???

You don’t have to load all Children.
parent_ids = db.Parents.find({'name': 'foo'}).distinct('_id')
children = db.Children.find({'parent': {'$in': parent_ids}})
(Also, why do you have both a child field on a parent and a parent field on a child?)

Related

python subclasses have general functions shared across them

I've created a base class and a subclass. I'll be creating more subclasses, however I have some general functions that will be used across all subclasses. Is this the proper way of setting it up? I'm assuming it would be easier to add the def to the base class and then call it within each subclass. Is that possible to do or recommended?
"""
Base class for all main class objects
"""
class Node(object):
def __init__(self, name, attributes, children):
self.name = name
self.attributes = attributes if attributes is not None else {}
self.children = children if children is not None else []
"""
contains the settings for cameras
"""
class Camera(Node):
def __init__(self, name="", attributes=None, children=None, enabled=True):
super(Camera, self).__init__(name=name, attributes=attributes, children=children)
self.enabled = enabled
# defaults
add_node_attributes( nodeObject=self)
# General class related functions
# ------------------------------------------------------------------------------
""" Adds attributes to the supplied nodeObject """
def add_node_attributes(nodeObject=None):
if nodeObject:
nodeObject.attributes.update( { "test" : 5 } )
# create test object
Camera()
You should add the general methods on the base class and call them from the subclass:
class Node(object):
def __init__(self, name, attributes, children):
self.name = name
self.attributes = attributes if attributes is not None else {}
self.children = children if children is not None else []
def add_node_attributes(self):
self.attributes.update( { "test" : 5 } )
This allows you to take maximum advantage of inheritance. Your subclasses will have the method add_node_attributes available to them:
c=Camera()
c.add_node_attributes()
You can also call it from within the child class:
class Camera(Node):
def __init__(self, name="", attributes=None, children=None, enabled=True):
super(Camera, self).__init__(name=name, attributes=attributes, children=children)
self.enabled = enabled
# defaults
self.add_node_attributes()

Dynamic lazy class properties python

I have a class, Record, that is used to contain the results of reading text file. The file contains a simple database with fields and tags. I would like to have each Record instance only have the properties associated with its database. Basically:
R1 = Record("file1")
R2 = Record("file2")
print(R1.TI) #"Record 1's title"
print(R2.TI) #AttributeError: 'Record' object has no attribute 'TI'
unfortunately some of the fields may require a large amount of processing to return something useful, and those values may never be needed. So I would like the value to be determined the first time they are called not when the object is initialized.
Because I know the tags name's only I have tried:
class tagWrapper(object):
def __init__(self, tag):
self.tag = tag
self.data = None
def __get__(self, instance, owner):
if self.data == None:
try:
#tagToFunc is a dictionary that maps tags to their processing function
self.data = tagToFunc[self.tag](instance._rawDataDict[self.tag])
except KeyError: #I do not know the full list of tags
self.data = instance._rawDataDict[self.tag]
return self.data
class Record(object):
def __init__(self, file):
#Reading file and making _rawDataDict
setattr(self, tag, tagWrapper(tag))
This causes R1.TI to produce the wrapper object not the value I want. So I suspect I am screwing something up with the get method.
Note: I am trying to make the attributes part of the individual class instances and not evaluated until needed. I can implement one or the other but have not been able to determine how to do both.
You are implementing the descriptor protocol, and descriptor belongs to the class instead of an instance of the class, so you can not assign it to an instance attribute.
class Tag(object):
def __init__(self, tag):
self.tag = tag
self.data = None
def __get__(self, instance, owner):
if not instance: # if accessed with the class directly, ie. Record.T1, just return this descriptor
return self
if self.data is None:
print "Reading data"
self.data = range(10)
return self.data
class Record(object):
T1 = Tag('T1')
I have a solution that seems to work although it is quite ugly:
class Record(object):
def __init__(self, file):
self._unComputedTags = set() #needs to be initialized first
#stuff
self._unComputedTags = set(self._fieldDict.keys())
for tag in self._fieldDict:
self.__dict__[tag] = None
def __getattribute__(self, name):
if name == '_unComputedTags':
#This may be unnecessary if I play with things a bit
return object.__getattribute__(self, '_unComputedTags')
if name in self._unComputedTags:
try:
tagVal = tagToFunc[name](self._fieldDict[name])
except KeyError:
tagVal = self._fieldDict[name]
setattr(self, name, tagVal)
self._unComputedTags.remove(name)
return object.__getattribute__(self, name)
I do not like overwriting __getattribute__ but this seems to work.

Python - How can I access variables of an above class within a class within a list?

I'm not sure if this is possible within the language or not, but imagine this:
class Parent(object):
def __init__(self, number):
self.variable_to_access = "I want this"
self.object_list = []
for i in range(number): self.object_list.append(Object_In_List(i))
class Object_In_List(object):
def __init__(self): pass
def my_method(self):
# How can I access variable_to_access
I have over simplified this but I was thinking Object_In_List could inherit Parent but Parent will contain many other items and I am concerned about memory usage.
I want to avoid passing the variable_to_access itself constantly. Is this actually possible to access variable_to_access within my_method()?
Thanks
Here is a slightly more complicated example which behaves like Java inner classes
class Parent(object):
def __init__(self, number):
self.variable_to_access = "I want this"
self.object_list = [] for i in range(number):
# Pass in a reference to the parent class when constructing our "inner class"
self.object_list.append(Object_In_List(self, i))
class Object_In_List(object):
# We need a reference to our parent class
def __init__(self, parent, i):
self.parent = parent
# ... So we can forward attribute lookups on to the parent
def __getattr__(self, name):
return getattr(self.parent, name)
# Now we can treat members of the parent class as if they were our own members (just like Java inner classes)
def my_method(self):
# You probably want to do something other than print here
print(self.variable_to_access)
class Parent(object):
def __init__(self, number):
self.variable_to_access = "I want this"
self.object_list = []
for i in range(number):
self.object_list.append(Object_In_List(self.variable_to_access, i))
class Object_In_List(object):
def __init__(self, parent_var, i):
self.pv = parent_var
def my_method(self):
# How can I access variable_to_access
# self.pv is what you want
Right now it seems the only place that any Object_In_List objects exists are within the Parent attribute self.object_list. If that is the case then you are in the Parent class when accessing the Object_In_List already, so you don't even need my_method in Object_In_List. If these objects exist outside of that list then you would have to pass the Parent object or variable anyway, as shown in the other answers.
If you want to be creative you could play with class attributes of Parent. This would not be as "variable" though, unless your class attribute was a dictionary, but then there is the memory thing.

Python shared property parent/child

Embarrassed to ask but I am using webapp2 and I am templating out a solution to make it easier to define routesbased on this google webapp2 route function. But it all depends on being able to define TYPE_NAME at the child level. The idea is the parent sets everything up and the child just needs to implement the _list function. The issue I ran into is TYPE_NAME is None and I need it to be the child.
#main WSGI is extended to have this function
class WSGIApplication(webapp2.WSGIApplication):
def route(self, *args, **kwargs):
def wrapper(func):
self.router.add(webapp2.Route(handler=func, *args, **kwargs))
return func
return wrapper
from main import application
class ParentHandler(RequestHandler):
TYPE_NAME = None
#application.route('/', name="list_%s" %TYPE_NAME)
def list(self):
return self._list()
class ChildHandler(ParentHandler):
TYPE_NAME = 'child'
def _list(self):
return []
I have tried a couple solutions using "class properties" but they didn't pan out. Open to other ideas, I basically just need the child class to inherit the decorated properties and execute them.
Edit:
For all of those on the edge of their seats wondering how I fix this,I was not able to get everything I needed out of the decorator so I ended up using a meta. I also added a _URLS parameter to allow for adding additional "routes". It maps custom function to the route. Really wanted to use a decorator but couldn't get it to work.
class RequestURLMeta(type):
def __new__(mcs, name, bases, dct):
result = super(RequestURLMeta, mcs).__new__(mcs, name, bases, dct)
urls = getattr(result, '_URLS', {}) or {}
for k,v in urls.iteritems():
template = v.pop('template')
app.route(getattr(result, k), template, **v)
if getattr(result, 'TYPE_NAME', None):
app.route(result.list, result.ROOT_PATH, methods=['GET'],name="%s" % result.TYPE_NAME)
#other ones went here..
return result
class ParentHandler(RequestHandler):
__metaclass__ = RequestURLMeta
class ChildHandler(ParentHandler):
TYPE_NAME = 'child'
_URLS = { 'custom': '/custom', 'TYPE_NAME': 'custom_test' }
def _list(self):
return []
def custom(self): pass
I think to get this to work you are going to need to use a metaclass. It might look something like the following (untested):
from main import application
class RouteMeta(type):
def __new__(mcs, name, bases, dct):
type_name = dct.get("TYPE_NAME")
if type_name is not None:
#application.route('/', type_name)
def list(self):
return self._list()
dct["list"] = list
return super(RouteMeta, mcs).__new__(mcs, name, bases, dct)
class ParentHandler(RequestHandler):
__metaclass__ = RouteMeta
class ChildHandler(ParentHandler):
TYPE_NAME = 'child'
def _list(self):
return []
Instead of having the list() method an attribute of ParentHandler, it is dynamically created for classes that inherit from ParentHandler and have TYPE_NAME defined.
If RequestHandler also uses a custom metaclass, have RouteMeta inherit from RequestHandler.__metaclass__ instead of type.
This code:
#application.route('/', name="list_%s" %TYPE_NAME)
def list(self):*emphasized text*
...
is semantically identical to this one:
def list(self):
...
list = application.route('/', name="list_%s" %TYPE_NAME)(list)
i.e. the method route is called inside the ParentHandler scope and
whatever lazy method you try, it will not work. You should try something
different:
from main import application
def route_list(klass):
klass.list = application.route('/',
name="list_%s" % klass.TYPE_NAME)(klass.list)
return klass
class ParentHandler(RequestHandler):
def list(self):
return self._list()
class ChildHandler(ParentHandler):
TYPE_NAME = 'child'
def _list(self):
return []
# in python3 would be:
# #route_list
# class ChildHandler(ParentHandler):
# ...
ChildHandler = route_list(ChildHandler)

Python API design pattern question

I often find I have class instances that are descendants of other class instances, in a tree like fashion. For example say I'm making a CMS platform in Python. I might have a Realm, and under that a Blog, and under that a Post. Each constructor takes it's parent as the first parameter so it knows what it belongs to. It might look like this:
class Realm(object):
def __init__(self, username, password)
class Blog(object):
def __init__(self, realm, name)
class Post(object);
def __init__(self, blog, title, body)
I typically add a create method to the parent class, so the linkage is a bit more automatic. My Realm class might look like this:
class Realm(object):
def __init__(self, username, password):
...
def createBlog(self, name):
return Blog(self, name)
That allows the user of the API to not import every single module, just the top level one. It might be like:
realm = Realm("admin", "FDS$#%")
blog = realm.createBlog("Kittens!")
post = blog.createPost("Cute kitten", "Some HTML blah blah")
The problem is those create methods are redundant and I have to pydoc the same parameters in two places.
I wonder if there's a pattern (perhaps using metaclasses) for linking one class instance to a parent class instance. Some way I could call code like this and have the blog know what it's parent realm is:
realm = Realm("admin", "FDS$#%")
blog = realm.Blog("Kittens!")
You could use a common base class for the containers featuring an add() method
class Container(object):
def __init__(self, parent=None):
self.children = []
self.parent = parent
def add(self, child)
child.parent = self
self.children.append(child)
return child
and make the parent parameter optional in the derived classes
class Blog(Container):
def __init__(self, name, realm=None):
Container.__init__(realm)
self.name = name
Your code above would now read
realm = Realm("admin", "FDS$#%")
blog = realm.add(Blog("Kittens!"))
post = blog.add(Post("Cute kitten", "Some HTML blah blah"))
You wouldn't have any create...() methods any more, so no need to document anything twice.
If setting the parent involves more than just modifying the parent attribute, you could use a property or a setter method.
EDIT: As you pointed out in the comments below, the children should be tied to the parents by the end of the contstructor. The above approach can be modified to support this:
class Container(object):
def __init__(self, parent=None):
self.children = []
self.parent = None
def add(self, cls, *args)
child = cls(self, *args)
self.children.append(child)
return child
class Realm(Container):
def __init__(self, username, password):
...
class Blog(Container):
def __init__(self, realm, name):
...
class Post(Container):
def __init__(self, blog, title, body):
...
realm = Realm("admin", "FDS$#%")
blog = realm.add(Blog, "Kittens!")
post = blog.add(Post, "Cute kitten", "Some HTML blah blah")
What about something like this, just subclassing it. In my Realm constructor:
class Realm(object):
def __init__(self, username, password):
...
parent = self
original_constructor = blog.Blog.__init__
class ScopedBlog(blog.Blog):
def __init__(self, *args):
self.parent = parent
original_constructor(self, *args)
self.Blog = ScopedBlog
Seems to work. And it could be generalized with base classes or meta classes.

Categories