Can I just partially override __setattr__? - python

I'm imitating the behavior of the ConfigParser module to write a highly specialized parser that exploits some well-defined structure in the configuration files for a particular application I work with. The files follow the standard INI structure:
[SectionA]
key1=value1
key2=value2
[SectionB]
key3=value3
key4=value4
For my application, the sections are largely irrelevant; there is no overlap between keys from different sections and all the users only remember the key names, never which section they're supposed to go in. As such, I'd like to override __getattr__ and __setattr__ in the MyParser class I'm creating to allow shortcuts like this:
config = MyParser('myfile.cfg')
config.key2 = 'foo'
The __setattr__ method would first try to find a section called key2 and set that to 'foo' if it exists. Assuming there's no such section, it would look inside each section for a key called key2. If the key exists, then it gets set to the new value. If it doesn't exist, the parser would finally raise an AttributeError.
I've built a test implementation of this, but the problem is that I also want a couple straight-up attributes exempt from this behavior. I want config.filename to be a simple string containing the name of the original file and config.content to be the dictionary that holds the dictionaries for each section.
Is there a clean way to set up the filename and content attributes in the constructor such that they will avoid being overlooked by my custom getters and setters? Will python look for attributes in the object's __dict__ before calling the custom __setattr__?

pass filename, content to super class to handle it
class MyParser(object):
def __setattr__(self, k, v):
if k in ['filename', 'content']:
super(MyParser, self).__setattr__(k, v)
else:
# mydict.update(mynewattr) # dict handles other attrs

I think it might be cleaner to present a dictionary-like interface for the contents of the file and leave attribute access for internal purposes. However, that's just my opinion.
To answer your question, __setattr__() is called prior to checking in __dict__, so you can implement it as something like this:
class MyParser(object):
specials = ("filename", "content")
def __setattr__(self, attr, value):
if attr in MyParser.specials:
self.__dict__[attr] = value
else:
# Implement your special behaviour here

Related

How to use python metaclass for the following scenario?

I want to create a configuration class with cascading feature. What do I mean by this? let say we have a configuration class like this
class BaseConfig(metaclass=ConfigMeta, ...):
def getattr():
return 'default values provided by the metaclass'
class Config(BaseConfig):
class Embedding(BaseConfig, size=200):
class WordEmbedding(Embedding):
size = 300
when I use this in code I will access the configuration as follows,
def function(Config, blah, blah):
word_embedding_size = Config.Embedding.Word.size
char_embedding_size = Config.Embedding.Char.size
The last line access a property which does not exist in Embedding class 'Char'. That should invoke getattr() which should return 200 in this case. I am not familiar with metaclasses enough to make a good judgement, but I gues I need to define the __new__() of the metaclass.
does this approach makes sense or is there a better way to do it?
EDIT:
class Config(BaseConfig):
class Embedding(BaseConfig, size=200):
class WordEmbedding(Embedding):
size = 300
class Log(BaseConfig, level=logging.DEBUG):
class PREPROCESS(Log):
level = logging.INFO
#When I use
log = logging.getLogger(level=Config.Log.Model.level) #level should be INFO
This is a bit confuse. I am not sure if this would be the best notation to declare configurations with default parameters - it seems verbose. But yes, given the flexibility of metaclasses and magic methods in Python, it is possible for something like this to old all flexibility you need.
Just for the sake of it, I'd like to say that using nested classes as namespaces, like you are doing, is probably the only useful thing for them. (nested classes). It is common to see a lot of people that misunderstands Python OO at all trying to make use of nested classes.
So - for your problem, you need that in the final class, a __getattr__ method exists that can fetch default values for atributes. These attributes in turn are declared as keywords to nested classes - which also can have the same metaclass. Otherwise, the hierarchy of nested classes just work for you to fetch nested attributes, using the dot notation in Python.
Moreover, for each class in a nested set, one can pass in keyword parameters that are to be used as default, if the next level of nested classes is not defined. In the given example, trying to access Config.Embedding.Char.size with a non exisitng Char should return the default "size". Not that a __getattr__ in "Embedding" can return you a fake "Char" object - but that object is the one that have to yield a size attribute. So, our __getattr__ have yet to yield an object that has itself a propper __getattr__;
However, I will suggest a change to your requirements - instead of passing in the default values as keyword parameters, to have a reserved name - like _default inside which you can put your default attributes. That way, you can provide deeply nested default subtress, instead of just scalar values as well, and the implementation can possibly be simpler.
Actually - a lot simpler. By using keywords to the class as you propose, you'd actually need to have a metaclass set those default parameters in a data structure(it would be possible in either __new__ or __init__ though). But by just using the nested classes all the way, with a reserved name, a custom __getattr__ on the metac class will work. That will retrieve unexisting class attributes on the configuration classes themselves, and all one have to do, if a requested attribute does not exist, is try to retrieve the _default class I mentioned.
Thus, you can work with something like:
class ConfigMeta(type):
def __getattr__(cls, attr):
return cls._default
class Base(metaclass=ConfigMeta):
pass
class Config(Base):
class Embed(Base):
class _default(Base):
size = 200
class Word(Base):
size = 300
assert Config.Embed.Char.size == 200
assert Config.Embed.Word.size == 300
Btw - just last year I was working on a project to have configurations like this, with default values, but using a dictionary syntax - that is why I mentioned I am not sure the nested class would be a nice design. But since all the functionality can be provided by a metaclass with 3 LoC I guess this beats anything in the way.
Also, that is why I think being able to nest whole default subtrees can be useful for what you want - I've been there.
You can use a metaclass to set the attribute:
class ConfigMeta(type):
def __new__(mt, clsn, bases, attrs):
try:
_ = attrs['size']
except KeyError:
attrs['size'] = 300
return super().__new__(mt, clsn, bases, attrs)
Now if the class does not have the size attribute, it would be set to 300 (change this to meet your need).

Appropriately altering __setattr__ method for mapped objects

So, I have a web application where there's a certain database object with attributes I would like to cache in a redis store. Relatively simple to do manually, with something like below:
db_object.update({<attribute>: <value>})
redis.set(db_object.id, <value>)
The issue here is that it's an attribute that is changed in many places throughout the codebase. Doesn't mean this approach won't work, it just means that it makes for code that is very repetitive. I would much rather just have a wrapper for the cache that I can access directly whenever I need to. This means that any time I change the particular attribute I'm interested in I would like to update my redis store, theoretically like so:
def __setattr__(self, name, value):
self.__dict__[name] = value
if name == <attribute>:
redis.set(self.id, value)
which would solve all my problems. The only issue is that, as detailed here I cannot directly modify the __dict__ in mapped objects. How can I achieve the same effect?
Found a nice way to approach this on another question. Can just call the super method instead of directly altering __dict__
Method below has the desired effect:
def __setattr__(self, name, value):
if name == <attribute>:
redis.set(self.id, value)
super(Dataset, self).__setattr__(name, value)

Python subclassing: adding properties

I have several classes where I want to add a single property to each class (its md5 hash value) and calculate that hash value when initializing objects of that class, but otherwise maintain everything else about the class. Is there any more elegant way to do that in python than to create a subclass for all the classes where I want to change the initialization and add the property?
You can add properties and override __init__ dynamically:
def newinit(self, orig):
orig(self)
self._md5 = #calculate md5 here
_orig_init = A.__init__
A.__init__ = lambda self: newinit(self, _orig_init)
A.md5 = property(lambda self: self._md5)
However, this can get quite confusing, even once you use more descriptive names than I did above. So I don't really recommend it.
Cleaner would probably be to simply subclass, possibly using a mixin class if you need to do this for multiple classes. You could also consider creating the subclasses dynamically using type() to cut down on the boilerplate further, but clarity of code would be my first concern.

what is the dict class used for

Can someone explain what the dict class is used for? This snippet is from Dive Into Python
class FileInfo(dict):
"store file metadata"
def __init__(self, filename=None):
self["name"] = filename
I understand the assignment of key=value pairs with self['name'] = filename but what does inheriting the dict class have to do with this? Please help me understand.
If you're not familiar with inheritance concept of object-oriented programming have a look at least at this wiki article (though, that's only for introduction and may be not for the best one).
In python we use this syntax to define class A as subclass of class B:
class A(B):
pass # empty class
In your example, as FileInfo class is inherited from standard dict type you can use instances of that class as dictionaries (as they have all methods that regular dict object has). Besides other things that allows you assign values by key like that (dict provides method for handing this operation):
self['name'] = filename
Is that the explanation you want or you don't understand something else?
It's for creating your own customized Dictionary type.
You can override __init__, __getitem__ and __setitem__ methods for your own special purposes to extend dictionary's usage.
Read the next section in the Dive into Python text: we use such inheritance to be able to work with file information just the way we do using a normal dictionary.
# From the example on the next section
>>> f = fileinfo.FileInfo("/music/_singles/kairo.mp3")
>>> f["name"]
'/music/_singles/kairo.mp3'
The fileinfo class is designed in a way that it receives a file name in its constructor, then lets the user get file information just the way you get the values from an ordinary dictionary.
Another usage of such a class is to create dictionaries which control their data. For example you want a dictionary who does a special thing when things are assigned to, or read from its 'sensor' key. You could define your special __setitem__ function which is sensitive with the key name:
def __setitem__(self, key, item):
self.data[key] = item
if key == "sensor":
print("Sensor activated!")
Or for example you want to return a special value each time user reads the 'temperature' key. For this you subclass a __getitem__ function:
def __getitem__(self, key):
if key == "temperature":
return CurrentWeatherTemperature()
else:
return self.data[key]
When an Class in Python inherits from another Class, it means that any of the methods defined on the inherited Class are, by nature, defined on the newly created Class.
So when FileInfo inherits dict it means all of the functionality of the dict class is now available to FileInfo, in addition to anything that FileInfo may declare, or more importantly, override by re-defining the method or parameter.
Since the dict Object in Python allows for key/value name pairs, this enables FileInfo to have access to that same mechanism.

Best approach with dynamic classes using Python globals()

I'm working on a web application that will return a variable set of modules depending on user input. Each module is a Python class with a constructor that accepts a single parameter and has an '.html' property that contains the output.
Pulling the class dynamically from the global namespace works:
result = globals()[classname](param).html
And it's certainly more succinct than:
if classname == 'Foo':
result = Foo(param).html
elif classname == 'Bar':
...
What is considered the best way to write this, stylistically? Are there risks or reasons not to use the global namespace?
A flaw with this approach is that it may give the user the ability to to more than you want them to. They can call any single-parameter function in that namespace just by providing the name. You can help guard against this with a few checks (eg. isinstance(SomeBaseClass, theClass), but its probably better to avoid this approach. Another disadvantage is that it constrains your class placement. If you end up with dozens of such classes and decide to group them into modules, your lookup code will stop working.
You have several alternative options:
Create an explicit mapping:
class_lookup = {'Class1' : Class1, ... }
...
result = class_lookup[className](param).html
though this has the disadvantage that you have to re-list all the classes.
Nest the classes in an enclosing scope. Eg. define them within their own module, or within an outer class:
class Namespace(object):
class Class1(object):
...
class Class2(object):
...
...
result = getattr(Namespace, className)(param).html
You do inadvertantly expose a couple of additional class variables here though (__bases__, __getattribute__ etc) - probably not exploitable, but not perfect.
Construct a lookup dict from the subclass tree. Make all your classes inherit from a single baseclass. When all classes have been created, examine all baseclasses and populate a dict from them. This has the advantage that you can define your classes anywhere (eg. in seperate modules), and so long as you create the registry after all are created, you will find them.
def register_subclasses(base):
d={}
for cls in base.__subclasses__():
d[cls.__name__] = cls
d.update(register_subclasses(cls))
return d
class_lookup = register_subclasses(MyBaseClass)
A more advanced variation on the above is to use self-registering classes - create a metaclass than automatically registers any created classes in a dict. This is probably overkill for this case - its useful in some "user-plugins" scenarios though.
First of all, it sounds like you may be reinventing the wheel a little bit... most Python web frameworks (CherryPy/TurboGears is what I know) already include a way to dispatch requests to specific classes based on the contents of the URL, or the user input.
There is nothing wrong with the way that you do it, really, but in my experience it tends to indicate some kind of "missing abstraction" in your program. You're basically relying on the Python interpreter to store a list of the objects you might need, rather than storing it yourself.
So, as a first step, you might want to just make a dictionary of all the classes that you might want to call:
dispatch = {'Foo': Foo, 'Bar': Bar, 'Bizbaz': Bizbaz}
Initially, this won't make much of a difference. But as your web app grows, you may find several advantages: (a) you won't run into namespace clashes, (b) using globals() you may have security issues where an attacker can, in essence, access any global symbol in your program if they can find a way to inject an arbitrary classname into your program, (c) if you ever want to have classname be something other than the actual exact classname, using your own dictionary will be more flexible, (d) you can replace the dispatch dictionary with a more-flexible user-defined class that does database access or something like that if you find the need.
The security issues are particularly salient for a web app. Doing globals()[variable] where variable is input from a web form is just asking for trouble.
Another way to build the map between class names and classes:
When defining classes, add an attribute to any class that you want to put in the lookup table, e.g.:
class Foo:
lookup = True
def __init__(self, params):
# and so on
Once this is done, building the lookup map is:
class_lookup = zip([(c, globals()[c]) for c in dir() if hasattr(globals()[c], "lookup")])

Categories