How to create a dict like class in python 2.7? - python

Looks like there are multiple ways to do that but couldn't find the latest best method.
Subclass UserDict
Subclass DictMixin
Subclass dict
Subclass MutableMapping
What is the correct way to do? I want to abstract actual data which is in a database.

Since your dict-like class isn't in fact a dictionary, I'd go with MutableMapping. Subclassing dict implies dict-like characteristics, including performance characteristics, which won't be true if you're actually hitting a database.

If you are doing your own thing (e.g. inventing your own wheel) you might as well write the class from scratch (i.e. subclass from object), just providing the correct special members (e.g. __getitem__) and other such functions as described in the object data model, so that it quacks like a dict. Internally, you might even own a number of dicts (has-a) to help your implementation.
This way, you don't have to shoehorn your design to "fit" some existing implementation, and you aren't paying for some things you aren't necessarily using .This recommendation in part is because your DB-backed class will already be considerably more complex than a standard dict if you make it account for performance/caching/consistency/optimal querying etc.

Related

Python NDB: what's the best way to have a Set instead of a List in property?

in NDB you have repeated properties, they behave just like a native python list but i want them to behave just like native sets.
i need to have a set of keys that is without duplicates.
in python you can remove duplicates like the_list = list(set(the_list)),
but how would you implement this so it is automatic and i don't to think about this ?
Three ways come to mind:
enforce the list (repeated property) is unique with a "setter" method that only inserts unique values;
likewise, enforce the list is unique with a _pre_put_hook() method;
use the key on each entity as your list, ndb will make sure they are unique.
Another option would be to subclass ndb.Property. Quite a few examples here:
https://cloud.google.com/appengine/docs/python/ndb/subclassprop
I believe the correct strategy would to build a custom SetProperty which subclasses the ListProperty, to enforce your requirements.
Have a read of up on Subclassing properties. https://cloud.google.com/appengine/docs/python/ndb/subclassprop
This I believe is the correct way for implementing this type of property, rather than _pre_put hooks. That is generally too late to perform appropriate validation and feedback.
You could write custom setters, however you setter can't be the name of the property, so this will look odd.
The other alternative would be to use a validator which is allowed to coerce the value. See https://cloud.google.com/appengine/docs/python/ndb/properties#options

Python: Data Object or class

I enjoy all the python libraries for scraping websites and I am experimenting with BeautifulSoup and IMDB just for fun.
As I come from Java, I have some Java-practices incorporated into my programming styles. I am trying to get the info of a certain movie, I can either create a Movie class or just use a dictionary with keys for the attributes.
My question is, should I just use dictionaries when a class will only contain data and perhaps almost no behaviour? In other languages creating a type will help you enforce certain restrictions and because of type checks the IDE will help you program, this is not always the case in python, so what should I do?
Should I resort to creating a class only when there's both, behaviour and data? Or create a movie class even though it'll probably be just a data container?
This all depends on your model, in this particular case either one is fine but I'm wondering about what's a good practice.
It's fine to use a class just to store attributes. You may also wish to use a namedtuple instead
The main differences between dict and class are the way you access the attributes [] vs . and inheritence.
instance.__dict__ is just a dict after all
You can even just use a single class for all of those types of objects if you wish
class Bunch:
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
movie = Bunch(title='foo', director='bar', ...)
In your case you could use a class that inherits from dict (e.g class MyClass(dict)) so that you can define custom behavior to your dict-like class or use UserDict.
It depends on what you really mean for "perhaps almost no behaviour", if dict already provides what you need stay with it. Otherwise consider to subclass dict adding your specific behaviour. Since Python 2.2 it is possible. Using UserDict is an older approach to the problem.
You could also use a plain dictionary and implement the behaviour externally via some function. I use this approach for prototyping, and eventually refactor the code later to make it Object Oriented (generally more scalable).
You can see what a dictionary offers typing this at the interpreter:
>>> help({})
or referring to the docs.
I would stick to KISS (Keep it simple stupid). If you only want to store values you are better off with a dictionary, because you can dynamically add values at runtime. WRONG:(But you can not add new filds to a class at runtime.)
So classes are useful if they provide state and behaviour.
EDIT: You can add fields to classes in python.

ZODB equivalent of ordered dict (odict?)

I am doing some PloneFormGen work. Currently PloneFormGen stores entered form entries internally as tuples without associated column information. If new columns (form fields) are added then the existing data becomes invalid.
This could be easily avoided storing data in ordered dictionaries, which retain both entered column order and column ids.
Does ZODB have data type equivalent of ordered dictionary? If possible even with matching API (Python dict-like item manipulation and access)?
You can use any ordered dict implementation out-of-the-box in the ZODB, but you'll have to mark the parent object (the object that refers to the ordered dict instance) as changed by using either parent = odict_instance every time you change it or by setting _p_changed to True. This will, of course, result in a new persistent record for the parent together with the ordered dict instance.
If you want the ordered dict instance itself to detect changes automatically, you'll probably have to build your own class as I am not aware of any current implementations. That said, it is probably exceedingly easy to do so, especially if you use the ZODB PersistentMapping class as a template on how to build a ordered version of the same. You can't use that class as a mixin, unfortunately, as it refers directly to UserDict methods instead of using super() calls (persistent.Persistent is not a new-style class).
Python 2.7 has a ordered dict class in the standard library. Presumably you are still using Python 2.6 in Plone, so you'd have to backport it. Once you've got it backported however, a PersistentOrderedDict implementation should be a straight copy from the PersistentMapping source code, with all instances of UserDict.IterableUserDict replaced with your OrderedDict port.
you'll probably have to build your own class as I am not aware of any current implementations.
You can find implementations of ZODB persisting ordered dicts based on PersistentDict and OOBtree here:
https://github.com/bluedynamics/node.ext.zodb/blob/master/src/node/ext/zodb/utils.py
This implementations are based on odict package:
http://pypi.python.org/pypi/odict
Since it's not possible to persist dict type inheriting objects to ZODB (because persistent.Persistent and dict has incompatible low level implementations) odict provides a way for easily hooking different base classes (using _dict_impl function internally all over the place). That is the reason why odict package is still used in favour of even python 2.7's ordered dict implementation or other 3rd party ordereddict implementations.
Both werkzeug and paste provide ordereddicts. You could no doubt pickle them for your purposes.
If a Python object can be pickled it can be persisted within the ZODB.
Take a look at PersistantMapping, from what I understand it should be sufficient to create a mix-in class like this:
class PersistantOrderedDict(PersistantMapping, OrderedDict):

How to wrap a python dict?

I want to implement a class that will wrap -- not subclass -- the python dict object, so that when a change is detected in a backing store I can re-create the delegated dict object. I intend to check for changes in the backing store each time the dict is accessed for a read.
Supposing I was to create an object to act like this; what methods would I need to implement?
You can subclass the ABC (abstract base class) collections.Mapping (or collections.MutableMapping if you also want to allow code using your instances to alter the simulated/wrapped dictionary, e.g. by indexed assignment, pop, etc).
If you do so, then, as the docs I pointed to imply somewhat indirectly, the methods you need to implement are
__len__
__iter__
__getitem__
(for a Mapping) -- you should also implement
__contains__
because by delegating to the dict you're wrapping it can be done much faster than the iterating approach the ABC would have to apply otherwise.
If you need to supply a MutableMapping then you also need to implement 2 more methods:
__setitem__
__delitem__
In addition to what's already been suggested, you might want to take a look at UserDict.
For an example of a dict like object, you can read through django's session implementation, specifically the SessionBase class.
I think it depends on which methods you use. Probably __getitem__, __setitem__, __iter__ and __len__ as most things can be implemented in terms of those. But you'll want to look at some use cases, particularly with __iter__. Something like this:
for key in wrapped_dictionary:
do_something(wrapped_dictionary[key])
...is going to be slow if you hit the data source on each iteration, not to mention that it might not even work if the data source is changing out from under you. So you'll want to maybe throw some sort of exception there and implement iteritems as an alternative, loading all the key-value pairs in one batch before you loop over them.
The Python docs have item listings where you can look for methods and use-cases.

Which is more pythonic, factory as a function in a module, or as a method on the class it creates?

I have some Python code that creates a Calendar object based on parsed VEvent objects from and iCalendar file.
The calendar object just has a method that adds events as they get parsed.
Now I want to create a factory function that creates a calendar from a file object, path, or URL.
I've been using the iCalendar python module, which implements a factory function as a class method directly on the Class that it returns an instance of:
cal = icalendar.Calendar.from_string(data)
From what little I know about Java, this is a common pattern in Java code, though I seem to find more references to a factory method being on a different class than the class you actually want to instantiate instances from.
The question is, is this also considered Pythonic ? Or is it considered more pythonic to just create a module-level method as the factory function ?
[Note. Be very cautious about separating "Calendar" a collection of events, and "Event" - a single event on a calendar. In your question, it seems like there could be some confusion.]
There are many variations on the Factory design pattern.
A stand-alone convenience function (e.g., calendarMaker(data))
A separate class (e.g., CalendarParser) which builds your target class (Calendar).
A class-level method (e.g. Calendar.from_string) method.
These have different purposes. All are Pythonic, the questions are "what do you mean?" and "what's likely to change?" Meaning is everything; change is important.
Convenience functions are Pythonic. Languages like Java can't have free-floating functions; you must wrap a lonely function in a class. Python allows you to have a lonely function without the overhead of a class. A function is relevant when your constructor has no state changes or alternate strategies or any memory of previous actions.
Sometimes folks will define a class and then provide a convenience function that makes an instance of the class, sets the usual parameters for state and strategy and any other configuration, and then calls the single relevant method of the class. This gives you both the statefulness of class plus the flexibility of a stand-alone function.
The class-level method pattern is used, but it has limitations. One, it's forced to rely on class-level variables. Since these can be confusing, a complex constructor as a static method runs into problems when you need to add features (like statefulness or alternative strategies.) Be sure you're never going to expand the static method.
Two, it's more-or-less irrelevant to the rest of the class methods and attributes. This kind of from_string is just one of many alternative encodings for your Calendar objects. You might have a from_xml, from_JSON, from_YAML and on and on. None of this has the least relevance to what a Calendar IS or what it DOES. These methods are all about how a Calendar is encoded for transmission.
What you'll see in the mature Python libraries is that factories are separate from the things they create. Encoding (as strings, XML, JSON, YAML) is subject to a great deal of more-or-less random change. The essential thing, however, rarely changes.
Separate the two concerns. Keep encoding and representation as far away from state and behavior as you can.
It's pythonic not to think about esoteric difference in some pattern you read somewhere and now want to use everywhere, like the factory pattern.
Most of the time you would think of a #staticmethod as a solution it's probably better to use a module function, except when you stuff multiple classes in one module and each has a different implementation of the same interface, then it's better to use a #staticmethod
Ultimately weather you create your instances by a #staticmethod or by module function makes little difference.
I'd probably use the initializer ( __init__ ) of a class because one of the more accepted "patterns" in python is that the factory for a class is the class initialization.
IMHO a module-level method is a cleaner solution. It hides behind the Python module system that gives it a unique namespace prefix, something the "factory pattern" is commonly used for.
The factory pattern has its own strengths and weaknesses. However, choosing one way to create instances usually has little pragmatic effect on your code.
A staticmethod rarely has value, but a classmethod may be useful. It depends on what you want the class and the factory function to actually do.
A factory function in a module would always make an instance of the 'right' type (where 'right' in your case is the 'Calendar' class always, but you might also make it dependant on the contents of what it is creating the instance out of.)
Use a classmethod if you wish to make it dependant not on the data, but on the class you call it on. A classmethod is like a staticmethod in that you can call it on the class, without an instance, but it receives the class it was called on as first argument. This allows you to actually create an instance of that class, which may be a subclass of the original class. An example of a classmethod is dict.fromkeys(), which creates a dict from a list of keys and a single value (defaulting to None.) Because it's a classmethod, when you subclass dict you get the 'fromkeys' method entirely for free. Here's an example of how one could write dict.fromkeys() oneself:
class dict_with_fromkeys(dict):
#classmethod
def fromkeys(cls, keys, value=None):
self = cls()
for key in keys:
self[key] = value
return self

Categories