Should I use a class or dictionary?

Should I use a class or dictionary? - python

I have a class that contains only fields and no methods, like this:
class Request(object):
def __init__(self, environ):
self.environ = environ
self.request_method = environ.get('REQUEST_METHOD', None)
self.url_scheme = environ.get('wsgi.url_scheme', None)
self.request_uri = wsgiref.util.request_uri(environ)
self.path = environ.get('PATH_INFO', None)
# ...
This could easily be translated to a dict. The class is more flexible for future additions and could be fast with __slots__. So would there be a benefit of using a dict instead? Would a dict be faster than a class? And faster than a class with slots?

Use a dictionary unless you need the extra mechanism of a class. You could also use a namedtuple for a hybrid approach:
>>> from collections import namedtuple
>>> request = namedtuple("Request", "environ request_method url_scheme")
>>> request
<class '__main__.Request'>
>>> request.environ = "foo"
>>> request.environ
'foo'
Performance differences here will be minimal, although I would be surprised if the dictionary wasn't faster.

Why would you make this a dictionary? What's the advantage? What happens if you later want to add some code? Where would your __init__ code go?
Classes are for bundling related data (and usually code).
Dictionaries are for storing key-value relationships, where usually the keys are all of the same type, and all the values are also of one type. Occasionally they can be useful for bundling data when the key/attribute names are not all known up front, but often this a sign that something's wrong with your design.
Keep this a class.

A class in python is a dict underneath. You do get some overhead with the class behavior, but you won't be able to notice it without a profiler. In this case, I believe you benefit from the class because:
All your logic lives in a single function
It is easy to update and stays encapsulated
If you change anything later, you can easily keep the interface the same

I think that the usage of each one is way too subjective for me to get in on that, so i'll just stick to numbers.
I compared the time it takes to create and to change a variable in a dict, a new_style class and a new_style class with slots.
Here's the code i used to test it(it's a bit messy but it does the job.)
import timeit
class Foo(object):
def __init__(self):
self.foo1 = 'test'
self.foo2 = 'test'
self.foo3 = 'test'
def create_dict():
foo_dict = {}
foo_dict['foo1'] = 'test'
foo_dict['foo2'] = 'test'
foo_dict['foo3'] = 'test'
return foo_dict
class Bar(object):
__slots__ = ['foo1', 'foo2', 'foo3']
def __init__(self):
self.foo1 = 'test'
self.foo2 = 'test'
self.foo3 = 'test'
tmit = timeit.timeit
print 'Creating...\n'
print 'Dict: ' + str(tmit('create_dict()', 'from __main__ import create_dict'))
print 'Class: ' + str(tmit('Foo()', 'from __main__ import Foo'))
print 'Class with slots: ' + str(tmit('Bar()', 'from __main__ import Bar'))
print '\nChanging a variable...\n'
print 'Dict: ' + str((tmit('create_dict()[\'foo3\'] = "Changed"', 'from __main__ import create_dict') - tmit('create_dict()', 'from __main__ import create_dict')))
print 'Class: ' + str((tmit('Foo().foo3 = "Changed"', 'from __main__ import Foo') - tmit('Foo()', 'from __main__ import Foo')))
print 'Class with slots: ' + str((tmit('Bar().foo3 = "Changed"', 'from __main__ import Bar') - tmit('Bar()', 'from __main__ import Bar')))
And here is the output...
Creating...
Dict: 0.817466186345
Class: 1.60829183597
Class_with_slots: 1.28776730003
Changing a variable...
Dict: 0.0735140918748
Class: 0.111714198313
Class_with_slots: 0.10618612142
So, if you're just storing variables, you need speed, and it won't require you to do many calculations, i recommend using a dict(you could always just make a function that looks like a method). But, if you really need classes, remember - always use __slots__.
Note:
I tested the 'Class' with both new_style and old_style classes. It turns out that old_style classes are faster to create but slower to modify(not by much but significant if you're creating lots of classes in a tight loop (tip: you're doing it wrong)).
Also the times for creating and changing variables may differ on your computer since mine is old and slow. Make sure you test it yourself to see the 'real' results.
Edit:
I later tested the namedtuple: i can't modify it but to create the 10000 samples (or something like that) it took 1.4 seconds so the dictionary is indeed the fastest.
If i change the dict function to include the keys and values and to return the dict instead of the variable containing the dict when i create it it gives me 0.65 instead of 0.8 seconds.
class Foo(dict):
pass
Creating is like a class with slots and changing the variable is the slowest (0.17 seconds) so do not use these classes. go for a dict (speed) or for the class derived from object ('syntax candy')

I agree with #adw. I would never represent an "object" (in an OO sense) with a dictionary. Dictionaries aggregate name/value pairs. Classes represent objects. I've seen code where the objects are represented with dictionaries and it's unclear what the actual shape of the thing is. What happens when certain name/values aren't there? What restricts the client from putting anything at all in. Or trying to get anything at all out. The shape of the thing should always be clearly defined.
When using Python it is important to build with discipline as the language allows many ways for the author to shoot him/herself in the foot.

I would recommend a class, as it is all sorts of information involved with a request. Were one to use a dictionary, I'd expect the data stored to be far more similar in nature. A guideline I tend to follow myself is that if I may want to loop over the entire set of key->value pairs and do something, I use a dictionary. Otherwise, the data apparently has far more structure than a basic key->value mapping, meaning a class would likely be a better alternative.
Hence, stick with the class.

If all that you want to achive is syntax candy like obj.bla = 5 instead of obj['bla'] = 5, especially if you have to repeat that a lot, you maybe want to use some plain container class as in martineaus suggestion. Nevertheless, the code there is quite bloated and unnecessarily slow. You can keep it simple like that:
class AttrDict(dict):
""" Syntax candy """
__getattr__ = dict.__getitem__
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
Another reason to switch to namedtuples or a class with __slots__ could be memory usage. Dicts require significantly more memory than list types, so this could be a point to think about.
Anyways, in your specific case, there doesn't seem to be any motivation to switch away from your current implementation. You don't seem to maintain millions of these objects, so no list-derived-types required. And it's actually containing some functional logic within the __init__, so you also shouldn't got with AttrDict.

It may be possible to have your cake and eat it, too. In other words you can create something that provides the functionality of both a class and dictionary instance. See the ActiveState's Dɪᴄᴛɪᴏɴᴀʀʏ ᴡɪᴛʜ ᴀᴛᴛʀɪʙᴜᴛᴇ-sᴛʏʟᴇ ᴀᴄᴄᴇss recipe and comments on ways of doing that.
If you decide to use a regular class rather than a subclass, I've found the Tʜᴇ sɪᴍᴘʟᴇ ʙᴜᴛ ʜᴀɴᴅʏ "ᴄᴏʟʟᴇᴄᴛᴏʀ ᴏғ ᴀ ʙᴜɴᴄʜ ᴏғ ɴᴀᴍᴇᴅ sᴛᴜғғ" ᴄʟᴀss recipe (by Alex Martelli) to be very flexible and useful for the sort of thing it looks like you're doing (i.e. create a relative simple aggregator of information). Since it's a class you can easily extend its functionality further by adding methods.
Lastly it should be noted that the names of class members must be legal Python identifiers, but dictionary keys do not—so a dictionary would provide greater freedom in that regard because keys can be anything hashable (even something that's not a string).
Update
A class object (which doesn't have a __dict__) subclass named SimpleNamespace (which does have one) was added to the types module Python 3.3, and is yet another alternative.

If the data, I mean set of fields, is not to be changed or extended in the future i would choose a class for representation such data. Why?
It's a little more clean and readable.
It's faster in terms of using it, which is much more important than creating it, which happens generally only once.
Even faster seems using just class as container for fields not object of the class.
extending alexpinho98 example:
import timeit
class Foo(object):
def __init__(self):
self.foo1 = 'test'
self.foo2 = 'test'
self.foo3 = 'test'
class FooClass:
foo1 = 'test'
foo2 = 'test'
foo3 = 'test'
def create_dict():
foo_dict = {}
foo_dict['foo1'] = 'test'
foo_dict['foo2'] = 'test'
foo_dict['foo3'] = 'test'
return foo_dict
class Bar(object):
__slots__ = ['foo1', 'foo2', 'foo3']
def __init__(self):
self.foo1 = 'test'
self.foo2 = 'test'
self.foo3 = 'test'
tmit = timeit.timeit
dict = create_dict()
def testDict():
a = dict['foo1']
b = dict['foo2']
c = dict['foo3']
dict_obj = Foo()
def testObjClass():
a = dict_obj.foo1
b = dict_obj.foo2
c = dict_obj.foo3
def testClass():
a = FooClass.foo1
b = FooClass.foo2
c = FooClass.foo3
print ('Creating...\n')
print ('Dict: ' + str(tmit('create_dict()', 'from __main__ import create_dict')))
print ('Class: ' + str(tmit('Foo()', 'from __main__ import Foo')))
print ('Class with slots: ' + str(tmit('Bar()', 'from __main__ import Bar')))
print ('=== Testing usage 1 ===')
print ('Using dict : ' + str(tmit('testDict()', 'from __main__ import testDict')))
print ('Using object: ' + str(tmit('testObjClass()', 'from __main__ import testObjClass')))
print ('Using class : ' + str(tmit('FooClass()', 'from __main__ import FooClass')))
Results are:
Creating...
Dict: 0.185864600000059
Class: 0.30627199999980803
Class with slots: 0.2572166999998444
=== Testing usage 1 ===
Using dict : 0.16507520000050135
Using object: 0.1266871000007086
Using class : 0.06327920000148879

class ClassWithSlotBase:
__slots__ = ('a', 'b',)
def __init__(self):
self.a: str = "test"
self.b: float = 0.0
def test_type_hint(_b: float) -> None:
print(_b)
class_tmp = ClassWithSlotBase()
test_type_hint(class_tmp.a)
I recommend a class. If you use a class, you can get type hint as shown. And Class support auto complete when class is argument of function.

Related

Overhead of creating classes in Python: Exact same code using class twice as slow as native DS?

I created a Stack class as an exercise in Python, using all list functions. For example, Stack.push() is just list.append(), Stack.pop() is list.pop() and Stack.isEmpty() is just list == [ ].
I was using my Stack class to implement a decimal to binary converter, and what I noticed is that even though the two functions are completely equivalent beyond the wrapping of my Stack class for push(), pop() and isEmpty(), the implementation using the Stack class is twice as slow as the implementation using Python's list.
Is that because there's always an inherent overhead to using classes in Python? And if so, where does the overhead come from technically speaking ("under the hood")? Finally, if the overhead is so significant, isn't it better not to use classes unless you absolutely have to?
def dectobin1(num):
s = Stack()
while num > 0:
s.push(num % 2)
num = num // 2
binnum = ''
while not s.isEmpty():
binnum = binnum + str(s.pop())
return binnum
def dectobin2(num):
l = []
while num > 0:
l.append(num % 2)
num = num // 2
binnum = ''
while not l == []:
binnum = binnum + str(l.pop())
return binnum
t1 = Timer('dectobin1(255)', 'from __main__ import dectobin1')
print(t1.timeit(number = 1000))
0.0211110115051
t2 = Timer('dectobin2(255)', 'from __main__ import dectobin2')
print(t2.timeit(number = 1000))
0.0094211101532

First off, a warning: Function calls are rarely what limits you in speed. This is often an unnecessary micro-optimisation. Only do that, if it is what actually limits your performance. Do some good profiling before and have a look if there might be a better way to optimise.
Make sure you don't sacrifice legibility for this tiny performance tweak!
Classes in Python are a little bit of a hack.
The way it works is that each object has a __dict__ field (a dict) which contains all attributes the object contains. Also each object has a __class__ object which again contains a __dict__ field (again a dict) which contains all class attributes.
So for example have a look at this:
>>> class X(): # I know this is an old-style class declaration, but this causes far less clutter for this demonstration
... def y(self):
... pass
...
>>> x = X()
>>> x.__class__.__dict__
{'y': <function y at 0x6ffffe29938>, '__module__': '__main__', '__doc__': None}
If you define a function dynamically (so not in the class declaration but after the object creation) the function does not go to the x.__class__.__dict__ but instead to x.__dict__.
Also there are two dicts that hold all variables accessible from the current function. There is globals() and locals() which include all global and local variables.
So now let's say, you have an object x of class X with functions y and z that were declared in the class declaration and a second function z, which was defined dynamically. Let's say object x is defined in global space.
Also, for comparison, there are two functions flocal(), which was defined in local space and fglobal(), which was defined in global space.
Now I will show what happens if you call each of these functions:
flocal():
locals()["flocal"]()
fglobal():
locals()["fglobal"] -> not found
globals()["fglobal"]()
x.y():
locals()["x"] -> not found
globals()["x"].__dict__["y"] -> not found, because y is in class space
.__class__.__dict__["y"]()
x.z():
locals()["x"] -> not found
globals()["x"].__dict__["z"]() -> found in object dict, ignoring z() in class space
So as you see, class space methods take a lot more time to lookup, object space methods are slow as well. The fastest option is a local function.
But you can get around that without sacrificing classes. Lets say, x.y() is called quite a lot and needs to be optimised.
class X():
def y(self):
pass
x = X()
for i in range(100000):
x.y() # slow
y = x.y # move the function lookup outside of loop
for i in range(100000):
y() # faster
Similar things happen with member variables of objects. They are also slower than local variables. The effect also adds up, if you call a function or use a member variable that is in an object that is a member variable of a different object. So for example
a.b.c.d.e.f()
would be a fair bit slower as each dot needs another dictionary lookup.
An official Python performance guide reccomends to avoid dots in performance critical parts of the code:
https://wiki.python.org/moin/PythonSpeed/PerformanceTips

There is an inherent overhead using functions (where methods on an instance are just wrappers around functions to pass in self).
A function call requires the current function information (a frame) to be stored on a stack (the Python call stack), and a new frame to be created for the function being called. That all takes time and memory:
>>> from timeit import timeit
>>> def f(): pass
...
>>> timeit(f, number=10**7)
0.8021022859902587
There is also a (smaller) cost of looking up the attribute (methods are attributes too), and creating the method object (each attribute lookup for a method name causes a new method object to be created):
>>> class Foo:
... bar = None
... def baz(self): pass
...
>>> timeit('instance.bar', 'from __main__ import Foo; instance = Foo()', number=10**7)
0.238075322995428
>>> timeit('instance.baz', 'from __main__ import Foo; instance = Foo()', number=10**7)
0.3402297169959638
So the sum cost of attribute lookup, method object creation and call stack operations add up to the extra time requirements you observed.

Serializing namedtuples via PyYAML

I'm looking for some reasonable way to serialize namedtuples in YAML using PyYAML.
A few things I don't want to do:
Rely on a dynamic call to add a constructor/representor/resolver upon instantiation of the namedtuple. These YAML files may be stored and re-loaded later, so I cannot rely on the same runtime environment existing when they are restored.
Register the namedtuples in global.
Rely on the namedtuples having unique names
I was thinking of something along these lines:
class namedtuple(object):
def __new__(cls, *args, **kwargs):
x = collections.namedtuple(*args, **kwargs)
class New(x):
def __getstate__(self):
return {
"name": self.__class__.__name__,
"_fields": self._fields,
"values": self._asdict().values()
}
return New
def namedtuple_constructor(loader, node):
import IPython; IPython.embed()
value = loader.construct_scalar(node)
import re
pattern = re.compile(r'!!python/object/new:myapp.util\.')
yaml.add_implicit_resolver(u'!!myapp.util.namedtuple', pattern)
yaml.add_constructor(u'!!myapp.util.namedtuple', namedtuple_constructor)
Assuming this was in an application module at the path myapp/util.py
I'm not getting into the constructor, however, when I try to load:
from myapp.util import namedtuple
x = namedtuple('test', ['a', 'b'])
t = x(1,2)
dump = yaml.dump(t)
load = yaml.load(dump)
It will fail to find New in myapp.util.
I tried a variety of other approaches as well, this was just one that I thought might work best.
Disclaimer: Even once I get into the proper constructor I'm aware my spec will need further work regarding what arguments get saved how they are passed into the resulting object, but the first step for me is to get the YAML representation into my constructor function, then the rest should be easy.

I was able to solve my problem, though in a slightly less than ideal way.
My application now uses its own namedtuple implementation; I copied the collections.namedtuple source, created a base class for all new namedtuple types to inherit, and modified the template (excerpts below for brevity, simply highlighting whats change from the namedtuple source).
class namedtupleBase(tuple):
pass
_class_template = '''\
class {typename}(namedtupleBase):
'{typename}({arg_list})'
One little change to the namedtuple function itself to add the new class into the namespace:
namespace = dict(_itemgetter=_itemgetter, __name__='namedtuple_%s' % typename,
OrderedDict=OrderedDict, _property=property, _tuple=tuple,
namedtupleBase=namedtupleBase)
Now registering a multi_representer solves the problem:
def repr_namedtuples(dumper, data):
return dumper.represent_mapping(u"!namedtupleBase", {
"__name__": data.__class__.__name__,
"__dict__": collections.OrderedDict(
[(k, v) for k, v in data._asdict().items()])
})
def consruct_namedtuples(loader, node):
value = loader.construct_mapping(node)
cls_ = namedtuple(value['__name__'], value['__dict__'].keys())
return cls_(*value['__dict__'].values())
yaml.add_multi_representer(namedtupleBase, repr_namedtuples)
yaml.add_constructor("!namedtupleBase", consruct_namedtuples)
Hattip to Represent instance of different classes with the same base class in pyyaml for the inspiration behind the solution.
Would love an idea that doesn't require re-creating the namedtuple function, but this accomplished my goals.

Would love an idea that doesn't require re-creating the namedtuple function, but this accomplished my goals.
Here you go.
TL;DR
Proof of concept using PyAML 3.12.
import yaml
def named_tuple(self, data):
if hasattr(data, '_asdict'):
return self.represent_dict(data._asdict())
return self.represent_list(data)
yaml.SafeDumper.yaml_multi_representers[tuple] = named_tuple
Note: To be clean you should use one of the add_multi_representer() methods at your disposition and a custom representer/loader, like you did.
This gives you:
>>> import collections
>>> Foo = collections.namedtuple('Foo', 'x y z')
>>> yaml.safe_dump({'foo': Foo(1,2,3), 'bar':(4,5,6)})
'bar: [4, 5, 6]\nfoo: {x: 1, y: 2, z: 3}\n'
>>> print yaml.safe_dump({'foo': Foo(1,2,3), 'bar':(4,5,6)})
bar: [4, 5, 6]
foo: {x: 1, y: 2, z: 3}
How does this work
As you discovered by yourself, a namedtuple does not have a special class; exploring it gives:
>>> collections.namedtuple('Bar', '').mro()
[<class '__main__.Bar'>, <type 'tuple'>, <type 'object'>]
So the instances of the Python named tuples are tuple instances with an additional _asdict() method.

How to create a new unknown or dynamic/expando object in Python

In python how can we create a new object without having a predefined Class and later dynamically add properties to it ?
example:
dynamic_object = Dynamic()
dynamic_object.dynamic_property_a = "abc"
dynamic_object.dynamic_property_b = "abcdefg"
What is the best way to do it?
EDIT Because many people advised in comments that I might not need this.
The thing is that I have a function that serializes an object's properties. For that reason, I don't want to create an object of the expected class due to some constructor restrictions, but instead create a similar one, let's say like a mock, add any "custom" properties I need, then feed it back to the function.

Just define your own class to do it:
class Expando(object):
pass
ex = Expando()
ex.foo = 17
ex.bar = "Hello"

If you take metaclassing approach from #Martijn's answer, #Ned's answer can be rewritten shorter (though it's obviously less readable, but does the same thing).
obj = type('Expando', (object,), {})()
obj.foo = 71
obj.bar = 'World'
Or just, which does the same as above using dict argument:
obj = type('Expando', (object,), {'foo': 71, 'bar': 'World'})()
For Python 3, passing object to bases argument is not necessary (see type documentation).
But for simple cases instantiation doesn't have any benefit, so is okay to do:
ns = type('Expando', (object,), {'foo': 71, 'bar': 'World'})
At the same time, personally I prefer a plain class (i.e. without instantiation) for ad-hoc test configuration cases as simplest and readable:
class ns:
foo = 71
bar = 'World'
Update
In Python 3.3+ there is exactly what OP asks for, types.SimpleNamespace. It's just:
A simple object subclass that provides attribute access to its namespace, as well as a meaningful repr.
Unlike object, with SimpleNamespace you can add and remove attributes. If a SimpleNamespace object is initialized with keyword arguments, those are directly added to the underlying namespace.
import types
obj = types.SimpleNamespace()
obj.a = 123
print(obj.a) # 123
print(repr(obj)) # namespace(a=123)
However, in stdlib of both Python 2 and Python 3 there's argparse.Namespace, which has the same purpose:
Simple object for storing attributes.
Implements equality by attribute names and values, and provides a simple string representation.
import argparse
obj = argparse.Namespace()
obj.a = 123
print(obj.a) # 123
print(repr(obj)) # Namespace(a=123)
Note that both can be initialised with keyword arguments:
types.SimpleNamespace(a = 'foo',b = 123)
argparse.Namespace(a = 'foo',b = 123)

Using an object just to hold values isn't the most Pythonic style of programming. It's common in programming languages that don't have good associative containers, but in Python, you can use use a dictionary:
my_dict = {} # empty dict instance
my_dict["foo"] = "bar"
my_dict["num"] = 42
You can also use a "dictionary literal" to define the dictionary's contents all at once:
my_dict = {"foo":"bar", "num":42}
Or, if your keys are all legal identifiers (and they will be, if you were planning on them being attribute names), you can use the dict constructor with keyword arguments as key-value pairs:
my_dict = dict(foo="bar", num=42) # note, no quotation marks needed around keys
Filling out a dictionary is in fact what Python is doing behind the scenes when you do use an object, such as in Ned Batchelder's answer. The attributes of his ex object get stored in a dictionary, ex.__dict__, which should end up being equal to an equivalent dict created directly.
Unless attribute syntax (e.g. ex.foo) is absolutely necessary, you may as well skip the object entirely and use a dictionary directly.

Use the collections.namedtuple() class factory to create a custom class for your return value:
from collections import namedtuple
return namedtuple('Expando', ('dynamic_property_a', 'dynamic_property_b'))('abc', 'abcdefg')
The returned value can be used both as a tuple and by attribute access:
print retval[0] # prints 'abc'
print retval.dynamic_property_b # prints 'abcdefg'

One way that I found is also by creating a lambda. It can have sideeffects and comes with some properties that are not wanted. Just posting for the interest.
dynamic_object = lambda:expando
dynamic_object.dynamic_property_a = "abc"
dynamic_object.dynamic_property_b = "abcdefg"

I define a dictionary first because it's easy to define. Then I use namedtuple to convert it to an object:
from collections import namedtuple
def dict_to_obj(dict):
return namedtuple("ObjectName", dict.keys())(*dict.values())
my_dict = {
'name': 'The mighty object',
'description': 'Yep! Thats me',
'prop3': 1234
}
my_obj = dict_to_obj(my_dict)

Ned Batchelder's answer is the best. I just wanted to record a slightly different answer here, which avoids the use of the class keyword (in case that's useful for instructive reasons, demonstration of closure, etc.)
Just define your own class to do it:
def Expando():
def inst():
None
return inst
ex = Expando()
ex.foo = 17
ex.bar = "Hello"

How to call Python functions dynamically [duplicate]

This question already has answers here:
Calling a function of a module by using its name (a string)
(18 answers)
Closed 4 months ago.
I have this code:
fields = ['name','email']
def clean_name():
pass
def clean_email():
pass
How can I call clean_name() and clean_email() dynamically?
For example:
for field in fields:
clean_{field}()
I used the curly brackets because it's how I used to do it in PHP but obviously doesn't work.
How to do this with Python?

If don't want to use globals, vars and don't want make a separate module and/or class to encapsulate functions you want to call dynamically, you can call them as the attributes of the current module:
import sys
...
getattr(sys.modules[__name__], "clean_%s" % fieldname)()

Using global is a very, very, bad way of doing this. You should be doing it this way:
fields = {'name':clean_name,'email':clean_email}
for key in fields:
fields[key]()
Map your functions to values in a dictionary.
Also using vars()[] is wrong too.

It would be better to have a dictionary of such functions than to look in globals().
The usual approach is to write a class with such functions:
class Cleaner(object):
def clean_name(self):
pass
and then use getattr to get access to them:
cleaner = Cleaner()
for f in fields:
getattr(cleaner, 'clean_%s' % f)()
You could even move further and do something like this:
class Cleaner(object):
def __init__(self, fields):
self.fields = fields
def clean(self):
for f in self.fields:
getattr(self, 'clean_%s' % f)()
Then inherit it and declare your clean_<name> methods on an inherited class:
cleaner = Cleaner(['one', 'two'])
cleaner.clean()
Actually this can be extended even further to make it more clean. The first step probably will be adding a check with hasattr() if such method exists in your class.

I have come across this problem twice now, and finally came up with a safe and not ugly solution (in my humble opinion).
RECAP of previous answers:
globals is the hacky, fast & easy method, but you have to be super consistent with your function names, and it can break at runtime if variables get overwritten. Also it's un-pythonic, unsafe, unethical, yadda yadda...
Dictionaries (i.e. string-to-function maps) are safer and easy to use... but it annoys me to no end, that i have to spread dictionary assignments across my file, that are easy to lose track of.
Decorators made the dictionary solution come together for me. Decorators are a pretty way to attach side-effects & transformations to a function definition.
Example time
fields = ['name', 'email', 'address']
# set up our function dictionary
cleaners = {}
# this is a parametered decorator
def add_cleaner(key):
# this is the actual decorator
def _add_cleaner(func):
cleaners[key] = func
return func
return _add_cleaner
Whenever you define a cleaner function, add this to the declaration:
#add_cleaner('email')
def email_cleaner(email):
#do stuff here
return result
The functions are added to the dictionary as soon as their definition is parsed and can be called like this:
cleaned_email = cleaners['email'](some_email)
Alternative proposed by PeterSchorn:
def add_cleaner(func):
cleaners[func.__name__] = func
return func
#add_cleaner
def email():
#clean email
This uses the function name of the cleaner method as its dictionary key.
It is more concise, though I think the method names become a little awkward.
Pick your favorite.

globals() will give you a dict of the global namespace. From this you can get the function you want:
f = globals()["clean_%s" % field]
Then call it:
f()

Here's another way:
myscript.py:
def f1():
print 'f1'
def f2():
print 'f2'
def f3():
print 'f3'
test.py:
import myscript
for i in range(1, 4):
getattr(myscript, 'f%d' % i)()

I had a requirement to call different methods of a class in a method of itself on the basis of list of method names passed as input (for running periodic tasks in FastAPI). For executing methods of Python classes, I have expanded the answer provided by #khachik. Here is how you can achieve it from inside or outside of the class:
>>> class Math:
... def add(self, x, y):
... return x+y
... def test_add(self):
... print(getattr(self, "add")(2,3))
...
>>> m = Math()
>>> m.test_add()
5
>>> getattr(m, "add")(2,3)
5
Closely see how you can do it from within the class using self like this:
getattr(self, "add")(2,3)
And from outside the class using an object of the class like this:
m = Math()
getattr(m, "add")(2,3)

Here's another way: define the functions then define a dict with the names as keys:
>>> z=[clean_email, clean_name]
>>> z={"email": clean_email, "name":clean_name}
>>> z['email']()
>>> z['name']()
then you loop over the names as keys.
or how about this one? Construct a string and use 'eval':
>>> field = "email"
>>> f="clean_"+field+"()"
>>> eval(f)
then just loop and construct the strings for eval.
Note that any method that requires constructing a string for evaluation is regarded as kludgy.

for field in fields:
vars()['clean_' + field]()

In case if you have a lot of functions and a different number of parameters.
class Cleaner:
#classmethod
def clean(cls, type, *args, **kwargs):
getattr(cls, f"_clean_{type}")(*args, **kwargs)
#classmethod
def _clean_email(cls, *args, **kwargs):
print("invoked _clean_email function")
#classmethod
def _clean_name(cls, *args, **kwargs):
print("invoked _clean_name function")
for type in ["email", "name"]:
Cleaner.clean(type)
Output:
invoked _clean_email function
invoked _clean_name function

I would use a dictionary which mapped field names to cleaning functions. If some fields don't have corresponding cleaning function, the for loop handling them can be kept simple by providing some sort of default function for those cases. Here's what I mean:
fields = ['name', 'email', 'subject']
def clean_name():
pass
def clean_email():
pass
# (one-time) field to cleaning-function map construction
def get_clean_func(field):
try:
return eval('clean_'+field)
except NameError:
return lambda: None # do nothing
clean = dict((field, get_clean_func(field)) for field in fields)
# sample usage
for field in fields:
clean[field]()
The code above constructs the function dictionary dynamically by determining if a corresponding function named clean_<field> exists for each one named in the fields list. You likely would only have to execute it once since it would remain the same as long as the field list or available cleaning functions aren't changed.

How to dynamically compose and access class attributes in Python? [duplicate]

This question already has answers here:
How to access (get or set) object attribute given string corresponding to name of that attribute
(3 answers)
Closed 3 years ago.
I have a Python class that have attributes named: date1, date2, date3, etc.
During runtime, I have a variable i, which is an integer.
What I want to do is to access the appropriate date attribute in run time based on the value of i.
For example,
if i == 1, I want to access myobject.date1
if i == 2, I want to access myobject.date2
And I want to do something similar for class instead of attribute.
For example, I have a bunch of classes: MyClass1, MyClass2, MyClass3, etc. And I have a variable k.
if k == 1, I want to instantiate a new instance of MyClass1
if k == 2, I want to instantiate a new instance of MyClass2
How can i do that?
EDIT
I'm hoping to avoid using a giant if-then-else statement to select the appropriate attribute/class.
Is there a way in Python to compose the class name on the fly using the value of a variable?

You can use getattr() to access a property when you don't know its name until runtime:
obj = myobject()
i = 7
date7 = getattr(obj, 'date%d' % i) # same as obj.date7
If you keep your numbered classes in a module called foo, you can use getattr() again to access them by number.
foo.py:
class Class1: pass
class Class2: pass
[ etc ]
bar.py:
import foo
i = 3
someClass = getattr(foo, "Class%d" % i) # Same as someClass = foo.Class3
obj = someClass() # someClass is a pointer to foo.Class3
# short version:
obj = getattr(foo, "Class%d" % i)()
Having said all that, you really should avoid this sort of thing because you will never be able to find out where these numbered properties and classes are being used except by reading through your entire codebase. You are better off putting everything in a dictionary.

For the first case, you should be able to do:
getattr(myobject, 'date%s' % i)
For the second case, you can do:
myobject = locals()['MyClass%s' % k]()
However, the fact that you need to do this in the first place can be a sign that you're approaching the problem in a very non-Pythonic way.

OK, well... It seems like this needs a bit of work. Firstly, for your date* things, they should be perhaps stored as a dict of attributes. eg, myobj.dates[1], so on.
For the classes, it sounds like you want polymorphism. All of your MyClass* classes should have a common ancestor. The ancestor's __new__ method should figure out which of its children to instantiate.
One way for the parent to know what to make is to keep a dict of the children. There are ways that the parent class doesn't need to enumerate its children by searching for all of its subclasses but it's a bit more complex to implement. See here for more info on how you might take that approach. Read the comments especially, they expand on it.
class Parent(object):
_children = {
1: MyClass1,
2: MyClass2,
}
def __new__(k):
return object.__new__(Parent._children[k])
class MyClass1(Parent):
def __init__(self):
self.foo = 1
class MyClass2(Parent):
def __init__(self):
self.foo = 2
bar = Parent(1)
print bar.foo # 1
baz = Parent(2)
print bar.foo # 2
Thirdly, you really should rethink your variable naming. Don't use numbers to enumerate your variables, instead give them meaningful names. i and k are bad to use as they are by convention reserved for loop indexes.
A sample of your existing code would be very helpful in improving it.

to get a list of all the attributes, try:
dir(<class instance>)

I agree with Daenyth, but if you're feeling sassy you can use the dict method that comes with all classes:
>>> class nullclass(object):
def nullmethod():
pass
>>> nullclass.__dict__.keys()
['__dict__', '__module__', '__weakref__', 'nullmethod', '__doc__']
>>> nullclass.__dict__["nullmethod"]
<function nullmethod at 0x013366A8>

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Should I use a class or dictionary? - python

Related

Overhead of creating classes in Python: Exact same code using class twice as slow as native DS?

Serializing namedtuples via PyYAML

How to create a new unknown or dynamic/expando object in Python

How to call Python functions dynamically [duplicate]

How to dynamically compose and access class attributes in Python? [duplicate]

Categories

Resources