I read a post recently where someone mentioned that there is no need for using enums in python. I'm interested in whether this is true or not.
For example, I use an enum to represent modem control signals:
class Signals:
CTS = "CTS"
DSR = "DSR"
...
Isn't it better that I use if signal == Signals.CTS: than if signal == "CTS":, or am I missing something?
Signals.CTS does seem better than "CTS". But Signals is not an enum, it's a class with specific fields. The claim, as I've heard it, is that you don't need a separate enum language construct, as you can do things like you've done in the question, or perhaps:
CTS, DSR, XXX, YYY, ZZZ = range(5)
If you have that in a signals module, it can be imported as used in a similar fashion, e.g., if signal == signals.CTS:. This is used in several modules in the standard library, including the re and os modules.
In your exact example, I guess it would be okay to use defined constants, as it would raise an error, when the constant is not found, alas a typo in a string would not.
I guess there is an at least equal solution using object orientation.
BTW: if "CTS": will always be True, since only empty strings are interpreted as False.
It depends on whether you use values of Signal.CTS, Signal.DSR as data. For example if you send these strings to actual modem. If this is true, then it would be a good idea to have aliases defined as you did, because external interfaces tend to change or be less uniform when you would expect. Otherwise if you don't ever use symbols values then you can skip layer of abstraction and use strings directly.
The only thing is not to mix internal symbols and external data.
If you want to have meaningful string constants (CTS = "CTS", etc.), you can simply do:
for constant_name in ('CTS', 'DSR'): # All constant names go here
globals()[constant_name] = constant_name
This defines variables CTS dans DSR with the values you want. (Reference about the use of globals(): Programmatically creating variables in Python.)
Directly defining your constants at the top level of a module is done in many standard library modules (like for instance the re and os modules [re.IGNORECASE, etc.]), so this approach is quite clean.
I think there's a lot more to load on enumerations (setting arbitrary values, bitwise operations, whitespace-d descriptions).
Please read below very short post, check the enum class offered, and judge yourself.
Python Enum Post
Do we need Enums in Python? Do we need an html module, a database module, or a bool type?
I would classify Enums as a nice-to-have, not a must-have.
However, part of the reason Enums finally showed up (in Python 3.4) is because they a such a nice-to-have that many folk reimplemented enums by hand. With so many private and public versions of enumerations, interoperability becomes an issue, standard use becomes an issue, etc., etc.
So to answer your question: No, we don't need an Enum type. But we now have one anyway. There's even a back-ported version.
Related
I understand the sense of using an Enum when it's converting a human-readable string into an underlying (e.g. numeric) value, with the FederalHoliday class in this answer being a good example of that.
But the use-case I'm considering is just where a function parameter is restricted to a set of possible values, which are currently passed as "magic strings". So implementing an Enum here wouldn't really improve code readability (if anything, it would make the code more cluttered). And turning a string into an Enum, only to compare on what is effectively the same string (i.e. the name of the Enum) feels like overkill.
This answer has a fantastic list of general advantages of enums, but I'm not sure how many of them apply in this case.
To clarify the case I'm meaning, there is an example of it here, with a function that prints a string with certain capitalization as specified by mode:
def print_my_string(my_string, mode):
if mode == 'l':
print(my_string.lower())
elif mode == 'u':
print(my_string.upper())
elif mode == 'c':
print(my_string.capitalize())
else:
raise ValueError("Unrecognised mode")
To see this in action, running this:
for mode in ['l', 'u', 'c']:
print_my_string("RaNdoM CAse StRING", mode)
gives:
random case string
RANDOM CASE STRING
Random case string
So my question is:
What advantage does an Enum bring when the strings don't represent another value underneath? Is there any way that it makes the code more robust? Does it add anything?
Other things I've read:
Mainly about Enums in general, especially when the strings represent another value underneath:
How can I represent an 'Enum' in Python?
Python enum, when and where to use?
This seems to have an example similar to mine (under Enums are interoperable), but I struggled to understand the technical discussion around it, and it only shows setting the Enum, not using it:
http://xion.io/post/code/python-enums-are-ok.html
You are right that, non being a statically typed language, the benefits of enums are not as strong in Python. Many APIs still use strings for these kind of things and are just fine. However I think there are still benefits to it, specially from the point of view of IDE support (autocompletion) and static analysis (code linting). Imagine you have a function that allows you to compute the norm of a vector, using different methods. Were the options 'Euclidean', 'Absolute' and 'Manhattan', 'EUCLIDEAN', 'ABSOLUTE_VALUE' and 'TAXICAB' or just 'e', 'a', and 'm'? If you have an enum and a nice enough IDE, you can probably write NormType. and press Ctrl+Space to see the options, instead of having to check out the documentation once again. And if you write the wrong one, a code linter will probably let you know. More over, if you happen to rename one of the options it should be easier to find all the places that need changing.
That said, I agree that it may just make the code more cluttered in some cases. In your example, the options are probably simple enough to use a string without problem. In any case, the benefits of enums become more relevant when they are used in several places, like several functions with similar parameters where you want to enforce uniformity and want to avoid silly string typos in the code. It is harder to justify the need for an enum for a single parameter of a single function.
Well, about code clarity probably depends de person and styles, but enum makes it clearer, think that if you need to change the string you want to use as flag you would have to to refactor all your code while using and enum you would just need to change it in the enum class.
For me makes sense to use the enum:
from enum import Enum
class PrintFlag(Enum):
L = "lower"
U = "upper"
C = "capitalize"
def print_my_string(my_string, mode):
action_dict = {
PrintFlag.L : str.lower,
PrintFlag.U : str.upper,
PrintFlag.C : str.capitalize,
}
try:
print(action_dict[mode](my_string))
except KeyError:
raise ValueError("Unrecognised mode")
for mode in PrintFlag:
print_my_string("RaNdoM CAse StRING", mode)
Here you have a live example
I noticed that many libraries nowadays seem to prefer the use of strings over enum-type variables for parameters.
Where people would previously use enums, e.g. dateutil.rrule.FR for a Friday, it seems that this has shifted towards using string (e.g. 'FRI').
Same in numpy (or pandas for that matter), where searchsorted for example uses of strings (e.g. side='left', or side='right') rather than a defined enum. For the avoidance of doubt, before python 3.4 this could have been easily implemented as an enum as such:
class SIDE:
RIGHT = 0
LEFT = 1
And the advantages of enums-type variable are clear: You can't misspell them without raising an error, they offer proper support for IDEs, etc.
So why use strings at all, instead of sticking to enum types? Doesn't this make the programs much more prone to user errors? It's not like enums create an overhead - if anything they should be slightly more efficient. So when and why did this paradigm shift happen?
I think enums are safer especially for larger systems with multiple developers.
As soon as the need arises to change the value of such an enum, looking up and replacing a string in many places is not my idea of fun :-)
The most important criteria IMHO is the usage: for use in a module or even a package a string seems to be fine, in a public API I'ld prefer enums.
[update]
As of today (2019) Python introduced dataclasses - combined with optional type annotations and static type analyzers like mypy I think this is a solved problem.
As for efficiency, attribute lookup is somewhat expensive in Python compared to most computer languages so I guess some libraries may still chose to avoid it for performance reasons.
[original answer]
IMHO it is a matter of taste. Some people like this style:
def searchsorted(a, v, side='left', sorter=None):
...
assert side in ('left', 'right'), "Invalid side '{}'".format(side)
...
numpy.searchsorted(a, v, side='right')
Yes, if you call searchsorted with side='foo' you may get an AssertionError way later at runtime - but at least the bug will be pretty easy to spot looking the traceback.
While other people may prefer (for the advantages you highlighted):
numpy.searchsorted(a, v, side=numpy.CONSTANTS.SIDE.RIGHT)
I favor the first because I think seldom used constants are not worth the namespace cruft. You may disagree, and people may align with either side due to other concerns.
If you really care, nothing prevents you from defining your own "enums":
class SIDE(object):
RIGHT = 'right'
LEFT = 'left'
numpy.searchsorted(a, v, side=SIDE.RIGHT)
I think it is not worth but again it is a matter of taste.
[update]
Stefan made a fair point:
As soon as the need arises to change the value of such an enum, looking up and replacing a string in many places is not my idea of fun :-)
I can see how painful this can be in a language without named parameters - using the example you have to search for the string 'right' and get a lot of false positives. In Python you can narrow it down searching for side='right'.
Of course if you are dealing with an interface that already has a defined set of enums/constants (like an external C library) then yes, by all means mimic the existing conventions.
I understand this question has already been answered, but there is one thing that has not at all been addressed: the fact that Python Enum objects must be explicitly called for their value when using values stored by Enums.
>>> class Test(Enum):
... WORD='word'
... ANOTHER='another'
...
>>> str(Test.WORD.value)
'word'
>>> str(Test.WORD)
'Test.WORD'
One simple solution to this problem is to offer an implementation of __str__()
>>> class Test(Enum):
... WORD='word'
... ANOTHER='another'
... def __str__(self):
... return self.value
...
>>> Test.WORD
<Test.WORD: 'word'>
>>> str(Test.WORD)
'word'
Yes, adding .value is not a huge deal, but it is an inconvenience nonetheless. Using regular strings requires zero extra effort, no extra classes, or redefinition of any default class methods. Still, there must be explicit casting to a string value in many cases, where a simple str would not have a problem.
i prefer strings for the reason of debugging. compare an object like
side=1, opt_type=0, order_type=6
to
side='BUY', opt_type='PUT', order_type='FILL_OR_KILL'
i also like "enums" where the values are strings:
class Side(object):
BUY = 'BUY'
SELL = 'SELL'
SHORT = 'SHORT'
Strictly speaking Python does not have enums - or at least it didn't prior to v3.4
https://docs.python.org/3/library/enum.html
I prefer to think of your example as programmer defined constants.
In argparse, one set of constants have string values. While the code uses the constant names, users more often use the strings.
e.g. argparse.ZERO_OR_MORE = '*'
arg.parse.OPTIONAL = '?'
numpy is one of the older 3rd party packages (at least its roots like numeric are). String values are more common than enums. In fact I can't off hand think of any enums (as you define them).
The question
Are there any valid use cases for the ol' module switcheroo, where you replace the module with a class instance? By a valid use case, I mean a case where it would be generally agreed that using this trick would be the best way of solving a problem. For example, the module:
VERSION = (1, 2, 8)
VERSION_NAME = '1.2.8'
Could be converted to this:
import sys
class ConstantsModule(object):
def __init__(self):
self.VERSION = (1, 2, 8)
#property
def VERSION_NAME(self):
return u'{}.{}.{}'.format(*self.VERSION)
sys.modules[__name__] = ConstantsModule()
And now VERSION_NAME is a property with logic behind it.
I have googled around for this without finding anything relevant. I learned of this trick in a SO answer I read some time ago, and I know this is something referred to as "black magic" and to be avoided, but I'm curious about the valid use cases.
My specific use case
I have a small problem with one of my modules that could easily be solved if the module was a class instance. I have a "constant" called VERSION_NAME, which is a string version of VERSION, which in turn is a tuple with my application's version information. The VERSION_NAME is used throughout my project and in several other projects based on this one. Now I would like VERSION_NAME to include some logic - I would like it to be based on VERSION so that I don't have to edit it manually all the time, and I would like it to be formatted slightly differently depending on a couple of environmental circumstances. The way I see it I have two choices:
Hunt down every use-case of VERSION_NAME in my project and all its sub-projects and change it to a function call like get_version_name.
Invoke black magic like shown above.
This question is not about my use case though, this is just an example of what I figure it could be used for.
Since everything in Python is an object, there is no black magic about it; this is simply duck typing; if you create an object that walks and talks like a module, then the rest of Python is none the wiser.
However, for your specific use-case, you don't need to resort to this level of deception. Simply calculate version name at import time:
VERSION_NAME = u'{}.{}.{}'.format(*VERSION)
Nowhere is it stated that module globals can only be literal values; just use a Python expression for them instead.
After all, your VERSION_NAME variable is not going to change during the lifetime of your program, you only need to generate it once. Use a property only when you need an attribute that needs to be re-calculated every time you access it.
I'm coding a poker hand evaluator as my first programming project. I've made it through three classes, each of which accomplishes its narrowly-defined task very well:
HandRange = a string-like object (e.g. "AA"). getHands() returns a list of tuples for each specific hand within the string:
[(Ad,Ac),(Ad,Ah),(Ad,As),(Ac,Ah),(Ac,As),(Ah,As)]
Translation = a dictionary that maps the return list from getHands to values that are useful for a given evaluator (yes, this can probably be refactored into another class).
{'As':52, 'Ad':51, ...}
Evaluator = takes a list from HandRange (as translated by Translator), enumerates all possible hand matchups and provides win % for each.
My question: what should my "domain" class for using all these classes look like, given that I may want to connect to it via either a shell UI or a GUI? Right now, it looks like an assembly line process:
user_input = HandRange()
x = Translation.translateList(user_input)
y = Evaluator.getEquities(x)
This smells funny in that it feels like it's procedural when I ought to be using OO.
In a more general way: if I've spent so much time ensuring that my classes are well defined, narrowly focused, orthogonal, whatever ... how do I actually manage work flow in my program when I need to use all of them in a row?
Thanks,
Mike
Don't make a fetish of object orientation -- Python supports multiple paradigms, after all! Think of your user-defined types, AKA classes, as building blocks that gradually give you a "language" that's closer to your domain rather than to general purpose language / library primitives.
At some point you'll want to code "verbs" (actions) that use your building blocks to perform something (under command from whatever interface you'll supply -- command line, RPC, web, GUI, ...) -- and those may be module-level functions as well as methods within some encompassing class. You'll surely want a class if you need multiple instances, and most likely also if the actions involve updating "state" (instance variables of a class being much nicer than globals) or if inheritance and/or polomorphism come into play; but, there is no a priori reason to prefer classes to functions otherwise.
If you find yourself writing static methods, yearning for a singleton (or Borg) design pattern, writing a class with no state (just methods) -- these are all "code smells" that should prompt you to check whether you really need a class for that subset of your code, or rather whether you may be overcomplicating things and should use a module with functions for that part of your code. (Sometimes after due consideration you'll unearth some different reason for preferring a class, and that's allright too, but the point is, don't just pick a class over a module w/functions "by reflex", without critically thinking about it!).
You could create a Poker class that ties these all together and intialize all of that stuff in the __init__() method:
class Poker(object):
def __init__(self, user_input=HandRange()):
self.user_input = user_input
self.translation = Translation.translateList(user_input)
self.evaluator = Evaluator.getEquities(x)
# and so on...
p = Poker()
# etc, etc...
I have a programming experience with statically typed languages. Now writing code in Python I feel difficulties with its readability. Lets say I have a class Host:
class Host(object):
def __init__(self, name, network_interface):
self.name = name
self.network_interface = network_interface
I don't understand from this definition, what "network_interface" should be. Is it a string, like "eth0" or is it an instance of a class NetworkInterface? The only way I'm thinking about to solve this is a documenting the code with a "docstring". Something like this:
class Host(object):
''' Attributes:
#name: a string
#network_interface: an instance of class NetworkInterface'''
Or may be there are name conventions for things like that?
Using dynamic languages will teach you something about static languages: all the help you got from the static language that you now miss in the dynamic language, it wasn't all that helpful.
To use your example, in a static language, you'd know that the parameter was a string, and in Python you don't. So in Python you write a docstring. And while you're writing it, you realize you had more to say about it than, "it's a string". You need to say what data is in the string, and what format it should have, and what the default is, and something about error conditions.
And then you realize you should have written all that down for your static language as well. Sure, Java would force you know that it was a string, but there's all these other details that need to be specified, and you have to manually do that work in any language.
The docstring conventions are at PEP 257.
The example there follows this format for specifying arguments, you can add the types if they matter:
def complex(real=0.0, imag=0.0):
"""Form a complex number.
Keyword arguments:
real -- the real part (default 0.0)
imag -- the imaginary part (default 0.0)
"""
if imag == 0.0 and real == 0.0: return complex_zero
...
There was also a rejected PEP for docstrings for attributes ( rather than constructor arguments ).
The most pythonic solution is to document with examples. If possible, state what operations an object must support to be acceptable, rather than a specific type.
class Host(object):
def __init__(self, name, network_interface)
"""Initialise host with given name and network_interface.
network_interface -- must support the same operations as NetworkInterface
>>> network_interface = NetworkInterface()
>>> host = Host("my_host", network_interface)
"""
...
At this point, hook your source up to doctest to make sure your doc examples continue to work in future.
Personally I found very usefull to use pylint to validate my code.
If you follow pylint suggestion almost automatically your code become more readable,
you will improve your python writing skills, respect naming conventions. You can also define your own naming conventions and so on. It's very useful specially for a python beginner.
I suggest you to use.
Python, though not as overtly typed as C or Java, is still typed and will throw exceptions if you're doing things with types that simply do not play nice together.
To that end, if you're concerned about your code being used correctly, maintained correctly, etc. simply use docstrings, comments, or even more explicit variable names to indicate what the type should be.
Even better yet, include code that will allow it to handle whichever type it may be passed as long as it yields a usable result.
One benefit of static typing is that types are a form of documentation. When programming in Python, you can document more flexibly and fluently. Of course in your example you want to say that network_interface should implement NetworkInterface, but in many cases the type is obvious from the context, variable name, or by convention, and in these cases by omitting the obvious you can produce more readable code. Common is to describe the meaning of a parameter and implicitly giving the type.
For example:
def Bar(foo, count):
"""Bar the foo the given number of times."""
...
This describes the function tersely and precisely. What foo and bar mean will be obvious from context, and that count is a (positive) integer is implicit.
For your example, I'd just mention the type in the document string:
"""Create a named host on the given NetworkInterface."""
This is shorter, more readable, and contains more information than a listing of the types.