Recommended Python Modules for Function Argument Handling? - python

There are many Python modules for parsing and coordinating command line options (argparse, getopt, blargs, etc). And Python is blessed with good built-in features/idioms for handling varied function arguments (e.g., default values, *varargs, **keyword_args). But when I read various projects' code for top-level functions, I see notably less discipline and standardization of function arguments than command line arguments.
For simple functions, this isn't an issue; the built-in argument features work great and are more than sufficient. But there are a lot of functionally rich modules whose top-level functions provide lots of different arguments and options (some complementary or exclusive), different modes of operation, defaults, over-rides, etc.--that is, they have argument complexity approaching that of command line arguments. And they seem to largely handle their arguments in ad hoc ways.
Given the number of command line processing modules out there, and how refined they've become over time, I'd expect at least a few modules for simplifying the wrangling of complicated function arguments. But I've searched PyPi, stackoverflow, and Google without success. So...are there function (not command line!) argument handling modules you would recommend?
---update with example---
It's hard to give a truly simple concrete example because the use case doesn't appear until you're dealing with a sophisticated module. But here's a shot at explaining the problem in code: A formatter module with defaults that can be overridden in formatter instantiation, or when the function/method is called. For having only a few options, there's already an awful lot of option-handling verbiage, and the option names are repeated ad nauseam.
defaults = { 'indent': 4,
'prefix': None,
'suffix': None,
'name': 'aFormatter',
'reverse': False,
'show_name': False
}
class Formatter(object):
def __init__(self, **kwargs):
self.name = kwargs.get('name', defaults['name'])
self.indent = kwargs.get('indent', defaults['indent'])
self.prefix = kwargs.get('prefix', defaults['prefix'])
self.suffix = kwargs.get('suffix', defaults['suffix'])
self.reverse = kwargs.get('reverse', defaults['reverse'])
self.show_name = kwargs.get('show_name', defaults['show_name'])
def show_lower(self, *args, **kwargs):
indent = kwargs.get('indent', self.indent) or 0
prefix = kwargs.get('prefix', self.prefix)
suffix = kwargs.get('suffix', self.suffix)
reverse = kwargs.get('reverse', self.reverse)
show_name = kwargs.get('show_name', self.show_name)
strings = []
if show_name:
strings.append(self.name + ": ")
if indent:
strings.append(" " * indent)
if prefix:
strings.append(prefix)
for a in args:
strings.append(a.upper() if reverse else a.lower())
if suffix:
strings.append(suffix)
print ''.join(strings)
if __name__ == '__main__':
fmt = Formatter()
fmt.show_lower("THIS IS GOOD")
fmt.show_lower("THIS", "IS", "GOOD")
fmt.show_lower('this IS good', reverse=True)
fmt.show_lower("something!", show_name=True)
upper = Formatter(reverse=True)
upper.show_lower("this is good!")
upper.show_lower("and so is this!", reverse=False)

When I first read your question, I thought to myself that you're asking for a band-aid module, and that it doesn't exist because nobody wants to write a module that enables bad design to persist.
But I realized that the situation is more complex than that. The point of creating a module such as the one you describe is to create reusable, general-case code. Now, it may well be that there are some interfaces that are justifiably complex. But those interfaces are precisely the interfaces that probably can't be handled easily by general-case code. They are complex because they address a problem domain with a lot of special cases.
In other words, if an interface really can't be refactored, then it probably requires a lot of custom, special-case code that isn't predictable enough to be worth generalizing in a module. Conversely, if an interface can easily be patched up with a module of the kind you describe, then it probably can also be refactored -- in which case it should be.

I don't think command line parsing and function argument processing have much in common. The main issue with the command line is that the only available data structure is a flat list of strings, and you don't have an instrument like a function header available to define what each string means. In the header of a Python function, you can give names to each of the parameters, you can accept containers as parameters, you can define default argument values etc. What a command line parsing library does is actually providing for the command line some of the features Python offers for function calls: give names to parameters, assign default values, convert to the desired types etc. In Python, all these features are built-in, so you don't need a library to get to that level of convenience.
Regarding your example, there are numerous ways how this design can be improved by using the features the language offers. You can use default argument values instead of your defaults dictionary, you can encapsulate all the flags in a FormatterConfig class and only pass one argument instead of all those arguments again and again. But let's just assume you want exactly the interface you gave in the example code. One way to achieve this would be the following code:
class Config(dict):
def __init__(self, config):
dict.__init__(self, config)
self.__dict__ = self
def get_config(kwargs, defaults):
config = defaults.copy()
config.update(kwargs)
return Config(config)
class Formatter(object):
def __init__(self, **kwargs):
self.config = get_config(kwargs, defaults)
def show_lower(self, *args, **kwargs):
config = get_config(kwargs, self.config)
strings = []
if config.show_name:
strings.append(config.name + ": ")
strings.append(" " * config.indent)
if config.prefix:
strings.append(config.prefix)
for a in args:
strings.append(a.upper() if config.reverse else a.lower())
if config.suffix:
strings.append(config.suffix)
print "".join(strings)
Python offers a lot of tools to do this kind of argument handling. So even if we decide not to use some of them (like default arguments), we still can avoid to repeat ourselves too much.

If your API is so complex you think it would be easier to use some module to process the options that were passed you, there's a good chance the actual solution is to simplify your API. The fact some modules have very complex ways to call stuff is a shame, not a feature.

Its in developer's hand, but if you're making a library which may be useful for some other projects or will be published across other users, then I think first you need to identify your problem and analyse it,
Document your functions well, Its good to minimize the number of arguments,
provide default values for functional arguments where users may have trouble to specify what exactly needed to pass.
and for some complex requirement you can provide special classmethods that can be override for advanced programming or by advanced users who actually wants to achieve what they are playing with the library, inheritance is always there.
and you can read the PEP8 also which may helpful, but ultimate goal is to specify the minimum number of arguments, restrict users to enter required arguments, its good to provide default values for optional arguments - in the way that your library / code is easily understandable by ordinary developers too...

You could write more generic code for the defaulting.
If you think about defaulting the other way around, going through the defaults and overwriting the keywords if the don't exist.
defaults = { 'indent': 4,
'prefix': None,
'suffix': None,
'name': 'aFormatter',
'reverse': False,
'show_name': False
}
class Formatter(object):
def __init__(self, **kwargs):
for d,dv in defaults.iteritems():
kwargs[d] = kwargs.get(d, dv)
Side Note:
I'd recommend using keywords args in the __init__ method definition with defaults. This allows the function definition really become the contract to other developers and users of your class (Formatter)
def __init__(self, indent=4, reverse=False .....etc..... ):

Related

Methods of creating syntax highlighting in textX?

As I cannot find any guidelines about syntax highlighting, I decided to prepare simple write-as-plain-text-and-then-highlight-everything-in-html-preview, which is enough for my scope at the moment.
By overriding many custom meta-model classes I have to_source method, which actually reimplements the whole syntax in reverse, as reverse parsing is not yet available. It's fine, but it ignores user formatting.
To retain user formatting we can use only available thing: _tx_position and _tx_position_end. Descending from main textX rule to its children by stored custom meta-model classes attributes works for most cases, but it fails with primitives.
# textX meta-model file
NonsenseProgram:
"begin" foo=Foo "," count=INT "end";
;
Foo:
"fancy" a=ID "separator" b=ID "finished"
;
# textX custom meta-model classes
class NonsenseProgram():
def __init__(foo, count):
self.foo = foo
self.count = count
def to_source(self):
pass # some recursive magic that use _tx_position and _tx_position_end
class Foo():
def __init__(parent, a, b):
self.parent = parent
self.a = a
self.b = b
def to_source(self):
pass # some recursive magic that use _tx_position and _tx_position_end
Let's consider given example. As we have NonsenseProgram and Foo classes that we can override, we are in control about it's returning source as a whole. We can modify NonsenseProgram generated code, NonsenseProgram.foo fragment (by overriding Foo), by accessing its _tx_* attributes. We can't do the same with NonsenseProgram.count, Foo.a and Foo.b as we have primitive string or int value.
Depending of the usage of primitives is out grammar we have following options:
Wrap every primitive with rule that contains only that primitive and nothing else.
Pros: It just works right now!
Cons: Produces massive overhead of nested values that our grammar toolchain need to handle. It's actually messing with grammar only for being pretty...
Ignore syntax from user and use only our reverse parsing rules.
Pros: It just works too!
Cons: You need reimplement your syntax with nearly every grammar element. It's forces code reformat on every highlight try.
Use some external rules of highlighting.
Pros: It would work...
Cons: Again grammar reimplementation.
Use language server.
Pros: Would be the best option on long run.
Cons: It's only mentioned once without any in-depth docs.
Any suggestions about any other options?
You are right. There is no information on position for primitive types. It seems that you have covered available options at the moment.
What would be an easy to implement option is to add bookkeeping of position directly to textX of all attributes as a special structure on each created object (e.g. a dict keyed by attribute name). It should be straightforward to implement so you can register a feature request in the issue tracker if you wish.
There was some work in the past to support full language services to the textX based languages. The idea is to get all the features you would expect from a decent code editor/IDE for any language specified using textX.
The work staled for a while but resumed recently as the full rewrite. It should be officially supported by the textX team. You can follow the progress here. Although, the project doesn't mention syntax highlighting at the moment, it is on our agenda.

The most 'Pythonic' way to handle overloading

Disclaimer: this is perhaps a quite subjective question with no 'right' answer but I'd appreciate any feedback on best-practices and program design. So here goes:
I am writing a library where text files are read into Text objects. Now these might be initialized with a list of file-names or directly with a list of Sentence objects. I am wondering what the best / most Pythonic way to do this might be because, if I understand correctly, Python doesn't directly support method overloading.
One example I found in Scikit-Learn's feature extraction module simply passes the type of the input as an argument while initializing the object. I assume that once this parameter is set it's just a matter of handling the different cases internally:
if input == 'filename':
# glob and read files
elif input == 'content':
# do something else
While this is easy to implement, it doesn't look like a very elegant solution. So I am wondering if there is a better way to handle multiple types of inputs to initialize a class that I am overlooking.
One way is to just create classmethods with different names for the different ways of instantiating the object:
class Text(object):
def __init__(self, data):
# handle data in whatever "basic" form you need
#classmethod
def fromFiles(cls, files):
# process list of filenames into the form that `__init__` needs
return cls(processed_data)
#classmethod
def fromSentences(cls, sentences):
# process list of Sentence objects into the form that `__init__` needs
return cls(processed_data)
This way you just create one "real" or "canonical" initialization method that accepts whatever "lowest common denominator" format you want. The specialized fromXXX methods can preprocess different types of input to convert them into the form they need to be in to pass to that canonical instantiation. The idea is that you call Text.fromFiles(...) to make a Text from filenames, or Text.fromSentences(...) to make a Text from sentence objects.
It can also be acceptable to do some simple type-checking if you just want to accept one of a few enumerable kinds of input. For instance, it's not uncommon for a class to accept either a filename (as a string) or a file object. In that case you'd do:
def __init__(self, file):
if isinstance(file, basestring):
# If a string filename was passed in, open the file before proceeding
file = open(file)
# Now you can handle file as a file object
This becomes unwieldy if you have many different types of input to handle, but if it's something relatively contained like this (e.g., an object or the string "name" that can be used to get that object), it can be simpler than the first method I showed.
You can use duck typing. First you consider as if the arguments are of the type X, if they raise an exception, then you assume they are of type Y, etc:
class Text(object):
def __init__(self, *init_vals):
try:
fileobjs = [open(fname) for fname in init_vals]
except TypeError:
# Then we consider them as file objects.
fileobjs = init_vals
try:
senteces = [parse_sentences(fobj) for fobj in fileobjs]
except TypeError:
# Then init_vals are Sentence objects.
senteces = fileobjs
Note that the absence of type checking means that the method actually accepts any type that implement one of the interfaces you actually use (e.g. file-like object, Sentence-like object etc.).
This method becomes quite heavy if you want to support a lot of different types, but I'd consider that bad code design. Accepting more than 2,3,4 types as initializers will probably confuse any programmer that uses your class, since he will always have to think "wait, did X also accept Y, or was it Z that accepted Y...".
It's probably better design the constructor to only accept 2,3 different interfaces and provide the user with some function/class that allows him to convert some often used types to these interfaces.

Class with too many parameters: better design strategy?

I am working with models of neurons. One class I am designing is a cell class which is a topological description of a neuron (several compartments connected together). It has many parameters but they are all relevant, for example:
number of axon segments, apical bifibrications, somatic length, somatic diameter, apical length, branching randomness, branching length and so on and so on... there are about 15 parameters in total!
I can set all these to some default value but my class looks crazy with several lines for parameters. This kind of thing must happen occasionally to other people too, is there some obvious better way to design this or am I doing the right thing?
UPDATE:
As some of you have asked I have attached my code for the class, as you can see this class has a huge number of parameters (>15) but they are all used and are necessary to define the topology of a cell. The problem essentially is that the physical object they create is very complex. I have attached an image representation of objects produced by this class. How would experienced programmers do this differently to avoid so many parameters in the definition?
class LayerV(__Cell):
def __init__(self,somatic_dendrites=10,oblique_dendrites=10,
somatic_bifibs=3,apical_bifibs=10,oblique_bifibs=3,
L_sigma=0.0,apical_branch_prob=1.0,
somatic_branch_prob=1.0,oblique_branch_prob=1.0,
soma_L=30,soma_d=25,axon_segs=5,myelin_L=100,
apical_sec1_L=200,oblique_sec1_L=40,somadend_sec1_L=60,
ldecf=0.98):
import random
import math
#make main the regions:
axon=Axon(n_axon_seg=axon_segs)
soma=Soma(diam=soma_d,length=soma_L)
main_apical_dendrite=DendriticTree(bifibs=
apical_bifibs,first_sec_L=apical_sec1_L,
L_sigma=L_sigma,L_decrease_factor=ldecf,
first_sec_d=9,branch_prob=apical_branch_prob)
#make the somatic denrites
somatic_dends=self.dendrite_list(num_dends=somatic_dendrites,
bifibs=somatic_bifibs,first_sec_L=somadend_sec1_L,
first_sec_d=1.5,L_sigma=L_sigma,
branch_prob=somatic_branch_prob,L_decrease_factor=ldecf)
#make oblique dendrites:
oblique_dends=self.dendrite_list(num_dends=oblique_dendrites,
bifibs=oblique_bifibs,first_sec_L=oblique_sec1_L,
first_sec_d=1.5,L_sigma=L_sigma,
branch_prob=oblique_branch_prob,L_decrease_factor=ldecf)
#connect axon to soma:
axon_section=axon.get_connecting_section()
self.soma_body=soma.body
soma.connect(axon_section,region_end=1)
#connect apical dendrite to soma:
apical_dendrite_firstsec=main_apical_dendrite.get_connecting_section()
soma.connect(apical_dendrite_firstsec,region_end=0)
#connect oblique dendrites to apical first section:
for dendrite in oblique_dends:
apical_location=math.exp(-5*random.random()) #for now connecting randomly but need to do this on some linspace
apsec=dendrite.get_connecting_section()
apsec.connect(apical_dendrite_firstsec,apical_location,0)
#connect dendrites to soma:
for dend in somatic_dends:
dendsec=dend.get_connecting_section()
soma.connect(dendsec,region_end=random.random()) #for now connecting randomly but need to do this on some linspace
#assign public sections
self.axon_iseg=axon.iseg
self.axon_hill=axon.hill
self.axon_nodes=axon.nodes
self.axon_myelin=axon.myelin
self.axon_sections=[axon.hill]+[axon.iseg]+axon.nodes+axon.myelin
self.soma_sections=[soma.body]
self.apical_dendrites=main_apical_dendrite.all_sections+self.seclist(oblique_dends)
self.somatic_dendrites=self.seclist(somatic_dends)
self.dendrites=self.apical_dendrites+self.somatic_dendrites
self.all_sections=self.axon_sections+[self.soma_sections]+self.dendrites
UPDATE: This approach may be suited in your specific case, but it definitely has its downsides, see is kwargs an antipattern?
Try this approach:
class Neuron(object):
def __init__(self, **kwargs):
prop_defaults = {
"num_axon_segments": 0,
"apical_bifibrications": "fancy default",
...
}
for (prop, default) in prop_defaults.iteritems():
setattr(self, prop, kwargs.get(prop, default))
You can then create a Neuron like this:
n = Neuron(apical_bifibrications="special value")
I'd say there is nothing wrong with this approach - if you need 15 parameters to model something, you need 15 parameters. And if there's no suitable default value, you have to pass in all 15 parameters when creating an object. Otherwise, you could just set the default and change it later via a setter or directly.
Another approach is to create subclasses for certain common kinds of neurons (in your example) and provide good defaults for certain values, or derive the values from other parameters.
Or you could encapsulate parts of the neuron in separate classes and reuse these parts for the actual neurons you model. I.e., you could write separate classes for modeling a synapse, an axon, the soma, etc.
You could perhaps use a Python"dict" object ?
http://docs.python.org/tutorial/datastructures.html#dictionaries
Having so many parameters suggests that the class is probably doing too many things.
I suggest that you want to divide your class into several classes, each of which take some of your parameters. That way each class is simpler and won't take so many parameters.
Without knowing more about your code, I can't say exactly how you should split it up.
Looks like you could cut down the number of arguments by constructing objects such as Axon, Soma and DendriticTree outside of the LayerV constructor, and passing those objects instead.
Some of the parameters are only used in constructing e.g. DendriticTree, others are used in other places as well, so the problem it's not as clear cut, but I would definitely try that approach.
could you supply some example code of what you are working on? It would help to get an idea of what you are doing and get help to you sooner.
If it's just the arguments you are passing to the class that make it long, you don't have to put it all in __init__. You can set the parameters after you create the class, or pass a dictionary/class full of the parameters as an argument.
class MyClass(object):
def __init__(self, **kwargs):
arg1 = None
arg2 = None
arg3 = None
for (key, value) in kwargs.iteritems():
if hasattr(self, key):
setattr(self, key, value)
if __name__ == "__main__":
a_class = MyClass()
a_class.arg1 = "A string"
a_class.arg2 = 105
a_class.arg3 = ["List", 100, 50.4]
b_class = MyClass(arg1 = "Astring", arg2 = 105, arg3 = ["List", 100, 50.4])
After looking over your code and realizing I have no idea how any of those parameters relate to each other (soley because of my lack of knowledge on the subject of neuroscience) I would point you to a very good book on object oriented design. Building Skills in Object Oriented Design by Steven F. Lott is an excellent read and I think would help you, and anyone else in laying out object oriented programs.
It is released under the Creative Commons License, so is free for you to use, here is a link of it in PDF format http://homepage.mac.com/s_lott/books/oodesign/build-python/latex/BuildingSkillsinOODesign.pdf
I think your problem boils down to the overall design of your classes. Sometimes, though very rarely, you need a whole lot of arguments to initialize, and most of the responses here have detailed other ways of initialization, but in a lot of cases you can break the class up into more easier to handle and less cumbersome classes.
This is similar to the other solutions that iterate through a default dictionary, but it uses a more compact notation:
class MyClass(object):
def __init__(self, **kwargs):
self.__dict__.update(dict(
arg1=123,
arg2=345,
arg3=678,
), **kwargs)
Can you give a more detailed use case ? Maybe a prototype pattern will work:
If there are some similarities in groups of objects, a prototype pattern might help.
Do you have a lot of cases where one population of neurons is just like another except different in some way ? ( i.e. rather than having a small number of discrete classes,
you have a large number of classes that slightly differ from each other. )
Python is a classed based language, but just as you can simulate class based
programming in a prototype based language like Javascript, you can simulate
prototypes by giving your class a CLONE method, that creates a new object and
populates its ivars from the parent. Write the clone method so that keyword parameters
passed to it override the "inherited" parameters, so you can call it with something
like:
new_neuron = old_neuron.clone( branching_length=n1, branching_randomness=r2 )
I have never had to deal with this situation, or this topic. Your description implies to me that you may find, as you develop the design, that there are a number of additional classes that will become relevant - compartment is the most obvious. If these do emerge as classes in their own right, it is probable that some of your parameters become parameters of these additional classes.
You could create a class for your parameters.
Instead passing a bunch of parameters, you pass one class.
In my opinion, in your case the easy solution is to pass higher order objects as parameter.
For example, in your __init__ you have a DendriticTree that uses several arguments from your main class LayerV:
main_apical_dendrite = DendriticTree(
bifibs=apical_bifibs,
first_sec_L=apical_sec1_L,
L_sigma=L_sigma,
L_decrease_factor=ldecf,
first_sec_d=9,
branch_prob=apical_branch_prob
)
Instead of passing these 6 arguments to your LayerV you would pass the DendriticTree object directly (thus saving 5 arguments).
You probably want to have this values accessible everywhere, therefore you will have to save this DendriticTree:
class LayerV(__Cell):
def __init__(self, main_apical_dendrite, ...):
self.main_apical_dendrite = main_apical_dendrite
If you want to have a default value too, you can have:
class LayerV(__Cell):
def __init__(self, main_apical_dendrite=None, ...):
self.main_apical_dendrite = main_apical_dendrite or DendriticTree()
This way you delegate what the default DendriticTree should be to the class dedicated to that matter instead of having this logic in the higher order class that LayerV.
Finally, when you need to access the apical_bifibs you used to pass to LayerV you just access it via self.main_apical_dendrite.bifibs.
In general, even if the class you are creating is not a clear composition of several classes, your goal is to find a logical way to split your parameters. Not only to make your code cleaner, but mostly to help people understand what these parameter will be used for. In the extreme cases where you can't split them, I think it's totally ok to have a class with that many parameters. If there is no clear way to split arguments, then you'll probably end up with something even less clear than a list of 15 arguments.
If you feel like creating a class to group parameters together is overkill, then you can simply use collections.namedtuple which can have default values as shown here.
Want to reiterate what a number of people have said. Theres nothing wrong with that amount of parameters. Especially when it comes to scientific computing/programming
Take for example, sklearn's KMeans++ clustering implementation which has 11 parameters you can init with. Like that, there are numerous examples and nothing wrong with them
I would say there is nothing wrong if make sure you need those params. If you really wanna make it more readable I would recommend following style.
I wouldn't say that a best practice or what, it just make others easily know what is necessary for this Object and what is option.
class LayerV(__Cell):
# author: {name, url} who made this info
def __init__(self, no_default_params, some_necessary_params):
self.necessary_param = some_necessary_params
self.no_default_param = no_default_params
self.something_else = "default"
self.some_option = "default"
def b_option(self, value):
self.some_option = value
return self
def b_else(self, value):
self.something_else = value
return self
I think the benefit for this style is:
You can easily know the params which is necessary in __init__ method
Unlike setter, you don't need two lines to construct the object if you need set an option value.
The disadvantage is, you created more methods in your class than before.
sample:
la = LayerV("no_default", "necessary").b_else("sample_else")
After all, if you have a lot of "necessary" and "no_default" params, always think about is this class(method) do too many things.
If your answer is not, just go ahead.

Python code readability

I have a programming experience with statically typed languages. Now writing code in Python I feel difficulties with its readability. Lets say I have a class Host:
class Host(object):
def __init__(self, name, network_interface):
self.name = name
self.network_interface = network_interface
I don't understand from this definition, what "network_interface" should be. Is it a string, like "eth0" or is it an instance of a class NetworkInterface? The only way I'm thinking about to solve this is a documenting the code with a "docstring". Something like this:
class Host(object):
''' Attributes:
#name: a string
#network_interface: an instance of class NetworkInterface'''
Or may be there are name conventions for things like that?
Using dynamic languages will teach you something about static languages: all the help you got from the static language that you now miss in the dynamic language, it wasn't all that helpful.
To use your example, in a static language, you'd know that the parameter was a string, and in Python you don't. So in Python you write a docstring. And while you're writing it, you realize you had more to say about it than, "it's a string". You need to say what data is in the string, and what format it should have, and what the default is, and something about error conditions.
And then you realize you should have written all that down for your static language as well. Sure, Java would force you know that it was a string, but there's all these other details that need to be specified, and you have to manually do that work in any language.
The docstring conventions are at PEP 257.
The example there follows this format for specifying arguments, you can add the types if they matter:
def complex(real=0.0, imag=0.0):
"""Form a complex number.
Keyword arguments:
real -- the real part (default 0.0)
imag -- the imaginary part (default 0.0)
"""
if imag == 0.0 and real == 0.0: return complex_zero
...
There was also a rejected PEP for docstrings for attributes ( rather than constructor arguments ).
The most pythonic solution is to document with examples. If possible, state what operations an object must support to be acceptable, rather than a specific type.
class Host(object):
def __init__(self, name, network_interface)
"""Initialise host with given name and network_interface.
network_interface -- must support the same operations as NetworkInterface
>>> network_interface = NetworkInterface()
>>> host = Host("my_host", network_interface)
"""
...
At this point, hook your source up to doctest to make sure your doc examples continue to work in future.
Personally I found very usefull to use pylint to validate my code.
If you follow pylint suggestion almost automatically your code become more readable,
you will improve your python writing skills, respect naming conventions. You can also define your own naming conventions and so on. It's very useful specially for a python beginner.
I suggest you to use.
Python, though not as overtly typed as C or Java, is still typed and will throw exceptions if you're doing things with types that simply do not play nice together.
To that end, if you're concerned about your code being used correctly, maintained correctly, etc. simply use docstrings, comments, or even more explicit variable names to indicate what the type should be.
Even better yet, include code that will allow it to handle whichever type it may be passed as long as it yields a usable result.
One benefit of static typing is that types are a form of documentation. When programming in Python, you can document more flexibly and fluently. Of course in your example you want to say that network_interface should implement NetworkInterface, but in many cases the type is obvious from the context, variable name, or by convention, and in these cases by omitting the obvious you can produce more readable code. Common is to describe the meaning of a parameter and implicitly giving the type.
For example:
def Bar(foo, count):
"""Bar the foo the given number of times."""
...
This describes the function tersely and precisely. What foo and bar mean will be obvious from context, and that count is a (positive) integer is implicit.
For your example, I'd just mention the type in the document string:
"""Create a named host on the given NetworkInterface."""
This is shorter, more readable, and contains more information than a listing of the types.

Python 3 and static typing

I didn't really pay as much attention to Python 3's development as I would have liked, and only just noticed some interesting new syntax changes. Specifically from this SO answer function parameter annotation:
def digits(x:'nonnegative number') -> "yields number's digits":
# ...
Not knowing anything about this, I thought it could maybe be used for implementing static typing in Python!
After some searching, there seemed to be a lot discussion regarding (entirely optional) static typing in Python, such as that mentioned in PEP 3107, and "Adding Optional Static Typing to Python" (and part 2)
..but, I'm not clear how far this has progressed. Are there any implementations of static typing, using the parameter-annotation? Did any of the parameterised-type ideas make it into Python 3?
Thanks for reading my code!
Indeed, it's not hard to create a generic annotation enforcer in Python. Here's my take:
'''Very simple enforcer of type annotations.
This toy super-decorator can decorate all functions in a given module that have
annotations so that the type of input and output is enforced; an AssertionError is
raised on mismatch.
This module also has a test function func() which should fail and logging facility
log which defaults to print.
Since this is a test module, I cut corners by only checking *keyword* arguments.
'''
import sys
log = print
def func(x:'int' = 0) -> 'str':
'''An example function that fails type checking.'''
return x
# For simplicity, I only do keyword args.
def check_type(*args):
param, value, assert_type = args
log('Checking {0} = {1} of {2}.'.format(*args))
if not isinstance(value, assert_type):
raise AssertionError(
'Check failed - parameter {0} = {1} not {2}.'
.format(*args))
return value
def decorate_func(func):
def newf(*args, **kwargs):
for k, v in kwargs.items():
check_type(k, v, ann[k])
return check_type('<return_value>', func(*args, **kwargs), ann['return'])
ann = {k: eval(v) for k, v in func.__annotations__.items()}
newf.__doc__ = func.__doc__
newf.__type_checked = True
return newf
def decorate_module(module = '__main__'):
'''Enforces type from annotation for all functions in module.'''
d = sys.modules[module].__dict__
for k, f in d.items():
if getattr(f, '__annotations__', {}) and not getattr(f, '__type_checked', False):
log('Decorated {0!r}.'.format(f.__name__))
d[k] = decorate_func(f)
if __name__ == '__main__':
decorate_module()
# This will raise AssertionError.
func(x = 5)
Given this simplicity, it's strange at the first sight that this thing is not mainstream. However, I believe there are good reasons why it's not as useful as it might seem. Generally, type checking helps because if you add integer and dictionary, chances are you made some obvious mistake (and if you meant something reasonable, it's still better to be explicit than implicit).
But in real life you often mix quantities of the same computer type as seen by compiler but clearly different human type, for example the following snippet contains an obvious mistake:
height = 1.75 # Bob's height in meters.
length = len(sys.modules) # Number of modules imported by program.
area = height * length # What's that supposed to mean???
Any human should immediately see a mistake in the above line provided it knows the 'human type' of variables height and length even though it looks to computer as perfectly legal multiplication of int and float.
There's more that can be said about possible solutions to this problem, but enforcing 'computer types' is apparently a half-solution, so, at least in my opinion, it's worse than no solution at all. It's the same reason why Systems Hungarian is a terrible idea while Apps Hungarian is a great one. There's more at the very informative post of Joel Spolsky.
Now if somebody was to implement some kind of Pythonic third-party library that would automatically assign to real-world data its human type and then took care to transform that type like width * height -> area and enforce that check with function annotations, I think that would be a type checking people could really use!
As mentioned in that PEP, static type checking is one of the possible applications that function annotations can be used for, but they're leaving it up to third-party libraries to decide how to do it. That is, there isn't going to be an official implementation in core python.
As far as third-party implementations are concerned, there are some snippets (such as http://code.activestate.com/recipes/572161/), which seem to do the job pretty well.
EDIT:
As a note, I want to mention that checking behavior is preferable to checking type, therefore I think static typechecking is not so great an idea. My answer above is aimed at answering the question, not because I would do typechecking myself in such a way.
"Static typing" in Python can only be implemented so that the type checking is done in run-time, which means it slows down the application. Therefore you don't want that as a generality. Instead you want some of your methods to check it's inputs. This can be easily done with plain asserts, or with decorators if you (mistakenly) think you need it a lot.
There is also an alternative to static type checking, and that is to use an aspect oriented component architecture like The Zope Component Architecture. Instead of checking the type, you adapt it. So instead of:
assert isinstance(theobject, myclass)
you do this:
theobject = IMyClass(theobject)
If theobject already implements IMyClass nothing happens. If it doesn't, an adapter that wraps whatever theobject is to IMyClass will be looked up, and used instead of theobject. If no adapter is found, you get an error.
This combined the dynamicism of Python with the desire to have a specific type in a specific way.
This is not an answer to question directly, but I found out a Python fork that adds static typing: mypy-lang.org, of course one can't rely on it as it's still small endeavor, but interesting.
Sure, static typing seems a bit "unpythonic" and I don't use it all the time. But there are cases (e.g. nested classes, as in domain specific language parsing) where it can really speed up your development.
Then I prefer using beartype explained in this post*. It comes with a git repo, tests and an explanation what it can and what it can't do ... and I like the name ;)
* Please don't pay attention to Cecil's rant about why Python doesn't come with batteries included in this case.

Categories