Fellow pythonistas,
We adopted Python type hints in a new project.
The project itself is composed of couple of modules that don't share a common repository, but they do have interdependencies. Since it is a pure algorithmic research project, we decided to use decopuled components that program through an interface, so we have an abstract class that all those that want to substitute current module for another one, have to comply with this interface.
Say,
class ClusteringInterface(ABC):
def __init__(self, n_groups: int, data_generator: Generator[Dict, None, None]):
self.n_groups = n_groups
self.data_generator = data_generator
...
#abstractmethod
def foo(self):
pass
Now, I do want to accept anything that complies with ClusteringInterface, so it implements foo methods and complies with the init function. Can I somehow define a custom type so that mypy will indeed throw an error when something pretends to be ClusteringInterface, but is really something completely different?
I have seen Raymond Hettinger lecture about Abstract classes and I understand that python itself cannot guarantee that.
So the more general question would be: What is the recommended solution so that I actually can enforce compliance with an interface? I realize that the question is a little bit too broad, but I don't really know where to look for more information.
Related
This question already has answers here:
What's the pythonic way to use getters and setters?
(8 answers)
Closed 2 months ago.
Using get/set seems to be a common practice in Java (for various reasons), but I hardly see Python code that uses this.
Why do you use or avoid get/set methods in Python?
In python, you can just access the attribute directly because it is public:
class MyClass:
def __init__(self):
self.my_attribute = 0
my_object = MyClass()
my_object.my_attribute = 1 # etc.
If you want to do something on access or mutation of the attribute, you can use properties:
class MyClass:
def __init__(self):
self._my_attribute = 0
#property
def my_attribute(self):
# Do something if you want
return self._my_attribute
#my_attribute.setter
def my_attribute(self, value):
# Do something if you want
self._my_attribute = value
Crucially, the client code remains the same.
Cool link: Python is not Java :)
In Java, you have to use getters and setters because using public fields gives you no opportunity to go back and change your mind later to using getters and setters. So in Java, you might as well get the chore out of the way up front. In Python, this is silly, because you can start with a normal attribute and change your mind at any time, without affecting any clients of the class. So, don't write getters and setters.
Here is what Guido van Rossum says about that in Masterminds of Programming
What do you mean by "fighting the language"?
Guido: That usually means that they're
trying to continue their habits that
worked well with a different language.
[...] People will turn everything into
a class, and turn every access into an
accessor method,
where that is really not a wise thing to do in Python;
you'll have more verbose code that is
harder to debug and runs a lot slower.
You know the expression "You can write
FORTRAN in any language?" You can write Java in any language, too.
No, it's unpythonic. The generally accepted way is to use normal data attribute and replace the ones that need more complex get/set logic with properties.
The short answer to your question is no, you should use properties when needed. Ryan Tamyoko provides the long answer in his article Getters/Setters/Fuxors
The basic value to take away from all this is that you want to strive to make sure every single line of code has some value or meaning to the programmer. Programming languages are for humans, not machines. If you have code that looks like it doesn’t do anything useful, is hard to read, or seems tedious, then chances are good that Python has some language feature that will let you remove it.
Your observation is correct. This is not a normal style of Python programming. Attributes are all public, so you just access (get, set, delete) them as you would with attributes of any object that has them (not just classes or instances). It's easy to tell when Java programmers learn Python because their Python code looks like Java using Python syntax!
I definitely agree with all previous posters, especially #Maximiliano's link to Phillip's famous article and #Max's suggestion that anything more complex than the standard way of setting (and getting) class and instance attributes is to use Properties (or Descriptors to generalize even more) to customize the getting and setting of attributes! (This includes being able to add your own customized versions of private, protected, friend, or whatever policy you want if you desire something other than public.)
As an interesting demo, in Core Python Programming (chapter 13, section 13.16), I came up with an example of using descriptors to store attributes to disk instead of in memory!! Yes, it's an odd form of persistent storage, but it does show you an example of what is possible!
Here's another related post that you may find useful as well:
Python: multiple properties, one setter/getter
I had come here for that answer(unfortunately i couldn't) . But i found a work around else where . This below code could be alternative for get .
class get_var_lis:
def __init__(self):
pass
def __call__(self):
return [2,3,4]
def __iter__(self):
return iter([2,3,4])
some_other_var = get_var_lis
This is just a workaround . By using the above concept u could easily build get/set methodology in py too.
Our teacher showed one example on class explaining when we should use accessor functions.
class Woman(Human):
def getAge(self):
if self.age > 30:
return super().getAge() - 10
else:
return super().getAge()
I struggle to understand the use of interfaces in C#.
This is likely because I come from Python and it isn't used there.
I don't understand other explanations as they don't fully answer my questions about interfaces. e.g. The purpose of interfaces continued
From what I understand, Interfaces tell a class what it can do, not how to do it. This means at some point the class has to be told how to do the methods.
If this is the case, what's the point in interfaces? Why not just have the definition of the method in the class?
The only benefit I see is to keep it clear what classes can/can't do, but at the cost of not DRY code?
I know that Python doesn't need interfaces and I think this limits my understanding, but can't quite figure it out why.
It is used in python, in the form (usually) of abstract classes.
The purpose varies, in java they solve multiple inheritance, in python they act as a contract between 2 classes.
Class A does something and a part of that something involves class B. Class B can be implemented in several ways, so instead of making 10 different classes and HOPE they will be used properly, you make them inherit from an abstract class (an interface) and you ensure they MUST implement all the methods defined as abstract. (NOTE IF THEY DON'T IMPLEMENT ANY OF THE METHODS IT WILL CRASH AT BUILD TIME, WHEN YOU INSTALL THE PACKAGE, NOT WHEN YOU RUN IT WHICH IS VERY IMPORTANT IN MIDDLE/LARGE-SIZE PROJECTS).
You also know that ANY class that implements those methods will work with the class that uses it. This sounds trivial, but the business side LOVES it because it means you can outsource part of your code and it will connect and work with the rest of your code.
From what I understand, Interfaces tell a class what it can do, not how to do it. This means at some point the class has to be told how to do the methods.
They actually tell it what it MUST do, as to how they do it we don't care.
If this is the case, what's the point in interfaces? Why not just have the definition of the method in the class?
That is not the point but yes, absolutely you need the definition of the method in the class that inherits from the interface.
Lets give you a more concrete example.
Imagine you have a python framework that runs some tasks. The tasks can run locally (in the same computer where the python framework is running), they can run on a distributed system, by submitting them to some central scheduler, they can run in a docker container, on Amazon Web services .... you get the idea.
All you need is an interface (an abstract class in python) that has a run_task method, as to which one you use is up to you.
e.g:
class Runner:
__metaclass__ = abc.ABCMeta
#abstractmethod
def run_task(self, task_command):
return
class LocalRunner(Runner):
def run_task(self, task_command):
subprocess.call(task_command)
class SlurmRunner(Runner):
def run_task(self, task_command):
subprocess.call('sbatch ' + task_command)
Now thge important bit, as you may be asking Why the ^$^$% do I need all these
complications? (probably you don't if yopur project is small enough, but there is a breakpoint depending on size where you almost HAVE to start using these things).
The class that uses the runner ONLY needs to understand the interface, e.g. you have a Task class, that class can delegate the execution of the task to the TaskRunner, as to which one is implemented you DON'T care, they are in a sense polymorphic.
class Task:
def __init__(self, task_runner):
self.task_runner = task_runner
self.task_command = 'ls'
def run_this_task(self):
self.task_runner.run_task(self.task_command)
And, if you are some programmer your boss can tell you, I need a new class that executes commands on AWS, you apss it a command and it implements a task_runner method , then you DONT need to know ANYTHING about the rest of the code, you can implement this bit as a completely isolated piece (this is the outsourcing part, now you can have a 100 people designign 100 different p[ieces, they don't need to know anything about the code, just the interfaces).
class Cat:
def meow(self):
print('meow')
def feed(cat):
cat.moew() # he thanks the owner
tom = Cat('Tom')
feed(tom)
C Sharp has the static type system. The compiler needs to know which methods the class has. That's why we must set type for every variable:
def feed(cat: Cat):
cat.moew() # he thanks the owner
But what if we have to write code and don't know what exactly type the variable must have?
def feed(it):
it.thank_owner()
Moreover we have to suppose our function will be used for various classes. Don't forget we must let the compiler know type of each variable! What to do? The solution:
class Pet: # an interface
def thank_owner(self):
raise NotImplementedError()
def feed(it: Pet):
it.thank_owner()
But what to do with Cat? The solution:
class Cat(Pet): # inherits the interface Pet
def thank_owner(self):
print('meow') # or self.meow() if we want to avoid big changes and follow DRY rule at the same time
tom = Cat('Tom')
feed(tom)
By the way now we can add new pets easy. We don't have to rewrite our code.
class Dog(Pet):
def thank_owner(self):
print('woof')
beethoven = Dog('Beethoven')
feed(beethoven) # yes, I use the same function. I haven't changed it at all!
Pay attention we created this class later than feed() and Pet. It's important we didn't think about Dog when writing code before. We were not curious about this. However we didn't have problems when we needed to extend the code.
This is a question about which of these methods would be considered as the most Pythonic. I'm not looking for personal opinions, but, instead, what is idiomatic. My background is not in Python, so this will help me.
I'm working on a Python 3 project which is extensible. The idea is similar to the factory pattern, except it is based on functions.
Essentially, users will be able to create a custom function (across packages and projects) which my tool can locate and dynamically invoke. It will also be able to use currying to pass arguments down (but that code is not included here)
My goal is for this to follow good-Pythonic practice. I'm torn between two strategies. And, since Python is not my expertise, I would like to know the pros/cons of the following practices:
Use a decorator
registered = {}
def factoried(name):
def __inner_factory_function(fn):
registered[name] = fn
return fn
return __inner_factory_function
def get_function(name):
return registered[name]
Then, the following function is automatically registered...
#factoried('foo')
def build_foo():
print('hi')
This seems reasonable, but does appear slightly magical to those who are not familiar with decorators.
Force sub-classing of an abstract class and use __subclasses__()
If subclasses are used, there's no need for registration. However, I feel like this forces classes to be defined when a full class may be unnecessary. Also, the use of .__subclasses__() under the hood could seem magical to consumers as well. However, even Java can be used to search for classes with annotations.
Explicit registration
Forget all of the above and force explicit registration. No decorators. No subclasses. Just something like this:
def build_foo():
# ...
factory.register('foo', build_foo)
There is no answer to this question.
The only standard practices promoted by the Python Foundation are PEP 8.
PEP 8 has very little related to higher-level "design-pattern" questions like this, and, in particular, nothing related to your specific question.
And, even if it did, PEP 8 is explicitly only a guideline for "code comprising the standard library in the main Python distribution", and Guido has rejected suggestions to make it some kind of wide-ranging standard that should be enforced on every Python project.
Plus, it hammers home the point that it's only a guideline, not a rigid recommendation.
Of course there are subjective reasons to prefer one design over another.
Ideally, these subjective reasons will usually be driven by some community consensus on what's "idiomatic" or "pythonic". But that community consensus isn't written down anywhere as some objective source you can cite.
There may be arguments that appeal to The Zen of Python, but that itself is just Tim Peters' attempts to distill Guido's own subjective guidelines into a collection of pithy sound bites, not an objective source. (And anyone who takes a brief look at, e.g., the python-ideas list can see that both sides of almost any question can appeal to the Zen…)
I recently went to an interview and did some little programming test where they had a simple two class script that would calculate the area, side length, etc.. Of a shape it looked something like this:
class Shape(object):
def __init__(self, side1, side2, side3, side4):
self.side1 = side1
...
#abstractmethod
def calc_area(self):
pass
class Triangle(Shape):
def calc_area(self):
... etc..
One of the questions they asked me was, if we ran help(Shape(3, 5, 5, 6)) what would happen assuming all objects have been initialized? my answer was, nothing because there's no docstring or __doc__. It appears that I was marked down on this answer and I can't seem to understand why? Was I incorrect in thinking that nothing would happen when running help()?
There are a couple of things in your description of the code that could be what they were expecting you to mention. It's not completely obvious to me which of these is an error of your memory of the code, and which is the real issue, but they are both things to think about:
The Shape class you show has an #abstractmethod in it. If it had a properly defined metaclass (something derived from abc.ABCMeta), you wouldn't be able to create any instances from it. Only concrete subclasses that override each of the abstract methods could be instantiated. So they may have expected you to say "you can't call help() on Shape(...) because the latter causes a TypeError.
Calling help on an instance will give you a description of each of the class's methods, even if it doesn't have a docstring. If a docstring exists, it will be displayed first, but the documentation for the methods (and descriptors, etc.) will always follow.
If you look at the source of pydoc.Helper class (especially its help() method), which is essentially what you get when you call help(), you'll see that it will still get the basic structure of your object using the inspect module and a couple of other tricks.
The goal being, of course, to provide at least object/function signature to other developers if the original code writers didn't bother with writing the docs. Personally, I don't see the reason of help() existing in the global namespace anyway, but I understand why they've placed it both from a historical and philosophical perspective.
That being said, whoever interviewed you was, IMO, a jerk - this doesn't really have to do much with programming but with nitpicking which is, may I say, quite un-Pythonic. That is, assuming that they weren't trying to get you on the #abstractmethod descriptor, of course, which might be a valid question.
Note: I'm not talking about preventing the rebinding of a variable. I'm talking about preventing the modification of the memory that the variable refers to, and of any memory that can be reached from there by following the nested containers.
I have a large data structure, and I want to expose it to other modules, on a read-only basis. The only way to do that in Python is to deep-copy the particular pieces I'd like to expose - prohibitively expensive in my case.
I am sure this is a very common problem, and it seems like a constant reference would be the perfect solution. But I must be missing something. Perhaps constant references are hard to implement in Python. Perhaps they don't quite do what I think they do.
Any insights would be appreciated.
While the answers are helpful, I haven't seen a single reason why const would be either hard to implement or unworkable in Python. I guess "un-Pythonic" would also count as a valid reason, but is it really? Python does do scrambling of private instance variables (starting with __) to avoid accidental bugs, and const doesn't seem to be that different in spirit.
EDIT: I just offered a very modest bounty. I am looking for a bit more detail about why Python ended up without const. I suspect the reason is that it's really hard to implement to work perfectly; I would like to understand why it's so hard.
It's the same as with private methods: as consenting adults authors of code should agree on an interface without need of force. Because really really enforcing the contract is hard, and doing it the half-assed way leads to hackish code in abundance.
Use get-only descriptors, and state clearly in your documentation that these data is meant to be read only. After all, a determined coder could probably find a way to use your code in different ways you thought of anyways.
In PEP 351, Barry Warsaw proposed a protocol for "freezing" any mutable data structure, analogous to the way that frozenset makes an immutable set. Frozen data structures would be hashable and so capable being used as keys in dictionaries.
The proposal was discussed on python-dev, with Raymond Hettinger's criticism the most detailed.
It's not quite what you're after, but it's the closest I can find, and should give you some idea of the thinking of the Python developers on this subject.
There are many design questions about any language, the answer to most of which is "just because". It's pretty clear that constants like this would go against the ideology of Python.
You can make a read-only class attribute, though, using descriptors. It's not trivial, but it's not very hard. The way it works is that you can make properties (things that look like attributes but call a method on access) using the property decorator; if you make a getter but not a setter property then you will get a read-only attribute. The reason for the metaclass programming is that since __init__ receives a fully-formed instance of the class, you actually can't set the attributes to what you want at this stage! Instead, you have to set them on creation of the class, which means you need a metaclass.
Code from this recipe:
# simple read only attributes with meta-class programming
# method factory for an attribute get method
def getmethod(attrname):
def _getmethod(self):
return self.__readonly__[attrname]
return _getmethod
class metaClass(type):
def __new__(cls,classname,bases,classdict):
readonly = classdict.get('__readonly__',{})
for name,default in readonly.items():
classdict[name] = property(getmethod(name))
return type.__new__(cls,classname,bases,classdict)
class ROClass(object):
__metaclass__ = metaClass
__readonly__ = {'a':1,'b':'text'}
if __name__ == '__main__':
def test1():
t = ROClass()
print t.a
print t.b
def test2():
t = ROClass()
t.a = 2
test1()
While one programmer writing code is a consenting adult, two programmers working on the same code seldom are consenting adults. More so if they do not value the beauty of the code but them deadlines or research funds.
For such adults there is some type safety, provided by Enthought's Traits.
You could look into Constant and ReadOnly traits.
For some additional thoughts, there is a similar question posed about Java here:
Why is there no Constant feature in Java?
When asking why Python has decided against constant references, I think it's helpful to think of how they would be implemented in the language. Should Python have some sort of special declaration, const, to create variable references that can't be changed? Why not allow variables to be declared a float/int/whatever then...these would surely help prevent programming bugs as well. While we're at it, adding class and method modifiers like protected/private/public/etc. would help enforce compile-type checking against illegal uses of these classes. ...pretty soon, we've lost the beauty, simplicity, and elegance that is Python, and we're writing code in some sort of bastard child of C++/Java.
Python also currently passes everything by reference. This would be some sort of special pass-by-reference-but-flag-it-to-prevent-modification...a pretty special case (and as the Tao of Python indicates, just "un-Pythonic").
As mentioned before, without actually changing the language, this type of behaviour can be implemented via classes & descriptors. It may not prevent modification from a determined hacker, but we are consenting adults. Python didn't necessarily decide against providing this as an included module ("batteries included") - there was just never enough demand for it.