The methods, which are used for debugging, rely on some class attributes. So in the original version, they are written as private class methods in each class. The code is repeated.
class Network1(NetworkTemplate):
def __init__(self, model_config)
self.parameter = model_config
def __debug_tool(self):
# repeated code
return self.parameter
class Network2(NetworkTemplate):
def __init__(self, model_config)
self.parameter = model_config
def __debug_tool(self):
# repeated code
return self.parameter
I am wondering how I could reuse the repeated code.
I thought of import __debug_tool but it depends on some class attributes self.parameter. I don't know if I should manually pass 6 parameters to the imported function, because it looks ugly.
Writing it in the NetworkTemplate as a public method seems the right thing to do, but I am not sure if it will impact the normal performance (i.e. with the debug tools turned off, or committed out, in the Network1).
Writing __debug_tool as a method of the NetworkTemplate class will provide it to the derived Network1 and Network2 classes by inheritance and will not affect the performance.
When doing so, you must not use a name beginning with two underscores, because this is specifically intended to mangle the name (in order to prevent it from being overwritten accidentally by a derived class) and you will not be able to access it in Network1 and Network2.
So pick a name with one or zero leading underscores (_debug_tool or debug_tool).
See for example What is the meaning of single and double underscore before an object name? for more details.
Related
I would like to have a function in my class, which I am going to use only inside methods of this class. I will not call it outside the implementations of these methods. In C++, I would use a method declared in the private section of the class. What is the best way to implement such a function in Python?
I am thinking of using a static decorator for this case. Can I use a function without any decorators and the self word?
Python doesn't have the concept of private methods or attributes. It's all about how you implement your class. But you can use pseudo-private variables (name mangling); any variable preceded by __(two underscores) becomes a pseudo-private variable.
From the documentation:
Since there is a valid use-case for class-private members (namely to
avoid name clashes of names with names defined by subclasses), there
is limited support for such a mechanism, called name mangling. Any
identifier of the form __spam (at least two leading underscores, at
most one trailing underscore) is textually replaced with
_classname__spam, where classname is the current class name with leading underscore(s) stripped. This mangling is done without regard
to the syntactic position of the identifier, as long as it occurs
within the definition of a class.
class A:
def __private(self):
pass
So __private now actually becomes _A__private.
Example of a static method:
>>> class A:
... #staticmethod # Not required in Python 3.x
... def __private():
... print 'hello'
...
>>> A._A__private()
hello
Python doesn't have the concept of 'private' the way many other languages do. It is built on the consenting adult principle that says that users of your code will use it responsibly. By convention, attributes starting with a single or double leading underscore will be treated as part of the internal implementation, but they are not actually hidden from users. Double underscore will cause name mangling of the attribute name though.
Also, note that self is only special by convention, not by any feature of the language. Instance methods, when called as members of an instance, are implicitly passed the instance as a first argument, but in the implementation of the method itself, that argument can technically be named any arbitrary thing you want. self is just the convention for ease of understanding code. As a result, not including self in the signature of a method has no actual functional effect other than causing the implicit instance argument to be assigned to the next variable name in the signature.
This is of course different for class methods, which receive the instance of the class object itself as an implicit first argument, and static methods, which receive no implicit arguments at all.
Python just doesn't do private. If you like you can follow convention and precede the name with a single underscore, but it's up to other coders to respect that in a gentlemanly† fashion
† or gentlewomanly
There is plenty of great stuff here with obfuscation using leading underscores. Personally, I benefit greatly from the language design decision to make everything public as it reduces the time it takes to understand and use new modules.
However, if you're determined to implement private attributes/methods and you're willing to be unpythonic, you could do something along the lines of:
from pprint import pprint
# CamelCase because it 'acts' like a class
def SneakyCounter():
class SneakyCounterInternal(object):
def __init__(self):
self.counter = 0
def add_two(self):
self.increment()
self.increment()
def increment(self):
self.counter += 1
def reset(self):
print 'count prior to reset: {}'.format(self.counter)
self.counter = 0
sneaky_counter = SneakyCounterInternal()
class SneakyCounterExternal(object):
def add_two(self):
sneaky_counter.add_two()
def reset(self):
sneaky_counter.reset()
return SneakyCounterExternal()
# counter attribute is not accessible from out here
sneaky_counter = SneakyCounter()
sneaky_counter.add_two()
sneaky_counter.add_two()
sneaky_counter.reset()
# `increment` and `counter` not exposed (AFAIK)
pprint(dir(sneaky_counter))
It is hard to imagine a case where you'd want to do this, but it is possible.
You just don't do it:
The Pythonic way is to not document those methods/members using docstrings, only with "real" code comments. And the convention is to append a single or a double underscore to them;
Then you can use double underscores in front of your member, so they are made local to the class (it's mostly name mangling, i.e., the real name of the member outside of the class becomes: instance.__classname_membername). It's useful to avoid conflicts when using inheritance, or create a "private space" between children of a class.
As far as I can tell, it is possible to "hide" variables using metaclasses, but that violates the whole philosophy of Python, so I won't go into details about that.
I want to use metaclass to implement a factory which make processors for data coming in from different sources. Following is the skeleton code:
class ProcessorFactory
def __call__(self, classname, supers, classdict):
...
def __New__(self, classname, supers, classdict):
...
def __int__(self):
...
class MQ_AddOn(object):
# MQ-specific code
class File_AddOn(object):
# Filesystem-specific code
class Web_AddOn(object):
# Web-specific code
class MQ_Processor(MQ_AddOn, metaclass=ProcessorFactory()):
# code common to all channels (MQ, Filesystem, Web)
class File_Processor(File_AddOn, metaclass=ProcessorFactory()):
# code common to all channels (MQ, Filesystem, Web)
class Web_Processor(Web_AddOn, metaclass=ProcessorFactory()):
# code common to all channels (MQ, Filesystem, Web)
My question is whether there is a way, similar to macro expansion in assembly, to factor out the code common to all channels (MQ, Filesystem, Web) so that it doesn't have to be copied for each of those class?
Sorry - I think you have to expand somewhat more on your pseudocode for a signficative answer.
the way for not copying code around is just use the normal inheritance mechanisms in Python - If you have code common to all those classes, just put that code in a common baseclass, which might be a mixin (i.e. no need for it to be the "only" baseclass), and your are set.
If the common code have to call methods for data aquisition or processing that is specific for each of the subclasses, just write fine grained methods to perform that, and place calls for those in the common method.
Also, there is no need in this process to have ither a "metaclass" or for an "inline macro expansion" - just use plain methods and finer-grained methods.
class Base:
def process(self,...):
# preparing code
...
# data gather
data = self.fetch_data()
# common data pre-processing code:
...
# specific data processing code:
post_data = self.refine_data(data)
# more common code
...
# specific output:
self.output(post_data)
def fetch_data(self, ...):
pass
def refine_data(self, data):
pass
def output(self, data):
pass
And then, on the subclasses you implement those addressing the specific channel peculiarities. There is no big secret there.
Even if you need a lot more things to be done, like steps that are to be called from more than one "leaf" class in each step, you are better of having your class feature a "pipeline" data member, were the steps are annotated and called in order - still, no need to (and no sense in) involving metaclasses.
In your example, the "Base" class could fit all the "common code" you mention, and be used with multiple inheritance:
class MQProcessor(MQAddOn, Base):
...
...
(when you write __call__ and __new__ methods in what would be a metaclass, one might try to think 'ok -these are metaclass-related things that possibly make sense' - but __int__ in a metaclass makes no sense at all: it would provide a way to map a class (not an instance, not some data content) to an integer - even if you are putting these classes in a list, and they need indexes to be located in the list, and you want to cross-reference those, you should just add a custom ".index" attribute to the class, not write __int__ on the metaclass)
I have a question about what I see as a potential bad habit when using inheritance in python
suppose I have a base class
class FourLeggedAnimal():
def __init__(self, name):
self.name = name
self.number_of_legs = 4
and two daughter classes
class Cat(FourLeggedAnimal):
def __init__(self, name):
super().__init__(name)
def claw_the_furniture(self):
for leg in range(self.number_of_legs):
print("scratch")
class Dog(FourLeggedAnimal):
def __init__(self, name):
super().__init__(name)
def run_in_sleep(self):
for leg in range(self.number_of_legs):
self.move_leg(leg)
def move_leg(i):
pass
For the purposes of this example, I intend to keep Animal in a different file than Cat. For someone reading the code for the Cat or Dog class, the number_of_legs attribute is used but not defined in the file. My understanding is that it is best not to have variables whose definitions are opaque (which is why its best to avoid from x import *.
I see the alternative to be repeating the definition self.number_of_legs in both daughter classes but that defeats the purposes of inheritance.
Is there a best-practice to deal with this kind of situation?
Is there a best-practice to deal with this kind of situation?
Normally, class variables are used for this purpose.
class FourLeggedAnimal():
number_of_legs = 4 # class variable
def __init__(self, name):
self.name = name
class Cat(FourLeggedAnimal):
def __init__(self, name):
super().__init__(name)
def claw_the_furniture(self):
for leg in range(self.number_of_legs):
print("scratch")
class Dog(FourLeggedAnimal):
def __init__(self, name):
super().__init__(name)
def run_in_sleep(self):
for leg in range(self.number_of_legs):
self.move_leg(leg)
def move_leg(i):
pass
Note that even if these classes are in different files, the attribute is part of the parent's public API and is knowable by the subclasses. Also, the class name, "FourLeggedAnimal" does a great job of communicating what the number of legs would be.
My understanding is that it is best not to have variables whose definitions are opaque (which is why its best to avoid from x import *.
I think perhaps you are misunderstanding the source of this advice. It may even be a mix of different pieces of advice. I'll try to explain what I think might have been the underlying ideas people were trying to convey.
Firstly, it's pretty widely agreed that from x import * is best avoided in Python. This is because it makes it hard for readers to find out where a name comes from or indeed if it's defined at all. It also confuses some code analysis tools. It's the only way that a (non-builtin) name will normally get into a top level namespace without appearing in the source code and being easy to search for. As far as this advice goes, it's only for this case. You could barely write Python code at all if you couldn't use fields and methods on objects, and you generally have a clear breadcrumb trail to follow. (Moreso if you're using type annotations.)
However, you may also be thinking of the principle of encapsulation. In object-oriented programming it's considered preferable to separate the interface from the implementation of your objects. You make the interface as small, simple and clear as you can and hide away the implementation from the code using the objects. In this way you can reason about and change the implementation in isolation, with confidence that doing so won't affect other code. This principle is applied even between base classes and sub-classes - the sub-class shouldn't "know" anything about the base class that it doesn't need to. Now, modifying variables, and to a lesser extent reading modifiable variables requires knowing an awful lot about what expectations the base class has for their values, their relationship with other state and when it's possible/permissible for them to change. Depending on them can make it much harder to safely change the base class.
Now, Python does have more flexibility than some other languages in this respect. In Python you can seamlessly replace a variable with a property, and thus make "reading" and "setting" a field into methods that you can implement however you want. In other languages once a sub-class starts using a field exposed by a base class it is impossible to refactor the base class to remove the field or add any extra behaviour when it is accessed, unless you also update all the sub-classes. So it's a bit less of a concern. Or rather, there's no particular reason to treat fields differently from methods.
With all this in mind, the question becomes - what interface is your base class presenting to its sub-classes? Does it support them setting as well as reading this field? Can you reduce the size and complexity of the interface between the two classes without making your code more complex? An interface is simpler and easier to reason about if it is read-only, and moreso if it does not involve mutable state at all. Where possible the base class should not give the sub-class any unnecessary opportunities to break its invariants. (I.e. it's expectations about its own state.) In Python these things are more often achieved through convention (e.g. fields and methods beginning with an underscore are considered not to be part of the public interface unless documented otherwise) and documentation than through language features.
I just can't see why do we need to use #staticmethod. Let's start with an exmaple.
class test1:
def __init__(self,value):
self.value=value
#staticmethod
def static_add_one(value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
a=test1(3)
print(a.new_val) ## >>> 4
class test2:
def __init__(self,value):
self.value=value
def static_add_one(self,value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
b=test2(3)
print(b.new_val) ## >>> 4
In the example above, the method, static_add_one , in the two classes do not require the instance of the class(self) in calculation.
The method static_add_one in the class test1 is decorated by #staticmethod and work properly.
But at the same time, the method static_add_one in the class test2 which has no #staticmethod decoration also works properly by using a trick that provides a self in the argument but doesn't use it at all.
So what is the benefit of using #staticmethod? Does it improve the performance? Or is it just due to the zen of python which states that "Explicit is better than implicit"?
The reason to use staticmethod is if you have something that could be written as a standalone function (not part of any class), but you want to keep it within the class because it's somehow semantically related to the class. (For instance, it could be a function that doesn't require any information from the class, but whose behavior is specific to the class, so that subclasses might want to override it.) In many cases, it could make just as much sense to write something as a standalone function instead of a staticmethod.
Your example isn't really the same. A key difference is that, even though you don't use self, you still need an instance to call static_add_one --- you can't call it directly on the class with test2.static_add_one(1). So there is a genuine difference in behavior there. The most serious "rival" to a staticmethod isn't a regular method that ignores self, but a standalone function.
Today I suddenly find a benefit of using #staticmethod.
If you created a staticmethod within a class, you don't need to create an instance of the class before using the staticmethod.
For example,
class File1:
def __init__(self, path):
out=self.parse(path)
def parse(self, path):
..parsing works..
return x
class File2:
def __init__(self, path):
out=self.parse(path)
#staticmethod
def parse(path):
..parsing works..
return x
if __name__=='__main__':
path='abc.txt'
File1.parse(path) #TypeError: unbound method parse() ....
File2.parse(path) #Goal!!!!!!!!!!!!!!!!!!!!
Since the method parse is strongly related to the classes File1 and File2, it is more natural to put it inside the class. However, sometimes this parse method may also be used in other classes under some circumstances. If you want to do so using File1, you must create an instance of File1 before calling the method parse. While using staticmethod in the class File2, you may directly call the method by using the syntax File2.parse.
This makes your works more convenient and natural.
I will add something other answers didn't mention. It's not only a matter of modularity, of putting something next to other logically related parts. It's also that the method could be non-static at other point of the hierarchy (i.e. in a subclass or superclass) and thus participate in polymorphism (type based dispatching). So if you put that function outside the class you will be precluding subclasses from effectively overriding it. Now, say you realize you don't need self in function C.f of class C, you have three two options:
Put it outside the class. But we just decided against this.
Do nothing new: while unused, still keep the self parameter.
Declare you are not using the self parameter, while still letting other C methods to call f as self.f, which is required if you wish to keep open the possibility of further overrides of f that do depend on some instance state.
Option 2 demands less conceptual baggage (you already have to know about self and methods-as-bound-functions, because it's the more general case). But you still may prefer to be explicit about self not being using (and the interpreter could even reward you with some optimization, not having to partially apply a function to self). In that case, you pick option 3 and add #staticmethod on top of your function.
Use #staticmethod for methods that don't need to operate on a specific object, but that you still want located in the scope of the class (as opposed to module scope).
Your example in test2.static_add_one wastes its time passing an unused self parameter, but otherwise works the same as test1.static_add_one. Note that this extraneous parameter can't be optimized away.
One example I can think of is in a Django project I have, where a model class represents a database table, and an object of that class represents a record. There are some functions used by the class that are stand-alone and do not need an object to operate on, for example a function that converts a title into a "slug", which is a representation of the title that follows the character set limits imposed by URL syntax. The function that converts a title to a slug is declared as a staticmethod precisely to strongly associate it with the class that uses it.
Sometimes self can denote the instance of the class and sometimes the class itself. So why don't we use inst and klass instead of self? Wouldn't that make things easier?
How things are now
class A:
#classmethod
def do(self): # self refers to class
..
class B:
def do(self): # self refers to instance of class
..
How I think they should be
class A:
#classmethod
def do(klass): # no ambiguity
..
class B:
def do(inst): # no ambiguity
..
So how come we don't program like this when in the zen of Python it is stated that explicit is better than implicit? Is there something that I am missing?
Class method support was added much later to Python, and the convention to use self for instances had already been established. Keeping that convention stable has more value than to switch to a longer name like instance.
The convention for class methods is to use the name cls:
class A:
#classmethod
def do(cls):
In other words, the conventions are already there to distinguish between a class object and the instance; never use self for class methods.
Also see PEP 8 - Function and method arguments:
Always use self for the first argument to instance methods.
Always use cls for the first argument to class methods.
I think it would be better to use "cls":
class A:
#classmethod
def do(cls): # cls refers to class
..
class B:
def do(self): # self refers to instance of class
..
It's requirement of PEP8:
http://legacy.python.org/dev/peps/pep-0008/#function-and-method-arguments
I think the point is that conventially you don't use self for methods wrapped with #classmethod. (You could write kls, cls, etc.)
There is ultimately nothing stopping you from writing inst instead of self if you so desire. So your second example would work fine and is actually the expected way to handle it (in terms of distinguishing an instance vs a class). However, you should definitely use self when dealing with instances. It's a Python convention and breaking it is strongly discouraged.
PEP8
Seeing as others have mentioned it, it's true PEP8 does say to use both self and cls in the case of instance and class methods, respectively. The only thing I'd add to this is that while there isn't any sensible reason to break this rule, changing self is significantly worse (from a semantic POV) because of its strong use inside of 99.999% of Python code. Its use is so universal that many (if not most) beginners assume it's a keyword and are confused by the idea that one can change self to anything.
This strong relationship to code and convention is not so apparent with class methods IMO. Of course I would urge anyone to follow PEP8 as much as possible, but if you felt inclined to use kls instead of cls, I feel that you'd be committing a lesser evil than if you changed self. However, whichever name you go with should remain consistent throughout your program.