I have a specific use-case for typings which I find hard to implement.
I have the concept of "Service" classes on my python codebase, which are classes with a handful of functions I want to "expose" so only they will be available when using an API. The implementation of the Service is like this:
class MyService(BaseService):
def normal_function(self):
pass
#exposed
def exposed_function(self):
pass
What's happening behind the #exposed thing is that it adds a unique property to the wrapped method which allows someone who uses it to know which functions are "exposed" or not.
I wish to make a type smart enough to understand that only the "exposed" functions are available.
Any ideas?
You could use name mangling:
class MyService():
def __normal_function(self):
pass
def exposed_function(self):
pass
my_service = MyService()
my_service.exposed_function() # this works, user can use the exposed function
my_service.__normal_function() # error: MyService instance has no attribute '__normal_function'
my_service._MyService__normal_function() # normal_function can only be called using its "mangled" name
In this case, the name of the normal function - __normal_function - will be textually replaced with _MyService__normal_function, so that the user will not be able to call the function using its "original" name.
Note that the normal function can still be called outside of the class, since private variables and methods don't exist in Python, but name mangling is probably the closest you can get to implementing private-like behavior.
Related
I am a Java programmer that is new to Python. I am having trouble understanding the syntax of the following code from the pymodbus repo in GitHub. Where is the function defined?
self.execute(request)
The reason I am confused is that AFAIK self refers to variables and functions of the current class, even inherited ones. There is no function defined in the ModBusClientMixIn class, nor the class inherit from any other class. So where is it coming from?
There is an execute function defined in the ReadCoilsRequest class, but to invoke that why would you need self? Also, where is context(a variable in the execute function argument list) coming from?
Would really appreciate if someone can help me understand the syntax.
It's a mixin which is used on classes which do define an execute method, e.g.:
class ModbusClientProtocol(protocol.Protocol, ModbusClientMixin):
A mixin adds methods to other classes and is not supposed to be used by itself.
If you wanted to type-annotate it properly, it would have to be something like:
class Executable(ABC):
#abstractmethod
def execute(self):
pass
class ModBusClientMixin:
def read_coils(self: Executable, address, count=1, **kwargs):
# ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑
# Expects self to conform to Executable interface,
# i.e. to be used in a class that implements execute().
self.execute()
Since Python heavily relies on duck-typing and type annotations are a relatively recent addition, they're often omitted and replaced by verbose documentation, or it is expected that developers recognise the purpose of mixins, or that it's such an internal implementation detail that it hasn't been explicitly documented.
This is a special case. You are right, that execute has to be defined somewhere.
But in this case, execute is implemented by a child class that derives from ModBusClientMixIn.
You would get an error if you were to create an instance of ModBusClientMixIn directly, because it does not implement execute.
Look at the implementations of ModbusClientProtocol or BaseModbusClient for example, they both have an execute method.
I am struggling to understand when it makes sense to use an instance method versus a static method. Also, I don't know if my functions are static since there is not a #staticmethod decorator. Would I be able to access the class functions when I make a call to one of the methods?
I am working on a webscraper that sends information to a database. It’s setup to run once a week. The structure of my code looks like this
import libraries...
class Get:
def build_url(url_paramater1, url_parameter2, request_date):
return url_with_parameters
def web_data(request_date, url_parameter1, url_parameter2): #no use of self
# using parameters pull the variables to look up in the database
for a in db_info:
url = build_url(a, url_parameter2, request_date)
x = requests.Session().get(url, proxies).json()
#save data to the database
return None
#same type of function for pulling the web data from the database and parsing it
if __name__ == ‘__main__’:
Get.web_data(request_date, url_parameter1, url_parameter2)
Parse.web_data(get_date, parameter) #to illustrate the second part of the scrapper
That is the basic structure. The code is functional but I don’t know if I am using the methods (functions?) correctly and potentially missing out on ways to use my code in the future. I may even be writing bad code that will cause errors down the line that are impossibly hard to debug only because I didn’t follow best practices.
After reading about when class and instance methods are used. I cannot see why I would use them. If I want the url built or the data pulled from the website I call the build_url or get_web_data function. I don’t need an instance of the function to keep track of anything separate. I cannot imagine when I would need to keep something separate either which I think is part of the problem.
The reason I think my question is different than the previous questions is: the conceptual examples to explain the differences don't seem to help me when I am sitting down and writing code. I have not run into real world problems that are solved with the different methods that show when I should even use an instance method, yet instance methods seem to be mandatory when looking at conceptual examples of code.
Thank you!
Classes can be used to represent objects, and also to group functions under a common namespace.
When a class represents an object, like a cat, anything that this object 'can do', logically, should be an instance method, such as meowing.
But when you have a group of static functions that are all related to each other or are usually used together to achieve a common goal, like build_url and web_data, you can make your code clearer and more organized by putting them under a static class, which provides a common namespace, like you did.
Therefore in my opinion the structure you chose is legitimate. It is worth considering though, that you'd find static classes more in more definitively OOP languages, like Java, while in python it is more common to use modules for namespace separation.
This code doesn't need to be a class at all. It should just be a pair of functions. You can't see why you would need an instance method because you don't have a reason to instantiate the object in the first place.
The functions you have wrote in your code are instance methods but they were written incorrectly.
An instance method must have self as first parameter
i.e def build_url(self, url_paramater1, url_parameter2, request_date):
Then you call it like that
get_inst = Get()
get_inst.build_url(url_paramater1, url_parameter2, request_date)
This self parameter is provided by python and it allow you to access all properties and functions - static or not - of your Get class.
If you don't need to access other functions or properties in your class then you add #staticmethod decorator and remove self parameter
#staticmethod
def build_url(url_paramater1, url_parameter2, request_date):
And then you can call it directly
Get.build_url(url_paramater1, url_parameter2, request_date)
or call from from class instance
get_inst = Get()
get_inst.build_url(url_paramater1, url_parameter2, request_date)
But what is the problem with your current code you might ask?
Try calling it from an instance like this and u will see the problem
get_inst = Get()
get_inst.build_url(url_paramater1, url_parameter2, request_date)
Example where creating an instance is useful:
Let's say you want to make a chat client.
You could write code like this
class Chat:
def send(server_url, message):
connection = connect(server_url)
connection.write(message)
connection.close()
def read(server_url):
connection = connect(server_url)
message = connection.read()
connection.close()
return message
But a much cleaner and better way to do it:
class Chat:
def __init__(server_url):
# Initialize connection only once when instance is created
self.connection = connect(server_url)
def __del__()
# Close connection only once when instance is deleted
self.connection.close()
def send(self, message):
self.connection.write(message)
def read(self):
return self.connection.read()
To use that last class you do
# Create new instance and pass server_url as argument
chat = Chat("http://example.com/chat")
chat.send("Hello")
chat.read()
# deleting chat causes __del__ function to be called and connection be closed
delete chat
From given example, there is no need to have Get class after all, since you are using it just like a additional namespace. You do not have any 'state' that you want to preserve, in either class or class instance.
What seems like a good thing is to have separate module and define these functions in it. This way, when importing this module, you get to have this namespace that you want.
My questions concern instance variables that are initialized in methods outside the class constructor. This is for Python.
I'll first state what I understand:
Classes may define a constructor, and it may also define other methods.
Instance variables are generally defined/initialized within the constructor.
But instance variables can also be defined/initialized outside the constructor, e.g. in the other methods of the same class.
An example of (2) and (3) -- see self.meow and self.roar in the Cat class below:
class Cat():
def __init__(self):
self.meow = "Meow!"
def meow_bigger(self):
self.roar = "Roar!"
My questions:
Why is it best practice to initialize the instance variable within the constructor?
What general/specific mess could arise if instance variables are regularly initialized in methods other than the constructor? (E.g. Having read Mark Lutz's Tkinter guide in his Programming Python, which I thought was excellent, I noticed that the instance variable used to hold the PhotoImage objects/references were initialized in the further methods, not in the constructor. It seemed to work without issue there, but could that practice cause issues in the long run?)
In what scenarios would it be better to initialize instance variables in the other methods, rather than in the constructor?
To my knowledge, instance variables exist not when the class object is created, but after the class object is instantiated. Proceeding upon my code above, I demonstrate this:
>> c = Cat()
>> c.meow
'Meow!'
>> c.roar
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Cat' object has no attribute 'roar'
>>> c.meow_bigger()
>>> c.roar
'Roar!'
As it were:
I cannot access the instance variable (c.roar) at first.
However, after I have called the instance method c.meow_bigger() once, I am suddenly able to access the instance variable c.roar.
Why is the above behaviour so?
Thank you for helping out with my understanding.
Why is it best practice to initialize the instance variable within the
constructor?
Clarity.
Because it makes it easy to see at a glance all of the attributes of the class. If you initialize the variables in multiple methods, it becomes difficult to understand the complete data structure without reading every line of code.
Initializing within the __init__ also makes documentation easier. With your example, you can't write "an instance of Cat has a roar attribute". Instead, you have to add a paragraph explaining that an instance of Cat might have a "roar" attribute, but only after calling the "meow_louder" method.
Clarity is king. One of the smartest programmers I ever met once told me "show me your data structures, and I can tell you how your code works without seeing any of your code". While that's a tiny bit hyperbolic, there's definitely a ring of truth to it. One of the biggest hurdles to learning a code base is understanding the data that it manipulates.
What general/specific mess could arise if instance variables are
regularly initialized in methods other than the constructor?
The most obvious one is that an object may not have an attribute available during all parts of the program, leading to having to add a lot of extra code to handle the case where the attribute is undefined.
In what scenarios would it be better to initialize instance variables
in the other methods, rather than in the constructor?
I don't think there are any.
Note: you don't necessarily have to initialize an attribute with it's final value. In your case it's acceptable to initialize roar to None. The mere fact that it has been initialized to something shows that it's a piece of data that the class maintains. It's fine if the value changes later.
Remember that class members in "pure" Python are just a dictionary. Members aren't added to an instance's dictionary until you run the function in which they are defined. Ideally this is the constructor, because that then guarantees that your members will all exist regardless of the order that your functions are called.
I believe your example above could be translated to:
class Cat():
def __init__(self):
self.__dict__['meow'] = "Meow!"
def meow_bigger(self):
self.__dict__['roar'] = "Roar!"
>>> c = Cat() # c.__dict__ = { 'meow': "Meow!" }
>>> c.meow_bigger() # c.__dict__ = { 'meow': "Meow!", 'roar': "Roar!" }
To initialize instance variables within the constructor, is - as you already pointed out - only recommended in python.
First of all, defining all instance variables within the constructor is a good way to document a class. Everybody, seeing the code, knows what kind of internal state an instance has.
Secondly, order matters. if one defines an instance variable V in a function A and there is another function B also accessing V, it is important to call A before B. Otherwise B will fail since V was never defined. Maybe, A has to be invoked before B, but then it should be ensured by an internal state, which would be an instance variable.
There are many more examples. Generally it is just a good idea to define everything in the __init__ method, and set it to None if it can not / should not be initialized at initialization.
Of course, one could use hasattr method to derive some information of the state. But, also one could check if some instance variable V is for example None, which can imply the same then.
So in my opinion, it is never a good idea to define an instance variable anywhere else as in the constructor.
Your examples state some basic properties of python. An object in Python is basically just a dictionary.
Lets use a dictionary: One can add functions and values to that dictionary and construct some kind of OOP. Using the class statement just brings everything into a clean syntax and provides extra stuff like magic methods.
In other languages all information about instance variables and functions are present before the object was initialized. Python does that at runtime. You can also add new methods to any object outside the class definition: Adding a Method to an Existing Object Instance
3.) But instance variables can also be defined/initialized outside the constructor, e.g. in the other methods of the same class.
I'd recommend providing a default state in initialization, just so its clear what the class should expect. In statically typed languages, you'd have to do this, and it's good practice in python.
Let's convey this by replacing the variable roar with a more meaningful variable like has_roared.
In this case, your meow_bigger() method now has a reason to set has_roar. You'd initialize it to false in __init__, as the cat has not roared yet upon instantiation.
class Cat():
def __init__(self):
self.meow = "Meow!"
self.has_roared = False
def meow_bigger(self):
print self.meow + "!!!"
self.has_roared = True
Now do you see why it often makes sense to initialize attributes with default values?
All that being said, why does python not enforce that we HAVE to define our variables in the __init__ method? Well, being a dynamic language, we can now do things like this.
>>> cat1 = Cat()
>>> cat2 = Cat()
>>> cat1.name = "steve"
>>> cat2.name = "sarah"
>>> print cat1.name
... "steve"
The name attribute was not defined in the __init__ method, but we're able to add it anyway. This is a more realistic use case of setting variables that aren't defaulted in __init__.
I try to provide a case where you would do so for:
3.) But instance variables can also be defined/initialized outside the constructor, e.g. in the other methods of the same class.
I agree it would be clear and organized to include instance field in the constructor, but sometimes you are inherit other class, which is created by some other people and has many instance fields and api.
But if you inherit it only for certain apis and you want to have your own instance field for your own apis, in this case, it is easier for you to just declare extra instance field in the method instead override the other's constructor without bothering to deep into the source code. This also support Adam Hughes's answer, because in this case, you will always have your defined instance because you will guarantee to call you own api first.
For instance, suppose you inherit a package's handler class for web development, you want to include a new instance field called user for handler, you would probability just declare it directly in the method--initialize without override the constructor, I saw it is more common to do so.
class BlogHandler(webapp2.RequestHandler):
def initialize(self, *a, **kw):
webapp2.RequestHandler.initialize(self, *a, **kw)
uid = self.read_cookie('user_id') #get user_id by read cookie in the browser
self.user = User.by_id(int(uid)) #run query in data base find the user and return user
These are very open questions.
Python is a very "free" language in the sense that it tries to never restrict you from doing anything, even if it looks silly. This is why you can do completely useless things such as replacing a class with a boolean (Yes you can).
The behaviour that you mention follows that same logic: if you wish to add an attribute to an object (or to a function - yes you can, too) dynamically, anywhere, not necessarily in the constructor, well... you can.
But it is not because you can that you should. The main reason for initializing attributes in the constructor is readability, which is a prerequisite for maintenance. As Bryan Oakley explains in his answer, class fields are key to understand the code as their names and types often reveal the intent better than the methods.
That being said, there is now a way to separate attribute definition from constructor initialization: pyfields. I wrote this library to be able to define the "contract" of a class in terms of attributes, while not requiring initialization in the constructor. This allows you in particular to create "mix-in classes" where attributes and methods relying on these attributes are defined, but no constructor is provided.
See this other answer for an example and details.
i think to keep it simple and understandable, better to initialize the class variables in the class constructor, so they can be directly called without the necessity of compiling of a specific class method.
class Cat():
def __init__(self,Meow,Roar):
self.meow = Meow
self.roar = Roar
def meow_bigger(self):
return self.roar
def mix(self):
return self.meow+self.roar
c=Cat("Meow!","Roar!")
print(c.meow_bigger())
print(c.mix())
Output
Roar!
Roar!
Meow!Roar!
I've been reading lots of previous SO discussions of factory functions, etc. and still don't know what the best (pythonic) approach is to this particular situation. I'll admit up front that i am imposing a somewhat artificial constraint on the problem in that i want my solution to work without modifying the module i am trying to extend: i could make modifications to it, but let's assume that it must remain as-is because i'm trying to understand best practice in this situation.
I'm working with the http://pypi.python.org/pypi/icalendar module, which handles parsing from and serializing to the Icalendar spec (hereafter ical). It parses the text into a hierarchy of dictionary-like "component" objects, where every "component" is an instance of a trivial derived class implementing the different valid ical types (VCALENDAR, VEVENT, etc.) and they are all spit out by a recursive factory from the common parent class:
class Component(...):
#classmethod
def from_ical(cls, ...)
I have created a 'CalendarFile' class that extends the ical 'Calendar' class, including in it generator function of its own:
class CalendarFile(Calendar):
#classmethod
def from_file(cls, ics):
which opens a file (ics) and passes it on:
instance = cls.from_ical(f.read())
It initializes and modifies some other things in instance and then returns it. The problem is that instance ends up being a Calendar object instead of a CalendarFile object, in spite of cls being CalendarFile. Short of going into the factory function of the ical module and fiddling around in there, is there any way to essentially "recast" that object as a 'CalendarFile'?
The alternatives (again without modifying the original module) that I have considered are:make the CalendarFile class a has-a Calendar class (each instance creates its own internal instance of a Calendar object), but that seems methodically stilted.
fiddle with the returned object to give it the methods it needs (i know there's a term for creating a customized object but it escapes me).
make the additional methods into functions and just have them work with instances of Calendar.
or perhaps the answer is that i shouldn't be trying to subclass from a module in the first place, and this type of code belongs in the module itself.
Again i'm trying to understand what the "best" approach is and also learn if i'm missing any alternatives. Thanks.
Normally, I would expect an alternative constructor defined as a classmethod to simply call the class's standard constructor, transforming the arguments that it receives into valid arguments to the standard constructor.
>>> class Toy(object):
... def __init__(self, x):
... self.x = abs(x)
... def __repr__(self):
... return 'Toy({})'.format(self.x)
... #classmethod
... def from_string(cls, s):
... return cls(int(s))
...
>>> Toy.from_string('5')
Toy(5)
In most cases, I would strongly recommend something like this approach; this is the gold standard for alternative constructors.
But this is a special case.
I've now looked over the source, and I think the best way to add a new class is to edit the module directly; otherwise, scrap inheritance and take option one (your "has-a" option). The different classes are all slightly differentiated versions of the same container class -- they shouldn't really even be separate classes. But if you want to add a new class in the idiom of the code as it it is written, you have to add a new class to the module itself. Furthermore, from_iter is deceptively named; it's not really a constructor at all. I think it should be a standalone function. It builds a whole tree of components linked together, and the code that builds the individual components is buried in a chain of calls to various factory functions that also should be standalone functions but aren't. IMO much of that code ought to live in __init__ where it would be useful to you for subclassing, but it doesn't.
Indeed, none of the subclasses of Component even add any methods. By adding methods to your subclass of Calendar, you're completely disregarding the actual idiom of the code. I don't like its idiom very much but by disregarding that idiom, you're making it even worse. If you don't want to modify the original module, then forget about inheritance here and give your object a has-a relationship to Calendar objects. Don't modify __class__; establish your own OO structure that follows standard OO practices.
I'm working on a web application that will return a variable set of modules depending on user input. Each module is a Python class with a constructor that accepts a single parameter and has an '.html' property that contains the output.
Pulling the class dynamically from the global namespace works:
result = globals()[classname](param).html
And it's certainly more succinct than:
if classname == 'Foo':
result = Foo(param).html
elif classname == 'Bar':
...
What is considered the best way to write this, stylistically? Are there risks or reasons not to use the global namespace?
A flaw with this approach is that it may give the user the ability to to more than you want them to. They can call any single-parameter function in that namespace just by providing the name. You can help guard against this with a few checks (eg. isinstance(SomeBaseClass, theClass), but its probably better to avoid this approach. Another disadvantage is that it constrains your class placement. If you end up with dozens of such classes and decide to group them into modules, your lookup code will stop working.
You have several alternative options:
Create an explicit mapping:
class_lookup = {'Class1' : Class1, ... }
...
result = class_lookup[className](param).html
though this has the disadvantage that you have to re-list all the classes.
Nest the classes in an enclosing scope. Eg. define them within their own module, or within an outer class:
class Namespace(object):
class Class1(object):
...
class Class2(object):
...
...
result = getattr(Namespace, className)(param).html
You do inadvertantly expose a couple of additional class variables here though (__bases__, __getattribute__ etc) - probably not exploitable, but not perfect.
Construct a lookup dict from the subclass tree. Make all your classes inherit from a single baseclass. When all classes have been created, examine all baseclasses and populate a dict from them. This has the advantage that you can define your classes anywhere (eg. in seperate modules), and so long as you create the registry after all are created, you will find them.
def register_subclasses(base):
d={}
for cls in base.__subclasses__():
d[cls.__name__] = cls
d.update(register_subclasses(cls))
return d
class_lookup = register_subclasses(MyBaseClass)
A more advanced variation on the above is to use self-registering classes - create a metaclass than automatically registers any created classes in a dict. This is probably overkill for this case - its useful in some "user-plugins" scenarios though.
First of all, it sounds like you may be reinventing the wheel a little bit... most Python web frameworks (CherryPy/TurboGears is what I know) already include a way to dispatch requests to specific classes based on the contents of the URL, or the user input.
There is nothing wrong with the way that you do it, really, but in my experience it tends to indicate some kind of "missing abstraction" in your program. You're basically relying on the Python interpreter to store a list of the objects you might need, rather than storing it yourself.
So, as a first step, you might want to just make a dictionary of all the classes that you might want to call:
dispatch = {'Foo': Foo, 'Bar': Bar, 'Bizbaz': Bizbaz}
Initially, this won't make much of a difference. But as your web app grows, you may find several advantages: (a) you won't run into namespace clashes, (b) using globals() you may have security issues where an attacker can, in essence, access any global symbol in your program if they can find a way to inject an arbitrary classname into your program, (c) if you ever want to have classname be something other than the actual exact classname, using your own dictionary will be more flexible, (d) you can replace the dispatch dictionary with a more-flexible user-defined class that does database access or something like that if you find the need.
The security issues are particularly salient for a web app. Doing globals()[variable] where variable is input from a web form is just asking for trouble.
Another way to build the map between class names and classes:
When defining classes, add an attribute to any class that you want to put in the lookup table, e.g.:
class Foo:
lookup = True
def __init__(self, params):
# and so on
Once this is done, building the lookup map is:
class_lookup = zip([(c, globals()[c]) for c in dir() if hasattr(globals()[c], "lookup")])