There are many questions related to the use of the Singleton pattern in python, and although this question might repeat many of the aspects already discussed, I have not found the answer to the following specific question.
Let's assume I have a class MyClass which I want to instantiate only exactly once. In python I can do this as follows in the code myclass.py:
class MyClass(object):
def foo(self):
....
instance = MyClass()
Then in any other program I can refer to the instance simply with
import myclass
myclass.instance.foo()
Under what circumstances is this approach enough? Under what circumstances is the use of a Singleton pattern useful/mandatory?
The singleton pattern is more often a matter of convenience than of requirement. Python is a little bit different than other languages in that it is fairly easy to mock out singletons in testing (just clobber the global variable!) by comparison to other languages, but it is neverthess a good idea to ask yourself when creating a singleton: am I doing this for the sake of convenience or because it is stricly necessary that there is only one instance? Is it possible that there may be more than one in the future?
If you create a class that really will be only constructed once, it may make more sense to make the state a part of the module, and to make its methods into module-level functions. If there is a possibility that the assumption of exactly one instance may change in the future, then it is often much better to pass the singleton instance around rather than referencing the singleton through a global name.
For example, you can just as easily implement a "singleton" this way:
if __name__ == '__main__':
instance = MyClass()
doSomethingWith(instance)
In the above, "instance" is singleton by virtue of the fact that it is constructed only once, but the code that handles it is provided the instance rather than referencing module.instance, which makes it easier to reuse pieces of the code if, in some future situation, you need more than one MyClass.
Assuming you want to use a module as a singleton as Michael Aaron Safyan suggests, you can make it work even if the module isn't imported by the main code by doing something like the following (in the main code or a module it does import direct or indirectly). What it does is make aninstanceclass attribute initialized it to one, and then replaces the module object insys.moduleswith the instance created:
class _MyClass(object):
def foo(self):
print 'foo()'
_MyClass.instance = _MyClass()
import sys
_ref = sys.modules[__name__] # Reference to current module so it's not deleted
sys.modules[__name__] = _MyClass.instance
I've found singletons a useful way to implement "registers" of things when it makes sense to have only one (registry) -- such as a group of classes for a class factory, a group of constants, or a bundle of configuration information. In many cases just a regular Python module will do fine because, by default, they're effectively already singletons due to fact that those already loaded get cached in the sys.modulesdictionary.
Occasionally however, class instances are preferable because they can be passed construction parameters and have properties -- something built-in module objects don't and can't be made to possess. Limitations like that can be worked-around using the trick shown above which effectively turns custom class instances into module objects.
The idea of using class instances as module objects is from Alex Martelli's ActiveState recipe named Constants in Python.
In my humble opinion, there are two sides to the singleton pattern.
you want a single context for a given service because more than one does not make sense.
you want to absolutely prevent people from creating two object of a given type because it might break your service
While the first case may have some applications (logging service), the second one is often the sign of a bad design.
You should design your API so that your users should not have to think about this problem. But if they dig through your undocumented layers to find your hidden constructor and want to use it for whatever reason, they should not have to deal with useless constructs created to prevent them to do what they need to do.
Related
Suppose I have a project in ~/app/, containing at least files myclass.py, myobject.py, and app.py.
In myclass.py I have something like
def myclass():
# class attributes and methods...
In myobject.py, I have something like
from app import myclass
attribute1 = 1
attribute2 = 2
myobject = myclass(attribute1, attribute2)
Finally, app.py looks something like
from app import myobject
# do stuff with myobject
In practice, I'm using myobject.py to gather a common instance of myclass and make it easily importable, so I don't have to define all the attributes separately. My question is on the convention of myobject.py. Is this okay or is there something that would be better to achieve the purpose mentioned. The concerns I thought of is that there are all these other variables (in this case attribute1 and attribute2) which are just... there... in the myobject module. It just feels a little weird because these aren't things that would ever be accessed individually, but the fact that it is accessible... I feel like there's some other conventional way to do this. Is this perfectly fine, or am I right to have concerns (if so, how to fix it)?
Edit: To make it more clear, here is an example: I have a Camera class which stores the properties of the lens and CCD and such (like in myclass.py). So users are able to define different cameras and use them in the application. However, I want to allow them to have some preset cameras, thus I define objects of the Camera class that are specific to certain cameras I know are common to use for this application (like in myobject.py). So when they run the application, they can just import these preset cameras (as Camera objects) (like in app.py). How should these preset objects be written, if how it's written in myobject.py is not the best way?
so you this method fails to call function inside the class in first case. i think you do it by making a class of attribute and getting variables from it.
class Attribute():
def __init(self,a1,a2):
self.a1=a1
self.a2=a2
att=Attribute(1,2)
print(att.a1)
It looks like you stumbled upon the singleton pattern. Essentially, your class should only ever have one instance at any time, most likely to store global configurations or some similar purpose. In Java, you'd implement this pattern by making the constructor private, and have a static method (eg. getInstance()) that returns a private static instance.
For Python, it's actually quite tricky to implement singletons. You can see some discussion about that subject here. To me how you're doing it is probably the simplest way to do it, and although it doesn't strictly enforce the singleton constraint, it's probably better for a small project than adding a ton of complexity like metaclasses to make 'true' singletons.
As Classes are first-class objects in Python, we can pass them to functions. For example, here is some code I've come across:
ChatRouter = sockjs.tornado.SockJSRouter(ChatConnection, '/chat')
where ChatConnection is a Class defined in the same module. I wonder what would be the common user case(s) for such practice?
In addition, in the code example above, why is the variable 'ChatRouter' capitalized?
Without knowing anything else about that code, I'd guess this:
OK, I looked at the source. Below the line is incorrect, although plausible. Basically what the code does is use ChatConnection to create a Session object which does some other stuff. ChatRouter is just a badly named regular variable, not a class name.
SockJSRouter is a class that takes another class (call it connection) and a string as parameters. It uses __new__ to create not an instance of SockJSRouter, but an instance of a special class that uses (possibly subclasses) connection. That would explain why ChatRouter is capitalized, as it would be a class name. The returned class would use connection to generalize a lot of things, as connection would be responsible for handling communicating over a network or whatever. So by using different connections, one could handle different protocols. ChatConnection is probably some layer over IRC.
So basically, the common use case (and likely the use here) is generalization, and the reason for the BactrianCase name is because it's a class (just one generated at runtime).
Passing classes around may be useful for customization and flexible code. The function may want to create several objects of the given class, so passing it a class is one way to implement this (another would be to pass some kind of factory function). For example, in the example you gave, SockJSRouter ends up passing the connection class to Session, which then uses it to construct a new connection object.
As for ChatRouter, I suppose this is just a naming convention. While Python programmers are advised to follow PEP 8, and many do, it's not strictly required and some projects settle on different naming conventions.
I'm getting a bit of a headache trying to figure out how to organise modules and classes together. Coming from C++, I'm used to classes encapsulating all the data and methods required to process that data. In python there are modules however and from code I have looked at, some people have a lot of loose functions stored in modules, whereas others almost always bind their functions to classes as methods.
For example say I have a data structure and would like to write it to disk.
One way would be to implement a save method for that object so that I could just type
MyObject.save(filename)
or something like that. Another method I have seen in equal proportion is to have something like
from myutils import readwrite
readwrite.save(MyObject,filename)
This is a small example, and I'm not sure how python specific this problem is at all, but my general question is what is the best pythonic practice in terms of functions vs methods organisation?
It seems like loose functions bother you. This is the python way. It makes sense because a module in python is really just an object on the same footing as any other object. It does have language level support for loading it from a file but other than that, it's just an object.
so if I have a module foo.py:
import pprint
def show(obj):
pprint(obj)
Then the when I import it from bar.py
import foo
class fubar(object):
#code
def method(self, obj):
#more stuff
foo.show(obj)
I am essentially accessing a method on the foo object. The data attributes of the foo module are just the globals that are defined in foo. A module is the language level implementation of a singleton without the need to prepend self to every methods argument list.
I try to write as many module level functions as possible. If some function will only work with an instance of a particular class, I will make it a method on the class. Otherwise, I try to make it work on instances of every class that is defined in the module for which it would make sense.
The rational behind the exact example that you gave is that if each class has a save method, then if you later change how you are saving data (from say filesystem to database or remote XML file) then you have to change every class. If each class implements an interface to yield that data that it wants saved, then you can write one function to save instances of every class and only change that function once. This is known as the Single Responsibility Principle: Each class should have only one reason to change.
If you have a regular old class you want to save to disk, I would just make it an instance method. If it were a serialization library that could handle different types of objects I would do the second way.
I'm a C++ programmer just starting to learn Python. I'd like to know how you keep track of instance variables in large Python classes. I'm used to having a .h file that gives me a neat list (complete with comments) of all the class' members. But since Python allows you to add new instance variables on the fly, how do you keep track of them all?
I'm picturing a scenario where I mistakenly add a new instance variable when I already had one - but it was 1000 lines away from where I was working. Are there standard practices for avoiding this?
Edit: It appears I created some confusion with the term "member variable." I really mean instance variable, and I've edited my question accordingly.
I would say, the standard practice to avoid this is to not write classes where you can be 1000 lines away from anything!
Seriously, that's way too much for just about any useful class, especially in a language that is as expressive as Python. Using more of what the Standard Library offers and abstracting away code into separate modules should help keeping your LOC count down.
The largest classes in the standard library have well below 100 lines!
First of all: class attributes, or instance attributes? Or both? =)
Usually you just add instance attributes in __init__, and class attributes in the class definition, often before method definitions... which should probably cover 90% of use cases.
If code adds attributes on the fly, it probably (hopefully :-) has good reasons for doing so... leveraging dynamic features, introspection, etc. Other than that, adding attributes this way is probably less common than you think.
pylint can statically detect attributes that aren't detected in __init__, along with many other potential bugs.
I'd also recommend writing unit tests and running your code often to detect these types of "whoopsie" programming mistakes.
Instance variables should be initialized in the class's __init__() method. (In general)
If that's not possible. You can use __dict__ to get a dictionary of all instance variables of an object during runtime. If you really need to track this in documentation add a list of instance variables you are using into the docstring of the class.
It sounds like you're talking about instance variables and not class variables. Note that in the following code a is a class variable and b is an instance variable.
class foo:
a = 0 #class variable
def __init__(self):
self.b = 0 #instance variable
Regarding the hypothetical where you create an unneeded instance variable because the other one was about one thousand lines away: The best solution is to not have classes that are one thousand lines long. If you can't avoid the length, then your class should have a well defined purpose and that will enable you to keep all of the complexities in your head at once.
A documentation generation system such as Epydoc can be used as a reference for what instance/class variables an object has, and if you're worried about accidentally creating new variables via typos you can use PyChecker to check your code for this.
This is a common concern I hear from many programmers who come from a C, C++, or other statically typed language where variables are pre-declared. In fact it was one of the biggest concerns we heard when we were persuading programmers at our organization to abandon C for high-level programs and use Python instead.
In theory, yes you can add instance variables to an object at any time. Yes it can happen from typos, etc. In practice, it rarely results in a bug. When it does, the bugs are generally not hard to find.
As long as your classes are not bloated (1000 lines is pretty huge!) and you have ample unit tests, you should rarely run in to a real problem. In case you do, it's easy to drop to a Python console at almost any time and inspect things as much as you wish.
It seems to me that the main issue here is that you're thinking in terms of C++ when you're working in python.
Having a 1000 line class is not a very wise thing anyway in python, (I know it happens alot in C++ though),
Learn to exploit the dynamism that python gives you, for instance you can combine lists and dictionaries in very creative ways and save your self hundreds of useless lines of code.
For example, if you're mapping strings to functions (for dispatching), you can exploit the fact that functions are first class objects and have a dictionary that goes like:
d = {'command1' : func1, 'command2': func2, 'command3' : func3}
#then somewhere else use this list to dispatch
#given a string `str`
func = d[str]
func() #call the function!
Something like this in C++ would take up sooo many lines of code!
The easiest is to use an IDE. PyDev is a plugin for eclipse.
I'm not a full on expert in all ways pythonic, but in general I define my class members right under the class definition in python, so if I add members, they're all relative.
My personal opinion is that class members should be declared in one section, for this specific reason.
Local scoped variables, otoh, should be defined closest to when they are used (except in C--which I believe still requires variables to be declared at the beginning of a method).
Consider using slots.
For example:
class Foo:
__slots__ = "a b c".split()
x = Foo()
x.a =1 # ok
x.b =1 # ok
x.c =1 # ok
x.bb = 1 # will raise "AttributeError: Foo instance has no attribute 'bb'"
It is generally a concern in any dynamic programming language -- any language that does not require variable declaration -- that a typo in a variable name will create a new variable instead of raise an exception or cause a compile-time error. Slots helps with instance variables, but doesn't help you with, module-scope variables, globals, local variables, etc. There's no silver bullet for this; it's part of the trade-off of not having to declare variables.
I have some Python code that creates a Calendar object based on parsed VEvent objects from and iCalendar file.
The calendar object just has a method that adds events as they get parsed.
Now I want to create a factory function that creates a calendar from a file object, path, or URL.
I've been using the iCalendar python module, which implements a factory function as a class method directly on the Class that it returns an instance of:
cal = icalendar.Calendar.from_string(data)
From what little I know about Java, this is a common pattern in Java code, though I seem to find more references to a factory method being on a different class than the class you actually want to instantiate instances from.
The question is, is this also considered Pythonic ? Or is it considered more pythonic to just create a module-level method as the factory function ?
[Note. Be very cautious about separating "Calendar" a collection of events, and "Event" - a single event on a calendar. In your question, it seems like there could be some confusion.]
There are many variations on the Factory design pattern.
A stand-alone convenience function (e.g., calendarMaker(data))
A separate class (e.g., CalendarParser) which builds your target class (Calendar).
A class-level method (e.g. Calendar.from_string) method.
These have different purposes. All are Pythonic, the questions are "what do you mean?" and "what's likely to change?" Meaning is everything; change is important.
Convenience functions are Pythonic. Languages like Java can't have free-floating functions; you must wrap a lonely function in a class. Python allows you to have a lonely function without the overhead of a class. A function is relevant when your constructor has no state changes or alternate strategies or any memory of previous actions.
Sometimes folks will define a class and then provide a convenience function that makes an instance of the class, sets the usual parameters for state and strategy and any other configuration, and then calls the single relevant method of the class. This gives you both the statefulness of class plus the flexibility of a stand-alone function.
The class-level method pattern is used, but it has limitations. One, it's forced to rely on class-level variables. Since these can be confusing, a complex constructor as a static method runs into problems when you need to add features (like statefulness or alternative strategies.) Be sure you're never going to expand the static method.
Two, it's more-or-less irrelevant to the rest of the class methods and attributes. This kind of from_string is just one of many alternative encodings for your Calendar objects. You might have a from_xml, from_JSON, from_YAML and on and on. None of this has the least relevance to what a Calendar IS or what it DOES. These methods are all about how a Calendar is encoded for transmission.
What you'll see in the mature Python libraries is that factories are separate from the things they create. Encoding (as strings, XML, JSON, YAML) is subject to a great deal of more-or-less random change. The essential thing, however, rarely changes.
Separate the two concerns. Keep encoding and representation as far away from state and behavior as you can.
It's pythonic not to think about esoteric difference in some pattern you read somewhere and now want to use everywhere, like the factory pattern.
Most of the time you would think of a #staticmethod as a solution it's probably better to use a module function, except when you stuff multiple classes in one module and each has a different implementation of the same interface, then it's better to use a #staticmethod
Ultimately weather you create your instances by a #staticmethod or by module function makes little difference.
I'd probably use the initializer ( __init__ ) of a class because one of the more accepted "patterns" in python is that the factory for a class is the class initialization.
IMHO a module-level method is a cleaner solution. It hides behind the Python module system that gives it a unique namespace prefix, something the "factory pattern" is commonly used for.
The factory pattern has its own strengths and weaknesses. However, choosing one way to create instances usually has little pragmatic effect on your code.
A staticmethod rarely has value, but a classmethod may be useful. It depends on what you want the class and the factory function to actually do.
A factory function in a module would always make an instance of the 'right' type (where 'right' in your case is the 'Calendar' class always, but you might also make it dependant on the contents of what it is creating the instance out of.)
Use a classmethod if you wish to make it dependant not on the data, but on the class you call it on. A classmethod is like a staticmethod in that you can call it on the class, without an instance, but it receives the class it was called on as first argument. This allows you to actually create an instance of that class, which may be a subclass of the original class. An example of a classmethod is dict.fromkeys(), which creates a dict from a list of keys and a single value (defaulting to None.) Because it's a classmethod, when you subclass dict you get the 'fromkeys' method entirely for free. Here's an example of how one could write dict.fromkeys() oneself:
class dict_with_fromkeys(dict):
#classmethod
def fromkeys(cls, keys, value=None):
self = cls()
for key in keys:
self[key] = value
return self