Heavy objects initialization in python - python

What is the good practice in Python when I need to initialize an objects that contains, for example, big array and it should be filled with calculated values during the object creation. Is it ok in Python to do this inside constructor or this is like a code-smell and I should use Factory instead? What is the Pythonic way here?

If you can initialize the array in a couple of lines of clear code then it's quite ok to initialize it directly in the __init__ method. Otherwise, initialize it in a separate method.
Python does have static methods and class methods (see here for some simple examples). You could use a static method to initialize your array, but if the initializer method uses attributes of the instance in its calculation you may as well make it a normal method, otherwise you'll need to pass it those values as parameters.
Python doesn't have private methods, but it's conventional to indicate that a method is for private use of the class by giving it a name that starts with a single leading underscore, eg _init_my_array. If a user of your class wants to call that method they can, but they know that doing so may cause the class to misbehave, and that it's generally not a good idea to call such "private" methods unless you know what you're doing.

Related

What is "static method" in python

I'm quite new to python, and could not understand what is static method in python(for example __new__()) and what does it do. Can anyone possibly explain it? Thanks a million
Have you already read this?
https://en.wikipedia.org/wiki/Method_(computer_programming)
Especially this?
https://en.wikipedia.org/wiki/Method_(computer_programming)#Static_methods
Explanation
In OOP you define classes that you later on instantiate. A class is nothing more than a blueprint: Once you instantiate objects from a class your object will follow exactly the blueprint of your class. That means: If you define a field named "abc" in your class you will later on have a field "abc" in your object. If you define a method "foo()" in your class, you will later on have a method "foo()" to be invoked on your object.
Please note that this "on your object" is essential: You always instantiate a class and then you can invoke the method. This is the "normal" way.
A static method is different. While a normal method always requires to have an instance (where you then can invoke this method at) a static method does not. A static method exists independently from your instances (that's why it is named "static"). So a static method is associated with your class definition itself and therefore is always there and therefore can be invoked only at your class itself. It is completely independent from all instances.
That's a static method.
Python's implementation is a bit ... well ... simple. In details there are deviations from this description above. But that does not make any difference: To be in line with OOP concepts you always should use methods exactly as described above.
Example
Let's give you an example:
class FooBar:
def someMethod(self):
print("abc")
This is a regular (instance) method. You use it like this:
myObj = FooBar()
myObj.someMethod()
If you have ...
myObjB = FooBar()
myObjB.someMethod()
... you have an additional instance and therefore invoking someMethod() on this second instance will be the invocation of a second someMethod() method - defined at the second object. This is because you instantiate objects before use so all instances follow the blueprint FooBar defined. All instances therefore receive some kind of copy of someMethod().
(In practice Python will use optimizations internally, so there actually is only one piece of code that implements your someMethod() in memory, but forget about this for now. To a programmer it appears as that every instance of a class will have a copy of the method someMethod(). And that's the level of abstraction that is relevant to us as this is the "surface" we work on. Deep within the implementation of a programming or script language things might be a bit different but this is not very relevant.)
Let's have a look at a static method:
class FooBar:
#staticmethod
def someStaticMethod():
print("abc")
Such static methods can be invoked like this:
FooBar.someStaticMethod()
As you can see: No instance. You directly invoke this method in the context of the class itself. While regular methods work on the particular instance itself - they typically modify data within this instance itself - a class method does not. It could modify static (!) data, but typically it does not anyway.
Consider a static method a special case. It is rarely needed. What you typically want if you write code is not to implement a static method. Only in very specific situations it makes sense to implement a static method.
The self parameter
Please note that a standard "instance" method always must have self as a first argument. This is a python specific. In the real world Python will (of course!) store your method only once in memory, even if you instantiate thousands of objects of your class. Consider this an optimization. If you then invoke your method on one of your thousands of instances, always the same single piece of code is called. But for it to distinguish on which particular object the code of the method should work on your instance is passed to this internally stored piece of code as the very first argument. This is the self argument. It is some kind of implicit argument and always needed for regular (instance) methods. (Not: static methods - there you don't need an instance to invoke them).
As this argument is implicit and always needed most programming languages hide it to the programmer (and handle this internally - under the hood - in the correct way). It does not really make much sense to expose this special argument anyway.
Unfortunately Python does not follow this principle. Python does not hide this argument which is implicitly required. (Python's incorporation of OOP concepts is a bit ... simple.) Therefore you see self almost everywhere in methods. In your mind you can ignore it, but you need to write it explicitly if you define your own classes. (Which is something you should do in order to structure your programs in a good way.)
The static method __new__()
Python is quite special. While regular programming languages follow a strict and immutable concept of how to create instances of particular classes, Python is a bit different here. This behavior can be changed. This behavior is implemented in __new__(). So if you do this ...
myObj = FooBar()
... Python implicitly invokes FooBar.__new__() which in turn invokes a constructor-like (instance) method named __init__() that you could (!) define in your class (as an instance method) and then returns the fully initialized instance. This instance is then what is stored in myObj in this example her.
You could modify this behavior if you want. But this would requires a very very very particularly unusual use case. You will likely never have anything to do with __new__() itself in your entire work with Python. My advice: If you're somehow new to Python just ignore it.

alternative for passing references around in python

I'm relatively new to python and oop, and i have a question around the design of my code for a hobby project.
I created a lot of variables in my main program. These variables are lists of objects (not configuration parameters and not constants). The objects in the lists are sprites.
I'm passing these variables around between objects, by calling methods and passing the variables around as arguments for a specific method. (pass-by-reference)
For example:
spritelist = [Sprite(...), Sprite(..)]
mycollisiondetector = CollisionDetector()
mycollisiondetector.check_collision(spritelist)
Then, in class CollisionDetector, spritelist is passed to "private" methods of the class. These private methods call other methods, and keep passing spritelist ... .
So, my question is just this: is there an alternative for endlessly passing variables around from one method to another ?
If you're dealing with instance variables (not configuration constants), it's considered bad practice to separate the variables into a module (a different file), since you're mixing instance state and global state.
If you have many references being passed around repeatedly, this is usually a indicator of bad class hierarchy design. You may want to consider subclassing, or defining a new class for your variables and passing a reference to it. The details will depend on your specific situation - it's hard to tell without seeing the code.

Python: is there a use case for changing an instance's class?

Related: Python object conversion
I recently learned that Python allows you to change an instance's class like so:
class Robe:
pass
class Dress:
pass
r = Robe()
r.__class__ = Dress
I'm trying to figure out whether there is a case where 'transmuting' an object like this can be useful. I've messed around with this in IDLE, and one thing I've noticed is that assigning a different class doesn't call the new class's __init__ method, though this can be done explicitly if needed.
Virtually every use case I can think of would be better served by composition, but I'm a coding newb so what do I know. ;)
There is rarely a good reason to do this for unrelated classes, like Robe and Dress in your example. Without a bit of work, it's hard to ensure that the object you get in the end is in a sane state.
However, it can be useful when inheriting from a base class, if you want to use a non-standard factory function or constructor to build the base object. Here's an example:
class Base(object):
pass
def base_factory():
return Base() # in real code, this would probably be something opaque
def Derived(Base):
def __new__(cls):
self = base_factory() # get an instance of Base
self.__class__ = Derived # and turn it into an instance of Derived
return self
In this example, the Derived class's __new__ method wants to construct its object using the base_factory method which returns an instance of the Base class. Often this sort of factory is in a library somewhere, and you can't know for certain how it's making the object (you can't just call Base() or super(Derived, cls).__new__(cls) yourself to get the same result).
The instance's __class__ attribute is rewritten so that the result of calling Derived.__new__ will be an instance of the Derived class, which ensures that it will have the Derived.__init__ method called on it (if such a method exists).
I remember using this technique ages ago to “upgrade” existing objects after recognizing what kind of data they hold. It was a part of an experimental XMPP client. XMPP uses many short XML messages (“stanzas”) for communication.
When the application received a stanza, it was parsed into a DOM tree. Then the application needed to recognize what kind of stanza it is (a presence stanza, message, automated query etc.). If, for example, it was recognized as a message stanza, the DOM object was “upgraded” to a subclass that provided methods like “get_author”, “get_body” etc.
I could of course just make a new class to represent a parsed message, make new object of that class and copy the relevant data from the original XML DOM object. There were two benefits of changing object's class in-place, though. Firstly, XMPP is a very extensible standard, and it was useful to still have an easy access to the original DOM object in case some other part of the code found something useful there, or while debugging. Secondly, profiling the code told me that creating a new object and explicitly copying data is much slower than just reusing the object that would be quickly destroyed anyway—the difference was enough to matter in XMPP, which uses many short messages.
I don't think any of these reasons justifies the use of this technique in production code, unless maybe you really need the (not that big) speedup in CPython. It's just a hack which I found useful to make code a bit shorter and faster in the experimental application. Note also that this technique will easily break JIT engines in non-CPython implementations, making the code much slower!

When should I use a class and when should I use a function?

When is a class more useful to use than a function? Is there any hard or fast rule that I should know about? Is it language dependent? I'm intending on writing a script for Python which will parse different types of json data, and my gut feeling is that I should use a class to do this, versus a function.
You should use a class when your routine needs to save state. Otherwise a function will suffice.
First of all, I think that isn't language-dependent (if the language permit you to define classes and function as well).
As a general rule I can tell you that a Class wrap into itself a behaviour. So, if you have a certain type of service that you have to implement (with, i.e. different functions) a class is what you're lookin' for.
Moreover classes (say object that is more correct) has state and you can instantiate more occurrences of a class (so different objects with different states).
Not less important, a class can be inearthed: so you can overwrite a specific behaviour of your function only with small changes.
the class when you have the state - something that should be persistent across the calls
the function in other cases
exception: if your class is only storing couple of values and has a single method besides __init__, you should better use the function
For anything non-trivial, you should probably be using a class. I tend to limit all of my "free-floating" functions to a utils.py file.
This is language-dependent.
Some languages, like Java, insist that you use a class for everything. There's simply no concept of a standalone function.
Python isn't like that. It's perfectly OK - in fact recommended - to define functions standalone, and related functions can be grouped together in modules. As others have stated, the only time you really want a class in Python is when you have state that you need to keep - ie, encapsulating the data within the object.

Which is more pythonic, factory as a function in a module, or as a method on the class it creates?

I have some Python code that creates a Calendar object based on parsed VEvent objects from and iCalendar file.
The calendar object just has a method that adds events as they get parsed.
Now I want to create a factory function that creates a calendar from a file object, path, or URL.
I've been using the iCalendar python module, which implements a factory function as a class method directly on the Class that it returns an instance of:
cal = icalendar.Calendar.from_string(data)
From what little I know about Java, this is a common pattern in Java code, though I seem to find more references to a factory method being on a different class than the class you actually want to instantiate instances from.
The question is, is this also considered Pythonic ? Or is it considered more pythonic to just create a module-level method as the factory function ?
[Note. Be very cautious about separating "Calendar" a collection of events, and "Event" - a single event on a calendar. In your question, it seems like there could be some confusion.]
There are many variations on the Factory design pattern.
A stand-alone convenience function (e.g., calendarMaker(data))
A separate class (e.g., CalendarParser) which builds your target class (Calendar).
A class-level method (e.g. Calendar.from_string) method.
These have different purposes. All are Pythonic, the questions are "what do you mean?" and "what's likely to change?" Meaning is everything; change is important.
Convenience functions are Pythonic. Languages like Java can't have free-floating functions; you must wrap a lonely function in a class. Python allows you to have a lonely function without the overhead of a class. A function is relevant when your constructor has no state changes or alternate strategies or any memory of previous actions.
Sometimes folks will define a class and then provide a convenience function that makes an instance of the class, sets the usual parameters for state and strategy and any other configuration, and then calls the single relevant method of the class. This gives you both the statefulness of class plus the flexibility of a stand-alone function.
The class-level method pattern is used, but it has limitations. One, it's forced to rely on class-level variables. Since these can be confusing, a complex constructor as a static method runs into problems when you need to add features (like statefulness or alternative strategies.) Be sure you're never going to expand the static method.
Two, it's more-or-less irrelevant to the rest of the class methods and attributes. This kind of from_string is just one of many alternative encodings for your Calendar objects. You might have a from_xml, from_JSON, from_YAML and on and on. None of this has the least relevance to what a Calendar IS or what it DOES. These methods are all about how a Calendar is encoded for transmission.
What you'll see in the mature Python libraries is that factories are separate from the things they create. Encoding (as strings, XML, JSON, YAML) is subject to a great deal of more-or-less random change. The essential thing, however, rarely changes.
Separate the two concerns. Keep encoding and representation as far away from state and behavior as you can.
It's pythonic not to think about esoteric difference in some pattern you read somewhere and now want to use everywhere, like the factory pattern.
Most of the time you would think of a #staticmethod as a solution it's probably better to use a module function, except when you stuff multiple classes in one module and each has a different implementation of the same interface, then it's better to use a #staticmethod
Ultimately weather you create your instances by a #staticmethod or by module function makes little difference.
I'd probably use the initializer ( __init__ ) of a class because one of the more accepted "patterns" in python is that the factory for a class is the class initialization.
IMHO a module-level method is a cleaner solution. It hides behind the Python module system that gives it a unique namespace prefix, something the "factory pattern" is commonly used for.
The factory pattern has its own strengths and weaknesses. However, choosing one way to create instances usually has little pragmatic effect on your code.
A staticmethod rarely has value, but a classmethod may be useful. It depends on what you want the class and the factory function to actually do.
A factory function in a module would always make an instance of the 'right' type (where 'right' in your case is the 'Calendar' class always, but you might also make it dependant on the contents of what it is creating the instance out of.)
Use a classmethod if you wish to make it dependant not on the data, but on the class you call it on. A classmethod is like a staticmethod in that you can call it on the class, without an instance, but it receives the class it was called on as first argument. This allows you to actually create an instance of that class, which may be a subclass of the original class. An example of a classmethod is dict.fromkeys(), which creates a dict from a list of keys and a single value (defaulting to None.) Because it's a classmethod, when you subclass dict you get the 'fromkeys' method entirely for free. Here's an example of how one could write dict.fromkeys() oneself:
class dict_with_fromkeys(dict):
#classmethod
def fromkeys(cls, keys, value=None):
self = cls()
for key in keys:
self[key] = value
return self

Categories