How is __slots__ implemented in Python? - python

How is __slots__ implemented in Python?
Is this exposed in the C interface?
How do I get __slots__ behaviour when defining a Python class in C via PyTypeObject?

When creating Python classes, they by default have a __dict__ and you can set any attribute on them. The point of slots is to not create a __dict__ to save space.
In the C interface it's the other way around, an extension class has by default no __dict__, and you would instead explicitly have to add one and add getattr/setattr support to handle it (although luckily there are methods for this already, PyObject_GenericGetAttr and PyObject_GenericSetAttr, so you don't have to implement them, just use them. (Funnily there is not PyObject_GenericDelAttr, though, I'm not sure what that is about. (I should probably stop nesting parenthesis like this (or not)))).
Slots therefore aren't needed nor make sense for Extension types. By default you just let your getattr/setatttr methods access only those attributes that the class has.
As for how it's implemented, the code is in typeobject.c, and it's pretty much just a question of "If the object has a __slots__ attribute, don't create a __dict__. Quite unexciting. :-)

Related

How to use implementation inheritance?

How to use implementation inheritance in Python, that is to say public attributes x and protected attributes _x of the implementation inherited base classes becoming private attributes __x of the derived class?
In other words, in the derived class:
accessing the public attribute x or protected attribute _x should look up x or _x respectively like usual, except it should skip the implementation inherited base classes;
accessing the private attribute __x should look up __x like usual, except it should look up x and _x instead of __x for the implementation inherited base classes.
In C++, implementation inheritance is achieved by using the private access specifier in the base class declarations of a derived class, while the more common interface inheritance is achieved by using the public access specifier:
class A: public B, private C, private D, public E { /* class body */ };
For instance, implementation inheritance is needed to implement the class Adapter design pattern which relies on class inheritance (not to be confused with the object Adapter design pattern which relies on object composition) and consists in converting the interface of an Adaptee class into the interface of a Target abstract class by using an Adapter class that inherits both the interface of the Target abstract class and the implementation of the Adaptee class (cf. the Design Patterns book by Erich Gamma et al.):
Here is a Python program specifying what is intended, based on the above class diagram:
import abc
class Target(abc.ABC):
#abc.abstractmethod
def request(self):
raise NotImplementedError
class Adaptee:
def __init__(self):
self.state = "foo"
def specific_request(self):
return "bar"
class Adapter(Target, private(Adaptee)):
def request(self):
# Should access self.__state and Adaptee.specific_request(self)
return self.__state + self.__specific_request()
a = Adapter()
# Test 1: the implementation of Adaptee should be inherited
try:
assert a.request() == "foobar"
except AttributeError:
assert False
# Test 2: the interface of Adaptee should NOT be inherited
try:
a.specific_request()
except AttributeError:
pass
else:
assert False
You don't want to do this. Python is not C++, nor is C++ Python. How classes are implemented is completely different and so will lead to different design patterns. You do not need to use the class adapter pattern in Python, nor do you want to.
The only practical way to implement the adapter pattern in Python is either by using composition, or by subclassing the Adaptee without hiding that you did so.
I say practical here because there are ways to sort of make it work, but this path would take a lot of work to implement and is likely to introduce hard to track down bugs, and would make debugging and code maintenance much, much harder. Forget about 'is it possible', you need to worry about 'why would anyone ever want to do this'.
I'll try to explain why.
I'll also tell you how the impractical approaches might work. I'm not actually going to implement these, because that's way too much work for no gain, and I simply don't want to spend any time on that.
But first we have to clear several misconceptions here. There are some very fundamental gaps in your understanding of Python and how it's model differs from the C++ model: how privacy is handled, and compilation and execution philosophies, so lets start with those:
Privacy models
First of all, you can't apply C++'s privacy model to Python, because Python has no encapsulation privacy. At all. You need to let go of this idea, entirely.
Names starting with a single underscore are not actually private, not in the way C++ privacy works. Nor are they 'protected'. Using an underscore is just a convention, Python does not enforce access control. Any code can access any attribute on instances or classes, whatever naming convention was used. Instead, when you see a name that start with an underscore you can assume that the name is not part of the conventions of a public interface, that is, that these names can be changed without notice or consideration for backwards compatibility.
Quoting from the Python tutorial section on the subject:
“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.
It's a good convention, but not even something you can rely on, consistently. E.g. the collections.namedtuple() class generator generates a class with 5 different methods and attributes that all start with an underscore but are all meant to be public, because the alternative would be to place arbitrary restrictions on what attribute names you can give the contained elements, and making it incredibly hard to add additional methods in future Python versions without breaking a lot of code.
Names starting with two underscores (and none at the end), are not private either, not in a class encapsulation sense such as the C++ model. They are class-private names, these names are re-written at compile time to produce a per-class namespace, to avoid collisions.
In other words, they are used to avoid a problem very similar to the namedtuple issue described above: to remove limits on what names a subclass can use. If you ever need to design base classes for use in a framework, where subclasses should have the freedom to name methods and attributes without limit, that's where you use __name class-private names. The Python compiler will rewrite __attribute_name to _ClassName__attribute_name when used inside a class statement as well as in any functions that are being defined inside a class statement.
Note that C++ doesn't use names to indicate privacy. Instead, privacy is a property of each identifier, within a given namespace, as processed by the compiler. The compiler enforces access control; private names are not accessible and will lead to compilation errors.
Without a privacy model, your requirement where "public attributes x and protected attributes _x of the implementation inherited base classes becoming private attributes __x of the derived class" are not attainable.
Compilation and execution models
C++
C++ compilation produces binary machine code aimed at execution directly by your CPU. If you want to extend a class from another project, you can only do so if you have access to additional information, in the form of header files, to describe what API is available. The compiler combines information in the header files with tables stored with the machine code and your source code to build more machine code; e.g. inheritance across library boundaries is handled through virtualisation tables.
Effectively, there is very little left of the objects used to construct the program with. You generally don't create references to class or method or function objects, the compiler has taken those abstract ideas as inputs but the output produced is machine code that doesn't need most of those concepts to exist any more. Variables (state, local variables in methods, etc.) are stored either on the heap or on the stack, and the machine code accesses these locations directly.
Privacy is used to direct compiler optimisations, because the compiler can, at all times, know exactly what code can change what state. Privacy also makes virtualisation tables and inheritance from 3rd-party libraries practical, as only the public interface needs to be exposed. Privacy is an efficiency measure, primarily.
Python
Python, on the other hand, runs Python code using a dedicated interpreter runtime, itself a piece of machine code compiled from C code, which has a central evaluation loop that takes Python-specific op-codes to execute your code. Python source code is compiled into bytecode roughly at the module and function levels, stored as a nested tree of objects.
These objects are fully introspectable, using a common model of attributes, sequences and mappings. You can subclass classes without having to have access to additional header files.
In this model, a class is an object with references to base classes, as well as a mapping of attributes (which includes any functions which become bound methods through access on instances). Any code to be executed when a method is called on an instance is encapsulated in code objects attached to function objects stored in the class attribute mapping. The code objects are already compiled to bytecode, and interaction with other objects in the Python object model is through runtime lookups of references, with the attribute names used for those lookups stored as constants within the compiled bytecode if the source code used fixed names.
From the point of view of executing Python code, variables (state and local variables) live in dictionaries (the Python kind, ignoring the internal implementation as hash maps) or, for local variables in functions, in an array attached to the stack frame object. The Python interpreter translates access to these to access to values stored on the heap.
This makes Python slow, but also much more flexible when executing. You can not only introspect the object tree, most of the tree is writeable letting you replace objects at will and so change how the program behaves in nearly limitless ways. And again, there are no privacy controls enforced.
Why use class adapters in C++, and not in Python
My understanding is that experienced C++ coders will use a class adapter (using subclassing) over an object adapter (using composition), because they need to pass compiler-enforced type checks (they need to pass the instances to something that requires the Target class or a subclass thereof), and they need to have fine control over object lifetimes and memory footprints. So, rather than have to worry about the lifetime or memory footprint of an encapsulated instance when using composition, subclassing gives you more complete control over the instance lifetime of your adapter.
This is especially helpful when it might not be practical or even possible to alter the implementation of how the adaptee class would control instance lifetime. At the same time, you wouldn't want to deprive the compiler from optimisation opportunities offered by private and protected attribute access. A class that exposes both the Target and Adaptee interfaces offers fewer options for optimisation.
In Python you almost never have to deal with such issues. Python's object lifetime handling is straightforward, predictable and works the same for every object anyway. If lifetime management or memory footprints were to become an issue you'd probably already be moving the implementation to an extension language like C++ or C.
Next, most Python APIs do not require a specific class or subclass. They only care about the right protocols, that is, if the right methods and attributes are implemented. As long as your Adapter has the right methods and attributes, it'll do fine. See Duck Typing; if your adapter walks like a duck, and talks like a duck, it surely must be a duck. It doesn't matter if that same duck can also bark like a dog.
The practical reasons why you don't do this in Python
Let's move to practicalities. We'll need to update your example Adaptee class to make it a bit more realistic:
class Adaptee:
def __init__(self, arg_foo=42):
self.state = "foo"
self._bar = arg_foo % 17 + 2 * arg_foo
def _ham_spam(self):
if self._bar % 2 == 0:
return f"ham: {self._bar:06d}"
return f"spam: {self._bar:06d}"
def specific_request(self):
return self._ham_spam()
This object not only has a state attribute, it also has a _bar attribute and a private method _ham_spam.
Now, from here on out I'm going to ignore the fact that your basic premise is flawed because there is no privacy model in Python, and instead re-interpret your question as a request to rename the attributes.
For the above example that would become:
state -> __state
_bar -> __bar
_ham_spam -> __ham_spam
specific_request -> __specific_request
You now have a problem, because the code in _ham_spam and specific_request has already been compiled. The implementation for these methods expects to find _bar and _ham_spam attributes on the self object passed in when called. Those names are constants in their compiled bytecode:
>>> import dis
>>> dis.dis(Adaptee._ham_spam)
8 0 LOAD_FAST 0 (self)
2 LOAD_ATTR 0 (_bar)
4 LOAD_CONST 1 (2)
6 BINARY_MODULO
# .. etc. remainder elided ..
The LOAD_ATTR opcode in the above Python bytecode disassembly excerpt will only work correctly if the local variable self has an attribute named _bar.
Note that self can be bound to an instance of Adaptee as well as of Adapter, something you'd have to take into account if you wanted to change how this code operates.
So, it is not enough to simply rename method and attribute names.
Overcoming this problem would require one of two approaches:
intercept all attribute access on both the class and instance levels to translate between the two models.
rewriting the implementations of all methods
Neither of these is a good idea. Certainly neither of them are going to be more efficient or practical, compared to creating a composition adapter.
Impractical approach #1: rewrite all attribute access
Python is dynamic, and you could intercept all attribute access on both the class and the instance levels. You need both, because you have a mix of class attributes (_ham_spam and specific_request), and instance attributes (state and _bar).
You can intercept instance-level attribute access by implementing all methods in the Customizing attribute access section (you don't need __getattr__ for this case). You'll have to be very careful, because you'll need access to various attributes of your instances while controlling access to those very attributes. You'll need to handle setting and deleting as well as getting. This lets you control most attribute access on instances of Adapter().
You would do the same at the class level by creating a metaclass for whatever class your private() adapter would return, and implementing the exact same hook methods for attribute access there. You'll have to take into account that your class can have multiple base classes, so you'd need to handle these as layered namespaces, using their MRO ordering. Attribute interactions with the Adapter class (such as Adapter._special_request to introspect the inherited method from Adaptee) will be handled at this level.
Sounds easy enough, right? Except than the Python interpreter has many optimisations to ensure it isn't completely too slow for practical work. If you start intercepting every attribute access on instances, you will kill a lot of these optimisations (such as the method call optimisations introduced in Python 3.7). Worse, Python ignores the attribute access hooks for special method lookups.
And you have now injected a translation layer, implemented in Python, invoked multiple times for every interaction with the object. This will be a performance bottleneck.
Last but not least, to do this in a generic way, where you can expect private(Adaptee) to work in most circumstances is hard. Adaptee could have other reasons to implement the same hooks. Adapter or a sibling class in the hierarchy could also be implementing the same hooks, and implement them in a way that means the private(...) version is simply bypassed.
Invasive all-out attribute interception is fragile and hard to get right.
Impractical approach #2: rewriting the bytecode
This goes down the rabbit hole quite a bit further. If attribute rewriting isn't practical, how about rewriting the code of Adaptee?
Yes, you could, in principle, do this. There are tools available to directly rewrite bytecode, such as codetransformer. Or you could use the inspect.getsource() function to read the on-disk Python source code for a given function, then use the ast module to rewrite all attribute and method access, then compile the resulting updated AST to bytecode. You'd have to do so for all methods in the Adaptee MRO, and produce a replacement class dynamically that'll achieve what you want.
This, again, is not easy. The pytest project does something like this, they rewrite test assertions to provide much more detailed failure information than otherwise possible. This simple feature requires a 1000+ line module to achieve, paired with a 1600-line test suite to ensure that it does this correctly.
And what you've then achieved is bytecode that doesn't match the original source code, so anyone having to debug this code will have to deal with the fact that the source code the debugger sees doesn't match up with what Python is executing.
You'll also lose the dynamic connection with the original base class. Direct inheritance without code rewriting lets you dynamically update the Adaptee class, rewriting the code forces a disconnect.
Other reason these approaches can't work
I've ignored a further issue that neither of the above approaches can solve. Because Python doesn't have a privacy model, there are plenty of projects out there where code interacts with class state directly.
E.g., what if your Adaptee() implementation relies on a utility function that will try to access state or _bar directly? It's part of the same library, the author of that library would be well within their rights to assume that accessing Adaptee()._bar is safe and normal. Neither attribute intercepting nor code rewriting will fix this issue.
I also ignored the fact that isinstance(a, Adaptee) will still return True, but if you have hidden it's public API by renaming, you have broken that contract. For better or worse, Adapter is a subclass of Adaptee.
TLDR
So, in summary:
Python has no privacy model. There is no point in trying to enforce one here.
The practical reasons that necessitate the class adapter pattern in C++, don't exist in Python
Neither dynamic attribute proxying nor code tranformation is going to be practical in this case and introduce more problems than are being solved here.
You should instead use composition, or just accept that your adapter is both a Target and an Adaptee and so use subclassing to implement the methods required by the new interface without hiding the adaptee interface:
class CompositionAdapter(Target):
def __init__(self, adaptee):
self._adaptee = adaptee
def request(self):
return self._adaptee.state + self._adaptee.specific_request()
class SubclassingAdapter(Target, Adaptee):
def request(self):
return self.state + self.specific_request()
Python doesn't have a way of defining private members like you've described (docs).
You could use encapsulation instead of inheritance and call the method directly, as you noted in your comment. This would be my preferred approach, and it feels the most "pythonic".
class Adapter(Target):
def request(self):
return Adaptee.specific_request(self)
In general, Python's approach to classes is much more relaxed than what is found in C++. Python supports duck-typing, so there is no requirement to subclass Adaptee, as long as the interface of Target is satisfied.
If you really want to use inheritance, you could override interfaces you don't want exposed to raise an AttributeError, and use the underscore convention to denote private members.
class Adaptee:
def specific_request(self):
return "foobar"
# make "private" copy
_specific_request = specific_request
class Adapter(Target, Adaptee):
def request(self):
# call "private" implementation
return self._specific_request()
def specific_request(self):
raise AttributeError()
This question has more suggestions if you want alternatives for faking private methods.
If you really wanted true private methods, you could probably implement a metaclass that overrides object.__getattribute__. But I wouldn't recommend it.

Strong attribute type checking when creating a python class

I am coming from a Java background and I am used to POJO to define a data model. Basically what I want to define is a data model using Python classes with strong attribute type checking and not being able to add more attribute then the one defined (basically something similar to django model). For example I am trying to do create a class:
class A:
element1: int
element2: str
but in this way I am not forcing that A object will not have attribute element3.
What is the pythonic way to achieve that, is there any framework you advice for python 3.6+?
You can do this using __slots__, you can find more information using the magic method __slots__ here: Slots docs.
Slots are a nice way to work around this space consumption problem. Instead of having a dynamic dict that allows adding attributes to objects dynamically, slots provide a static structure which prohibits additions after the creation of an instance.
Keep in mind that this comes at a cost.
It will break serialization (e.g. pickle). It will also break multiple inheritance. A class can't inherit from more than one class that either defines slots or has an instance layout defined in C code (like list, tuple or int).
I hope this information is sufficient.

Does type create the class 'Object' in python

I was reading about this excellent post on metaclasses What is a metaclass in Python?. The accepted answer shows how to create a class using type with the following signature.
type(name of the class,
tuple of the parent class (for inheritance, can be empty),
dictionary containing attributes names and values)
That makes me wonder who creates the object class. Is object class also created by type class?
In theory you are correct. object is an instance of type. But, when you check type's base classes, you'll find that type inherits from object! How can this be?
The truth is, this is hard-coded in the Python interpreter. Both object and type are given; neither is actually involved in any way with creating the other.
The main thing to remember at these rarified levels of the object/type hierarchy is that things like object and type are not created in .py files at all, they are created statically in C code, along with the rest of the foundational underpinnings of CPython. So they aren't necessarily "created" with any particular Python code.
type is a bit of a - umm, not sure how to describe it - weirdo of the language.
It can be used to do type(someobj) or used as a constructor (with 3 params) of some sorts to create new types derived from a base with instances. As you have seen with meta-classes - it's useful to make factory classes - although, IMHO unless you really want to be too clever, using class decorators makes this slightly easier in 2.6+

When to use attributes vs. when to use properties in python? [duplicate]

This question already has answers here:
What's the difference between a Python "property" and "attribute"?
(7 answers)
Closed 2 months ago.
Just a quick question, I'm having a little difficulty understanding where to use properties vs. where use to plain old attributes. The distinction to me is a bit blurry. Any resources on the subject would be superb, thank you!
Properties are more flexible than attributes, since you can define functions that describe what is supposed to happen when setting, getting or deleting them. If you don't need this additional flexibility, use attributes – they are easier to declare and faster.
In languages like Java, it is usually recommended to always write getters and setters, in order to have the option to replace these functions with more complex versions in the future. This is not necessary in Python, since the client code syntax to access attributes and properties is the same, so you can always choose to use properties later on, without breaking backwards compatibilty.
The point is that the syntax is interchangeable. Always start with attributes. If you find you need additional calculations when accessing an attribute, replace it with a property.
In addition to what Daniel Roseman said, I often use properties when I'm wrapping something i.e. when I don't store the information myself but wrapped object does. Then properties make excellent accessors.
Properties are attributes + a posteriori encapsulation.
When you turn an attribute into a property, you just define some getter and setter that you "attach" to it, that will hook the data access. Then, you don't need to rewrite the rest of your code, the way for accessing the data is the same, whatever your attribute is a property or not.
Thanks to this very clever and powerful encapsulation mechanism, in Python you can usually go with attributes (without a priori encapsulation, so without any getter nor setter), unless you need to do special things when accessing the data.
If so, then you just can define setters and getters, only if needed, and "attach" them to the attribute, turning it into a property, without any incidence on the rest of your code (whereas in Java, the first thing you usually do when creating a field, usually private, is to create it's associated getter and setter method).
Nice page about attributes, properties and descriptors here

What is Ruby's analog to Python Metaclasses?

Python has the idea of metaclasses that, if I understand correctly, allow you to modify an object of a class at the moment of construction. You are not modifying the class, but instead the object that is to be created then initialized.
Python (at least as of 3.0 I believe) also has the idea of class decorators. Again if I understand correctly, class decorators allow the modifying of the class definition at the moment it is being declared.
Now I believe there is an equivalent feature or features to the class decorator in Ruby, but I'm currently unaware of something equivalent to metaclasses. I'm sure you can easily pump any Ruby object through some functions and do what you will to it, but is there a feature in the language that sets that up like metaclasses do?
So again, Does Ruby have something similar to Python's metaclasses?
Edit I was off on the metaclasses for Python. A metaclass and a class decorator do very similar things it appears. They both modify the class when it is defined but in different manners. Hopefully a Python guru will come in and explain better on these features in Python.
But a class or the parent of a class can implement a __new__(cls[,..]) function that does customize the construction of the object before it is initialized with __init__(self[,..]).
Edit This question is mostly for discussion and learning about how the two languages compare in these features. I'm familiar with Python but not Ruby and was curious. Hopefully anyone else who has the same question about the two languages will find this post helpful and enlightening.
Ruby doesn't have metaclasses. There are some constructs in Ruby which some people sometimes wrongly call metaclasses but they aren't (which is a source of endless confusion).
However, there's a lot of ways to achieve the same results in Ruby that you would do with metaclasses. But without telling us what exactly you want to do, there's no telling what those mechanisms might be.
In short:
Ruby doesn't have metaclasses
Ruby doesn't have any one construct that corresponds to Python's metaclasses
Everything that Python can do with metaclasses can also be done in Ruby
But there is no single construct, you will use different constructs depending on what exactly you want to do
Any one of those constructs probably has other features as well that do not correspond to metaclasses (although they probably correspond to something else in Python)
While you can do anything in Ruby that you can do with metaclasses in Python, it might not necessarily be straightforward
Although often there will be a more Rubyish solution that is elegant
Last but not least: while you can do anything in Ruby that you can do with metaclasses in Python, doing it might not necessarily be The Ruby Way
So, what are metaclasses exactly? Well, they are classes of classes. So, let's take a step back: what are classes exactly?
Classes …
are factories for objects
define the behavior of objects
define on a metaphysical level what it means to be an instance of the class
For example, the Array class produces array objects, defines the behavior of arrays and defines what "array-ness" means.
Back to metaclasses.
Metaclasses …
are factories for classes
define the behavior of classes
define on a metaphysical level what it means to be a class
In Ruby, those three responsibilities are split across three different places:
the Class class creates classes and defines a little bit of the behavior
the individual class's eigenclass defines a little bit of the behavior of the class
the concept of "classness" is hardwired into the interpreter, which also implements the bulk of the behavior (for example, you cannot inherit from Class to create a new kind of class that looks up methods differently, or something like that – the method lookup algorithm is hardwired into the interpreter)
So, those three things together play the role of metaclasses, but neither one of those is a metaclass (each one only implements a small part of what a metaclass does), nor is the sum of those the metaclass (because they do much more than that).
Unfortunately, some people call eigenclasses of classes metaclasses. (Until recently, I was one of those misguided souls, until I finally saw the light.) Other people call all eigenclasses metaclasses. (Unfortunately, one of those people is the author of one the most popular tutorials on Ruby metaprogramming and the Ruby object model.) Some popular libraries add a metaclass method to Object that returns the object's eigenclass (e.g. ActiveSupport, Facets, metaid). Some people call all virtual classes (i.e. eigenclasses and include classes) metaclasses. Some people call Class the metaclass. Even within the Ruby source code itself, the word "metaclass" is used to refer to things that are not metaclasses.
Your updated question looks quite different now. If I understand you correctly, you want to hook into object allocation and initialization, which has absolutely nothing whatsoever to do with metaclasses. (But you still don't write what it is that you actually want to do, so I might still be off.)
In some object-oriented languages, objects are created by constructors. However, Ruby doesn't have constructors. Constructors are just factory methods (with stupid restrictions); there is no reason to have them in a well-designed language, if you can just use a (more powerful) factory method instead.
Object construction in Ruby works like this: object construction is split into two phases, allocation and initialization. Allocation is done by a public class method called allocate, which is defined as an instance method of class Class and is generally never overriden. (In fact, I don't think you actually can override it.) It just allocates the memory space for the object and sets up a few pointers, however, the object is not really usable at this point.
That's where the initializer comes in: it is an instance method called initialize, which sets up the object's internal state and brings it into a consistent, fully defined state which can be used by other objects.
So, in order to fully create a new object, what you need to do is this:
x = X.allocate
x.initialize
[Note: Objective-C programmers may recognize this.]
However, because it is too easy to forget to call initialize and as a general rule an object should be fully valid after construction, there is a convenience factory method called Class#new, which does all that work for you and looks something like this:
class Class
def new(*args, &block)
obj = allocate
obj.initialize(*args, &block)
return obj
end
end
[Note: actually, initialize is private, so reflection has to be used to circumvent the access restrictions like this: obj.send(:initialize, *args, &block)]
That, by the way, is the reason why to construct an object you call a public class method Foo.new but you implement a private instance method Foo#initialize, which seems to trip up a lot of newcomers.
However, none of this is in any way baked into the language. The fact that the primary factory method for any class is usually called new is just a convention (and sometimes I wish it were different, because it looks similar to constructors in Java, but is completely different). In other languages, the constructor must have a specific name. In Java, it must have the same name as the class, which means that a) there can be only one constructor and b) anonymous classes can't have constructors because they don't have names. In Python, the factory method must be called __new__, which again means there can be only one. (In both Java and Python, you can of course have different factory methods, but calling them looks different from calling the default, while in Ruby (and Smalltalk from whence this pattern originated) it looks just the same.)
In Ruby, there can be as many factory methods as you like, with any name you like, and a factory method can have many different names. (For collection classes, for example, the factory method is often aliased to [], which allows you to write List[1, 2, 3] instead of List.new(1, 2, 3) which ends looking more like an array, thus emphasizing the collection-ish nature of lists.)
In short:
the standardized factory method is Foo.new, but it can be anything
Foo.new calls allocate to allocate memory for an empty object foo
Foo.new then calls foo.initialize, i.e. the Foo#initialize instance method
all three of those are just methods like any other, which you can undefine, redefine, override, wrap, alias and whatnot
well, except allocate which needs to allocate memory inside the Ruby runtime which you can't really do from Ruby
In Python, __new__ roughly corresponds to both new and allocate in Ruby, and __init__ exactly corresponds to initialize in Ruby. The main difference is that in Ruby, new calls initialize whereas in Python, the runtime automatically calls __init__ after __new__.
For example, here is a class which only allows a maximum of 2 instances created:
class Foo
def self.new(*args, &block)
#instances ||= 0
raise 'Too many instances!' if #instances >= 2
obj = allocate
obj.send(:initialize, *args, &block)
#instances += 1
return obj
end
attr_reader :name
def initialize(name)
#name = name
end
end
one = Foo.new('#1')
two = Foo.new('#2')
puts two.name # => #2
three = Foo.new('#3') # => RuntimeError: Too many instances!

Categories