This question already has answers here:
What is the purpose of the `self` parameter? Why is it needed?
(26 answers)
Closed 6 years ago.
I've recently learned about class, method, and (self) functions. While I seem to understand the syntax side of things, I find having to type [self] in front of variables bothersome with very little benefit.
In what scenario would this be beneficial compared to simply using individual functions without class reference?
In what scenario would this be beneficial compared to simply using individual functions without class reference?
whoa… what a question! I believe you should start with a lecture on object oriented programing.
Simply said: the benefits of using a class over laying out functions in a module, is to enable many features of object oriented programing (encapsulation of data through a behaviour, class inheritance, properties…).
And in more length: the main idea behind OOP, is that you create a public programing interface, that other developers (including you) will use, and you can hide the inner workings, so once the public interface is all implemented, you don't care how things work internally, and can even change without breaking the code that use that.
So in this case, you create a class, and you implement your own algorithms on your own data, but all what people really care is how to change your data using those algorithms.
And those algorithms are being called methods and the data members.
In many languages, when you create a method (i.e. a function that is bound to an "object"), the reference to the current instance of the object is implicit (and might be optionally explicit in Java or C++, which is the this variable).
In python, the language designers chose for it to be explicit, and called by convention self. Then in rare cases, you can choose to call it this if you want (which can be useful when doing nested classes).
Finally, I talked about "classes", "instances" and "objects". An object is an instance of a class. What that means is that the class is here to lay out how things work (what are the members, what are the methods…). Then you do instanciate your object by calling the constructor. So then, you can have plenty of objects that share the same class. Which means they work similarly, but they have different data.
But here I only scratch the surface, and only want to show you that there are a lot of concepts behind the self and the classes.
For more on the topic, go read books and take programming classes:
https://wiki.python.org/moin/BeginnersGuide
https://www.python.org/about/gettingstarted/
https://en.wikibooks.org/wiki/A_Beginner%27s_Python_Tutorial/Classes
http://www.bodenseo.com/course/python_training_course.html
https://www.udemy.com/python-for-beginners/
https://www.coursera.org/learn/python
Related
Suppose I have an instance method that contains a lot of nested conditionals. What would be a good way to encapsulate that code? Put in another instance method of the same class or a function? Could you say why a certain approach is preferred?
If the function is only used by one class, and especially if the module has more classes with potentially more utility functions (used only by one class), it might clarify things a bit if you kept the functions as static methods instead to make it obvious which class they belong to. Also, automated refactorings (using the e.g. the rope library, or PyCharm or PyDev etc) then automatically move the static method along with the class to wherever the class is moved.
P.S. #staticmethods, unlike module-level functions, can be overridden in subclasses, e.g. in case of a mathematical formula that doesn't depend on the object but does depend on the type of the object.
There are two different questions here. The first one is what to do with multiple nested conditionals. There's no single right answer: it depends on your coding style, how the conditions interact, the architecture of your program and so on. Have a look at this Programmers.SE question and Jeff Atwood's blog post for some ideas; personally, I like
if not check1: return
code1
if not check2: return
code 2
...
although some people object to the multiple exit points.
The second question is what to do with individual functions if you're writing object oriented Python. The usual answer is just to put them as functions inside the module containing the class, since there's no requirement that a function be attached to a particular class. If you want, though, you can include them in the class as static methods.
This question already has answers here:
What's the difference between a Python "property" and "attribute"?
(7 answers)
Closed 2 months ago.
Just a quick question, I'm having a little difficulty understanding where to use properties vs. where use to plain old attributes. The distinction to me is a bit blurry. Any resources on the subject would be superb, thank you!
Properties are more flexible than attributes, since you can define functions that describe what is supposed to happen when setting, getting or deleting them. If you don't need this additional flexibility, use attributes – they are easier to declare and faster.
In languages like Java, it is usually recommended to always write getters and setters, in order to have the option to replace these functions with more complex versions in the future. This is not necessary in Python, since the client code syntax to access attributes and properties is the same, so you can always choose to use properties later on, without breaking backwards compatibilty.
The point is that the syntax is interchangeable. Always start with attributes. If you find you need additional calculations when accessing an attribute, replace it with a property.
In addition to what Daniel Roseman said, I often use properties when I'm wrapping something i.e. when I don't store the information myself but wrapped object does. Then properties make excellent accessors.
Properties are attributes + a posteriori encapsulation.
When you turn an attribute into a property, you just define some getter and setter that you "attach" to it, that will hook the data access. Then, you don't need to rewrite the rest of your code, the way for accessing the data is the same, whatever your attribute is a property or not.
Thanks to this very clever and powerful encapsulation mechanism, in Python you can usually go with attributes (without a priori encapsulation, so without any getter nor setter), unless you need to do special things when accessing the data.
If so, then you just can define setters and getters, only if needed, and "attach" them to the attribute, turning it into a property, without any incidence on the rest of your code (whereas in Java, the first thing you usually do when creating a field, usually private, is to create it's associated getter and setter method).
Nice page about attributes, properties and descriptors here
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Python (and Python C API): new versus init
I'm at college just now and the lecturer was using the terms constructors and initializers interchangeably. I'm pretty sure that this is wrong though.
I've tried googling the answer but not found the answer I'm looking for.
In most OO languages, they are the same step, so he's not wrong for things like java, c++, etc. In python they are done in two steps: __new__ is the constructor; __init__ is the initializer.
Here is another answer that goes into more detail about the differences between them.
In almost all usual cases, Python does not have constructors in the same sense used by other OO languages because manually managing memory is generally discouraged. Instead, what you should usually do is define an __init__ method on the class. This method is called to initialize the new instance object automatically, first thing after it is constructed. Thus, it is not really a constructor, and talking about it as a constructor might confuse some people.
Of course some people want to call it a constructor because it is used a little bit like a constructor - fundamentally you can call it whatever you want as long as everyone understands what you are actually referring to. But in general, to be explicit and make yourself understood, call it an init method or something other than a constructor. Fundamentally, different languages just come with somewhat different terminology and speaking very clearly will always require adjustment to your subject matter and audience.
In Python it is possible to manage instance creation and destruction at a finer granularity, though you won't want to unless you know what you're doing. This is done by defining __new__ and __del__ methods to hook object instantiation and del statements. Whether these qualify as constructors and destructors precisely is a little more debatable (Python docs call the del method a destructor, but tend to be vaguer on what constitutes a constructor, e.g. including many functions which return object instances). I'd still encourage you to use the specific terminology for the language at hand, and in comparative discussions to define your terms up front. As always, your choice of terms while speaking involves tradeoffs between the audience being able to easily follow you and the audience potentially being led into confusion: if you are talking about memory management probably be as specific as possible, but if you are talking loosely then just use some word your audience understands and be ready to clarify.
Your instructor is being unclear at worst, I'm not aware of any one canonical definition of these terms but they might cause confusion for people who have learned very specific definitions from other languages.
http://docs.python.org/reference/datamodel.html#basic-customization
__new__ - constructor.
__init__ - initializer.
Note: I'm not talking about preventing the rebinding of a variable. I'm talking about preventing the modification of the memory that the variable refers to, and of any memory that can be reached from there by following the nested containers.
I have a large data structure, and I want to expose it to other modules, on a read-only basis. The only way to do that in Python is to deep-copy the particular pieces I'd like to expose - prohibitively expensive in my case.
I am sure this is a very common problem, and it seems like a constant reference would be the perfect solution. But I must be missing something. Perhaps constant references are hard to implement in Python. Perhaps they don't quite do what I think they do.
Any insights would be appreciated.
While the answers are helpful, I haven't seen a single reason why const would be either hard to implement or unworkable in Python. I guess "un-Pythonic" would also count as a valid reason, but is it really? Python does do scrambling of private instance variables (starting with __) to avoid accidental bugs, and const doesn't seem to be that different in spirit.
EDIT: I just offered a very modest bounty. I am looking for a bit more detail about why Python ended up without const. I suspect the reason is that it's really hard to implement to work perfectly; I would like to understand why it's so hard.
It's the same as with private methods: as consenting adults authors of code should agree on an interface without need of force. Because really really enforcing the contract is hard, and doing it the half-assed way leads to hackish code in abundance.
Use get-only descriptors, and state clearly in your documentation that these data is meant to be read only. After all, a determined coder could probably find a way to use your code in different ways you thought of anyways.
In PEP 351, Barry Warsaw proposed a protocol for "freezing" any mutable data structure, analogous to the way that frozenset makes an immutable set. Frozen data structures would be hashable and so capable being used as keys in dictionaries.
The proposal was discussed on python-dev, with Raymond Hettinger's criticism the most detailed.
It's not quite what you're after, but it's the closest I can find, and should give you some idea of the thinking of the Python developers on this subject.
There are many design questions about any language, the answer to most of which is "just because". It's pretty clear that constants like this would go against the ideology of Python.
You can make a read-only class attribute, though, using descriptors. It's not trivial, but it's not very hard. The way it works is that you can make properties (things that look like attributes but call a method on access) using the property decorator; if you make a getter but not a setter property then you will get a read-only attribute. The reason for the metaclass programming is that since __init__ receives a fully-formed instance of the class, you actually can't set the attributes to what you want at this stage! Instead, you have to set them on creation of the class, which means you need a metaclass.
Code from this recipe:
# simple read only attributes with meta-class programming
# method factory for an attribute get method
def getmethod(attrname):
def _getmethod(self):
return self.__readonly__[attrname]
return _getmethod
class metaClass(type):
def __new__(cls,classname,bases,classdict):
readonly = classdict.get('__readonly__',{})
for name,default in readonly.items():
classdict[name] = property(getmethod(name))
return type.__new__(cls,classname,bases,classdict)
class ROClass(object):
__metaclass__ = metaClass
__readonly__ = {'a':1,'b':'text'}
if __name__ == '__main__':
def test1():
t = ROClass()
print t.a
print t.b
def test2():
t = ROClass()
t.a = 2
test1()
While one programmer writing code is a consenting adult, two programmers working on the same code seldom are consenting adults. More so if they do not value the beauty of the code but them deadlines or research funds.
For such adults there is some type safety, provided by Enthought's Traits.
You could look into Constant and ReadOnly traits.
For some additional thoughts, there is a similar question posed about Java here:
Why is there no Constant feature in Java?
When asking why Python has decided against constant references, I think it's helpful to think of how they would be implemented in the language. Should Python have some sort of special declaration, const, to create variable references that can't be changed? Why not allow variables to be declared a float/int/whatever then...these would surely help prevent programming bugs as well. While we're at it, adding class and method modifiers like protected/private/public/etc. would help enforce compile-type checking against illegal uses of these classes. ...pretty soon, we've lost the beauty, simplicity, and elegance that is Python, and we're writing code in some sort of bastard child of C++/Java.
Python also currently passes everything by reference. This would be some sort of special pass-by-reference-but-flag-it-to-prevent-modification...a pretty special case (and as the Tao of Python indicates, just "un-Pythonic").
As mentioned before, without actually changing the language, this type of behaviour can be implemented via classes & descriptors. It may not prevent modification from a determined hacker, but we are consenting adults. Python didn't necessarily decide against providing this as an included module ("batteries included") - there was just never enough demand for it.
Python has the idea of metaclasses that, if I understand correctly, allow you to modify an object of a class at the moment of construction. You are not modifying the class, but instead the object that is to be created then initialized.
Python (at least as of 3.0 I believe) also has the idea of class decorators. Again if I understand correctly, class decorators allow the modifying of the class definition at the moment it is being declared.
Now I believe there is an equivalent feature or features to the class decorator in Ruby, but I'm currently unaware of something equivalent to metaclasses. I'm sure you can easily pump any Ruby object through some functions and do what you will to it, but is there a feature in the language that sets that up like metaclasses do?
So again, Does Ruby have something similar to Python's metaclasses?
Edit I was off on the metaclasses for Python. A metaclass and a class decorator do very similar things it appears. They both modify the class when it is defined but in different manners. Hopefully a Python guru will come in and explain better on these features in Python.
But a class or the parent of a class can implement a __new__(cls[,..]) function that does customize the construction of the object before it is initialized with __init__(self[,..]).
Edit This question is mostly for discussion and learning about how the two languages compare in these features. I'm familiar with Python but not Ruby and was curious. Hopefully anyone else who has the same question about the two languages will find this post helpful and enlightening.
Ruby doesn't have metaclasses. There are some constructs in Ruby which some people sometimes wrongly call metaclasses but they aren't (which is a source of endless confusion).
However, there's a lot of ways to achieve the same results in Ruby that you would do with metaclasses. But without telling us what exactly you want to do, there's no telling what those mechanisms might be.
In short:
Ruby doesn't have metaclasses
Ruby doesn't have any one construct that corresponds to Python's metaclasses
Everything that Python can do with metaclasses can also be done in Ruby
But there is no single construct, you will use different constructs depending on what exactly you want to do
Any one of those constructs probably has other features as well that do not correspond to metaclasses (although they probably correspond to something else in Python)
While you can do anything in Ruby that you can do with metaclasses in Python, it might not necessarily be straightforward
Although often there will be a more Rubyish solution that is elegant
Last but not least: while you can do anything in Ruby that you can do with metaclasses in Python, doing it might not necessarily be The Ruby Way
So, what are metaclasses exactly? Well, they are classes of classes. So, let's take a step back: what are classes exactly?
Classes …
are factories for objects
define the behavior of objects
define on a metaphysical level what it means to be an instance of the class
For example, the Array class produces array objects, defines the behavior of arrays and defines what "array-ness" means.
Back to metaclasses.
Metaclasses …
are factories for classes
define the behavior of classes
define on a metaphysical level what it means to be a class
In Ruby, those three responsibilities are split across three different places:
the Class class creates classes and defines a little bit of the behavior
the individual class's eigenclass defines a little bit of the behavior of the class
the concept of "classness" is hardwired into the interpreter, which also implements the bulk of the behavior (for example, you cannot inherit from Class to create a new kind of class that looks up methods differently, or something like that – the method lookup algorithm is hardwired into the interpreter)
So, those three things together play the role of metaclasses, but neither one of those is a metaclass (each one only implements a small part of what a metaclass does), nor is the sum of those the metaclass (because they do much more than that).
Unfortunately, some people call eigenclasses of classes metaclasses. (Until recently, I was one of those misguided souls, until I finally saw the light.) Other people call all eigenclasses metaclasses. (Unfortunately, one of those people is the author of one the most popular tutorials on Ruby metaprogramming and the Ruby object model.) Some popular libraries add a metaclass method to Object that returns the object's eigenclass (e.g. ActiveSupport, Facets, metaid). Some people call all virtual classes (i.e. eigenclasses and include classes) metaclasses. Some people call Class the metaclass. Even within the Ruby source code itself, the word "metaclass" is used to refer to things that are not metaclasses.
Your updated question looks quite different now. If I understand you correctly, you want to hook into object allocation and initialization, which has absolutely nothing whatsoever to do with metaclasses. (But you still don't write what it is that you actually want to do, so I might still be off.)
In some object-oriented languages, objects are created by constructors. However, Ruby doesn't have constructors. Constructors are just factory methods (with stupid restrictions); there is no reason to have them in a well-designed language, if you can just use a (more powerful) factory method instead.
Object construction in Ruby works like this: object construction is split into two phases, allocation and initialization. Allocation is done by a public class method called allocate, which is defined as an instance method of class Class and is generally never overriden. (In fact, I don't think you actually can override it.) It just allocates the memory space for the object and sets up a few pointers, however, the object is not really usable at this point.
That's where the initializer comes in: it is an instance method called initialize, which sets up the object's internal state and brings it into a consistent, fully defined state which can be used by other objects.
So, in order to fully create a new object, what you need to do is this:
x = X.allocate
x.initialize
[Note: Objective-C programmers may recognize this.]
However, because it is too easy to forget to call initialize and as a general rule an object should be fully valid after construction, there is a convenience factory method called Class#new, which does all that work for you and looks something like this:
class Class
def new(*args, &block)
obj = allocate
obj.initialize(*args, &block)
return obj
end
end
[Note: actually, initialize is private, so reflection has to be used to circumvent the access restrictions like this: obj.send(:initialize, *args, &block)]
That, by the way, is the reason why to construct an object you call a public class method Foo.new but you implement a private instance method Foo#initialize, which seems to trip up a lot of newcomers.
However, none of this is in any way baked into the language. The fact that the primary factory method for any class is usually called new is just a convention (and sometimes I wish it were different, because it looks similar to constructors in Java, but is completely different). In other languages, the constructor must have a specific name. In Java, it must have the same name as the class, which means that a) there can be only one constructor and b) anonymous classes can't have constructors because they don't have names. In Python, the factory method must be called __new__, which again means there can be only one. (In both Java and Python, you can of course have different factory methods, but calling them looks different from calling the default, while in Ruby (and Smalltalk from whence this pattern originated) it looks just the same.)
In Ruby, there can be as many factory methods as you like, with any name you like, and a factory method can have many different names. (For collection classes, for example, the factory method is often aliased to [], which allows you to write List[1, 2, 3] instead of List.new(1, 2, 3) which ends looking more like an array, thus emphasizing the collection-ish nature of lists.)
In short:
the standardized factory method is Foo.new, but it can be anything
Foo.new calls allocate to allocate memory for an empty object foo
Foo.new then calls foo.initialize, i.e. the Foo#initialize instance method
all three of those are just methods like any other, which you can undefine, redefine, override, wrap, alias and whatnot
well, except allocate which needs to allocate memory inside the Ruby runtime which you can't really do from Ruby
In Python, __new__ roughly corresponds to both new and allocate in Ruby, and __init__ exactly corresponds to initialize in Ruby. The main difference is that in Ruby, new calls initialize whereas in Python, the runtime automatically calls __init__ after __new__.
For example, here is a class which only allows a maximum of 2 instances created:
class Foo
def self.new(*args, &block)
#instances ||= 0
raise 'Too many instances!' if #instances >= 2
obj = allocate
obj.send(:initialize, *args, &block)
#instances += 1
return obj
end
attr_reader :name
def initialize(name)
#name = name
end
end
one = Foo.new('#1')
two = Foo.new('#2')
puts two.name # => #2
three = Foo.new('#3') # => RuntimeError: Too many instances!