Why is everything in Python, an object? According to what I read, everything including functions is an object. It's not the same in other languages. So what prompted this shift of approach, to treat everything including, even functions, as objects.
The power of everything being an object is that you can define behavior for each object. For example a function being an object gives you an easy way to access the docs of the function for introspection.
print( function.__doc__ )
The alternative would be to provide a library of function that took
a function and returned its interesting properties.
import function_lib
print( function_lib.get_doc( function )
Making int, str etc classes means that you can extend those provide types
in interesting ways for your problem domain.
In my opinion, the 'Everything is object' is great in Python. In this language, you don't react to what are the objects you have to handle, but how they can interact. A function is just an object that you can __call__, a list is just an object that you can __iter__. But why should we divide data in non overlapping groups. An object can behave like a function when we call it, but also like an array when we access it.
This means that you don't think your "function" like, "i want an array of integers and i return the sum of it" but more "i will try to iterate over the thing that someone gave me and try to add them together, if something goes wrong, i will tell it to the caller by raiseing error and he will hate to modify his behavior".
The most interesting exemple is __add__. When you try something like Object1 + Object2, Python will ask (nicely ^^) to Object1 to try to add himself with object2 (Object1.__add__(Object2)). There is 2 scenarios here: either Oject1 knows how to add himself to Object2 and everything is fine, either he raises a NotImplemented error and Python will ask to Object2 to radd himself to Object1. Just with this mechanism, you can teach to your object to add themselves with any other object, you can manage commutativity,...
why is everything in Python, an object?
Python (unlike other languages) is a truly Object Orient language (aka OOP)
when everything is an object, it becomes easier to search, manipulate or access things. (But everything comes at the cost of speed)
what prompted this shift of approach, to treat everything including, even functions, as objects?
"Necessity is the mother of invention"
Related
I currently need to partially create a Python object and be able to update it for some time. Although, I must not be able to update it once I used the object as a dictionary key.
Of course there is the solution of marking the fields as private, which is mostly a warning for the programmer, and I will actually go for that solution.
But I stumbled on another solution and I want to know if this could be a good idea, or if it could simply go horribly wrong. Here it is:
class Foo():
def __init__(self, bar):
self._bar = bar
self._has_been_hashed = False
def __hash__(self):
self._has_been_hashed = True
return self._bar.__hash__()
def __eq__(self, other):
return self._bar == other._bar
def __copy__(self):
return Foo(self._bar)
def set_bar(self, bar):
if self.has_been_hashed:
raise FooIsNowImmutable
else:
self._bar = bar
Some testing proved it to work as desired, I can no longer use set_bar once I, say, used my object as a dictionary key.
What do you think? Is it a good idea? Will it turn against me? Is there an easier way? And is it somehow a bad practice?
Doing it that way is a bit fragile, since you never know when something might be used as a dictionary key, or when its hash might be called for some other reason. An object isn't supposed to "know" whether it's being used as a dictionary key. It will be confusing to have code that may raise an exception just because some other code somewhere else put the object in a dictionary.
Following the Python philosophy of "explicit is better than implicit", it would be safer to just give your object a method called .finalize() or .lock() or something, which would set a flag indicating the object is immutable. You could also reverse the exception-raising logic, so that __hash__ raises an exception if the object is not yet locked (rather than mutation raising an exception if the object has been hashed).
You would then call .lock() when you're ready to make the object immutable. It makes more sense to explicitly set it immutable when you're done with whatever mutating you need to do, rather than implicitly assuming that as soon as you use it in a dictionary, you're done mutating it.
You can do that, but I'm not sure I'd recommend it. Why do you need it in a dictionary?
It requires a lot more awareness of the state of the object... think a file object. Would you put one in a dictionary? It has to be opened for a lot of the functions to work, and once it's closed, you can't do them anymore. The user has to be aware in the surrounding code which state the object is in.
For files, that makes sense - after all, you don't normally hold files open across large parts of your program, or if you do, they have very defined init and close codes; something similar has to make sense for your object. Especially if you have some APIs that take the object, but expect an immutable version, and others that take the same object, but expect to change it...
I have used the lock method before, and it works well for complex, read-only objects that you want to initialize once and then make sure no one is messing with. E.G. you load a copy of a (say, English) dictionary from disk... it has to be mutable while you are populating it, but you don't want anyone to accidentally modify it, so locking it is a great idea. I would only use it if it was a one-time lock though - something you are locking and unlocking seems like a recipe for disaster.
There are two solutions IMHO if you just want to create a version you can use in hashable places. First is to explicitly create an immutable copy when you put it in a dictionary - tuple and frozenset are examples of this sort of behaviour... if you want to put a list in a dict, you can't, but you can create a tuple from it first, and that can be hashed. Create a frozen version of your object, then it's very clear by looking at the object type whether it's expected to be mutable or immutable, and so cases where it was used incorrectly are easily seen.
Second, if you really want it to be hashable, but need it to be mutable... that's actually legal, but implemented a little different. It goes back to the idea of hashing... hashing is used both for optimized lookups, and equality.
The first is to ensure you can get objects back... you put something in a dictionary, and it hashes to a value of 4 - goes in slot 4. Then you modify it. Then you go to look it up again, and now it hashes to 9 - there's nothing in slot 9, or worse, a different object, and you're broken.
Second is equality - for things like sets, I need to know if my object is already in there. I can hash, but if you know anything about hashing, you still need to check equality to check for hash collisions.
That doesn't preclude supporting __hash__ and being mutable, but it's unusual. You need to decide for your item what makes it the same, even though it's mutable. What you need to do then is give each object a unique id. Technically, you may be able to get away with id(self), but something like the uuid module is probably a better possibility. The UUID4 (or technically, the hash of the UUID4) is what determines both the hash and equality; two objects that contain the same UUID4 should be the exact same object; two objects that have the exact same data but a different UUID4 would be different object.
I never realized just how poor a programmer I was until I came across this exercise below. I am to write a Python file that allows all of the tests below to pass without error.
I believe the file I write needs to be a class, but I have absolutely no idea what should be in my class. I know what the question is asking, but not how to make classes or to respond to the calls to the class with the appropriate object(s).
Please review the exercise code below, and then see my questions at the end.
File with tests:
import unittest
from allergies import Allergies
class AllergiesTests(unittest.TestCase):
def test_ignore_non_allergen_score_parts(self):
self.assertEqual(['eggs'], Allergies(257).list)
if __name__ == '__main__':
unittest.main()
1) I don't understand the "list" method at the end of this assertion. Is it the the Built-In Python function "list()," or is it an attribute that I need to define in my "Allergies" class?
2) What type of object is "Allergies(257).list"
self.assertEqual(['eggs'], Allergies(257).list)
3) Do I approach this by defining something like the following?
def list(self):
list_of_allergens = ['eggs','pollen','cat hair', 'shellfish']
return list_of_allergens[0]
1) I don't understand the "list" method at the end of this assertion. Is it the the Built-In Python function "list()," or is it an attribute that I need to define in my "Allergies" class?
From the ., you can tell that it's an attribute that you need to define on your Allergies class—or, rather, on each of its instances.*
2) What type of object is "Allergies(257).list"
Well, what is it supposed to compare equal to? ['eggs'] is a list of strings (well, of string). So, unless you're going to create a custom type that likes to compare equal to lists, you need a list.
3) Do I approach this by defining something like the following?
def list(self):
list_of_allergens = ['eggs','pollen','cat hair', 'shellfish']
return ist_of_allergens
No. You're on the wrong track right off the bat. This will make Allergies(257).list into a method. Even if that method returns a list when it's called, the test driver isn't calling it. It has to be a list. (Also, more obviously, ['eggs','pollen','cat hair', 'shellfish'] is not going to compare equal to ['eggs'], and ist_of_allergens isn't the same thing as list_of_allergens.)
So, where is that list going to come from? Well, your class is going to need to assign something to self.list somewhere. And, since the only code from your class that's getting called is your constructor (__new__) and initializer (__init__), that "somewhere" is pretty limited. And you probably haven't learned about __new__ yet, which means you have a choice of one place, which makes it pretty simple.
* Technically, you could use a class attribute here, but that seems less likely to be what they're looking for. For that matter, Allergies doesn't even have to be a class; it could be a function that just defines a new type on the fly, constructs it, and adds list to its dict. But both PEP 8 naming standards and "don't make things more complex for no good reason" both point to wanting a class here.
From how it's used, list is an attribute of the object returned by Allergies, which may be a function that returns an object or simply the call to construct an object of type Allergies. In this last case, the whole thing can be easily implemented as:
class Allergies:
def __init__(self, n):
# probably you should do something more
# interesting with n
if n==257:
self.list=['eggs']
This looks like one of the exercises from exercism.io.
I have completed the exercise, so I know what's involved.
'list' is supposed to be a class attribute of the class Allergies, and is itself an object of type list. At least that's one straight-forward way of dealing with it. I defined it in the __init__ method of the class. In my opinion, it's confusing that they called it 'list', as this clashes with Pythons list type.
snippet from my answer:
class Allergies(object):
allergens = ["eggs", "peanuts",
"shellfish", "strawberries",
"tomatoes", "chocolate",
"pollen","cats"]
def __init__(self, score):
# score_breakdown returns a list
self.list = self.score_breakdown(score) # let the name of this function be a little clue ;)
If I were you I would go and do some Python tutorials. I would start with basics, even if it feels like you are covering ground you already travelled. It's absolutely worth knowing your basics/fundamentals as solidly as possible. For this, I could recommend Udacity or codeacademy.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Python: Get object by id
Typically when you see the string representation of an instance it looks something like <module.space.Class # 0x108181bc>. I am curious if it's possible to take this string and get a handle on the instance.
Something like
obj_instance = get_instance_form_repr("<module.space.Class # 0x1aa031b>")
I don't believe it could be possible but if it is, it would be really useful.
You can do it for the subset that is tracked by the garbage collector at least, in a nasty and unreliable way.
def lookup_object_by_repr(myrep)
import gc
for obj in gc.get_objects():
if repr(obj) == myrep:
return obj
You can do even more things if you write a simple C extension and inspect the memory address.
Of course there's no way to get an object from its repr, because any class can return anything it wants in its repr. But in the specific case where the repr has a pointer in it, you're basically asking how to get an object from a pointer. Which (in C Python, at least) is the exact same thing as asking how to get an object from its id.
There's no built-in way to do this, even though it would be pretty simple. And that's intentional, because this is almost always a bad idea.
If you think you have a practical purpose for getting an object by id, you're probably wrong. Especially since you have to deal with object lifecycle issues that are usually taken care of automatically (and behind your back, which makes it hard to take care of them manually even if you try). For example, if an object goes away, a weakref to that object nulls out, but the id is still pointing—to deallocated memory, or another object created later, or half of one object and half of another. And then of course whatever you do in C Python isn't going to work in PyPy or jython or IronPython…
But if you're just tinkering around to learn how the C Python runtime works, then this is a legitimate question, and there's a legitimate answer. Probably the best way to do it is to create a C extension module. If you don't know how to create an extension module, you really ought to learn that first, and come back to this question later. If you do, the function you want to implement is pretty simple:
static PyObject *objectFromId(PyObject *self, PyObject *args) {
PyObject *obj;
if (!PyArg_ParseTuple(args, "n", &obj)) return NULL;
Py_INCREF(obj);
return obj;
}
You could do this in Pyrex/Cython instead of C, if you wanted. Or you could just call PyObj_FromPtr directly on the _ctypes module. But if you're trying to learn how things work at this level, it makes more sense to make what's happening explicit, and to explicitly put your "dealing with C pointers" code in C.
On further thought, it you wanted to build a sort of best-guess objectFromRepr for tinkering purposes, you could. Basically:
Use a regex that matches the common angle-bracket form; if it matches, call objectFromId with the address.
Call eval.
You might want to put an intermediate step in there: for many types, the repr is in the form of a constructor call like Class(arg1, arg2), and in that case it might be better to match that form with another regex and call the constructor directly, instead of using eval. If nothing else, it's probably more instructive, and that's the point of this exercise, right?
Obviously this is an even more terrible idea in real-life code than objectFromId. It's almost always bad to use eval, and eval(repr(x)) == x is not actually guaranteed to be true, and you're actually getting a new reference to the same object in the id case but a new object with the same value in the eval case, and…
PS, After you learn how all of this works in C Python, it's probably worth doing a similar exercise in another interpreter like jython or IronPython (especially when you repr a native Java/.NET/etc. object).
Is it common in Python to keep testing for type values when working in a OOP fashion?
class Foo():
def __init__(self,barObject):
self.bar = setBarObject(barObject)
def setBarObject(barObject);
if (isInstance(barObject,Bar):
self.bar = barObject
else:
# throw exception, log, etc.
class Bar():
pass
Or I can use a more loose approach, like:
class Foo():
def __init__(self,barObject):
self.bar = barObject
class Bar():
pass
Nope, in fact it's overwhelmingly common not to test for type values, as in your second approach. The idea is that a client of your code (i.e. some other programmer who uses your class) should be able to pass any kind of object that has all the appropriate methods or properties. If it doesn't happen to be an instance of some particular class, that's fine; your code never needs to know the difference. This is called duck typing, because of the adage "If it quacks like a duck and flies like a duck, it might as well be a duck" (well, that's not the actual adage but I got the gist of it I think)
One place you'll see this a lot is in the standard library, with any functions that handle file input or output. Instead of requiring an actual file object, they'll take anything that implements the read() or readline() method (depending on the function), or write() for writing. In fact you'll often see this in the documentation, e.g. with tokenize.generate_tokens, which I just happened to be looking at earlier today:
The generate_tokens() generator requires one argument, readline, which must be a callable object which provides the same interface as the readline() method of built-in file objects (see section File Objects). Each call to the function should return one line of input as a string.
This allows you to use a StringIO object (like an in-memory file), or something wackier like a dialog box, in place of a real file.
In your own code, just access whatever properties of an object you need, and if it's the wrong kind of object, one of the properties you need won't be there and it'll throw an exception.
I think that it's good practice to check input for type. It's reasonable to assume that if you asked a user to give one data type they might give you another, so you should code to defend against this.
However, it seems like a waste of time (both writing and running the program) to check the type of input that the program generates independent of input. As in a strongly-typed language, checking type isn't important to defend against programmer error.
So basically, check input but nothing else so that code can run smoothly and users don't have to wonder why they got an exception rather than a result.
If your alternative to the type check is an else containing exception handling, then you should really consider duck typing one tier up, supporting as many objects with the methods you require from the input, and working inside a try.
You can then except (and except as specifically as possible) that.
The final result wouldn't be unlike what you have there, but a lot more versatile and Pythonic.
Everything else that needed to be said about the actual question, whether it's common/good practice or not, I think has been answered excellently by David's.
I agree with some of the above answers, in that I generally never check for type from one function to another.
However, as someone else mentioned, anything accepted from a user should be checked, and for things like this I use regular expressions. The nice thing about using regular expressions to validate user input is that not only can you verify that the data is in the correct format, but you can parse the input into a more convenient form, like a string into a dictionary.
example:
a_list = [1, 2, 3]
a_list.len() # doesn't work
len(a_list) # works
Python being (very) object oriented, I don't understand why the 'len' function isn't inherited by the object.
Plus I keep trying the wrong solution since it appears as the logical one to me
Guido's explanation is here:
First of all, I chose len(x) over x.len() for HCI reasons (def __len__() came much later). There are two intertwined reasons actually, both HCI:
(a) For some operations, prefix notation just reads better than postfix — prefix (and infix!) operations have a long tradition in mathematics which likes notations where the visuals help the mathematician thinking about a problem. Compare the easy with which we rewrite a formula like x*(a+b) into x*a + x*b to the clumsiness of doing the same thing using a raw OO notation.
(b) When I read code that says len(x) I know that it is asking for the length of something. This tells me two things: the result is an integer, and the argument is some kind of container. To the contrary, when I read x.len(), I have to already know that x is some kind of container implementing an interface or inheriting from a class that has a standard len(). Witness the confusion we occasionally have when a class that is not implementing a mapping has a get() or keys() method, or something that isn’t a file has a write() method.
Saying the same thing in another way, I see ‘len‘ as a built-in operation. I’d hate to lose that. /…/
The short answer: 1) backwards compatibility and 2) there's not enough of a difference for it to really matter. For a more detailed explanation, read on.
The idiomatic Python approach to such operations is special methods which aren't intended to be called directly. For example, to make x + y work for your own class, you write a __add__ method. To make sure that int(spam) properly converts your custom class, write a __int__ method. To make sure that len(foo) does something sensible, write a __len__ method.
This is how things have always been with Python, and I think it makes a lot of sense for some things. In particular, this seems like a sensible way to implement operator overloading. As for the rest, different languages disagree; in Ruby you'd convert something to an integer by calling spam.to_i directly instead of saying int(spam).
You're right that Python is an extremely object-oriented language and that having to call an external function on an object to get its length seems odd. On the other hand, len(silly_walks) isn't any more onerous than silly_walks.len(), and Guido has said that he actually prefers it (http://mail.python.org/pipermail/python-3000/2006-November/004643.html).
It just isn't.
You can, however, do:
>>> [1,2,3].__len__()
3
Adding a __len__() method to a class is what makes the len() magic work.
This way fits in better with the rest of the language. The convention in python is that you add __foo__ special methods to objects to make them have certain capabilities (rather than e.g. deriving from a specific base class). For example, an object is
callable if it has a __call__ method
iterable if it has an __iter__ method,
supports access with [] if it has __getitem__ and __setitem__.
...
One of these special methods is __len__ which makes it have a length accessible with len().
Maybe you're looking for __len__. If that method exists, then len(a) calls it:
>>> class Spam:
... def __len__(self): return 3
...
>>> s = Spam()
>>> len(s)
3
Well, there actually is a length method, it is just hidden:
>>> a_list = [1, 2, 3]
>>> a_list.__len__()
3
The len() built-in function appears to be simply a wrapper for a call to the hidden len() method of the object.
Not sure why they made the decision to implement things this way though.
there is some good info below on why certain things are functions and other are methods. It does indeed cause some inconsistencies in the language.
http://mail.python.org/pipermail/python-dev/2008-January/076612.html