Extending weakref proxy/Copying behaviour - python

I have a class holding a table (list of lists). This class should return a rowpointer similar to sql. For this row pointer I would like to week ref the table row (a list) with a weakref.proxy. However, I would like to add additional capabilities to a row pointer, e.g. overwrite the __getitem__ method to allow access via, say the column names.
Is there an easy way to get the same behaviour (translating access to my object to the object beeing referenced), or do I have to reimplement all the special methods?
As an easy way I could think of inheritance (but since I found no doc on weakref.ProxyType I wont even try to inherit from that, (how to init?). The other option could be to define some special method even to always redirect "special" (__xxx__) function calls to the referred object, even though this makes that seem impossible.

Iresearched some more and found out this:
http://code.activestate.com/recipes/496741-object-proxying/
http://pypi.python.org/pypi/ProxyTypes
So in short, one can forward all calls (I think the recipi on active state is better), but I have not found a way to implement:
$a = proxy([1,2,3])
$b = a
$print type(b)
>>list
I will settle for just working with an object wich pretty much behaves like the list.

Related

Safe way to create an "out of band" alternative value for a function argument in Python

I have a function like the following:
def do_something(thing):
pass
def foo(everything, which_things):
"""Do stuff to things.
Args:
everything: Dict of things indexed by thing identifiers.
which_things: Iterable of thing identifiers. Something is only
done to these things.
"""
for thing_identifier in which_things:
do_something(everything[thing_identifier])
But I want to extend it so that a caller can do_something with everything passed in everything without having to provide a list of identifiers. (As a motivation, if everything was an opaque container whose keys weren't accessible to library users but only library internals. In this case, foo can access the keys but the caller can't. Another motivation is error prevention: having a constant with obvious semantics avoids a caller mistakenly passing the wrong set of identifiers in.) So one thought is to have a constant USE_EVERYTHING that can be passed in, like so:
def foo(everything, which_things):
"""Do stuff to things.
Args:
everything: Dict of things indexed by thing identifiers.
which_things: Iterable of thing identifiers. Something is only
done to these things. Alternatively pass USE_EVERYTHING to
do something to everything.
"""
if which_things == USE_EVERYTHING:
which_things = everything.keys()
for thing_identifier in which_things:
do_something(everything[thing_identifier])
What are some advantages and limitations of this approach? How can I define a USE_EVERYTHING constant so that it is unique and specific to this function?
My first thought is to give it its own type, like so:
class UseEverythingType:
pass
USE_EVERYTHING = UseEverythingType()
This would be in a package only exporting USE_EVERYTHING, discouraging creating any other UseEverythingType objects. But I worry that I'm not considering all aspects of Python's object model -- could two instances of USE_EVERYTHING somehow compare unequal?

Approach behind having everything as an object in Python

Why is everything in Python, an object? According to what I read, everything including functions is an object. It's not the same in other languages. So what prompted this shift of approach, to treat everything including, even functions, as objects.
The power of everything being an object is that you can define behavior for each object. For example a function being an object gives you an easy way to access the docs of the function for introspection.
print( function.__doc__ )
The alternative would be to provide a library of function that took
a function and returned its interesting properties.
import function_lib
print( function_lib.get_doc( function )
Making int, str etc classes means that you can extend those provide types
in interesting ways for your problem domain.
In my opinion, the 'Everything is object' is great in Python. In this language, you don't react to what are the objects you have to handle, but how they can interact. A function is just an object that you can __call__, a list is just an object that you can __iter__. But why should we divide data in non overlapping groups. An object can behave like a function when we call it, but also like an array when we access it.
This means that you don't think your "function" like, "i want an array of integers and i return the sum of it" but more "i will try to iterate over the thing that someone gave me and try to add them together, if something goes wrong, i will tell it to the caller by raiseing error and he will hate to modify his behavior".
The most interesting exemple is __add__. When you try something like Object1 + Object2, Python will ask (nicely ^^) to Object1 to try to add himself with object2 (Object1.__add__(Object2)). There is 2 scenarios here: either Oject1 knows how to add himself to Object2 and everything is fine, either he raises a NotImplemented error and Python will ask to Object2 to radd himself to Object1. Just with this mechanism, you can teach to your object to add themselves with any other object, you can manage commutativity,...
why is everything in Python, an object?
Python (unlike other languages) is a truly Object Orient language (aka OOP)
when everything is an object, it becomes easier to search, manipulate or access things. (But everything comes at the cost of speed)
what prompted this shift of approach, to treat everything including, even functions, as objects?
"Necessity is the mother of invention"

Python pseudo-immutable object field

I currently need to partially create a Python object and be able to update it for some time. Although, I must not be able to update it once I used the object as a dictionary key.
Of course there is the solution of marking the fields as private, which is mostly a warning for the programmer, and I will actually go for that solution.
But I stumbled on another solution and I want to know if this could be a good idea, or if it could simply go horribly wrong. Here it is:
class Foo():
def __init__(self, bar):
self._bar = bar
self._has_been_hashed = False
def __hash__(self):
self._has_been_hashed = True
return self._bar.__hash__()
def __eq__(self, other):
return self._bar == other._bar
def __copy__(self):
return Foo(self._bar)
def set_bar(self, bar):
if self.has_been_hashed:
raise FooIsNowImmutable
else:
self._bar = bar
Some testing proved it to work as desired, I can no longer use set_bar once I, say, used my object as a dictionary key.
What do you think? Is it a good idea? Will it turn against me? Is there an easier way? And is it somehow a bad practice?
Doing it that way is a bit fragile, since you never know when something might be used as a dictionary key, or when its hash might be called for some other reason. An object isn't supposed to "know" whether it's being used as a dictionary key. It will be confusing to have code that may raise an exception just because some other code somewhere else put the object in a dictionary.
Following the Python philosophy of "explicit is better than implicit", it would be safer to just give your object a method called .finalize() or .lock() or something, which would set a flag indicating the object is immutable. You could also reverse the exception-raising logic, so that __hash__ raises an exception if the object is not yet locked (rather than mutation raising an exception if the object has been hashed).
You would then call .lock() when you're ready to make the object immutable. It makes more sense to explicitly set it immutable when you're done with whatever mutating you need to do, rather than implicitly assuming that as soon as you use it in a dictionary, you're done mutating it.
You can do that, but I'm not sure I'd recommend it. Why do you need it in a dictionary?
It requires a lot more awareness of the state of the object... think a file object. Would you put one in a dictionary? It has to be opened for a lot of the functions to work, and once it's closed, you can't do them anymore. The user has to be aware in the surrounding code which state the object is in.
For files, that makes sense - after all, you don't normally hold files open across large parts of your program, or if you do, they have very defined init and close codes; something similar has to make sense for your object. Especially if you have some APIs that take the object, but expect an immutable version, and others that take the same object, but expect to change it...
I have used the lock method before, and it works well for complex, read-only objects that you want to initialize once and then make sure no one is messing with. E.G. you load a copy of a (say, English) dictionary from disk... it has to be mutable while you are populating it, but you don't want anyone to accidentally modify it, so locking it is a great idea. I would only use it if it was a one-time lock though - something you are locking and unlocking seems like a recipe for disaster.
There are two solutions IMHO if you just want to create a version you can use in hashable places. First is to explicitly create an immutable copy when you put it in a dictionary - tuple and frozenset are examples of this sort of behaviour... if you want to put a list in a dict, you can't, but you can create a tuple from it first, and that can be hashed. Create a frozen version of your object, then it's very clear by looking at the object type whether it's expected to be mutable or immutable, and so cases where it was used incorrectly are easily seen.
Second, if you really want it to be hashable, but need it to be mutable... that's actually legal, but implemented a little different. It goes back to the idea of hashing... hashing is used both for optimized lookups, and equality.
The first is to ensure you can get objects back... you put something in a dictionary, and it hashes to a value of 4 - goes in slot 4. Then you modify it. Then you go to look it up again, and now it hashes to 9 - there's nothing in slot 9, or worse, a different object, and you're broken.
Second is equality - for things like sets, I need to know if my object is already in there. I can hash, but if you know anything about hashing, you still need to check equality to check for hash collisions.
That doesn't preclude supporting __hash__ and being mutable, but it's unusual. You need to decide for your item what makes it the same, even though it's mutable. What you need to do then is give each object a unique id. Technically, you may be able to get away with id(self), but something like the uuid module is probably a better possibility. The UUID4 (or technically, the hash of the UUID4) is what determines both the hash and equality; two objects that contain the same UUID4 should be the exact same object; two objects that have the exact same data but a different UUID4 would be different object.

Duck typing trouble. Duck typing test for "i-am-like-a-list"

USAGE CONTEXT ADDED AT END
I often want to operate on an abstract object like a list. e.g.
def list_ish(thing):
for i in xrange(0,len(thing)):
print thing[i]
Now this appropriate if thing is a list, but will fail if thing is a dict for example. what is the pythonic why to ask "do you behave like a list?"
NOTE:
hasattr('__getitem__') and not hasattr('keys')
this will work for all cases I can think of, but I don't like defining a duck type negatively, as I expect there could be cases that it does not catch.
really what I want is to ask.
"hey do you operate on integer indicies in the way I expect a list to do?" e.g.
thing[i], thing[4:7] = [...], etc.
NOTE: I do not want to simply execute my operations inside of a large try/except, since they are destructive. it is not cool to try and fail here....
USAGE CONTEXT
-- A "point-lists" is a list-like-thing that contains dict-like-things as its elements.
-- A "matrix" is a list-like-thing that contains list-like-things
-- I have a library of functions that operate on point-lists and also in an analogous way on matrix like things.
-- for example, From the users point of view destructive operations like the "spreadsheet-like" operations "column-slice" can operate on both matrix objects and also on point-list objects in an analogous way -- the resulting thing is like the original one, but only has the specified columns.
-- since this particular operation is destructive it would not be cool to proceed as if an object were a matrix, only to find out part way thru the operation, it was really a point-list or none-of-the-above.
-- I want my 'is_matrix' and 'is_point_list' tests to be performant, since they sometimes occur inside inner loops. So I would be satisfied with a test which only investigated element zero for example.
-- I would prefer tests that do not involve construction of temporary objects, just to determine an object's type, but maybe that is not the python way.
in general I find the whole duck typing thing to be kinda messy, and fraught with bugs and slowness, but maybe I dont yet think like a true Pythonista
happy to drink more kool-aid...
One thing you can do, that should work quickly on a normal list and fail on a normal dict, is taking a zero-length slice from the front:
try:
thing[:0]
except TypeError:
# probably not list-like
else:
# probably list-like
The slice fails on dicts because slices are not hashable.
However, str and unicode also pass this test, and you mention that you are doing destructive edits. That means you probably also want to check for __delitem__ and __setitem__:
def supports_slices_and_editing(thing):
if hasattr(thing, '__setitem__') and hasattr(thing, '__delitem__'):
try:
thing[:0]
return True
except TypeError:
pass
return False
I suggest you organize the requirements you have for your input, and the range of possible inputs you want your function to handle, more explicitly than you have so far in your question. If you really just wanted to handle lists and dicts, you'd be using isinstance, right? Maybe what your method does could only ever delete items, or only ever replace items, so you don't need to check for the other capability. Document these requirements for future reference.
When dealing with built-in types, you can use the Abstract Base Classes. In your case, you may want to test against collections.Sequence or collections.MutableSequence:
if isinstance(your_thing, collections.Sequence):
# access your_thing as a list
This is supported in all Python versions after (and including) 2.6.
If you are using your own classes to build your_thing, I'd recommend that you inherit from these abstract base classes as well (directly or indirectly). This way, you can ensure that the sequence interface is implemented correctly, and avoid all the typing mess.
And for third-party libraries, there's no simple way to check for a sequence interface, if the third-party classes didn't inherit from the built-in types or abstract classes. In this case you'll have to check for every interface that you're going to use, and only those you use. For example, your list_ish function used __len__ and __getitem__, so only check whether these two methods exist. A wrong behavior of __getitem__ (e.g. a dict) should raise an exception.
Perhaps their is no ideal pythonic answer here, so I am proposing a 'hack' solution, but don't know enough about the class structure of python to know if I am getting this right:
def is_list_like(thing):
return hasattr(thing, '__setslice__')
def is_dict_like(thing):
return hasattr(thing, 'keys')
My reduce goals here are to simply have performant tests that will:
(1) never call a dict-thing, nor a string-like-thing a list List item
(2) returns the right answer for python types
(3) will return the right answer if someone implement a "full" set of core method for a list/dict
(4) is fast (ideally does not allocate objects during the test)
EDIT: Incorporated ideas from #DanGetz

Obtaining an object's instance based on its string repr (Python) [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Python: Get object by id
Typically when you see the string representation of an instance it looks something like <module.space.Class # 0x108181bc>. I am curious if it's possible to take this string and get a handle on the instance.
Something like
obj_instance = get_instance_form_repr("<module.space.Class # 0x1aa031b>")
I don't believe it could be possible but if it is, it would be really useful.
You can do it for the subset that is tracked by the garbage collector at least, in a nasty and unreliable way.
def lookup_object_by_repr(myrep)
import gc
for obj in gc.get_objects():
if repr(obj) == myrep:
return obj
You can do even more things if you write a simple C extension and inspect the memory address.
Of course there's no way to get an object from its repr, because any class can return anything it wants in its repr. But in the specific case where the repr has a pointer in it, you're basically asking how to get an object from a pointer. Which (in C Python, at least) is the exact same thing as asking how to get an object from its id.
There's no built-in way to do this, even though it would be pretty simple. And that's intentional, because this is almost always a bad idea.
If you think you have a practical purpose for getting an object by id, you're probably wrong. Especially since you have to deal with object lifecycle issues that are usually taken care of automatically (and behind your back, which makes it hard to take care of them manually even if you try). For example, if an object goes away, a weakref to that object nulls out, but the id is still pointing—to deallocated memory, or another object created later, or half of one object and half of another. And then of course whatever you do in C Python isn't going to work in PyPy or jython or IronPython…
But if you're just tinkering around to learn how the C Python runtime works, then this is a legitimate question, and there's a legitimate answer. Probably the best way to do it is to create a C extension module. If you don't know how to create an extension module, you really ought to learn that first, and come back to this question later. If you do, the function you want to implement is pretty simple:
static PyObject *objectFromId(PyObject *self, PyObject *args) {
PyObject *obj;
if (!PyArg_ParseTuple(args, "n", &obj)) return NULL;
Py_INCREF(obj);
return obj;
}
You could do this in Pyrex/Cython instead of C, if you wanted. Or you could just call PyObj_FromPtr directly on the _ctypes module. But if you're trying to learn how things work at this level, it makes more sense to make what's happening explicit, and to explicitly put your "dealing with C pointers" code in C.
On further thought, it you wanted to build a sort of best-guess objectFromRepr for tinkering purposes, you could. Basically:
Use a regex that matches the common angle-bracket form; if it matches, call objectFromId with the address.
Call eval.
You might want to put an intermediate step in there: for many types, the repr is in the form of a constructor call like Class(arg1, arg2), and in that case it might be better to match that form with another regex and call the constructor directly, instead of using eval. If nothing else, it's probably more instructive, and that's the point of this exercise, right?
Obviously this is an even more terrible idea in real-life code than objectFromId. It's almost always bad to use eval, and eval(repr(x)) == x is not actually guaranteed to be true, and you're actually getting a new reference to the same object in the id case but a new object with the same value in the eval case, and…
PS, After you learn how all of this works in C Python, it's probably worth doing a similar exercise in another interpreter like jython or IronPython (especially when you repr a native Java/.NET/etc. object).

Categories