I have an object:
c = Character(...)
I convert it to a string by using:
p = "{0}".format(c)
print(p)
>>> <Character.Character object at 0x000002267ED6DA50>
How do i get the object back so i can run this code?
p.get_name()
You absolutely can if you are using CPython (where the id is the memory address). Different implementations may not work the same way.
>>> import ctypes
>>> class A:
... pass
...
>>> a = A()
>>> id(a)
140669136944864
>>> b = ctypes.cast(id(a), ctypes.py_object).value
>>> b
<__main__.A object at 0x7ff015f03ee0>
>>> a is b
True
So we've de-referenced the id of a back into a py_object and snagged its value.
If your main goal is to serialize and deserialize objects. (ie. turn objects into string and back while preserving all the data and functions) your can use pickle. You can use pickle.dumps to convert any object into a string and pickle.loads to convert it back into an object. docs
>>> import pickle
>>> class Student:
... def __init__(self, name, age):
... self.name = name
... self.age = age
...
>>> a = Student("name", 20)
>>> pickle.dumps(a)
b'\x80\x04\x951\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x07Student\x94\x93\x94)\x81\x94}\x94(\x8c\x04name\x94h\x05\x8c\x03age\x94K\x14ub.'
>>> s = pickle.dumps(a)
>>> b = pickle.loads(s)
>>> b
<__main__.Student object at 0x7f04a856c910>
>>> b.name == a.name and b.age == a.age
True
It is not possible in the general case to use the ID embedded in the default __str__ implementation to retrieve an object.
Borrowing from g.d.d.c's answer, let's define a function to do this:
import ctypes
def get_object_by_id(obj_id):
return ctypes.cast(obj_id, ctypes.py_object).value
def get_object_by_repr(obj_repr):
return get_object_by_id(int(obj_repr[-19:-1], 16))
It works for any object that is still in scope, provided that it's using the default __repr__/__str__ implementation that includes the hex-encoded id at the end of the string:
>>> class A:
... pass
...
>>> a = A()
>>> r = str(a)
>>> r
'<__main__.A object at 0x000001C584E8BC10>'
>>> get_object_by_repr(r)
<__main__.A object at 0x000001C584E8BC10>
But what if our original A has gone out of scope?
>>> def get_a_repr():
... a = A()
... return str(a)
...
>>> r = get_a_repr()
>>> get_object_by_repr(r)
(crash)
(and I don't mean an uncaught exception, I mean Python itself crashes)
You don't necessarily need to define a in a function to do this; it also can happen if you just rebind a in the local scope (note: GC isn't necessarily guaranteed to happen as soon as you rebind the variable, so this may not behave 100% deterministically, but the below is a real example):
>>> a = A()
>>> r = str(a)
>>> get_object_by_repr(r)
<__main__.A object at 0x000001C73C73BBE0>
>>> a = A()
>>> get_object_by_repr(r)
<__main__.A object at 0x000001C73C73BBE0>
>>> a
<__main__.A object at 0x000001C73C73B9A0>
>>> get_object_by_repr(r)
(crash)
I'd expect this to also happen if you passed this string between processes, stored it in a file for later use by the same script, or any of the other things you'd normally be doing with a serialized object.
The reason this happens is that unlike C, Python garbage-collects objects that have gone out of scope and which do not have any references -- and the id value itself (which is just an int), or the string representation of the object that has the id embedded in it, is not recognized by the interpreter as a live reference! And because ctypes lets you reach right into the guts of the interpreter (usually a bad idea if you don't know exactly what you're doing), you're telling it to dereference a pointer to freed memory, and it crashes.
In other situations, you might actually get the far more insidious bug of getting a different object because that memory address has since been repurposed to hold something else (I'm not sure how likely this is).
To actually solve the problem of turning a str() representation into the original object, the object must be serialized, i.e. turned into a string that contains all the data needed to reconstruct an exact copy of the object, even if the original object no longer exists. How to do this depends entirely on the actual content of the object, but a pretty standard (language-agnostic) solution is to make the class JSON-serializable; check out How to make a class JSON serializable.
Related
Usually in Python when you do an assignment of a variable, you don't get a copy - you just get a second reference to the same object.
a = b'Hi'
b = a
a is b # shows True
Now when you use ctypes.create_string_buffer to get a buffer to e.g. interact with a Windows API function, you can use the .raw attribute to access the bytes. But what if you want to access those bytes after you've deleted the buffer?
c = ctypes.create_string_buffer(b'Hi')
d = c.raw
e = c.raw
d is e # shows False?
d == e # shows True as you'd expect
c.raw is c.raw # shows False!
del c
At this point are d and e still safe to use? From my experimentation it looks like the .raw attribute makes copies when you access it, but I can't find anything in the official documentation to support that.
.raw returns a separate, immutable Python bytes object each time called it may be an interned version (d is e could return True) but is safe to use.
An easy test:
>>> x=ctypes.create_string_buffer(1)
>>> a = x.raw
>>> x[0] = 255
>>> b = x.raw
>>> a
b'\x00'
>>> b
b'\xff'
This point in the documentation comments on this for the ctypes type c_char_p, but it applies to other ctypes types like c_char_Array as well (emphasis mine):
>>> s = c_char_p()
>>> s.value = b"abc def ghi"
>>> s.value
b'abc def ghi'
>>> s.value is s.value
False
...
Why is it printing False? ctypes instances are objects containing a memory block plus some descriptors accessing the contents of the memory. Storing a Python object in the memory block does not store the object itself, instead the contents of the object is stored. Accessing the contents again constructs a new Python object each time!
From the python document the id() function:
Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.
CPython implementation detail: This is the address of the object in memory.
This is clear, I understand it. It a python class instance is passed to id function, it returns the “identity” / address of that object. However, if a python class name is passed to id function. What does it return? From example.
class Abc:
value = 0
x = Abc()
id(x) # -> 65097560
id(Abc) # -> 67738424
I understand 65097560 is the memory address of object x. but what is the meaning of id(Abc) / 67738424?
Classes are objects themselves. They're instances of the metaclass type. (type is also used for determining something's type.)
>>> class Abc: pass
...
>>> type(Abc)
<class 'type'>
>>> id(Abc)
39363368
In Python 2, old-style classes are instances of classobj.
>>> class Old: pass
...
>>> type(Old)
<type 'classobj'>
>>> id(Old)
140369496387664
>>> class New(object): pass
...
>>> type(New)
<type 'type'>
>>> id(New)
94676790299104
In general, everything is an object in Python, including functions like id() itself, builtin types/classes like list, and modules like sys. Though there are a few exceptions like keywords and operators.
I started reading about python's += syntax and stumbled onto the following post/answer:
Interactive code about +=
So what I noticed was that there seems to be a difference between frames and objects.
In the global frame, they point to the same object even though they're different variables; if the line
l2 += [item]
was instead
l2 = l2 + [item]
then 'l2' becomes a separate object when that line runs. My biggest question is when would you want a variable to point to a separate object? Also, why and when would you want to keep them pointed at the same object?
Any explanation or use cases would be greatly appreciated! Extra thanks if you can mention anything relevant to data science :)
frame and object don't mean what you think they mean.
In programming you have something called a stack. In Python, when you call a function you create something called a stack frame. This frame is (as you see in your example) basically just a table of all of the variables that are local to your function.
Note that defining a function doesn't create a new stack frame, it's the calling a function. For instance something like this:
def say_hello():
name = input('What is your name?')
print('Hello, {}'.format(name))
Your global frame is just going to hold one reference: say_hello. You can see that by checking out what's in the local namespace (in Python you pretty much have a 1:1 relationship between namespace, scope, and stack frames):
print(locals())
You'll see something that looks like this:
{'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <_frozen_importlib_external.SourceFileLoader object at 0x1019bb320>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, '__file__': '/private/tmp/fun.py', '__cached__': None, 'say_hello': <function say_hello at 0x101962d90>}
Note the dunder (short for double underscore double underscore) names - those are automagically provided, and for the purposes of our discussion you can ignore them. That leaves us with:
{'say_hello': <function say_hello at 0x101962d90>}
That 0x bit is the memory address where the function itself lives. So here, our global stack/frame contains just that one value. If you call your function and then check locals() again, you'll see that name isn't there. That's because when you call the function you create a new stack frame and the variable is assigned there. You can prove this by adding print(locals()) at the end of your function. Then you'll see something like this:
{'name': 'Arthur, King of the Brits'}
No dunder names here. You'll also note that this doesn't show a memory address. If you want to know where this value lives, there's a function for that.
def say_hello():
name = input('What is your name?')
print('hello {}'.format(name))
print(locals())
print(id(name))
return name
print(id(say_hello()))
That's what the example means when it's talking about a frame.
But what about objects? Well, in Python, everything is an object. Just try it:
>>> isinstance(3, object)
True
>>> isinstance(None, object)
True
>>> isinstance('hello', object)
True
>>> isinstance(13.2, object)
True
>>> isinstance(3j, object)
True
>>> def fun():
... print('hello')
...
>>> isinstance(fun, object)
True
>>> class Cool: pass
...
>>> isinstance(Cool, object)
True
>>> isinstance(Cool(), object)
True
>>> isinstance(object, object)
True
>>> isinstance(isinstance, object)
True
>>> isinstance(True, object)
True
They're all objects. But they may be different objects. And how can you tell? With id:
>>> id(3)
4297619904
>>> id(None)
4297303920
>>> id('hello')
4325843048
>>> id('hello')
4325843048
>>> id(13.2)
4322300216
>>> id(3j)
4325518960
>>> id(13.2)
4322300216
>>> id(fun)
4322635152
>>> id(isinstance)
4298988640
>>> id(True)
4297228640
>>> id(False)
4297228608
>>> id(None)
4297303920
>>> id(Cool)
4302561896
Note that you also can compare whether or not two objects are the same object by using is.
>>> True is False
False
>>> True is True
True
>>> 'hello world' is 'hello world'
True
>>> 'hello world' is ('hello ' + 'world')
False
>>> 512 is (500+12)
False
>>> 23 is (20+3)
True
Ehhhhh...? Wait a minute, what happened there? Well, as it turns out, python(that is, CPython) caches small integers. So the object 512 is different from the object that is the result of the object 500 added to the object 12.
One important thing to note is that the assignment operator = always assigns a new name to the same object. For example:
>>> x = 592
>>> y = 592
>>> x is y
False
>>> x == y
True
>>> x = y
>>> x is y
True
>>> x == y
True
And it doesn't matter how many other names you give an object, or even if you pass the object around to different frames, you still have the same object.
But as you're starting to gather, it's important to understand the difference between operations that change an object and operations that produce a new object. Generally speaking you have a few immutable types in Python, and operations on them will produce a new object.
As for your question, when do you want to change objects and when do you want to keep them the same is actually looking at it the wrong way. You want to use a mutable type when you want to change things, and you want to use an immutable type if you don't want things to change.
For instance, say you've got a group, and you want to add members to the group. You might use a mutable type like a list to keep track of the group, and an immutable type like strings to represent the members. Like this:
>>> group = []
>>> id(group)
4325836488
>>> group.append('Sir Lancelot')
>>> group.append('Sir Gallahad')
>>> group.append('Sir Robin')
>>> group.append("Robin's Minstrels")
>>> group.append('King Arthur')
>>> group
['Sir Lancelot', 'Sir Gallahad', 'Sir Robin', "Robin's Minstrels", 'King Arthur']
What happens when a member of the group is eaten?
>>> del group[-2] # And there was much rejoicing
>>> id(group)
4325836488
>>> group
['Sir Lancelot', 'Sir Gallahad', 'Sir Robin', 'King Arthur']
You'll notice that you still have the same group, just the members have changed.
I have code which contains the following two lines in it:-
instanceMethod = new.instancemethod(testFunc, None, TestCase)
setattr(TestCase, testName, instanceMethod)
How could it be re-written without using the "new" module? Im sure new style classes provide some kind of workaround for this, but I am not sure how.
There is a discussion that suggests that in python 3, this is not required. The same works in Python 2.6
http://mail.python.org/pipermail/python-list/2009-April/531898.html
See:
>>> class C: pass
...
>>> c=C()
>>> def f(self): pass
...
>>> c.f = f.__get__(c, C)
>>> c.f
<bound method C.f of <__main__.C instance at 0x10042efc8>>
>>> c.f
<unbound method C.f>
>>>
Reiterating the question for every one's benefit, including mine.
Is there a replacement in Python3 for new.instancemethod? That is, given an arbitrary instance (not its class) how can I add a new appropriately defined function as a method to it?
So following should suffice:
TestCase.testFunc = testFunc.__get__(None, TestCase)
You can replace "new.instancemethod" by "types.MethodType":
from types import MethodType as instancemethod
class Foo:
def __init__(self):
print 'I am ', id(self)
def bar(self):
print 'hi', id(self)
foo = Foo() # prints 'I am <instance id>'
mm = instancemethod(bar, foo) # automatically uses foo.__class__
mm() # prints 'I have been bound to <same instance id>'
foo.mm # traceback because no 'field' created in foo to hold ref to mm
foo.mm = mm # create ref to bound method in foo
foo.mm() # prints 'I have been bound to <same instance id>'
This will do the same:
>>> Testcase.testName = testFunc
Yeah, it's really that simple.
Your line
>>> instanceMethod = new.instancemethod(testFunc, None, TestCase)
Is in practice (although not in theory) a noop. :) You could just as well do
>>> instanceMethod = testFunc
In fact, in Python 3 I'm pretty sure it would be the same in theory as well, but the new module is gone so I can't test it in practice.
To confirm that it's not needed to use new.instancemthod() at all since Python v2.4, here's an example how to replace an instance method. It's also not needed to use descriptors (even though it works).
class Ham(object):
def spam(self):
pass
h = Ham()
def fake_spam():
h._spam = True
h.spam = fake_spam
h.spam()
# h._spam should be True now.
Handy for unit testing.
What is the difference between type(obj) and obj.__class__? Is there ever a possibility of type(obj) is not obj.__class__?
I want to write a function that works generically on the supplied objects, using a default value of 1 in the same type as another parameter. Which variation, #1 or #2 below, is going to do the right thing?
def f(a, b=None):
if b is None:
b = type(a)(1) # #1
b = a.__class__(1) # #2
This is an old question, but none of the answers seems to mention that. in the general case, it IS possible for a new-style class to have different values for type(instance) and instance.__class__:
class ClassA(object):
def display(self):
print("ClassA")
class ClassB(object):
__class__ = ClassA
def display(self):
print("ClassB")
instance = ClassB()
print(type(instance))
print(instance.__class__)
instance.display()
Output:
<class '__main__.ClassB'>
<class '__main__.ClassA'>
ClassB
The reason is that ClassB is overriding the __class__ descriptor, however the internal type field in the object is not changed. type(instance) reads directly from that type field, so it returns the correct value, whereas instance.__class__ refers to the new descriptor replacing the original descriptor provided by Python, which reads the internal type field. Instead of reading that internal type field, it returns a hardcoded value.
Old-style classes are the problem, sigh:
>>> class old: pass
...
>>> x=old()
>>> type(x)
<type 'instance'>
>>> x.__class__
<class __main__.old at 0x6a150>
>>>
Not a problem in Python 3 since all classes are new-style now;-).
In Python 2, a class is new-style only if it inherits from another new-style class (including object and the various built-in types such as dict, list, set, ...) or implicitly or explicitly sets __metaclass__ to type.
type(obj) and type.__class__ do not behave the same for old style classes:
>>> class a(object):
... pass
...
>>> class b(a):
... pass
...
>>> class c:
... pass
...
>>> ai=a()
>>> bi=b()
>>> ci=c()
>>> type(ai) is ai.__class__
True
>>> type(bi) is bi.__class__
True
>>> type(ci) is ci.__class__
False
There's an interesting edge case with proxy objects (that use weak references):
>>> import weakref
>>> class MyClass:
... x = 42
...
>>> obj = MyClass()
>>> obj_proxy = weakref.proxy(obj)
>>> obj_proxy.x # proxies attribute lookup to the referenced object
42
>>> type(obj_proxy) # returns type of the proxy
weakproxy
>>> obj_proxy.__class__ # returns type of the referenced object
__main__.MyClass
>>> del obj # breaks the proxy's weak reference
>>> type(obj_proxy) # still works
weakproxy
>>> obj_proxy.__class__ # fails
ReferenceError: weakly-referenced object no longer exists
FYI - Django does this.
>>> from django.core.files.storage import default_storage
>>> type(default_storage)
django.core.files.storage.DefaultStorage
>>> default_storage.__class__
django.core.files.storage.FileSystemStorage
As someone with finite cognitive capacity who's just trying to figure out what's going in order to get work done... it's frustrating.