Why is __code__ for a function(Python) mutable

Why is __code__ for a function(Python) mutable - python

In a previous question yesterday, in comments, I came to know that in python __code__ atrribute of a function is mutable. Hence I can write code as following
def foo():
print "Hello"
def foo2():
print "Hello 2"
foo()
foo.__code__ = foo2.__code__
foo()
Output
Hello
Hello 2
I tried googling, but either because there is no information(I highly doubt this), or the keyword (__code__) is not easily searchable, I couldn't find a use case for this.
It doesn't seem like "because most things in Python are mutable" is a reasonable answer either, because other attributes of functions — __closure__ and __globals__ — are explicitly read-only (from Objects/funcobject.c):
static PyMemberDef func_memberlist[] = {
{"__closure__", T_OBJECT, OFF(func_closure),
RESTRICTED|READONLY},
{"__doc__", T_OBJECT, OFF(func_doc), PY_WRITE_RESTRICTED},
{"__globals__", T_OBJECT, OFF(func_globals),
RESTRICTED|READONLY},
{"__module__", T_OBJECT, OFF(func_module), PY_WRITE_RESTRICTED},
{NULL} /* Sentinel */
};
Why would __code__ be writable while other attributes are read-only?

The fact is, most things in Python are mutable. So the real question is, why are __closure__ and __globals__ not?
The answer initially appears simple. Both of these things are containers for variables which the function might need. The code object itself does not carry its closed-over and global variables around with it; it merely knows how to get them from the function. It grabs the actual values out of these two attributes when the function is called.
But the scopes themselves are mutable, so this answer is unsatisfying. We need to explain why modifying these things in particular would break stuff.
For __closure__, we can look to its structure. It is not a mapping, but a tuple of cells. It doesn't know the names of the closed-over variables. When the code object looks up a closed-over variable, it needs to know its position in the tuple; they match up one-to-one with co_freevars which is also read-only. And if the tuple is of the wrong size or not a tuple at all, this mechanism breaks down, probably violently (read: segfaults) if the underlying C code isn't expecting such a situation. Forcing the C code to check the type and size of the tuple is needless busy-work which can be eliminated by making the attribute read-only. If you try to replace __code__ with something taking a different number of free variables, you get an error, so the size is always right.
For __globals__, the explanation is less immediately obvious, but I'll speculate. The scope lookup mechanism expects to have access to the global namespace at all times. Indeed, the bytecode may be hard-coded to go straight to the global namespace, if the compiler can prove no other namespace will have a variable with a particular name. If the global namespace was suddenly None or some other non-mapping object, the C code could, once again, violently misbehave. Again, making the code perform needless type checks would be a waste of CPU cycles.
Another possibility is that (normally-declared) functions borrow a reference to the module's global namespace, and making the attribute writable would cause the reference count to get messed up. I could imagine this design, but I'm not really sure it's a great idea since functions can be constructed explicitly with objects whose lifetimes might be shorter than that of the owning module, and these would need to be special-cased.

Related

How do I iterate through a dictionary/set in SLY?

So, I'm trying to transition my code from my earlier PLY implementation to SLY. Previously, I had some code that loaded a binary file with a wide range of reserved words scraped from documentation of the scripting language I'm trying to implement. However, when I try to iterate through the scraped items in the lexer for SLY, I get an error message inside LexerMetaDict's __setitem__ when trying to iterate through the resulting set of:
Exception has occurred: AttributeError
Name transition redefined
File "C:\dev\sly\sly\lex.py", line 126, in __setitem__
raise AttributeError(f'Name {key} redefined')
File "C:\dev\sly\example\HeroLab\HeroLab.py", line 24, in HeroLabLexer
for transition in transition_set:
File "C:\dev\sly\example\HeroLab\HeroLab.py", line 6, in <module>
class HeroLabLexer(Lexer):
The code in question:
from transitions import transition_set, reference_set
class HeroLabLexer(Lexer):
# initial token assignments
for transition in transition_set:
tokens.add(transition)
I might not be as surprised if it were happening when trying to add to the tokens, since I'm still trying to figure out how to interface with the SLY method for defining things, but if I change that line to a print statement, it still fails when I iterate through the second item in "transition_set". I've tried renaming the various variables, but to little avail.

The error you get is the result of a modification Sly makes to the Lexer metaclass, which I discuss below. But for a simple answer: I assume tokens is a set, so you can easily avoid the problem with
tokens |= transition_set
If transition_set were an iterable but not a set, you could use the update method, which works with any iterable (and any number of iterable arguments):
tokens.update(transition_set)
tokens doesn't have to be a set. Sly should work with any iterable. But you might need to adjust the above expressions. If tokens is a tuple or a list, you'd use += instead of |= and, in the case of lists, extend instead of update. (There are some minor differences, as with the fact that set.update can be used to merge several sets.)
That doesn't answer your direct question, "How do I iterate... in SLY". I interpret that as asking:
How do I write a for loop at class scope in a class derived from sly.lexer?
and that's a harder question. The games which Sly plays with Python namespaces make it difficult to use for loops in the class scope of a lexer, because Sly replaces the attribute dictionary of Lexer class (and its subclasses) with a special dictionary which doesn't allow redefinition of attributes with string values. Since the iteration variable in a for statement is in the enclosing scope --in this case is the class scope--, any for loop with a string index variable and whose body runs more than once will trigger the "Name redefined" error which you experienced.
It's also worth noting that if you use a for statement at class scope in any class, the last value of the iteration variable will become a class attribute. That's hardly ever desirable, and, really, that construction is not good practice in any class. But it doesn't usually throw an error.
At class scope, you can use comprehensions (whose iteration variables are effectively locals). Of course, in this case, there's no advantage in writing:
tokens.update(transition for transition in transition_set)
But the construct might be useful for other situations. Note, however, that other variables at class scope (such as tokens) are not visible in the body of the comprehension, which might also create difficulties.
Although it's extremely ugly, you can declare the iteration variable as a global, which makes it a module variable rather than a class variable (and therefore just trades one bad practice for another one, although you can later remove the variable from the module).
You could do the computation in a different scope (such as a global function), or you could write a (global) generator to use with tokens.update(), which is probably the most general solution.
Finally, you can make sure that the index variable is never an instance of a str.

Python: `locals()` as a default function argument

Suppose I have a module PyFoo.py that has a function bar. I want bar to print all of the local variables associated with the namespace that called it.
For example:
#! /usr/bin/env python
import PyFoo as pf
var1 = 'hi'
print locals()
pf.bar()
The two last lines would give the same output. So far I've tried defining bar as such:
def bar(x=locals):
print x()
def bar(x=locals()):
print x
But neither works. The first ends up being what's local to bar's namespace (which I guess is because that's when it's evaluated), and the second is as if I passed in globals (which I assume is because it's evaluated during import).
Is there a way I can have the default value of argument x of bar be all variables in the namespace which called bar?
EDIT 2018-07-29:
As has been pointed out, what was given was an XY Problem; as such, I'll give the specifics.
The module I'm putting together will allow the user to create various objects that represent different aspects of a numerical problem (e.x. various topology definitions, boundary conditions, constitutive models, ect.) and define how any given object interacts with any other object(s). The idea is for the user to import the module, define the various model entities that they need, and then call a function which will take all objects passed to it, make needed adjustments to ensure capability between them, and then write out a file that represents the entire numerical problem as a text file.
The module has a function generate that accepts each of the various types of aspects of the numerical problem. The default value for all arguments is an empty list. If a non-empty list is passed, then generate will use those instances for generating the completed numerical problem. If an argument is an empty list, then I'd like it to take in all instances in the namespace that called generate (which I will then parse out the appropriate instances for the argument).
EDIT 2018-07-29:
Sorry for any lack of understanding on my part (I'm not that strong of a programmer), but I think I might understand what you're saying with respect to an instance being declared or registered.
From my limited understanding, could this be done by creating some sort of registry dataset (like a list or dict) in the module that will be created when the module is imported, and that all module classes take this registry object in by default. During class initialization self can be appended to said dataset, and then the genereate function will take the registry as a default value for one of the arguments?

There's no way you can do what you want directly.
locals just returns the local variables in whatever namespace it's called in. As you've seen, you have access to the namespace the function is defined in at the time of definition, and you have access to the namespace of the function itself from within the function, but you don't have access to any other namespaces.
You can do what you want indirectly… but it's almost certainly a bad idea. At least this smells like an XY problem, and whatever it is you're actually trying to do, there's probably a better way to do it.
But occasionally it is necessary, so in case you have one of those cases:
The main good reason to want to know the locals of your caller is for some kind of debugging or other introspection function. And the way to do introspection is almost always through the inspect library.
In this case, what you want to inspect is the interpreter call stack. The calling function will be the first frame on the call stack behind your function's own frame.
You can get the raw stack frame:
inspect.currentframe().f_back
… or you can get a FrameInfo representing it:
inspect.stack()[1]
As explained at the top of the inspect docs, a frame object's local namespace is available as:
frame.f_locals
Note that this has all the same caveats that apply to getting your own locals with locals: what you get isn't the live namespace, but a mapping that, even if it is mutable, can't be used to modify the namespace (or, worse in 2.x, one that may or may not modify the namespace, unpredictably), and that has all cell and free variables flattened into their values rather than their cell references.
Also, see the big warning in the docs about not keeping frame objects alive unnecessarily (or calling their clear method if you need to keep a snapshot but not all of the references, but I think that only exists in 3.x).

How C/C++ global variables are implemented in python?

While i am reading through SWIG Documentation i came through these lines..
C/C++ global variables are fully supported by SWIG. However, the underlying mechanism is somewhat different than you might expect due to the way that Python assignment works. When you type the following in Python
a = 3.4
"a" becomes a name for an object containing the value 3.4. If you later type
b = a
then "a" and "b" are both names for the object containing the value 3.4. Thus, there is only one object containing 3.4 and "a" and "b" are both names that refer to it. This is quite different than C where a variable name refers to a memory location in which a value is stored (and assignment copies data into that location). Because of this, there is no direct way to map variable assignment in C to variable assignment in Python.
To provide access to C global variables, SWIG creates a special object called `cvar' that is added to each SWIG generated module. Global variables are then accessed as attributes of this object.
My question is what is the need for implementing in the above way. Even though we implemented in the above mentioned way object attributes are also implemented as objects.
Please see the below python code snippet
a = 10
b = a
a is b
True
class sample:
pass
obj = sample()
obj.a = 10
obj.b = obj.a
obj.a is obj.b
True
Here in both the above cases object assignment happening in the same way

It's all about the fact that SWIG has to provide an interface to a library in C/C++ which acts differently.
Let us assume that instead of implementing a cvar object SWIG simply used PyInts etc. as attributes to the generated modules(which is what "normal" C-extensions do).
Then when, from python code, the user assigns a value to the variable a new PyInt object is assigned to that attribute but the original variable used by the library is unchanged, because the module object does not know that it has to modify the C-global variable when doing an assignment.
This means that, while from the python side the user will see the value change, the C library wouldn't be aware of the change because the memory location represented by the global variable didn't change its value.
In order to allow the user to set the values in a manner that is visible from the C/C+ library, SWIG had to define this cvar object, which, when performing assignments, assigns the value to the library's variable under the cover, i.e. it changes the contents of the memory location that contains the value of the global variable.
This is probably done providing an implementation of __setattr__ and __getattr__ or __getattribute__, so that cvar is able to override the behaviour of assignment to an attribute.

memory management with objects and lists in python

I am trying to understand how exactly assignment operators, constructors and parameters passed in functions work in python specifically with lists and objects. I have a class with a list as a parameter. I want to initialize it to an empty list and then want to populate it using the constructor. I am not quite sure how to do it.
Lets say my class is --
class A:
List = [] # Point 1
def __init1__(self, begin=[]): # Point 2
for item in begin:
self.List.append(item)
def __init2__(self, begin): # Point 3
List = begin
def __init3__(self, begin=[]): # Point 4
List = list()
for item in begin:
self.List.append(item)
listObj = A()
del(listObj)
b = listObj
I have the following questions. It will be awesome if someone could clarify what happens in each case --
Is declaring an empty like in Point 1 valid? What is created? A variable pointing to NULL?
Which of Point 2 and Point 3 are valid constructors? In Point 3 I am guessing that a new copy of the list passed in (begin) is not made and instead the variable List will be pointing to the pointer "begin". Is a new copy of the list made if I use the constructor as in Point 2?
What happens when I delete the object using del? Is the list deleted as well or do I have to call del on the List before calling del on the containing object? I know Python uses GC but if I am concerned about cleaning unused memory even before GC kicks in is it worth it?
Also assigning an object of type A to another only makes the second one point to the first right? If so how do I do a deep copy? Is there a feature to overload operators? I know python is probably much simpler than this and hence the question.
EDIT:
5. I just realized that using Point 2 and Point 3 does not make a difference. The items from the list begin are only copied by reference and a new copy is not made. To do that I have to create a new list using list(). This makes sense after I see it I guess.
Thanks!

In order:
using this form is simply syntactic sugar for calling the list constructor - i.e. you are creating a new (empty) list. This will be bound to the class itself (is a static field) and will be the same for all instances.
apart from the constructor name which must always be init, both are valid forms, but mean different things.
The first constructor can be called with a list as argument or without. If it is called without arguments, the empty list passed as default is used within (this empty list is created once during class definition, and not once per constructor call), so no items are added to the static list.
The second must be called with a list parameter, or python will complain with an error, but using it without the self. prefix like you are doing, it would just create a new local variable name List, accessible only within the constructor, and leave the static A.List variable unchanged.
Deleting will only unlink a reference to the object, without actually deleting anything. Once all references are removed, however, the garbage collector is free to clear the memory as needed.
It is usually a bad idea to try to control the garbage collector. instead. just make sure you don't hold references to objects you no longer need and let it make its work.
Assigning a variable with an object will only create a new reference to the same object, yes. To create a deep copy use the related functions or write your own.
Operator overloading (use with care, it can make things more confusing instead of clearer if misused) can be done by overriding some special methods in the class definition.
About your edit: like i pointed above, when writing List=list() inside the constructor, without the self. (or better, since the variable is static, A.) prefix, you are just creating an empty variable, and not overriding the one you defined in the class body.
For reference, the usual way to handle a list as default argument is by using a None placeholder:
class A(object):
def __init__(self, arg=None):
self.startvalue = list(arg) if arg is not None else list()
# making a defensive copy of arg to keep the original intact
As an aside, do take a look at the python tutorial. It is very well written and easy to follow and understand.

"It will be awesome if someone could clarify what happens in each case" isn't that the purpose of the dis module ?
http://docs.python.org/2/library/dis.html

Python: how to pass a reference to a function

IMO python is pass by value if the parameter is basic types, like number, boolean
func_a(bool_value):
bool_value = True
Will not change the outside bool_value, right?
So my question is how can I make the bool_value change takes effect in the outside one(pass by reference?

You can use a list to enclose the inout variable:
def func(container):
container[0] = True
container = [False]
func(container)
print container[0]
The call-by-value/call-by-reference misnomer is an old debate. Python's semantics are more accurately described by CLU's call-by-sharing. See Fredrik Lundh's write up of this for more detail:
Call By Object

Python (always), like Java (mostly) passes arguments (and, in simple assignment, binds names) by object reference. There is no concept of "pass by value", neither does any concept of "reference to a variables" -- only reference to a value (some express this by saying that Python doesn't have "variables"... it has names, which get bound to values -- and that is all that can ever happen).
Mutable objects can have mutating methods (some of which look like operators or even assignment, e.g a.b = c actually means type(a).__setattr__(a, 'b', c), which calls a method which may likely be a mutating ones).
But simple assignment to a barename (and argument passing, which is exactly the same as simple assignment to a barename) never has anything at all to do with any mutating methods.
Quite independently of the types involved, simple barename assignment (and, identically, argument passing) only ever binds or rebinds the specific name on the left of the =, never affecting any other name nor any object in any way whatsoever. You're very mistaken if you believe that types have anything to do with the semantics of argument passing (or, identically, simple assignment to barenames).

Unmutable types can't, but if you send a user-defined class instance, a list or a dictionary, you can change it and keep with only one object.
Like this:
def add1(my_list):
my_list.append(1)
a = []
add1(a)
print a
But, if you do my_list = [1], you obtain a new instance, losing the original reference inside the function, that's why you can't just do "my_bool = False" and hope that outside of the function your variable get that False

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is code for a function(Python) mutable - python

Related

How do I iterate through a dictionary/set in SLY?

Python: `locals()` as a default function argument

How C/C++ global variables are implemented in python?

memory management with objects and lists in python

Python: how to pass a reference to a function

Categories

Resources