how to use dir() as default argument within a function [duplicate] - python

This question already has answers here:
"Least Astonishment" and the Mutable Default Argument
(33 answers)
Closed 5 years ago.
I can't figure out what I'm doing wrong. It seems I'm not able to use dir() as the default argument to a function's input
def print_all_vars(arg=dir()):
for i in arg:
print(i)
foo=1
If I use
print_all_vars()
gives the output
__builtins__
__cached__
__doc__
__file__
__loader__
__name__
__package__
__spec__
but
print_all_vars(dir())
outputs
__builtins__
foo
print_all_vars

So - first, what is going on -
The line containing the def statement is run only once, when the module containing the function is first imported (or on the interactive prompt) - anyway, any expressions used as default parameter values are evaluated a single time, and their resulting object is kept as the default value for that parameter.
So, your dir() call is executed, and the resulting list - a dir on the namespace of the module containing your example function is saved. (The most common related error for Python beginners is to try to put an empty list or dictionary ( [] or {}) as a paramter default: the same list or dictionary will be reused for the lifetime of the function).
That explains why you see in the output some values that won't show up if you call it with a dir from the interactive prompt, like __package__ or __spec__.
Now, your real question is:
... I am actually trying to create an analog of MATLAB's save function
that saves all existing variables to an external file that can be
recalled later –
So, even if your dir() would be lazily executed, when the function is called only, that would still fail: it would still retrieve the variable names from the module where the function is defined, not on the callers.
The coding pattern for doing that would be:
def print_all_vars(arg=None):
if arg is None:
arg = dir()
...
(And this is the most common way to enforce empty lists of dictionaries as default parameters on the function body)
But even this, or if you pass your own list of variable names explicitly to the function, you will still have just a list of strings - not Python objects that need you will want to save.
And - if you want to retrieve the variables from the caller by default, you have a problem in common with having this list of strings: you need to be able to access the namespace of the caller code, in order to either retrieve the variable names existing there, and later to retrieve their contents in order to be saved.
The way to know "who've called your function" in Python is an advanced thing, and I am not even sure it is guarantee to work across all Python implementations - but it will certainly work on the most common (cPython, Pypy, Jython) and even some considered smaller ones (such as Brython):
You call sys._getframe() to retrieve the frame object of the code currently being executed (that is the code in your function itself) - and this frame object will have a f_back attribute, which points to the frame object of the code which called your function. The frame object, on its side, has reference to the globals and locals variables of the code being run as plain dictionaries as f_globals and f_locals respectively.
You can them "save" and "restore" as you want the contents of the global variables on the caller code - but you can't update - using pure Python code - the local variables.
So, just to list by default the global variables on the caller code - you can do:
import sys
def print_all_vars(arg=None):
if arg is None:
arg = list(sys.get_frame.f_back.f_globals.keys())
print (arg)
Being able to properly saving and restoring those variables is another question - as you can see in the comments, you can use pickle to serialize a lot of Python objects - but the caller namespace will have module names (such as mathnp`, etc...) which can't be ordinarily pickled - so you will have to come with an strategy to detect, annotate the used modules, and restoring them, by using further introspection on all objects retrieved.
Anyway, experimenting with the information here should put you on the track for what you want.

Related

locals() defined inside python function do not work [duplicate]

This question already has answers here:
Dynamically set local variable [duplicate]
(7 answers)
Any way to modify locals dictionary?
(5 answers)
Closed 2 years ago.
Consider below code:-
dct = {'one':[2,3]}
Now the below works,
for key,val in dct.items():
locals()[key] = val
print(one)
Result:
[2, 3]
But When I use function which I really want doesn't work. Please help
def ff(dct):
for key,val in dct.items():
locals()[key] = val
print(one)
ff(dct)
Result:
NameError: name 'one' is not defined
The result of the locals() call inside a function is not one that you can use to actually update locals using your method. The documentation makes that clear:
locals()
Update and return a dictionary representing the current local symbol table. Free variables are returned by locals() when it is called in function blocks, but not in class blocks. Note that at the module level, locals() and globals() are the same dictionary.
Note The contents of this dictionary should not be modified; changes may not affect the values of local and free variables used by the interpreter.
The reason why you cannot update locals using this method is that they are heavily optimised within the CPython source code(a). When you call locals(), it actually builds a dictionary (the "update and return" bit) based on these heavily optimised structures and that is what you get.
Writing a new key to that dictionary is not reflected back to the actual locals.
I believe the globals() return value does allow updates using this method because that's already a dictionary and it just gives you a reference to it, not a copy. That's why your code works outside of a function (see the clause above stating locals() and globals() are the same thing in that context).
From memory, Python 2 allowed you to do something like exec "answer = 42" and that would affect the locals. But, as with print, this was changed from a statement to a library call in Python 3, so the execution engine really has no idea what it does under the covers, meaning it cant magically bind locals with exec("answer = 42").
I suppose someone could request this as a feature since it would make dynamic programming a little easier. Whether it would get through the gauntlet, I have no idea, since the fact that you can provide your own locals dictionary to exec() means that you already have a way to detect variables that were bound by arbitrary code. They'll just be in a separate dictionary rather then in the actual locals area.
(a) Access to locals is via known indexes into the stack-frame local variable area, computed at compile time and embedded into the actual bypte code. Being able to add/change variables dynamically would break this optimisation.
This problem was raised as an issue back in early 2009 and the outcome was that it was too difficult without losing a lot of performance.

Python - accessing variable that is needed for function defined in different moduel

I have saved a defined function within a module, which I have then imported into a new script.
Within this function, the variable master (a pandas dataframe) is queried. However, master is not one of the arguments in the function and is a dataframe I am hoping to access regardless of the script.
When trying to use the aforementioned function in my new script, I get the following error:
NameError: name 'master' is not defined
But when I enter master into the console, it prints with no problem or error.
I think it is something to do with local and global variables, but I am new to Python and am struggling to understand how I can fix the error.
Within this function, the variable master (a pandas dataframe) is queried. However, master is not one of the arguments in the function
Then update your function to take it as an argument. You can't use a global here, since (if I understand correctly), your "master" variable is defined in your script and the function in a module that's imported by the script. In this case explicitely passing "master" to your function is the only way to make it available since Python has no true "global" namespace ("global" in Python actually means "module level").
And that's a GoodThing actually because it's the only sane way to structure your program. As a general rule, globals (mutable globals that is) are evil, they make your code brittle, untestable, unmaintainable and unpredictable.

Check if a built-in function/class/type/module has been over-written during the script?

Question
We are getting a strange error** and we suspect it is because our script.py*** assigned a variable that already has some built-in meaning. E.g.
str = 2
Is there a way we can check if this has happened?
So far
We're thinking it would involve:
Assign a list at the beggining of the script, containing all built-in objects' names as strings:
builtin_names = get_builtin_var_names() # hypothetical function
Assign a list at the end of the script, containing all user-assigned objects' names as strings:
user_names = get_user_var_names() # hypothetical function
Find the intersection, and check if not empty:
overwritten_names = list(set(user_names) & set(builtin_names))
if overwritten_names:
print("Whoops")
Related
Is there a way to tell if a function in JavaScript is getting over
written?
Is there a common way to check in Python if an object
is any function type?
**Silent error, for those interested in it, it is silent, i.e. it finishes without an error code but the value it spits out differs between two implementations of the same code, call them A and B... both versions require the running of two modules (separate files) that we've made (changes.py and dnds.py), but whereas:
Version A: involves running changes.py -> pickle intermediate data (into a .p file) -> dnds.py,
Version B: involves running changes.py -> return the data (a dict) as arguments to dnds.py -> dnds.py.
And for some reason only version A is the one with the correct final value (benchmarked against MATLAB's dnds function).
***script.py, is actually dnds.py (who has imported changes.py). You can find all the code, but to test the two alternative versions I was talking about in ** you need to specifically look at dnds.py, the line with: CTRL+F: "##TODO:Urgent:debug:2016-11-28:". Once you find that line, you can read the rest of that comment line for instructions how to replicate version B, and its resulting silent error**. For some reason I HAVE to pickle the data to get it to work... when I just return the dicts directly I get the wrong dN/dS values.
You can get the names (and values) of builtins via the dict __builtins__. You can get the names (and values) of global variables with globals() and of locals with locals(). So you could do something like:
import __builtin__
name, val = None, None
for name, val in locals().iteritems():
if hasattr(__builtin__, name) and getattr(__builtin__, name) != val:
print("{} was overwritten!".format(name))
and then the same for globals(). This will check whether there is any object in the local namespace that has a different value in the builtins namespace. (Setting name and val to None is needed so that the variables exist before calling locals, or else you'll get a "dictionary changed sized during iteration" error because the names are added partway through the loop.)
You could also use a tool like pylint which checks for such errors among many others.

Modules and variable scopes

I'm not an expert at python, so bear with me while I try to understand the nuances of variable scopes.
As a simple example that describes the problem I'm facing, say I have the following three files.
The first file is outside_code.py. Due to certain restrictions I cannot modify this file. It must be taken as is. It contains some code that runs an eval at some point (yes, I know that eval is the spawn of satan but that's a discussion for a later day). For example, let's say that it contains the following lines of code:
def eval_string(x):
return eval(x)
The second file is a set of user defined functions. Let's call it functions.py. It contains some unknown number of function definitions, for example, let's say that functions.py contains one function, defined below:
def foo(x):
print("Your number is {}!".format(x))
Now I write a third file, let's call it main.py. Which contains the following code:
import outside_code
from functions import *
outside_code.eval_string("foo(4)")
I import all of the function definitions from functions.py with a *, so they should be accessible by main.py without needing to do something like functions.foo(). I also import outside_code.py so I can access its core functionality, the code that contains an eval. Finally I call the function in outside_code.py, passing a string that is related to a function defined in functions.py.
In the simplified example, I want the code to print out "Your number is 4!". However, I get an error stating that 'foo' is not defined. This obviously means that the code in outside_code.py cannot access the same foo function that exists in main.py. So somehow I need to make foo accessible to it. Could anyone tell me exactly what the scope of foo currently is, and how I could extend it to cover the space that I actually want to use it in? What is the best way to solve my problem?
You'd have to add those names to the scope of outside_code. If outside_code is a regular Python module, you can do so directly:
import outside_code
import functions
for name in getattr(functions, '__all__', (n for n in vars(functions) if not n[0] == '_')):
setattr(outside_code, name, getattr(functions, name))
This takes all names functions exports (which you'd import with from functions import *) and adds a reference to the corresponding object to outside_code so that eval() inside outside_code.eval_string() can find them.
You could use the ast.parse() function to produce a parse tree from the expression before passing it to eval_function() and then extract all global names from the expression and only add those names to outside_code to limit the damage, so to speak, but you'd still be clobbering the other module namespace to make this work.
Mind you, this is almost as evil as using eval() in the first place, but it's your only choice if you can't tell eval() in that other module to take a namespace parameter. That's because by default, eval() uses the global namespace of the module it is run in as the namespace.
If, however, your eval_string() function actually accepts more parameters, look for a namespace or globals option. If that exists, the function probably looks more like this:
def eval_string(x, namespace=None):
return eval(x, globals=namespace)
after which you could just do:
outside_code.eval_string('foo(4)', vars(functions))
where vars(functions) gives you the namespace of the functions module.
foo has been imported into main.py; its scope is restricted to that file (and to the file where it was originally defined, of course). It does not exist within outside_code.py.
The real eval function accepts locals and globals dicts to allow you to add elements to the namespace of the evaluted code. But you can't do anything if your eval_string doesn't already pass those on.
The relevant documentation: https://docs.python.org/3.5/library/functions.html#eval
eval takes an optional dictionary mapping global names to values
eval('foo(4)', {'foo': foo})
Will do what you need. It is mapping the string 'foo' to the function object foo.
EDIT
Rereading your question, it looks like this won't work for you. My only other thought is to try
eval_str('eval("foo(4)", {"foo": lambda x: print("Your number is {}!".format(x))})')
But that's a very hackish solution and doesn't scale well to functions that don't fit in lambdas.

Update locals from inside a function

I would like to write a function which receives a local namespace dictionary and update it. Something like this:
def UpdateLocals(local_dict):
d = {'a':10, 'b':20, 'c':30}
local_dict.update(d)
When I call this function from the interactive python shell it works all right, like this:
a = 1
UpdateLocals(locals())
# prints 20
print a
However, when I call UpdateLocals from inside a function, it doesn't do what I expect:
def TestUpdateLocals():
a = 1
UpdateLocals(locals())
print a
# prints 1
TestUpdateLocals()
How can I make the second case work like the first?
UPDATE:
Aswin's explanation makes sense and is very helpful to me. However I still want a mechanism to update the local variables. Before I figure out a less ugly approach, I'm going to do the following:
def LoadDictionary():
return {'a': 10, 'b': 20, 'c': 30}
def TestUpdateLocals():
a = 1
for name, value in LoadDictionary().iteritems():
exec('%s = value' % name)
Of course the construction of the string statements can be automated, and the details can be hidden from the user.
You have asked a very good question. In fact, the ability to update local variables is very important and crucial in saving and loading datasets for machine learning or in games. However, most developers of Python language have not come to a realization of its importance. They focus too much on conformity and optimization which is nevertheless important too.
Imagine you are developing a game or running a deep neural network (DNN), if all local variables are serializable, saving the entire game or DNN can be simply put into one line as print(locals()), and loading the entire game or DNN can be simply put into one line as locals().update(eval(sys.stdin.read())).
Currently, globals().update(...) takes immediate effect but locals().update(...) does not work because Python documentation says:
The default locals act as described for function locals() below:
modifications to the default locals dictionary should not be
attempted. Pass an explicit locals dictionary if you need to see
effects of the code on locals after function exec() returns.
Why they design Python in such way is because of optimization and conforming the exec statement into a function:
To modify the locals of a function on the fly is not possible without
several consequences: normally, function locals are not stored in a
dictionary, but an array, whose indices are determined at compile time
from the known locales. This collides at least with new locals added
by exec. The old exec statement circumvented this, because the
compiler knew that if an exec without globals/locals args occurred in
a function, that namespace would be "unoptimized", i.e. not using the
locals array. Since exec() is now a normal function, the compiler does
not know what "exec" may be bound to, and therefore can not treat is
specially.
Since global().update(...) works, the following piece of code will work in root namespace (i.e., outside any function) because locals() is the same as globals() in root namespace:
locals().update({'a':3, 'b':4})
print(a, b)
But this will not work inside a function.
However, as hacker-level Python programmers, we can use sys._getframe(1).f_locals instead of locals(). From what I have tested so far, on Python 3, the following piece of code always works:
def f1():
sys._getframe(1).f_locals.update({'a':3, 'b':4})
print(a, b)
f1()
However, sys._getframe(1).f_locals does not work in root namespace.
The locals are not updated here because, in the first case, the variable declared has a global scope. But when declared inside a function, the variable loses scope outside it.
Thus, the original value of the locals() is not changed in the UpdateLocals function.
PS: This might not be related to your question, but using camel case is not a good practice in Python. Try using the other method.
update_locals() instead of UpdateLocals()
Edit To answer the question in your comment:
There is something called a System Stack. The main job of this system stack during the execution of a code is to manage local variables, make sure the control returns to the correct statement after the completion of execution of the called function etc.,
So, everytime a function call is made, a new entry is created in that stack,
which contains the line number (or instruction number) to which the control has to return after the return statement, and a set of fresh local variables.
The local variables when the control is inside the function, will be taken from the stack entry. Thus, the set of locals in both the functions are not the same. The entry in the stack is popped when the control exits from the function. Thus, the changes you made inside the function are erased, unless and until those variables have a global scope.

Categories