Python: `locals()` as a default function argument - python

Suppose I have a module PyFoo.py that has a function bar. I want bar to print all of the local variables associated with the namespace that called it.
For example:
#! /usr/bin/env python
import PyFoo as pf
var1 = 'hi'
print locals()
pf.bar()
The two last lines would give the same output. So far I've tried defining bar as such:
def bar(x=locals):
print x()
def bar(x=locals()):
print x
But neither works. The first ends up being what's local to bar's namespace (which I guess is because that's when it's evaluated), and the second is as if I passed in globals (which I assume is because it's evaluated during import).
Is there a way I can have the default value of argument x of bar be all variables in the namespace which called bar?
EDIT 2018-07-29:
As has been pointed out, what was given was an XY Problem; as such, I'll give the specifics.
The module I'm putting together will allow the user to create various objects that represent different aspects of a numerical problem (e.x. various topology definitions, boundary conditions, constitutive models, ect.) and define how any given object interacts with any other object(s). The idea is for the user to import the module, define the various model entities that they need, and then call a function which will take all objects passed to it, make needed adjustments to ensure capability between them, and then write out a file that represents the entire numerical problem as a text file.
The module has a function generate that accepts each of the various types of aspects of the numerical problem. The default value for all arguments is an empty list. If a non-empty list is passed, then generate will use those instances for generating the completed numerical problem. If an argument is an empty list, then I'd like it to take in all instances in the namespace that called generate (which I will then parse out the appropriate instances for the argument).
EDIT 2018-07-29:
Sorry for any lack of understanding on my part (I'm not that strong of a programmer), but I think I might understand what you're saying with respect to an instance being declared or registered.
From my limited understanding, could this be done by creating some sort of registry dataset (like a list or dict) in the module that will be created when the module is imported, and that all module classes take this registry object in by default. During class initialization self can be appended to said dataset, and then the genereate function will take the registry as a default value for one of the arguments?

There's no way you can do what you want directly.
locals just returns the local variables in whatever namespace it's called in. As you've seen, you have access to the namespace the function is defined in at the time of definition, and you have access to the namespace of the function itself from within the function, but you don't have access to any other namespaces.
You can do what you want indirectly… but it's almost certainly a bad idea. At least this smells like an XY problem, and whatever it is you're actually trying to do, there's probably a better way to do it.
But occasionally it is necessary, so in case you have one of those cases:
The main good reason to want to know the locals of your caller is for some kind of debugging or other introspection function. And the way to do introspection is almost always through the inspect library.
In this case, what you want to inspect is the interpreter call stack. The calling function will be the first frame on the call stack behind your function's own frame.
You can get the raw stack frame:
inspect.currentframe().f_back
… or you can get a FrameInfo representing it:
inspect.stack()[1]
As explained at the top of the inspect docs, a frame object's local namespace is available as:
frame.f_locals
Note that this has all the same caveats that apply to getting your own locals with locals: what you get isn't the live namespace, but a mapping that, even if it is mutable, can't be used to modify the namespace (or, worse in 2.x, one that may or may not modify the namespace, unpredictably), and that has all cell and free variables flattened into their values rather than their cell references.
Also, see the big warning in the docs about not keeping frame objects alive unnecessarily (or calling their clear method if you need to keep a snapshot but not all of the references, but I think that only exists in 3.x).

Related

use of attributes in python

This is kind of a high level question. I'm not sure what you'd do with code like this:
class Object(object):
pass
obj = Object
obj.a = lambda: None
obj.d = lambda: dict
setattr(obj.d, 'dictionary', {4,3,5})
setattr(obj.a, 'somefield', 'somevalue')
If I'm going to call obj.a.somefield, why would I use print? It feels redundant.
I simply can't see what programming strictly with setting attributes would be good for?
I could write an entire program with all of my variables in object classes.
First about your print question. Print is used more for debugging or for attributes that are an output from an object that gives you information when you create it.
For example, there might be an object that you create by passing it data and it finds all of the basic statistics information of that data. You could have it return a dictionary via a method and access the values from there or you could simply access it via an attribute, making the data more readable.
For your second part of your question about why you would want to use attributes in general, they're more for internally passing information from function to function in an object or for configuring an object. Python has different scopes that determine which information each function can access. All methods of an object can access that object's attributes, which allows you to avoid using external or global variables. That makes your object nice and self contained. Global variables are generally avoided at all costs, because they can get messy, so they are considered bad practice.
Taking that a step further, using setattr is a more sophisticated way of setting these attributes to make your code more readable. You could use a function to modify aspects of an object or you could "hide" the complexity inside your setattr so the user can use a higher level interface rather than getting bogged down in the specifics.

How do I iterate through a dictionary/set in SLY?

So, I'm trying to transition my code from my earlier PLY implementation to SLY. Previously, I had some code that loaded a binary file with a wide range of reserved words scraped from documentation of the scripting language I'm trying to implement. However, when I try to iterate through the scraped items in the lexer for SLY, I get an error message inside LexerMetaDict's __setitem__ when trying to iterate through the resulting set of:
Exception has occurred: AttributeError
Name transition redefined
File "C:\dev\sly\sly\lex.py", line 126, in __setitem__
raise AttributeError(f'Name {key} redefined')
File "C:\dev\sly\example\HeroLab\HeroLab.py", line 24, in HeroLabLexer
for transition in transition_set:
File "C:\dev\sly\example\HeroLab\HeroLab.py", line 6, in <module>
class HeroLabLexer(Lexer):
The code in question:
from transitions import transition_set, reference_set
class HeroLabLexer(Lexer):
# initial token assignments
for transition in transition_set:
tokens.add(transition)
I might not be as surprised if it were happening when trying to add to the tokens, since I'm still trying to figure out how to interface with the SLY method for defining things, but if I change that line to a print statement, it still fails when I iterate through the second item in "transition_set". I've tried renaming the various variables, but to little avail.
The error you get is the result of a modification Sly makes to the Lexer metaclass, which I discuss below. But for a simple answer: I assume tokens is a set, so you can easily avoid the problem with
tokens |= transition_set
If transition_set were an iterable but not a set, you could use the update method, which works with any iterable (and any number of iterable arguments):
tokens.update(transition_set)
tokens doesn't have to be a set. Sly should work with any iterable. But you might need to adjust the above expressions. If tokens is a tuple or a list, you'd use += instead of |= and, in the case of lists, extend instead of update. (There are some minor differences, as with the fact that set.update can be used to merge several sets.)
That doesn't answer your direct question, "How do I iterate... in SLY". I interpret that as asking:
How do I write a for loop at class scope in a class derived from sly.lexer?
and that's a harder question. The games which Sly plays with Python namespaces make it difficult to use for loops in the class scope of a lexer, because Sly replaces the attribute dictionary of Lexer class (and its subclasses) with a special dictionary which doesn't allow redefinition of attributes with string values. Since the iteration variable in a for statement is in the enclosing scope --in this case is the class scope--, any for loop with a string index variable and whose body runs more than once will trigger the "Name redefined" error which you experienced.
It's also worth noting that if you use a for statement at class scope in any class, the last value of the iteration variable will become a class attribute. That's hardly ever desirable, and, really, that construction is not good practice in any class. But it doesn't usually throw an error.
At class scope, you can use comprehensions (whose iteration variables are effectively locals). Of course, in this case, there's no advantage in writing:
tokens.update(transition for transition in transition_set)
But the construct might be useful for other situations. Note, however, that other variables at class scope (such as tokens) are not visible in the body of the comprehension, which might also create difficulties.
Although it's extremely ugly, you can declare the iteration variable as a global, which makes it a module variable rather than a class variable (and therefore just trades one bad practice for another one, although you can later remove the variable from the module).
You could do the computation in a different scope (such as a global function), or you could write a (global) generator to use with tokens.update(), which is probably the most general solution.
Finally, you can make sure that the index variable is never an instance of a str.

LLDB Python scripting create variable

I am using LLDB Python scripting support to add custom Variable Formatting for a complex C++ class type in XCode.
This is working well for simple situations, but I have hit a wall when I need to call a method which uses a pass-by-reference parameter, which it populates with results. This would require me to create a variable to pass here, but I can't find a way to do this?
I have tried using the target's CreateValueFromData method, as below, but this doesn't seem to work.
import lldb
def MyClass(valobj, internal_dict):
class2_type = valobj.target.FindFirstType('class2')
process = valobj.process
class2Data = [0]
data = lldb.SBData.CreateDataFromUInt32Array(process.GetByteOrder(), process.GetAddressByteSize(), class2Data)
valobj.target.CreateValueFromData("testClass2", data, class2_type)
valobj.EvaluateExpression("getType(testClass2)")
class2Val = valobj.frame.FindVariable("testClass2")
if not class2Val.error.success:
return class2Val.error.description
return class2Val.GetValueAsUnsigned()
Is there some way to be able to achieve what I'm trying to do?
SBValue names are just labels for the SBValue, they aren't guaranteed to exist as symbols in the target. For instance if the value you are formatting is an ivar of some other object, it's name will be the ivar name... And lldb does not inject new SBValue's names into the symbol table - that would end up causing lots of name collisions. So they don't exist in the namespace the expression evaluator queries when looking up names.
If the variable you are formatting is a pointer, you can get the pointer value and cons up an expression that casts the pointer value to the appropriate type for your getType function, and pass that to your function. If the value is not a pointer, you can still use SBValue.AddressOf to get the memory location of the value. If the value exists only in lldb (AddressOf will return an invalid address) then you would have to push it to the target with SBProcess.AllocateMemory/WriteMemory, but that should only happen if you have another data formatter that makes these objects out of whole cloth for its own purposes.
It's better not to call functions in formatters if you can help it. But if you really must call a function in your data formatter, you should to do that judiciously.
They can cause performance problems (if you have an array of 100 elements of this type, your formatter will require 100 function calls in the target to render the array... That's 200 context switches between your process and the debugger, plus a bunch of memory reads and writes) for every step operation.
Also, since you can't ensure that the data in your value is correct (it might represent a variable that has not been initialized yet, or already deallocated) you either need to have your function handle bad data, or at least be prepared for the expression to crash. lldb can clean up the stack and suppress the exception from crashes, but it can't undo any side-effects the expression might have had before crashing.
For instance, if the function you called took some lock before crashing that it was expecting to release on the way out, your formatter will damage the state of the program. So you have to be careful what you call...
And by default, EvaluateExpression will allow all threads to run so that expressions don't deadlock against a lock held by another thread. You probably don't want that to happen, since that means looking at the locals of one thread will "change" the state of another thread. So you really should only call functions you are sure don't take locks. And use the version of EvaluateExpression that takes an SBExpressionOption, in which you set the SBExpressionOptions.StopOthers to True, and SetTryAllThreads to False.

Why is __code__ for a function(Python) mutable

In a previous question yesterday, in comments, I came to know that in python __code__ atrribute of a function is mutable. Hence I can write code as following
def foo():
print "Hello"
def foo2():
print "Hello 2"
foo()
foo.__code__ = foo2.__code__
foo()
Output
Hello
Hello 2
I tried googling, but either because there is no information(I highly doubt this), or the keyword (__code__) is not easily searchable, I couldn't find a use case for this.
It doesn't seem like "because most things in Python are mutable" is a reasonable answer either, because other attributes of functions — __closure__ and __globals__ — are explicitly read-only (from Objects/funcobject.c):
static PyMemberDef func_memberlist[] = {
{"__closure__", T_OBJECT, OFF(func_closure),
RESTRICTED|READONLY},
{"__doc__", T_OBJECT, OFF(func_doc), PY_WRITE_RESTRICTED},
{"__globals__", T_OBJECT, OFF(func_globals),
RESTRICTED|READONLY},
{"__module__", T_OBJECT, OFF(func_module), PY_WRITE_RESTRICTED},
{NULL} /* Sentinel */
};
Why would __code__ be writable while other attributes are read-only?
The fact is, most things in Python are mutable. So the real question is, why are __closure__ and __globals__ not?
The answer initially appears simple. Both of these things are containers for variables which the function might need. The code object itself does not carry its closed-over and global variables around with it; it merely knows how to get them from the function. It grabs the actual values out of these two attributes when the function is called.
But the scopes themselves are mutable, so this answer is unsatisfying. We need to explain why modifying these things in particular would break stuff.
For __closure__, we can look to its structure. It is not a mapping, but a tuple of cells. It doesn't know the names of the closed-over variables. When the code object looks up a closed-over variable, it needs to know its position in the tuple; they match up one-to-one with co_freevars which is also read-only. And if the tuple is of the wrong size or not a tuple at all, this mechanism breaks down, probably violently (read: segfaults) if the underlying C code isn't expecting such a situation. Forcing the C code to check the type and size of the tuple is needless busy-work which can be eliminated by making the attribute read-only. If you try to replace __code__ with something taking a different number of free variables, you get an error, so the size is always right.
For __globals__, the explanation is less immediately obvious, but I'll speculate. The scope lookup mechanism expects to have access to the global namespace at all times. Indeed, the bytecode may be hard-coded to go straight to the global namespace, if the compiler can prove no other namespace will have a variable with a particular name. If the global namespace was suddenly None or some other non-mapping object, the C code could, once again, violently misbehave. Again, making the code perform needless type checks would be a waste of CPU cycles.
Another possibility is that (normally-declared) functions borrow a reference to the module's global namespace, and making the attribute writable would cause the reference count to get messed up. I could imagine this design, but I'm not really sure it's a great idea since functions can be constructed explicitly with objects whose lifetimes might be shorter than that of the owning module, and these would need to be special-cased.

Python functional closures without performance hit

In my python program, I have a ton of functions that are really wrappers for more complicated functions (the more complicated functions take more arguments, so the simple functions calculate the extra arguments and pass them along with the original arguments to the complex functions). I don't want the more complicated functions to be visible from the outer scope. However, my understanding is that if you define a function inside a function every time the outer function gets called it redefines the inner function, which is wasteful. How can I hide my inner functions without redefining them over and over again? There must be some way for the interpreter to parse my file and just do the definitions once but still keep them in the inner scope.
Rather than controlling access to your "inner functions" by nesting them, use either or both of:
naming conventions (a leading underscore on a name means private-by-convention, see the style guide); and
defining a list named __all__ to specify what gets imported from the package by default (see the tutorial on modules).
In use:
# define the names that get imported from this package
__all__ = ['outer_func']`
def _inner_func(...):
"""Private-by-convention inner function."""
...
def outer_func(...):
"""Public outer function to call _inner_func."""
...
This makes testing much easier, too, as you can still get direct access to _inner_func when necessary.
I think the convention is to prepend the function name with two underscores.
(See: http://www.diveintopython.net/object_oriented_framework/private_functions.html)

Categories