Storing the entire workspace output? [duplicate] - python

I want to save all the variables in my current python environment. It seems one option is to use the 'pickle' module. However, I don't want to do this for 2 reasons:
I have to call pickle.dump() for each variable
When I want to retrieve the variables, I must remember the order in which I saved the variables, and then do a pickle.load() to retrieve each variable.
I am looking for some command which would save the entire session, so that when I load this saved session, all my variables are restored. Is this possible?
Edit: I guess I don't mind calling pickle.dump() for each variable that I would like to save, but remembering the exact order in which the variables were saved seems like a big restriction. I want to avoid that.

If you use shelve, you do not have to remember the order in which the objects are pickled, since shelve gives you a dictionary-like object:
To shelve your work:
import shelve
T='Hiya'
val=[1,2,3]
filename='/tmp/shelve.out'
my_shelf = shelve.open(filename,'n') # 'n' for new
for key in dir():
try:
my_shelf[key] = globals()[key]
except TypeError:
#
# __builtins__, my_shelf, and imported modules can not be shelved.
#
print('ERROR shelving: {0}'.format(key))
my_shelf.close()
To restore:
my_shelf = shelve.open(filename)
for key in my_shelf:
globals()[key]=my_shelf[key]
my_shelf.close()
print(T)
# Hiya
print(val)
# [1, 2, 3]

Having sat here and failed to save the globals() as a dictionary, I discovered you can pickle a session using the dill library.
This can be done by using:
import dill #pip install dill --user
filename = 'globalsave.pkl'
dill.dump_session(filename)
# and to load the session again:
dill.load_session(filename)

One very easy way that might satisfy your needs. For me, it did pretty well:
Simply, click on this icon on the Variable Explorer (right side of Spider):

Here is a way saving the Spyder workspace variables using the spyderlib functions
#%% Load data from .spydata file
from spyderlib.utils.iofuncs import load_dictionary
globals().update(load_dictionary(fpath)[0])
data = load_dictionary(fpath)
#%% Save data to .spydata file
from spyderlib.utils.iofuncs import save_dictionary
def variablesfilter(d):
from spyderlib.widgets.dicteditorutils import globalsfilter
from spyderlib.plugins.variableexplorer import VariableExplorer
from spyderlib.baseconfig import get_conf_path, get_supported_types
data = globals()
settings = VariableExplorer.get_settings()
get_supported_types()
data = globalsfilter(data,
check_all=True,
filters=tuple(get_supported_types()['picklable']),
exclude_private=settings['exclude_private'],
exclude_uppercase=settings['exclude_uppercase'],
exclude_capitalized=settings['exclude_capitalized'],
exclude_unsupported=settings['exclude_unsupported'],
excluded_names=settings['excluded_names']+['settings','In'])
return data
def saveglobals(filename):
data = globalsfiltered()
save_dictionary(data,filename)
#%%
savepath = 'test.spydata'
saveglobals(savepath)
Let me know if it works for you.
David B-H

What you're trying to do is to hibernate your process. This was discussed already. The conclusion is that there are several hard-to-solve problems exist while trying to do so. For example with restoring open file descriptors.
It is better to think about serialization/deserialization subsystem for your program. It is not trivial in many cases, but is far better solution in long-time perspective.
Although if I've exaggerated the problem. You can try to pickle your global variables dict. Use globals() to access the dictionary. Since it is varname-indexed you haven't to bother about the order.

If you want the accepted answer abstracted to function you can use:
import shelve
def save_workspace(filename, names_of_spaces_to_save, dict_of_values_to_save):
'''
filename = location to save workspace.
names_of_spaces_to_save = use dir() from parent to save all variables in previous scope.
-dir() = return the list of names in the current local scope
dict_of_values_to_save = use globals() or locals() to save all variables.
-globals() = Return a dictionary representing the current global symbol table.
This is always the dictionary of the current module (inside a function or method,
this is the module where it is defined, not the module from which it is called).
-locals() = Update and return a dictionary representing the current local symbol table.
Free variables are returned by locals() when it is called in function blocks, but not in class blocks.
Example of globals and dir():
>>> x = 3 #note variable value and name bellow
>>> globals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', 'x': 3, '__doc__': None, '__package__': None}
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__', 'x']
'''
print 'save_workspace'
print 'C_hat_bests' in names_of_spaces_to_save
print dict_of_values_to_save
my_shelf = shelve.open(filename,'n') # 'n' for new
for key in names_of_spaces_to_save:
try:
my_shelf[key] = dict_of_values_to_save[key]
except TypeError:
#
# __builtins__, my_shelf, and imported modules can not be shelved.
#
#print('ERROR shelving: {0}'.format(key))
pass
my_shelf.close()
def load_workspace(filename, parent_globals):
'''
filename = location to load workspace.
parent_globals use globals() to load the workspace saved in filename to current scope.
'''
my_shelf = shelve.open(filename)
for key in my_shelf:
parent_globals[key]=my_shelf[key]
my_shelf.close()
an example script of using this:
import my_pkg as mp
x = 3
mp.save_workspace('a', dir(), globals())
to get/load the workspace:
import my_pkg as mp
x=1
mp.load_workspace('a', globals())
print x #print 3 for me
it worked when I ran it. I will admit I don't understand dir() and globals() 100% so I am not sure if there might be some weird caveat, but so far it seems to work. Comments are welcome :)
after some more research if you call save_workspace as I suggested with globals and save_workspace is within a function it won't work as expected if you want to save the veriables in a local scope. For that use locals(). This happens because globals takes the globals from the module where the function is defined, not from where it is called would be my guess.

You can save it as a text file or a CVS file. People use Spyder for example to save variables but it has a known issue: for specific data types it fails to import down in the road.

Related

accessing and changing module level variable [duplicate]

I've run into a bit of a wall importing modules in a Python script. I'll do my best to describe the error, why I run into it, and why I'm tying this particular approach to solve my problem (which I will describe in a second):
Let's suppose I have a module in which I've defined some utility functions/classes, which refer to entities defined in the namespace into which this auxiliary module will be imported (let "a" be such an entity):
module1:
def f():
print a
And then I have the main program, where "a" is defined, into which I want to import those utilities:
import module1
a=3
module1.f()
Executing the program will trigger the following error:
Traceback (most recent call last):
File "Z:\Python\main.py", line 10, in <module>
module1.f()
File "Z:\Python\module1.py", line 3, in f
print a
NameError: global name 'a' is not defined
Similar questions have been asked in the past (two days ago, d'uh) and several solutions have been suggested, however I don't really think these fit my requirements. Here's my particular context:
I'm trying to make a Python program which connects to a MySQL database server and displays/modifies data with a GUI. For cleanliness sake, I've defined the bunch of auxiliary/utility MySQL-related functions in a separate file. However they all have a common variable, which I had originally defined inside the utilities module, and which is the cursor object from MySQLdb module.
I later realised that the cursor object (which is used to communicate with the db server) should be defined in the main module, so that both the main module and anything that is imported into it can access that object.
End result would be something like this:
utilities_module.py:
def utility_1(args):
code which references a variable named "cur"
def utility_n(args):
etcetera
And my main module:
program.py:
import MySQLdb, Tkinter
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
And then, as soon as I try to call any of the utilities functions, it triggers the aforementioned "global name not defined" error.
A particular suggestion was to have a "from program import cur" statement in the utilities file, such as this:
utilities_module.py:
from program import cur
#rest of function definitions
program.py:
import Tkinter, MySQLdb
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
But that's cyclic import or something like that and, bottom line, it crashes too. So my question is:
How in hell can I make the "cur" object, defined in the main module, visible to those auxiliary functions which are imported into it?
Thanks for your time and my deepest apologies if the solution has been posted elsewhere. I just can't find the answer myself and I've got no more tricks in my book.
Globals in Python are global to a module, not across all modules. (Many people are confused by this, because in, say, C, a global is the same across all implementation files unless you explicitly make it static.)
There are different ways to solve this, depending on your actual use case.
Before even going down this path, ask yourself whether this really needs to be global. Maybe you really want a class, with f as an instance method, rather than just a free function? Then you could do something like this:
import module1
thingy1 = module1.Thingy(a=3)
thingy1.f()
If you really do want a global, but it's just there to be used by module1, set it in that module.
import module1
module1.a=3
module1.f()
On the other hand, if a is shared by a whole lot of modules, put it somewhere else, and have everyone import it:
import shared_stuff
import module1
shared_stuff.a = 3
module1.f()
… and, in module1.py:
import shared_stuff
def f():
print shared_stuff.a
Don't use a from import unless the variable is intended to be a constant. from shared_stuff import a would create a new a variable initialized to whatever shared_stuff.a referred to at the time of the import, and this new a variable would not be affected by assignments to shared_stuff.a.
Or, in the rare case that you really do need it to be truly global everywhere, like a builtin, add it to the builtin module. The exact details differ between Python 2.x and 3.x. In 3.x, it works like this:
import builtins
import module1
builtins.a = 3
module1.f()
As a workaround, you could consider setting environment variables in the outer layer, like this.
main.py:
import os
os.environ['MYVAL'] = str(myintvariable)
mymodule.py:
import os
myval = None
if 'MYVAL' in os.environ:
myval = os.environ['MYVAL']
As an extra precaution, handle the case when MYVAL is not defined inside the module.
This post is just an observation for Python behaviour I encountered. Maybe the advices you read above don't work for you if you made the same thing I did below.
Namely, I have a module which contains global/shared variables (as suggested above):
#sharedstuff.py
globaltimes_randomnode=[]
globalist_randomnode=[]
Then I had the main module which imports the shared stuff with:
import sharedstuff as shared
and some other modules that actually populated these arrays. These are called by the main module. When exiting these other modules I can clearly see that the arrays are populated. But when reading them back in the main module, they were empty. This was rather strange for me (well, I am new to Python). However, when I change the way I import the sharedstuff.py in the main module to:
from globals import *
it worked (the arrays were populated).
Just sayin'
A function uses the globals of the module it's defined in. Instead of setting a = 3, for example, you should be setting module1.a = 3. So, if you want cur available as a global in utilities_module, set utilities_module.cur.
A better solution: don't use globals. Pass the variables you need into the functions that need it, or create a class to bundle all the data together, and pass it when initializing the instance.
The easiest solution to this particular problem would have been to add another function within the module that would have stored the cursor in a variable global to the module. Then all the other functions could use it as well.
module1:
cursor = None
def setCursor(cur):
global cursor
cursor = cur
def method(some, args):
global cursor
do_stuff(cursor, some, args)
main program:
import module1
cursor = get_a_cursor()
module1.setCursor(cursor)
module1.method()
Since globals are module specific, you can add the following function to all imported modules, and then use it to:
Add singular variables (in dictionary format) as globals for those
Transfer your main module globals to it
.
addglobals = lambda x: globals().update(x)
Then all you need to pass on current globals is:
import module
module.addglobals(globals())
Since I haven't seen it in the answers above, I thought I would add my simple workaround, which is just to add a global_dict argument to the function requiring the calling module's globals, and then pass the dict into the function when calling; e.g:
# external_module
def imported_function(global_dict=None):
print(global_dict["a"])
# calling_module
a = 12
from external_module import imported_function
imported_function(global_dict=globals())
>>> 12
The OOP way of doing this would be to make your module a class instead of a set of unbound methods. Then you could use __init__ or a setter method to set the variables from the caller for use in the module methods.
Update
To test the theory, I created a module and put it on pypi. It all worked perfectly.
pip install superglobals
Short answer
This works fine in Python 2 or 3:
import inspect
def superglobals():
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals
save as superglobals.py and employ in another module thusly:
from superglobals import *
superglobals()['var'] = value
Extended Answer
You can add some extra functions to make things more attractive.
def superglobals():
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals
def getglobal(key, default=None):
"""
getglobal(key[, default]) -> value
Return the value for key if key is in the global dictionary, else default.
"""
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals.get(key, default)
def setglobal(key, value):
_globals = superglobals()
_globals[key] = value
def defaultglobal(key, value):
"""
defaultglobal(key, value)
Set the value of global variable `key` if it is not otherwise st
"""
_globals = superglobals()
if key not in _globals:
_globals[key] = value
Then use thusly:
from superglobals import *
setglobal('test', 123)
defaultglobal('test', 456)
assert(getglobal('test') == 123)
Justification
The "python purity league" answers that litter this question are perfectly correct, but in some environments (such as IDAPython) which is basically single threaded with a large globally instantiated API, it just doesn't matter as much.
It's still bad form and a bad practice to encourage, but sometimes it's just easier. Especially when the code you are writing isn't going to have a very long life.

How can I call a function that is stored as a variable from a python file that is also stored as a variable?

I can import a python script using import_module. But, how can I call a function stored as a variable from that script? I've previously used getattr to work with dictionaries stored as variables, but I don't think this same method works with functions. Here's an example that does not currently work:
from importlib import import_module
file_list = ['file1','file2']
func_list = ['func1','func2']
for file in file_list:
test_file = import_module(file)
for func in func_list:
from test_file import func
file1:
def func1():
...
def func2():
...
file2:
def func1():
...
def func2():
...
I can import a python script using import_module.
When you do this, the result is a module object - just the same as an import statement provides.
from test_file import func
The reason this doesn't work is because it is looking for a test_file module - and it cares about module names as they appear in sys.path, not about your local variable names.
Fortunately, since you already have the module object, you presumably realized you could access the contents normally, as attributes, e.g. test_file.func.
I've previously used getattr to work with dictionaries stored as variables, but I don't think this same method works with functions
I'm not quite sure what you mean here. Attributes are attributes, whether they're plain data, functions, classes or anything else. test_file is a thing that has a func attribute, therefore getattr(test_file, 'func') gets that attribute.
The remaining issue is the variable-variables problem - you don't really want to be creating a name for that result dynamically. So yes, you can store that in a dict, if you want. But frankly it's easier to just use the module object. Unless perhaps for some reason you need/want to "trim" the contents and only expose a limited interface (for some other client); but you can't avoid loading the whole module. from X import Y does that anyway.
The module object that you got from the dynamic import is already working as a namespace, which you need here anyway because you're importing multiple modules that have overlapping attribute names.
tl;dr: if you want to call a function from that imported module, just do it the same way that you would have if you had imported the module (not a name from that module) normally. We can, for example, put the imported modules in a list:
modules = [import_module(f) for f in filenames]
and then call the appropriate method by looking it up within the appropriate module object:
modules[desired_module_id].desired_func()
basically You would run this code in a separate file and where it says the_file_where_this_is_needed.py You would insert the file where You want these import statement to be. (also probably You can run this code in the very file). it will be sort of like hardcoding but automatic
file_list = ['file1', 'file2']
func_list = ['func1', 'func2']
with open('the_file_where_this_is_needed.py', 'r') as file:
data = file.read()
string = ''
for file in file_list:
for func in func_list:
string += f'from {file} import {func}\n'
data = string + data
with open('the_file_where_this_is_needed.py', 'w') as file:
file.write(data)

Module namespace initialisation before execution

I'm trying to dynamically update code during runtime by reloading modules using importlib.reload. However, I need a specific module variable to be set before the module's code is executed. I could easily set it as an attribute after reloading but each module would have already executed its code (e.g., defined its default arguments).
A simple example:
# module.py
def do():
try:
print(a)
except NameError:
print('failed')
# main.py
import module
module.do() # prints failed
module.a = 'succeeded'
module.do() # prints succeeded
The desired pseudocode:
import_module_without_executing_code module
module.initialise(a = 'succeeded')
module.do()
Is there a way to control module namespace initialisation (like with classes using metaclasses)?
It's not usually a good idea to use reload other than for interactive debugging. For example, it can easily create situations where two objects of type module.A are not the same type.
What you want is execfile. Pass a globals dictionary (you don't need an explicit locals dictionary) to keep each execution isolated; anything you store in it ahead of time acts exactly like the "pre-set" variables you want. If you do want to have a "real" module interface change, you can have a wrapper module that calls (or just holds as an attribute) the most recently loaded function from your changing file.
Of course, since you're using Python 3, you'll have to use one of the replacements for execfile.
Strictly speaking, I don't believe there is a way to do what you're describing in Python natively. However, assuming you own the module you're trying to import, a common approach with Python modules that need some initializing input is to use an init function.
If all you need is some internal variables to be set, like a in you example above, that's easy: just declare some module-global variables and set them in your init function:
Demo: https://repl.it/MyK0
Module:
## mymodule.py
a = None
def do():
print(a)
def init(_a):
global a
a = _a
Main:
## main.py
import mymodule
mymodule.init(123)
mymodule.do()
mymodule.init('foo')
mymodule.do()
Output:
123
foo
Where things can get trickier is if you need to actually redefine some functions because some dynamic internal something is dependent on the input you give. Here's one solution, borrowed from https://stackoverflow.com/a/1676860. Basically, the idea is to grab a reference to the current module by using the magic variable __name__ to index into the system module dictionary, sys.modules, and then define or overwrite the functions that need it. We can define the functions locally as inner functions, then add them to the module:
Demo: https://repl.it/MyHT/2
Module:
## mymodule.py
import sys
def init(a):
current_module = sys.modules[__name__]
def _do():
try:
print(a)
except NameError:
print('failed')
current_module.do = _do

How do I detect if a class / variable was imported in Python 3?

This is the contents of script_one.py:
x = "Hello World"
This is the contents of script_two.py:
from script_one import x
print(x)
Now, if I ran script_two.py the output would be:
>>> Hello World
What I need is a way to detect if x was imported.
This is what I imagine the source code of script_one.py would look like:
x = "Hello World"
if x.has_been_imported:
print("You've just imported \"x\"!")
Then if I ran script_two.py the output "should" be:
>>> Hello World
>>> You've just imported "x"!
What is this called, does this feature exist in Python 3 and how do you use it?
You can't. Effort expended on trying to detect this are a waste of time, I'm afraid.
Python imports consist of the following steps:
Check if the module is already loaded by looking at sys.modules.
If the module hasn't been loaded yet, load it. This creates a new module object that is added to sys.modules, containing all objects resulting from executing the top-level code.
Bind names in the importing namespace. How names are bound depends on the exact import variant chosen.
import module binds the name module to the sys.modules[module] object
import module as othername binds the name othername to the sys.modules[module] object
from module import attribute binds the name attribute to the sys.modules[module].attribute object
from module import attribute as othername binds the name othername to the sys.modules[module].attribute object
In this context it is important to realise that Python names are just references; all Python objects (including modules) live on a heap and stand or fall with the number of references to them. See this great article by Ned Batchelder on Python names if you need a primer on how this works.
Your question then can be interpreted in two ways:
You want to know the module has been imported. The moment code in the module is executed (like x = "Hello World"), it has been imported. All of it. Python doesn't load just x here, it's all or nothing.
You want to know if other code is using a specific name. You'd have to track what other references exist to the object. This is a mammoth task involving recursively checking the gc.get_referrers() object chain to see what other Python objects might now refer to x.
The latter goal is made the harder all the further in any of the following scenarios:
import script_one, then use script_one.x; references like these could be too short-lived for you to detect.
from script_one import x, then del x. Unless something else still references the same string object within the imported namespace, that reference is now gone and can't be detected anymore.
import sys; sys.modules['script_one'].x is a legitimate way of referencing the same string object, but does this count as an import?
import script_one, then list(vars(script_one).values()) would create a list of all objects defined in the module, but these references are indices in a list, not named. Does this count as an import?
Looks like it is impossible previously. But ever since python 3.7+ introduces __getattr__ on module level, looks like it is possible now. At least we can distinguish whether a variable is imported by from module import varable or import module; module.variable.
The idea is to detect the AST node in the previous frame, whether it is an Attribute:
script_one.py
def _variables():
# we have to define the variables
# so that it dosen't bypass __getattr__
return {'x': 'Hello world!'}
def __getattr__(name):
try:
out = _variables()[name]
except KeyError as kerr:
raise ImportError(kerr)
import ast, sys
from executing import Source
frame = sys._getframe(1)
node = Source.executing(frame).node
if node is None:
print('`x` is imported')
else:
print('`x` is accessed via `script_one.x`')
return out
script_two.py
from script_one import x
print(x)
# `x` is imported
# 'Hello world!'
import script_one
print(script_one.x)
# `x` is accessed via `script_one.x`
# 'Hello world!'

How to determine if a variable exists in another Python file

I have two python files. From python file #1, I want to check to see if there is a certain global variable defined in python file #2.
What is the best way to do this?
You can directly test whether the file2 module (which is a module object) has an attribute with the right name:
import file2
if hasattr(file2, 'varName'):
# varName is defined in file2…
This may be more direct and legible than the try… except… approach (depending on how you want to use it).
try:
from file import varName
except ImportError:
print 'var not found'
Alternatively you could do this (if you already imported the file):
import file
# ...
try:
v = file.varName
except AttributeError:
print 'var not found'
This will work only if the var is global. If you are after scoped variables, you'll need to use introspection.
With the getattr() built-in function you also can specify a default value like:
import file2
myVar = getattr(file2, attribute, False)
See the documentation

Categories