ipython run without destroying global variables defined in the target file - python

I want to define some globals in some number crunching work I am doing, I am incrementally writing the script and don't want previous result to keep being loaded/recalculated. One approach is to split out mature code into a separate file and only python run interactively new code. However I just want to do it in a single file for speed of development.
I was under the assumption that a global defined in a file would persist between invocations of run, but they do not.
So my script has the following chunk if code :
if globals().has_key('all_post_freq') != True:
print "creating all post freq var"
global all_post_freq
all_post_freq = all_post_freq_("pickle/all_post_freq.pickle")
How do I retain all_post_freq between invocations of ipython run
edit
ok I have split stuff up into files, but I know there must be a way of doing what I need to do :D

When you %run a file, it is normally started in a blank namespace, and its globals are added to the interactive namespace when it finishes. There's a -i flag which will run it directly in the interactive namespace, so it will see variables you've already defined:
%run -i myscript.py

Related

IPython %run ignore print commands

I'm using several Jupyter Notebooks to split the tasks between different modules. In my main notebook I call another module %run another_module.ipynb which loads all my data. However, it also plots and prints everything I have in another_module.ipynb.
I want to keep the plots in another_module.ipynb to help me visualise the data but I don't want to reprint everything when calling run another_module.ipynb. Is there an option to prevent priniting this?
Thanks
You could:
Override the print function and make it a no-op:
_print_function = print # create a backup in case you need it later
globals()["print"] = lambda *args, **kwargs: None
Run the file with the -i flag. Without -i, the file is run in a new namespace, so your modifications to the global variables are lost; with -i, the file is run in the current namespace.
%run -i another_module.ipynb
If you're using other methods to print logs (e.g., sys.stdout.write(), logging), it would be harder to create mocks for them. In that case, I would suggest redirecting the stdout or stderr pipe to /dev/null:
import os
import sys
sys.stdout = fopen(os.devnull, "w")
%run -i another_module.ipynb
Both methods are considered hacks and should only be used when you know the consequences. The better thing to do here is to change your code in the notebook, either to add a --verbose flag to control logging, or use some logging library (e.g., logging) that supports turning off logging entirely.

In Python, how to use many exec calls in a shared context?

I'm writing a Python program to execute embedded Python code in Verilog scripts. I thought of using the eval and exec functions, but I came across a problem: I would like to have all the execs and evals run in the shared context, without changing the main program's environment.
I'm putting exec and eval inside a function to call in the parsing routine:
# some parsing code
for embedded_code_string in list_of_embedded_code_strings:
execute_embedded_code(embedded_code_string)
# more parsing code
def execute_embedded_code(embedded_code_string):
exec(embedded_code_string)
# other routines involving io.StringIO for the redirection of stdout which isn't the problem.
If the first embedded code string to be run is row_len = 1, and the second one is column_len = row_len * 2, then when running the second code snippet, row_len would be undefined. It's expected: after all, exec is running in the context of the function execute_embedded_code, and after the function finishes, the variable would disappear from both locals() and globals().
It appears that you can set the local and global namespace for exec. However, the changes after running exec wouldn't be preserved in-place. (Corrections: the global_dict and local_dict parameter MUST be a dictionary, or it would be ignored. If it were a dictionary, it would be updated in-place.) Running globals() and locals() after exec would capture the change, but it would also capture the objects in the parsing program, and I wouldn't want the embedded code to inadvertently mess up the parsing program.
So my question is, how would I run many exec calls in their shared context, yet isolated enough that they wouldn't have unexpected consequences. No need to think of security, as all the embedded code to the run would be trusted.
I would like to get the individual outputs of each embedded code string, so I don't think joining them together and run it all at once would work.
You should be able to define your own shared globals that you pass to exec which is then modified by the embedded code:
def execute_embedded_code(embedded_code_string, shared_globals):
exec(embedded_code_string, shared_globals)
shared_globals = dict()
shared_globals['result'] = 0
sample_string = 'result += 1'
execute_embedded_code(sample_string, shared_globals)
print(shared_globals['result'])
Output
1
Note
To address a comment below, the documentation for exec states
If only globals is provided, it must be a dictionary, which will be used for both the global and the local variables.

Save breakpoints to file

When debugging my Python code, I run a script through ipdb from the commandline, and set a number of breakpoints. Then I make some changes in one or more modules, and rerun. However, if I simply use run modules do not get reloaded. To make sure they do, I can exist and restart Python completely, but then I need to reset all breakpoints, which is tedious if I have many and if done over and over again.
Is there a way to save breakpoint to a file in (i)pdb, so that after small changes that do not change line numbers, I can dump my breakpoints, restart Python + pdb, and reload my breakpoints? The equivalent to Matlabs X = dbstatus, saving/loading X, and setting dbstop(X).
You can save the breakpoints to .pdbrc file in a working path or globally to your home dir. File should have something like this:
# breakpoint 1
break /path/to/file:lineno
# breakpoint 2
break /path/to/file:lineno
You can define breakpoints various ways, just like in the interactive mode. So just break 4 or break method will work too.
This file works for both, pdb and ipdb, since later has everything pdb has and more.
Bonus:
You could use alias to more easily save breakpoints.
For example:
# append breakpoint to .pdbrc in current working directory
# usage: bs lineno
alias bs with open(".pdbrc", "a") as pdbrc: pdbrc.write("break " + __file__ + ":%1\n")
Put above to your global .pdbrc and use it like this:
> bs 15
This will append a breakpoint statement to a local .pdbrc file for a line 15 of current file.
It is not perfect solution, but close enough for me. Tune the command to your needs.
Read more about aliases here.

How to access a variable in IPython from a program executed by %run

I have a program like this:
if __name__=="__main__":
foo = expensiveDataProcessClass(filepath)
y = foo.doStuff()
y = foo.doOtherStuff()
I'm testing things out as a I build it in ipython with the %run myprogram command.
After it's running, since it takes forever, I'll break it with ctrl+C and go rewrite some stuff in the file.
Even after I break it, though, IPython has foo stored.
>type(foo)
__main__.expensiveDataProcessClass
I'm never having to edit anything in foo, so it would be cool if I could update my program to first check for the existence of this foo variable and just continue to use it in IPython rather than doing the whole creation process again.
You could first check for the variable's existence, and only assign to it if it doesn't exist. Example:
if __name__=="__main__":
if not "foo" in globals()
foo = expensiveDataProcessClass(filepath)
However, this won't actually work (in the sense of saving a foo assignment). If you read IPython's doc on the %run magic, it clearly states that the executed program is run in its own namespace, and only after program execution are its globals loaded into IPython's interactive namespace. Every time you use %run it will always not have foo defined from the program's prospective.

Running a python debug session from a program, not from the console

I'm writing a little python IDE, and I want to add simple debugging. I don't need all the features of winpdb.
How do I launch a python program (by file name) with a breakpoint set at a line number so that it runs until that line number and halts?
Note that I don't want to do this from the command-line, and I don't want to edit the source (by inserting set_trace, for example). And I don't want it to stop at the first line so I have to run the debugger from there. I've tried all the obvious ways with pdb and bdb, but I must be missing something.
Pretty much the only viable way to do it (as far as I know) is to run Python as a subprocess from within your IDE. This avoids "pollution" from the current Python interpreter, which makes it fairly likely that the program will run in the same way as if you had started it independently. (If you have issues with this, check the subprocess environment.) In this manner, you can run a script in "debug mode" using
p = subprocess.Popen(args=[sys.executable, '-m', 'pdb', 'scriptname.py', 'arg1'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
This will start up Python at the debugger prompt. You'll need to run some debugger commands to set breakpoints, which you can do like so:
o,e = p.communicate('break scriptname.py:lineno')
If this works, o should be the normal output of the Python interpreter after it sets a breakpoint, and e should be empty. I'd suggest you play around with this and add some checks in your code to ensure whether the breakpoints were properly set.
After that, you can start the program running with
p.communicate('continue')
At this point you'd probably want to hook the input, output, and error streams up to the console that you're embedding in your IDE. You would probably need to do this with an event loop, roughly like so:
while p.returncode is None:
o,e = p.communicate(console.read())
console.write(o)
console.write(e)
You should consider that snippet to be effectively pseudocode, since depending on how exactly your console works, it'll probably take some tinkering to get it right.
If this seems excessively messy, you can probably simplify the process a bit using the features of Python's pdb and bdb modules (I'm guessing "Python debugger" and basic debugger" respectively). The best reference on how to do this is the source code of the pdb module itself. Basically, the way the responsibilities of the modules are split is that bdb handles "under the hood" debugger functionality, like setting breakpoints, or stopping and restarting execution; pdb is a wrapper around this that handles user interaction, i.e. reading commands and displaying output.
For your IDE-integrated debugger, it would make sense to adjust the behavior of the pdb module in two ways that I can think of:
have it automatically set breakpoints during initialization, without you having to explicity send the textual commands to do so
make it take input from and send output to your IDE's console
Just these two changes should be easy to implement by subclassing pdb.Pdb. You can create a subclass whose initializer takes a list of breakpoints as an additional argument:
class MyPDB(pdb.Pdb):
def __init__(self, breakpoints, completekey='tab',
stdin=None, stdout=None, skip=None):
pdb.Pdb.__init__(self, completekey, stdin, stdout, skip)
self._breakpoints = breakpoints
The logical place to actually set up the breakpoints is just after the debugger reads its .pdbrc file, which occurs in the pdb.Pdb.setup method. To perform the actual setup, use the set_break method inherited from bdb.Bdb:
def setInitialBreakpoints(self):
_breakpoints = self._breakpoints
self._breakpoints = None # to avoid setting breaks twice
for bp in _breakpoints:
self.set_break(filename=bp.filename, line=bp.line,
temporary=bp.temporary, conditional=bp.conditional,
funcname=bp.funcname)
def setup(self, f, t):
pdb.Pdb.setup(self, f, t)
self.setInitialBreakpoints()
This piece of code would work for each breakpoint being passed as e.g. a named tuple. You could also experiment with just constructing bdb.Breakpoint instances directly, but I'm not sure if that would work properly, since bdb.Bdb maintains its own information about breakpoints.
Next, you'll need to create a new main method for your module which runs it the same way pdb runs. To some extent, you can copy the main method from pdb (and the if __name__ == '__main__' statement of course), but you'll need to augment it with some way to pass in the information about your additional breakpoints. What I'd suggest is writing the breakpoints to a temporary file from your IDE, and passing the name of that file as a second argument:
tmpfilename = ...
# write breakpoint info
p = subprocess.Popen(args=[sys.executable, '-m', 'mypdb', tmpfilename, ...], ...)
# delete the temporary file
Then in mypdb.main(), you would add something like this:
def main():
# code excerpted from pdb.main()
...
del sys.argv[0]
# add this
bpfilename = sys.argv[0]
with open(bpfilename) as f:
# read breakpoint info
breakpoints = ...
del sys.argv[0]
# back to excerpt from pdb.main()
sys.path[0] = os.path.dirname(mainpyfile)
pdb = Pdb(breakpoints) # modified
Now you can use your new debugger module just like you would use pdb, except that you don't have to explicitly send break commands before the process starts. This has the advantage that you can directly hook the standard input and output of the Python subprocess to your console, if it allows you to do that.

Categories