Run multiple cells in Colab at the same time - python

I have been searched the answer for a while. The answers include using Ipyparallel, but there are no clear instructions on how can I do to apply to two cells in general. Most of the examples are just compute some value with some functions distributively. They are at least not clear for me to understand how to solve the problem. I wish someone could provide some code or instructions on how can I do for running two independent cells in general on Colab. Or, if there are other methods, it is also fine as long as it works. Thank you.

It's not exactly straightforward, because Python is synchronous by default. Maybe encapsulating your cells in functions and using asyncio to execute two functions asynchronously will do the job. Or, if it includes something like heavy processing, threading or multiprocessing modules may be the way to go.

Related

Python start GUI and backend in parallel

So I am trying to run a gui to access certain files and a program to create those files in parallel. So once they are created, they wont be touched by the latter anymore.
My question is, how do I best implement this? Do I want multiprocessing? If so, how do I multiprocess just two functions?
I looked into it and the multiprocessing module seems to be what I want, but I havent quite understood how I can let it run two pools or how I run two specific functions with that. It just seems to take in a function and split it up arbitrarily.
My idea would be to just call two bat files in parallel which start their python process. But thats hardly an elegant solution.
So in short, how do I start two specific seperate functions with multiprocessing? Or is there any other more elegant solution, like os.fork or smth like that, that works as I want?

Debugging: how to view behind the scenes calls? - Python, PyQt

This seems like a somewhat simple question that I'm having a lot of trouble finding an answer for, perhaps I haven't found the words programmers use to talk about this.
I am a relatively inexperienced programmer, and have run into some difficulties with troubleshooting an application I have made. The question I have could be general, because I haven't found an answer to this for any language, but for me I'm specifically using python and PyQt4:
Is there any way to view the behind-the-scenes calls made when I execute a method/call a command? I can use debugging software (in my case winpdb) to track which lines in my code are called, but, aside from traceback info given after an error, haven't figured out how to follow the program's steps "behind the scenes". This would be helpful for cases where there hasn't been a programming error from the compiler's point of view, but the behavior is unexpected because the programmer doesn't have a complete understanding of a module's inner workings.
To put it another way: I want to know about the code that I didn't write. I want to know what is triggered after a line of my code is called. If I call a method like len(), python calls its own methods that will eventually return an integer to me. I'm hoping there's a way we can see what python does in between my lines of code.
If this has been asked before, please let me know, and accept my apologies for repeating the question, but I haven't been able to find an answer to that question, or at least the best way of asking it. Thank you very much for your help.
I'm not sure what your requirements are, but using cProfile with Gprof2Dot will generate a png of your code's call graphs and the amount of time spent in each function/method.
Just:
grab gprof2dot.py
Use the given sample code

Leave some code running while executing more

I am starting a loop in python, and after I start the loop I want the execution of code after the loop to carry on, while the loop keeps looping (in the 'background').
x=True
while x:
#do some output
sleep(1)
#ask for input or something
if input()=='something':
x=False
So in that example, #do some output will keep happening while input is asked for.
Is this possible in python? Are there any work-arounds that could achieve this?
From what I understand, what you want is to create a Thread that keeps executing some tasks in the background.
http://docs.python.org/library/threading.html
Threading is a moderately complex issue, so giving you a recipe here wouldn't amount to much. I suggest you take a look at the documentation to understand what's involved.
Update your question or create a new one if you then have troubles with something specific to the implementation.
EDIT:
I agree with the comment made by Tim Hoffman, the correct solution depends on what you're trying to achieve. From my understanding, threading should work for you, but if you give us more details, it might be easier to give a more precise answer.
Following OP's wish I'm posting the answer. :) There are multiple libraries for achieving what you are trying to do. Most known include:
threading module
multiprocessing module
There are subtle differences between them, but I guess that you shouldn't worry about it at the moment. I encourage you though to read more about mutli-processing and threading in general. Also "green" threads is an interesting alternative.

Is there a statistical profiler for python? If not, how could I go about writing one?

I would need to run a python script for some random amount of time, pause it, get a stack traceback, and unpause it. I've googled around for a way to do this, but I see no obvious solution.
There's the statprof module
pip install statprof (or easy_install statprof), then to use:
import statprof
statprof.start()
try:
my_questionable_function()
finally:
statprof.stop()
statprof.display()
There's a bit of background on the module from this blog post:
Why would this matter, though? Python already has two built-in profilers: lsprof and the long-deprecated hotshot. The trouble with lsprof is that it only tracks function calls. If you have a few hot loops within a function, lsprof is nearly worthless for figuring out which ones are actually important.
A few days ago, I found myself in exactly the situation in which lsprof fails: it was telling me that I had a hot function, but the function was unfamiliar to me, and long enough that it wasn’t immediately obvious where the problem was.
After a bit of begging on Twitter and Google+, someone pointed me at statprof. But there was a problem: although it was doing statistical sampling (yay!), it was only tracking the first line of a function when sampling (wtf!?). So I fixed that, spiffed up the documentation, and now it’s both usable and not misleading. Here’s an example of its output, locating the offending line in that hot function more accurately:
% cumulative self
time seconds seconds name
68.75 0.14 0.14 scmutil.py:546:revrange
6.25 0.01 0.01 cmdutil.py:1006:walkchangerevs
6.25 0.01 0.01 revlog.py:241:__init__
[...blah blah blah...]
0.00 0.01 0.00 util.py:237:__get__
---
Sample count: 16
Total time: 0.200000 seconds
I have uploaded statprof to the Python package index, so it’s almost trivial to install: "easy_install statprof" and you’re up and running.
Since the code is up on github, please feel welcome to contribute bug reports and improvements. Enjoy!
I can think of a couple of few ways to do this:
Rather than trying to get a stack trace while the program is running, just fire an interrupt at it, and parse the output. You could do this with a shell script or with another python script that invokes your app as a subprocess. The basic idea is explained and rather thoroughly defended in this answer to a C++-specific question.
Actually, rather than having to parse the output, you could register a postmortem routine (using sys.excepthook) that logs the stack trace. Unfortunately, Python doesn't have any way to continue from the point at which an exception occurred, so you can't resume execution after logging.
In order to actually get a stack trace from a running program, you will may have to hack the implementation. So if you really want to do that, it may be worth your time to check out pypy, a Python implementation written mostly in Python. I've no idea how convenient it would be to do this in pypy. I'm guessing that it wouldn't be particularly convenient, since it would involve introducing a hook into basically every instruction, which would I think be prohibitively inefficient. Also, I don't think there will be much advantage over the first option, unless it takes a very long time to reach the state where you want to start doing stack traces.
There exists a set of macros for the gdb debugger intended to facilitate debugging Python itself. gdb can attach to an external process (in this case the instance of python which is executing your application) and do, well, pretty much anything with it. It seems that the macro pystack will get you a backtrace of the Python stack at the current point of execution. I think it would be pretty easy to automate this procedure, since you can (at worst) just feed text into gdb using expect or whatever.
Python already contains everything you need to do what you described, no need to hack the interpreter.
You just have to use the traceback module in conjunction with the sys._current_frames() function. All you need is a way to dump the tracebacks you need at the frequency you want, for example using UNIX signals, or another thread.
To jump-start your code, you can do exactly what is done in this commit:
Copy the threads.py module from that commit, or at least the stack trace dumping function (ZPL license, very liberal):
Hook it up to a signal handler, say, SIGUSR1
Then you just need to run your code and "kill" it with SIGUSR1 as frequently as you need.
For the case where a single function of a single thread is "sampled" from time to time with the same technique, using another thread for timing, I suggest dissecting the code of Products.LongRequestLogger and its tests (developed by yours truly, while under the employ of Nexedi):
Whether or not this is proper "statistical" profiling, the answer by Mike Dunlavey referenced by intuited makes a compelling argument that this is a very powerful "performance debugging" technique, and I have personal experience that it really helps zoom in quickly on the real causes of performance issues.
To implement an external statistical profiler for Python, you're going to need some general debugging tools that let you interrogate another process, as well as some Python specific tools to get a hold of the interpreter state.
That's not an easy problem in general, but you may want to try starting with GDB 7 and the associated CPython analysis tools.
Seven years after the question was asked, there are now several good statistical profilers available for Python. In addition to vmprof, already mentioned by Dmitry Trofimov in this answer, there are also vprof and pyflame. All of them support flame graphs one way or another, giving you a nice overview of where the time was spent.
Austin is a frame stack sampler for CPython that can be used to make statistical profilers for Python that require no instrumentation and introduce minimal overhead. The simplest thing to do is to pipe the output of Austin with FlameGraph. However, you can just grab Austin's output with a custom application to make your very own profiler that is targeted at precisely your needs.
This is a screenshot of Austin TUI, a terminal application that provides a top-like view of everything that is happening inside a running Python application.
This is Web Austin, a web application that shows you a live flame graph of the collected samples. You can configure the address where to serve the application which then allows you to do remote profiling.
There is a cross-platform sampling(statistical) Python profiler written in C called vmprof-python.
Developed by the members of PyPy team, it supports PyPy as well as CPython.
It works on Linux, Mac OSX, and Windows. It is written in C, thus has a very small overhead.
It profiles Python code as well as native calls made from Python code.
Also, it has a very useful option to collect statistics about execution lines inside functions in addition to function names.
It can also profile memory usage (by tracing the heap size).
It can be called from the Python code via API or from the console.
There is a Web UI to view the profile dumps: vmprof.com, which is also open sourced.
Also, some Python IDEs (for example PyCharm) have integration with it, allowing to run the profiler and see the results in the editor.
For Python there is py-spy to dump the stacktraces. The dumps can get analyzed by speedscope
Source: Guidelines

How can I go about securely executing a subset of python?

I need to store source code for a basic function in a database and allow it to be modified through an admin interface. This code will take several numbers and strings as parameters, and return a number or None. I know that eval is evil, so I need to implement a safe way to execute a very basic subset of python, or something syntactically similar at least, from within a python based web-app.
The obvious answer is to implement a DSL (Domain Specific Language), however, I have no experience with that, nor do I have any idea where to begin, and a lot of the resources available seem to go a little over my head. I'm hoping that maybe there is something already out there which will allow me to essentially generate a secure python-callable function from a string in a database. the language really only needs to support assignment, basic math, if/else, and case insensitive string comparisons. any other features are a bonus, but I think most things can be done with just that, no need for complex data structures, classes, functions, etc.
If no such thing currently exists, I'm willing to look into the possibility of creating one, but as I said, I have no idea how to go about that, and any advice in that regard would be appreciated as well.
Restricted Python environments are hard to make really safe.
Maybe something like lua is a better fit for you
PySandbox might help. I haven't tested it, just found it linked elsewhere.
You could use Pyparsing to implement your DSL, provided the expressions involved won't be too complex (you don't give full details on that but you imply the requirements are pretty simple). See the examples page including specifically fourFn.py or simpleCalc.py.
You could implement a subset of Python by using the ast module to parse Python code into an abstract syntax tree then walk the tree checking that it only uses the subset of Python that you allow. This will only work in Python 2.x since Python 3 has removed the ast module.
However even using this method it will be hard to create something that is 100% secure, since even the most innocuous code could allow the user to write something that could blow up your application, e.g. by allocating more memory than you have available or putting the program into an infinite loop using all the CPU.

Categories