Running python process on EC2 from local ipython notebook - python

I'm playing around with some python deep learning packages (Theano/Lasagne/Keras). I've been running it on CPU on my laptop, which takes a very long time to train the models.
For a while I was also using Amazon GPU instances, with an iPython notebook server running, which obviously ran much faster for full runs, but was pretty expensive to use for prototyping.
Is there any way to set things up that would let me prototype in iPython on my local machine, and then when I have a large model to train spin up a GPU instance, do all processing/training on that, then shut down the instance.
Is a setup like this possible, or does anyone have any suggestions to combine the convenience of the local machine with temporary processing on AWS?
My thoughts so far were along the lines of
Prototype on local ipython notebook
Set up cell to run a long process from start to
finish.
Use boto to start up an ec2 instance ssh into the instance
using boto's sshclient_from_instance
ssh_client = sshclient_from_instance(instance,
key_path='<path to SSH keyfile>',
user_name='ec2-user')
Get the contents of the cell I've set up using the script using the solution here, say the script is in cell 13 Execute that script using
ssh_client.run('python -c "'+ _i13 + '"' )
Shut down instance using boto
This just seems a bit convoluted, is there a proper way to do this?

So when it comes to EC2 you don't have to shut down the instance every time. The beauty of AWS is that you stop and start your instance when you use it, and only pay for the time you have it up and running. Also you can always try your code on a smaller and cheaper instance, and if its too slow for your liking then you just scale up to a larger instance.

Related

How to start asyncio server on remote server with Python?

I have a virtual server available which runs Linux, having 8 cores. 32 GB RAM, and 1 TB in addition. It should be a development environment. (same for test and prod) This is what I could get from IT. Server can only be accessed via so-called jump servers by putty or direct tcp/ip ports (ssh is a must).
The application I am working on starts several processes via multiprocessing. In every process an asyncio event loop is started, and an asyncio socket server in some cases. Basically it is a low level data streaming and processing application (unfortunately no kafka or similar technology available yet). The live application runs forever, no or limited interaction with the user (reads/processes/writes data).
I assume, IPython is an option for this, but - and maybe I am wrong - I think it starts new kernels per client request, but I need to start new process from the main code w/o user interaction. If so, this can be an option for monitoring the application, gathering data from it, sending new user commands to the main module, but not sure how to run processes and asyncio servers remotely.
I would like to understand how these can be done on the given environment. I do not know where to start, what alternatives there are. And I do not understand ipython properly, their page is not obviuos to me yet.
Please help me out! Thank you in advance!
After lots of research and learning I came across to a possible solution in our "sandbox" environment. First, I had to split the problem into several sub-problems:
"remote" development
paralllization
scheduling and executing parallel codes
data sharing between these "engines"
controlling these "engines"
Let's see in details:
Remote development means you want write your code on your laptop, but the code must be executed on a remote server. Easy answer is Jupyter Notebook (or equivalent solution) although it has several trade-offs, also other solutions are available, but this was faster to deploy and use and had the least dependency, maintenance, etc.
parallelization: had several challenges with iPython kernel when working with multiprocessing, so every code that must run parallel will be written in separated Jupyter Notebook. In a single code I can still use eventloop to get async behaviour
executing parallel codes: there are several options I will use :
iPyParallel - "workaround" for multiprocessing
papermill - execute JNs with parameters from command line (optional)
using %%writefile magic command in Jupyter Notebook - create importables
os task scheduler like cron.
async with eventloops
No option yet: docker, multiprocessing, multithreading, cloud (aws, azure, google...)
data sharing: selected ZeroMQ, took time to learn but was simpler and easier than writing everything on pure sockets. There are alternatives but come with extra dependency, and some very useful benefit (will check them later): RabbitMQ, Redis message broker, etc. The reasons for preferring ZMQ: fast, simple, elegant, and just a library. (Knonw risk: our IT will prefer RabbitMQ, but that problem comes later :-) )
controlling the engines: now this answer is obvious: separate python code (can be tested as JN code but easy to turn into pure .py and schedule it). This one can communicate with the other modules via ZMQ sockets: healthcheck, sending new parameters, commands, etc....

Can the SSH client affect how long it takes to run some code?

I'm running a CNN algorithm on an interpreter which is running on a remote machine. The code is pretty simple and resembles this: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
But I added a TensorBoard writer to draw the training and validation loss as well as 3 images per epoch.
I used both PyCharm and MobaXterm to access the remote server (an Amazon AWS EC2 instance). But I've noticed significant performance differences whether I run the code from PyCharm or from MobaXterm with a command line.
The results averaged on 10 iterations.
Pycharm: Each iteration takes 55s.
MobaXterm: Each iteration takes 95s.
I've made sure that both are using the same conda environment and the same Python to run (Python 3.7).
Maybe I'm missing a key comprehension of MobaXterm and working with a remote interpreter?

Benchmarking a Python script on a remote machine with a distributed access

I want to benchmark execution time of some Python script that involves some heavy computations. Although I can run it in acceptable time against a few datasets on my local machine, the waiting time becomes unacceptable when I test it against thousands of datasets, which is something I would like to do, so I have to resort to using a remote computation. The problem is that the remote machine is not used just by me, but by a few people, and their code often drains computation power from CPUs that I use, making benchmark w.r.t. time barely meaningful.
At the moment, I just run Python script via executing bash script like this:
python myscript.py --dataset dataset1
I can't ask the remote server owner to grant me full access to CPUs, which is a perfect case scenario, of course. I would like to do something like this: check if current CPU is used by anything else at the moment, if it is, then freeze the process and wait until it is free again, then resume the process. Is there a way to accomplish this or are there alternatives up to this task out there?

what are the good ways to deploy and manage python script on production server?

I've written a lot of python scripts. Now I want to run it on another computer which running non-stop to crawling, analyzing data and update to an sql database.
Normally I open a command prompt and run the scripts:
python [script directory]
But with many scripts I have to open many cmd and every script call an python interpreter, so It end up with huge mess using a lot of memory.
What should I do to manage these scripts.
You haven't specified what OS your server is, but assuming that it's a Linux server you should probably research a process management tool such as Supervisord or Systemd. These are tools designed to run and monitor your program automatically, and even restart it if it crashes.
If you're using Ubuntu 16.04 then it comes with Systemd out of the box, however I personally find Supervisord easier to configure and use for simple tasks.
These programs won't necessarily help with your memory consumption issues however. Sure you can place caps on memory use for a process, but that's not really going to help you if it stops your program from working. You're probably best to re-evaluate your code and look for ways to reduce its memory footprint or use a server with more ram.
EDIT:
You've just added that the OS is Windows 10, which makes the above irrelevant. You can use the Windows Task Scheduler to automatically execute long running tasks.
you can use pythonw *.py , and it will run in background.

Error: Failed to Write, Broken Pipe when running python script on AWS EC2 instance

I'm new to cloud computing in general, and I've started a free trial with Amazon's Web Services, in hopes of using their EC2 servers to run some code for Kaggle competitions. I'm currently working on running a test Python script for doing some image processing and testing a linear classifier (I don't suspect these details are relevant to my problem, but wanted to provide some context).
Here are the steps I've gone through to run my script on an EC2 instance:
Log in to AWS, and start EC2 instance where I've installed relevant Python modules for my tasks (e.g. Anaconda distribution). As a sidenote, all my data and the script I want to run are in the same directory on this server instance.
SSH to my EC2 instance from my laptop, and cd to the directory with my script.
Run screen to run program in background.
Run script via python program.py and detach from screen session (ctrl + A, D)
Keep EC2 instance running, but exit from SSH session connecting my laptop to the server.
I've followed these steps a number of times, which result in either (a) "Broken Pipe" errors, or (b) in an error where the connection appears to "hang". In the case of (b), I've attempted to disconnect from the SSH session and reconnect to the server, however I am unable to do so due to an error stating "connection has been reset by peer".
I'm not sure if I need to configure something differently on the EC2 instance, or if I need to specify different options when connecting to the server via SSH. Any help here would be appreciated. Thanks for reading.
EDIT: I've been successful in running some example scripts using scikit-learn by setting up an iPython notebook, launching it with nohup, and running the code in a notebook cell. However, when trying to do the same with my Kaggle competition code, the same "hanging" issue happens, and the connection appears to be dropped, causing the code to stop running. The image dataset I'm running the code on in the second case is quite a bit larger than the dataset processed by the example code in the first case. Not sure if dataset size along is causing the issue, or how to solve this.

Categories