AWS deployment goes down frequently

AWS deployment goes down frequently - python

So, I've got a very basic deployment on an EC2 instance that largely works, except for a couple of large issues. Right now I'm just ssh'ing into the box and running
python -m SimpleHTTPServer 80
and I have the box on a security group that allows http requests in on Port 80.
This seems to work, but if I leave it alone for a while (1-2 hours usually) my elastic ip will start returning 404s. I really need this server to stay up for demos to third parties. Any ideas on how to make sure it stays up?
Additionally it goes down when I close the terminal that's ssh'd into my box, which is extremely non-ideal as I would like this demo to stay up even if my computer is off. That's a less urgent matter, but any advice on that would also be appreciated.

Use screen!
Here's a quick tutorial: http://www.nixtutor.com/linux/introduction-to-gnu-screen/
Essentially just ssh in, open a new window via screen, start the server via python -m SimpleHTTPServer 80, then detach from the window. Additionally, you should be able to close your terminal and it should stay up.

I was able to solve this a little hackishly by putting together a cron job to run a bash script that spun up a server, but I'm not sure if this is the best way. It seems to have solved my problems in the short term though. For reference, this is the code I used:
import SimpleHTTPServer
import SocketServer
PORT = 80
Handler = SimpleHTTPServer.SimpleHTTPRequestHandler
httpd = SocketServer.TCPServer(("", PORT), Handler)
httpd.serve_forever()
Which I wrapped in a simple bash script:
#!/bin/bash
cd relevant/directory
sudo -u ubuntu python simple_server.py
I sure there was a better permissioning to use, but after that I just ran
chmod -R 777 bash_script.sh
To make sure I wouldn't run into any issues on that front.
And then placed in a cronjob to run every minute (The more the merrier, right?)
crontab -e (Just to bring up the relevant file)
Added in this line:
*/1 * * * * path/to/bash_script.sh
And it seems to be working. I closed out my ssh'd terminal and everything still runs and nothing has gone down yet. I will update if something does, but I'm generally happy with this solution (not that I will be in 2 weeks once I learn more about the subject), but it seems very minimal and low level, which means I at least understand what I just did.

SimpleHTTPServer just serves static pages on port 80, mainly for use during development.
For production usage (if you want to use EC2) I recommend you read up on Apache or nginx. Basically you want a web server that runs on Linux.
If you think your site will remain static files (HTML, CSS, JS) I recommend you host them on Amazon S3 instead. S3 is cheaper and way more reliable. Take a look at this answer for instructions: Static hosting on Amazon S3 - DNS Configuration
Enjoy!

Related

AWS Fargate Docker - How to print and see stdout/stderr from a headless ubuntu docker?

This may be a sort of 101 question, but in setting this up for the first time there are no hints about such a fundamental and common task. Basically I have a headless ubuntu running as a docker image inside AWS, which gets built via github actions CI/CD. All is running well.
Inside ubuntu I have some python scripts, let's say a custom server, cron jobs, some software running etc. How can I know, remotely, if there were any errors logged by any of these? Let's keep it simple: How can I print an error message, from a python server inside ubuntu, that I can read from outside docker? Does AWS have any kind of web interface for viewing stdout/stderr logs? Or at least an ssh console? Any examples somewhere?
Furthermore, I've set up my docker with healthchecks, to confirm that my servers running inside ubuntu are online and serving. Those work because I can test them in localhost by doing docker ps and shows Status 'healthy'. How do I see this same thing when live in AWS?
Have I really missed something this big? It feels like this should be the first thing flashing on the main page of setting up a docker on AWS.

There's a few things to unpack here, that you learn after digging through a lot of stuff you don't need in order to get started, just so you can know how to get started.
Docker will log by default the output of the startup processes that you've described your dockerfile setup, e.g. when you do ENTRYPOINT bash -C /home/ubuntu/my_dockerfile_sh_scripts/myStartupScripts.sh. If any subproceses spawned by those processes also log to stdout/stderr, the messages should bubble up to the host process, and therefore be shown in the docker log. If they don't bubble, look up subprocess stdout/stderr in linux.
Ok we know that, but where the heck is AWS's stats and logs page? Well in Amazon Cloudwatch™ of course. Didn't you already know about that term? Why, it says so right there when you create a docker, or on your ECS console next to your docker Clusters, or next to your running docker image Service. OH WAIT! No, no it does not! There is no utterance of "Cloudwatch" anywhere. Well there is this one page that has "Cloudwatch" on it, which you can get to if you know the url, but hey look at that, you don't actually see any sort of logs coming from your code in docker anywhere on there so ..yeah. So where do you see your actual logs and output? There is this Logs tab, in your Service's page (the page of the currently running docker image): https://eu-central-1.console.aws.amazon.com/ecs/home?region=eu-central-1#/clusters/your-cluster-name/services/your-cluster-docker-image-service/logs. This generically named and not-described tab, shows a log not of some status for the service, from the AWS side, but actually shows you the docker logs I mentioned in point 1. Ok. How do I view this as a raw file or access this remotely via script? Well I don't know. I guess you'll find out about that basic common task, after reading a couple of manuals about setting up the AWS CLI (another thing you didn't know existed).
Like I said in point 1, docker cannot log generic operating system log messages, or show you log files generated by your server, or just other software or jobs that are running that weren't described and started by your dockerfile/config. So how do we get AWS to see that? Well, It's a bit of a pain in the ass, you have to either replace your docker image's default OS's (e.g. ubuntu) logging driver with sudo yum install -y awslogs and set that up, or you can create symbolic links between specific log files and the stdout/stderr stream (docker docs mention of this). Also check Mark B's answer. But probably the easiest thing is to write your own little scripts with short messages that print out to the main process what's the status of things. Usually that's all you need unless you're an enterprise.
Is there any ssh or otherwise an AWS online command line interface page into the running docker, like you get in your localhost docker desktop? So you could maybe cd and ls browse or search for files and see if everything's fine? No. Make your own. Or better yet, avoid needing that in the first place, even though it's inconvenient for R&D.
Healthchecks. Where the heck do I see my docker healthchecks? The equivalent to the localhost method of just running the docker ps command. Well by default there aren't any healthchecks shown anywhere on AWS. Why would you need healthchecks anyway? So what if your dockerfile has HEALTHCHECKs defined?..🙂 You have to set that up in Fargate™ (..whatever fargate even means cause the name's not written anywhere ("UX")). You have to create what is called a new Task Definition Revision. Go to your Clusters in Amazon ECS. Go to your cluster. Then you click on your Service's entry in the Task Definition column of the services table on the bottom. You click on Create New Revision (new task definition revision). On the new page you click on your container in the Container Definitions table. On the new page you scroll down to HEALTHCHECK, bingo! Now what is this? What commands to I paste in here? It's not automatically taking the HEALTHCHECK that I defined in my dockerfile, so does that mean I must write something else here?? What environment are the healthchecks even run in? Is it my docker? Is it linux? Here's the answer: you paste in this box, what you already wrote in your dockerfile's HEALTHCHECK. Just use http://127.0.0.1 (localhost) as you would in your local docker desktop testing environment. Now click Update. Click Create. K, now we're still not done. Go back to your Amazon ECS / Clusters / cluster. Click on your service name in the services table. Click Update. Select the latest Revision. Check "force new deployment". Then keep clicking Next until finally you click Update Service. You can also define what triggers your image to be shut down on healthcheck fail. For example if it ran out of ram. Now #Amazon, I hope you take this answer and staple it to your shitty ass ECS experience.
I swear the relentlessly, exclusively bottom-up UX of platforms like AWS and Azure are what is keeping the tutorial blogger industry alive.. How would I know what AWS CloudWatch is, or that it even exists? There are no hints about these things anywhere while you set up. You'd think the first thing that flashes on your screen after you completed a docker setup would be "hey, 99.9% of people right now need to set up logging. You should use cloudwatch. And here's how you connect healthchecks to cloudwatch". But no, of course not..! 🙃
Instead, AWS's "engineer" approach here seems to be: here's a grid of holes in the wall, and here's a mess of wires next to it in a bucket. Now in order to do the common frequently done tasks you want to do, you must first read the manual for each hole, and the manual for each wire in the bucket, then find all of the holes and wires you need, and plug them in the right order (and for the right order you need to find a blog post because that always involves some level of not following the docs and definitely also magic).
I guess it's called "job security" for if you're an enterprise server engineer :)

I faced the same issue, I found the AWS Wiki, the /dev/stdout symbolic link doesn't work to me, but /proc/1/fd/1 symbolic link works to me.
Here is the solution:
Step 1. Add those commands to your Dockerfile.
# forward logs to docker log collector
RUN ln -sf /proc/1/fd/1 /var/log/console.log \
&& ln -sf /proc/1/fd/2 /var/log/error.log
Step 2. refer to "Mark B"'s step2.

Step 1. Update your docker image by deleting all the log files you care about, and replacing them with symbolic links to stdout or stderr, for example to capture logs in an nginx container I may do the following in the Dockerfile:
RUN ln -sf /dev/stdout /var/log/nginx/access.log \
&& ln -sf /dev/stderr /var/log/nginx/error.log
Step 2. Configure the awslogs driver in the ECS Task Definition, like so:
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "my-log-group",
"awslogs-region": "my-aws-region",
"awslogs-stream-prefix": "my-log-prefix"
}
}
And as long as you gave the ECS Execution Role permission to write to AWS Logs, log data will start appearing in CloudWatch Logs.

mod_wsgi: Reload Code via Inotify - not every N seconds

Up to now I followed this advice to reload the code:
https://code.google.com/archive/p/modwsgi/wikis/ReloadingSourceCode.wiki
This has the drawback, that the code changes get detected only every N second. I could use N=0.1, but this results in useless disk IO.
AFAIK the inotify callback of the linux kernel is available via python.
Is there a faster way to detect code changes and restart the wsgi handler?
We use daemon mode on linux.
Why code reload for mod_wsgi at all
There is interest in why I want this at all. Here is my setup:
Most people use "manage.py runserver" for development and some other wsgi deployment for for production.
In my context we have automated the creation of new systems and prod and development systems are mostly identical.
One operating system (linux) can host N systems (virtual environments).
Developers can use runserver or mod_wsgi. Using runserver has the benefit that it's easy for debugging, mod_wsgi has the benefit that you don't need to start the server first.
mod_wsgi has the benefit, that you know the URL: https://dev-server/system-name/myurl/
With runserver you don't know the port. Use case: You want to link from an internal wiki to a dev-system ....
A dirty hack to get code reload for mod_wsgi, which we used in the past: maximum-requests=1 but this is slow.

Preliminaries.
Developers can use runserver or mod_wsgi. Using runserver has the
benefit that you it easy for debugging, mod_wsgi has the benefit that
you don't need to start the server first.
But you do, the server needs to be setup first and that takes a lot of effort. And the server needs to be started here as well though you can configure it to start automatically at boot.
If you are running on port 80 or 443 which is usually the case, the server can be started only by the root. If it needs to be restarted you will have to ask the super user's help again. So ./manage.py runserver scores heavily here.
mod_wsgi has the benefit, that you know the URL:
https://dev-server/system-name/myurl/
Which is no different from the dev server. By default it starts on port 8000 so you can access it as http://dev-server:8000/system-name/myurl/. If you wanted to use SSL with the development server you can use a package such as django-sslserver or you can put nginx in front of django development server.
With runserver you don't know the port. Use case: You want to link from >an internal wiki to a dev-system ....
With runserver, the port is well defined as mentioned above. And you can make it listen on a different port for exapmle with:
./manage.py runserver 0.0.0.0:9090
Note that if you put development server behind apache (as a reverse proxy) or NGINX, restarting problems etc that I have mentioned above do not apply here.
So in short, for development work, what ever you do with mod_wsgi can be done with the django development server (aka ./manage.py runserver).
Inotify
Here we are getting to the main topic at last. Assuming you have installed inotify-tools you could type this into your shell. You don't need to write a script.
while inotifywait -r -e modify .; do sudo kill -2 yourpid ; done
This will result in the code being reloaded when ...
... using daemon mode with a single process you can send a SIGINT
signal to the daemon process using the ‘kill’ command, or have the
application send the signal to itself when a specific URL is
triggered.
ref: http://modwsgi.readthedocs.io/en/develop/user-guides/frequently-asked-questions.html#application-reloading
alternatively
while inotifywait -r -e modify .; do touch wsgi.py ; done
when
... using daemon mode, with any number of processes, and the process
reload mechanism of mod_wsgi 2.0 has been enabled, then all you need
to do is touch the WSGI script file, thereby updating its modification
time, and the daemon processes will automatically shutdown and restart
the next time they receive a request.
In both situations we are using the -r flag to tell inotify to monitor subdirectories. That means each time you save a .css or .js file apache will reload. But without the -r flag changes to python code in subfolders will be undetected. To have the best of both worls, remove css, js, images etc with the --exclude directive.
What about when your IDE saves an auto backup file? or vim saves the .swp file? That too will cause a code reload. So you would have to exclude those file types too.
So in short, it's a lot of hard work to reproduce what the django development server does free of charge.

You can use inotify hooktables to run any command you want depending on a i-notify signal (here's my source link: http://terokarvinen.com/2016/when-files-change-take-action-inotify-hookable).
After looking the tables you can just reload the code of apache.
For your specific problem, it should be something like:
inotify-hookable --watch-directories sources/ --recursive --on-modify-command './code_reload.sh'
In the previous link, the command to execute was just a simple touch flask/init.wsgi
So, the whole code (adding ignored files was):
inotify-hookable --watch-directories flask/ --recursive --ignore-paths='flask/init.wsgi' --on-modify-command 'touch flask/init.wsgi'
As stated here: Flask + mod_wsgi automatic reload on source code change, if you have enabled WSGIScriptReloading, you can just touch that file. It will cause the entire code to reload (not just the config file). But, if you prefer, you can set any other script to reload the code.
After googling a bit, it seems to be a pretty standard solution for that problem and I think that you can use it for your application.

Automating jobs on remote servers with python2.7

I need to make a python script that will do these steps in order, but I'm not sure how to go about setting this up.
SSH into a server
Copy a folder from point A to point B (cp /foo/bar/folder1 /foo/folder2)
mysql -u root -pfoobar (This database is accessible from localhost only)
create a database, do some other mysql stuff in the mysql console
Replaces instances of Foo with Bar in file foobar
Copy and edit a file
Restart a service
The fact that I have to ssh into a server, and THEN do all of this is really confusing me. I looked into the Fabric library, but that seems to do only do 1 command at a time and doesn't keep context from previous commands.

I looked into the Fabric library, but that seems to do only do 1 command at a time and doesn't keep context from previous commands.
Look into Fabric more. It is still probably what you want.
This page has a lot of good examples.
By "context" I'm assuming you want to be able to cd into another directory and run commands from there. That's what fabric.context_managers.cd is for -- search for it on that page.

Sounds like you are doing some sort of remote deployment/configuring. There's a whole world of tools out there to professionally set this up, look into Chef and Puppet.
Alternatively if you're just looking for a quick and easy way of scripting some remote commands, maybe pexpect can do what you need.
Pexpect is a pure Python module for spawning child applications; controlling them; and responding to expected patterns in their output.
I haven't used it myself but a quick glance at its manual suggests it can work with an SSH session fine: https://pexpect.readthedocs.org/en/latest/api/pxssh.html

I have never used Fabric.
My way to solve those kind of issues (before starting to use saltstack) it was using pyexpect, to run the ssh connection, and all the commands that were needed.
maybe the use of a series of sql scripts to work with the database (just to make it easier) would help.
Another way, since you need to access the remote server using ssh, it would be using paramiko to connect and execute commands remotely. It's a bit more complicated when you want to see what's happening on stdout (while with pexpect you will see exactly what's going on).
but it all depends from what you really need.

Python/Flask: Application is running after closing

I'm working on a simple Flask web application. I use Eclipse/Pydev. When I'm working on the app, I have to restart this app very often because of code changes. And that's the problem. When I run the app, I can see the frame on my localhost, which is good. But when I want to close this app, just click on the red square which should stop applications in Eclipse, sometimes (often), the old version of application keeps running so I can't test the new version. In this case the only thing which helps is to force close every process in Windows Task Manager.
Will you give me any advice how to manage this problem? Thank you in advance.
EDIT: This maybe helps: Many times, I have to run the app twice. Otherwise I can't connect.

I've faced the same problem and solved it. I think it may help.
When we run a flask based site locally it is assigned to a TCP port: 5000 and the Default IP: 127.0.0.1:5000
Sometimes TCP connection remains even after closing the program or terminating the code. So, The idea is kill the TCP connection. You can do it from command-prompt(cmd)
Two Steps to Follow:
1. See the Process ID(PID) for the running TCP connection.
Go to cmd and type:
netstat -ano
Kill The Process By PID. Command for this: taskkill /f /im [PID]. Example is showed bellow.
taskkill /f /im 7332

I've had a very similar thing happen to me. I was using CherryPy rather than Flask, but my solution might still work for you. Oftentimes browsers save webpages locally so that they don't have to re-download them every time the website is visited. This is called caching, and although it's very useful for the average web user, it can be a real pain to app developers. If you're frequently generating new versions of the application, it's possible that your browser is displaying an old version of the app that it has cached instead of the most up to date version. I recommend clearing that cache every time you restart your application, or disabling the cache altogether.

This actually shouldn't happen with the latest versions of PyDev (i.e.: since PyDev 3.4.1: http://pydev.org/history_pydev.html, PyDev should kill all the subprocesses of the main process).
So, can you check which PyDev version are you using?
If you're in the latest version of PyDev, you can use Ctrl+Shift+F9 to terminate/relaunch by default.
But as you're dealing with flask, you should be able to use it to reload automatically on code-changes without doing anything by setting use_reloader=True.
I.e.: I haven't actually tested, but its documentation says that you can set the reload flag for that run(use_reloader=True) -- and PyDev should even be able to debug it (I'll take a better look and improve the PyDev docs on that area later on).

python testing server-deployed application

I've got a small application (https://github.com/tkoomzaaskz/cherry-api) and I would like to integrate it with travis. In fact, travis is probably not important here. My question is how can I configure a build/job to execute the following sequence:
start the server that serves the application
run tests
close the server (which means close the build)
The application is written in python/CherryPy (basic webapp framework). On my localhost I do it using two consoles. One runs the server and another one runs the tests - it's pretty easy and works fine. But when I want to execute all this in the CI environment, I fall in trouble - I'm unable to gain control after the server is started, because the server process waits for requests... and waits... and waits... and tests are never run (https://travis-ci.org/tkoomzaaskz/cherry-api/builds/10855029 - this build is infinite). Additionally, I don't know how to close the server. This is my .travis.yml:
before_script: python src/hello.py
script: nosetests
src/hello.py starts the built-in CherryPy server (listens on localhost:8080). I know I can move it to the background by adding the &: before_script: python src/hello.py & but then I shall find the process ID in the CI-environment and kill the process which seems very very dirty solution and I guess there's something better than that.
I'd appreciate any hints on how can I configure this.
edit: I've configured this dirty run in the background and then kill the process in this file. The build passes now. Still, I think it's ugly...

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.