The scenario is our production servers are sitting in a private subnet with a NAT instance in front of them to allow maintenance via SSH. Currently we connect to the NAT instance via SSH then via SSH from there to the respective server.
What I would like to do is run deployment tasks from my machine using the NAT as a proxy without uploading the codebase to the NAT instance. Is this possible with Fabric or am I just going to end up in a world of pain?
EDIT
Just to follow up on this, as #Morgan suggested, the gateway option will indeed fix this issue.
For a bit of completeness, in my fabfile.py:
def setup_connections():
"""
This should be called as the first task in all calls in order to setup the correct connections
e.g. fab setup_connections task1 task2...
"""
env.roledefs = {}
env.gateway = 'ec2-user#X.X.X.X' # where all the magic happens
tag_mgr = EC2TagManager(...)
for role in ['web', 'worker']:
env.roledefs[role] = ['ubuntu#%s' % ins for ins in
tag_mgr.get_instances(instance_attr='private_ip_address', role=role)]
env.key_filename = '/path/to/server.pem'
#roles('web')
def test_uname_web():
run('uname -a')
I can now run fab setup_connections test_uname_web and I can get the uname of my web server
So if you have a newer version of Fabric (1.5+) you can try using the gateway options. I've never used it myself, but seems like what you'd want.
Documentation here:
env var
cli flag
execution model notes
original ticket
Also if you run into any issues, all of us tend to idle in irc.
Related
I have a small AWS Lambda function that looks like this:
It grabs the credentials to connect to Snowflake from SSM Parameter Store, and then calls snowflake.connector.connect. It's meant to obviously go grab data from my Snowflake data warehouse. However, the code hangs and never finishes the snowflake.connector.connect call.
I believe my subnets and networking are set up properly:
Just to test and develop, I set my security group to allow all inbound and outbound traffic on all ports.
I have my Lambda running in a private subnet, and a route table that directs 0.0.0.0/0 to the NAT Gateway instance. In my code, I print(requests.get('http://216.58.192.142')) just to prove that I do indeed have internet connectivity.
I have many large dependencies that don't fit in the 200MB deploy package for Lambdas, so I have my dependencies mounted in an EFS file system at /mnt/efs path, and I add /mnt/efs/python to my PYTHONPATH in the code before I start to import those dependencies.
import boto3
import snowflake.connector
print("Getting snowflake creds")
session = boto3.session.Session()
ssm = session.client("ssm")
obj = ssm.get_parameter(Name="snowflake", WithDecryption=True)
sf_creds = json.loads(obj.get("Parameter").get("Value"))
def get_data(event, context):
print(requests.get('http://216.58.192.142'))
print("Executing")
print("got parameter, connecting")
con = snowflake.connector.connect(
user=sf_creds["USER"],
password=sf_creds["PASSWORD"],
account=sf_creds["ACCOUNT"],
ocsp_response_cache_filename="/tmp/ocsp_response_cache"
)
print("connected")
When I run this same exact code locally on my MacBook, I am able to connect fairly quickly, within a second or two. My snowflake-python-connector version is 2.3.2.
However, no matter how long I try, the connect method hands when it executes in an AWS Lambda function. I'm really not sure what is going on.
I've verified that
the AWS Lambda function is connected to the internet (it receives a [200 OK] from the requests.get call).
security groups are as permissive as possible (allow all traffic on all ports both inbound and outbound)
I have not touched the NACL
Really, I'm at a loss as to why this is happening, especially given that the code works fine on my local machine - could someone try to point me in the right direction?
Lambda can respond with 200, but there can be exception in logs. Check out the CloudWatch logs of this lambda.
I would like to parallelise queries to a MongoDB database, using pymongo. I am using an HPC system, which uses Slurm as the workload manager. I have a setup which works fine on a single node, but fails when the tasks are spread across more than one node.
I know that the problem is that mongodb is bound to node I start it on, and therefore the additional nodes can't connect to it.
I specifically would like to know how to start and then connect to the mongodb server when using multiple HPC nodes. Thanks!
Some extra details:
Before starting my python script, I start the mongodb like this:
numactl --interleave=all mongod --dbpath=database &
And I get the warning message:
** WARNING: This server is bound to localhost.
** Remote systems will be unable to connect to this server.
** Start the server with --bind_ip <address> to specify which IP
** addresses it should serve responses from, or with --bind_ip_all to
** bind to all interfaces. If this behavior is desired, start the
** server with --bind_ip 127.0.0.1 to disable this warning.
In my python script, I have a worker function which is run by each processor. It is basically structured like this:
def worker(args):
cl = pymongo.MongoClient()
db = cl.mydb
collection = db['mycol']
query = {}
result = collection.find_one(query)
# now do some work...
The warning message mentions --bind_ip <address>. To know the IP address of a compute node, the simplest solution is to use the hostname -i command. So in your submission script, try
numactl --interleave=all mongod --dbpath=database --bind_ip $(hostname -i) &
But then, your Python script must also know the IP address of the node on which MongoDB is running:
def worker(args):
cl = pymongo.MongoClient(host=<IP of MongoDB Server>)
db = cl.mydb
collection = db['mycol']
query = {}
result = collection.find_one(query)
# now do some work...
You will need to adapt the <IP of MongoDB Server> part depending on how you want to pass the information to the Python script. It can be through a command-line parameter, trough the environment, through a file, etc.
Do not forget to use srun to run the python script on all nodes of the allocation, or you will need to implement that functionality in your python script itself.
Do not hesitate also to change the default port of MongoDB from job to job to avoid possible interference if you have several of them running.
I am using Invoke/Fabric with boto3 to create an AWS instance and hand it over to an Ansible script. In order to do that, a few things have to be prepared on the remote machine before Ansible can take over, notably installing Python, create a user, and copy public SSH keys.
The AWS image comes with a particular user. I would like to use this user only to create my own user, copy public keys, and remove password login afterwards. While using the Fabric CLI the connection object is not created and cannot be modified within tasks.
What would be a good way to switch users (aka recreate a connection object between tasks) and run the following tasks with the user that I just created?
I might not go about it the right way (I am migrating from Fabric 1 where a switch of the env values has been sufficient). Here are a few strategies I am aware of, most of them remove some flexibility we have been relying on.
Create a custom AMI on which all preparations has been done already.
Create a local Connection object within a task for the user setup before falling back to the connection object provided by the Fabric CLI.
Deeper integrate AWS with Ansible (the problem is that we have users that might use Ansible after the instance is alive but don't have AWS privileges).
I guess this list includes also a best practice question.
The AWS image comes with a particular user. I would like to use this user
only to create my own user, copy public keys, and remove password login
afterwards. While using the Fabric CLI the connection object is not created
and cannot be modified within tasks.
I'm not sure this is accurate. I have switched users during the execution of a task just fine. You just have to make sure that all subsequent calls that need the updated env use the execute operation.
e.g.
def create_users():
run('some command')
def some_other_stuff():
run('whoami')
#task
def new_instance():
# provision instance using boto3
env.host = [ ip_address ]
env.user = 'ec2-user'
env.password = 'sesame'
execute(create_users)
env.user = 'some-other-user'
execute(some_other_stuff)
From python, I am using knife to launch a server.
e.g.
knife ec2 server create -r "role[nginx_server]" --region ap-southeast-1 -Z ap-southeast-1a -I ami-ae1a5dfc --flavor t1.micro -G nginx -x ubuntu -S sg_development -i /home/ubuntu/.ec2/sg_development.pem -N webserver1
I will then use the chef-server api to check for when the bootstrap is complete so I can then use boto and other tools to configure the newly created server. Pseudo code will look like this:
cmd = """knife ec2 server create -r "role[nginx_server]...."""
os.system(cmd)
boot = False
while boot==False:
chefTrigger = getStatusFromChefApi()
if chefTrigger==True:
boot=True
continue with code for further proccessing
My question is: What is the trigger in the chef-server that will indicate when the node is fully processed by chef? Note, I used the -N to name the server and will query its properties, but what do I look for? Is there a bool? A status?
Thanks
TL;DR: Use a report/exception handler instead.
When the node has finished running chef-client successfully, it will save the node object to the Chef Server. One of the attributes automatically generated by ohai every time Chef runs is node['ohai_time'], which is the Unix epoch timestamp when ohai was executed (at the beginning of the Chef run). A node that has not successfully saved itself to the server will not have the ohai_time at all. However, this attribute merely tracks the time when ohai ran, not necessarily when chef-client saved to the server (since that can be a few seconds to minutes depending on what your recipes are doing). Note if the chef run exits due to an unhandled exception, it won't save to the server by default.
A more reliable way to be notified when a node has completed is to use a Report/Exception handler, which can send a message to a variety of places and APIs. See the documentation for more information.
In my Django project I need to be able to check whether a host on the LAN is up using an ICMP ping. I found this SO question which answers how to ping something in Python and this SO question which links to resources explaining how to use the sodoers file.
The Setting
A Device model stores an IP address for a host on the LAN, and after adding a new Device instance to the DB (via a custom view, not the admin) I envisage checking to see if the device responds to a ping using an AJAX call to an API which exposes the capability.
The Problem
However (from the docstring of a library suggested in the the first SO question) "Note that ICMP messages can only be sent from processes running as root."
I don't want to run Django as the root user, since it is bad practice. However this part of the process (sending and ICMP ping) needs to run as root. If with a Django view I wish to send off a ping packet to test the liveness of a host then Django itself is required to be running as root since that is the process which would be invoking the ping.
Solutions
These are the solutions I can think of, and my question is are there any better ways to only execute select parts of a Django project as root, other than these:
Run Django as root (please no!)
Put a "ping request" in a queue that another processes -- run as root -- can periodically check and fulfil. Maybe something like celery.
Is there not a simpler way?
I want something like a "Django run as root" library, is this possible?
Absolutely no way, do not run the Django code as root!
I would run a daemon as root (written in Python, why not) and then IPC between the Django instance and your daemon. As long as you're sure to validate the content and properly handle it (e.g. use subprocess.call with an array etc) and only pass in data (not commands to execute) it should be fine.
Here is an example client and server, using web.py
Server: http://gist.github.com/788639
Client: http://gist.github.com/788658
You'll need to install webpy.org but it's worth having around anyway. If you can hard-wire the IP (or hostname) into the server and remove the argument, all the better.
What's your OS here? You might be able to write a little program that does what you want given a parameter, and stick that in the sudoers file, and give your django user permission to run it as root.
/etc/sudoers
I don't know what kind of system you're on, but on any box I've encountered, one does not have to be root to run the command-line ping program (it has the suid bit set, so it becomes root as necessary). So you could just invoke that. It's a bit more overhead, but probably negligible compared to network latency.