Mock a Remote Host in Python - python

I am writing some functions, using paramiko, to execute commands and create files on a remote host. I would like to write some unit tests for them, but I don't know what would be the simplest way to achieve this? This is what I envisage as being an example outline of my code:
import os
import paramiko
import pytest
def my_function(hostname, relpath='.', **kwargs):
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(hostname, **kwargs)
sftp = ssh.open_sftp()
sftp.chdir(relpath)
stdin, stdout, stderr = ssh.exec_command("echo hallo > test.txt")
#pytest.fixture("module")
def mock_remote_host():
# start a remote host here with a local test path
try:
yield hostname, testpath, {"username":"bob", "password":"1234"}
finally:
# delete the test path
# close the remote host
def test_my_function(mock_remote_host):
hostname, dirpath, kwargs = mock_remote_host
my_function(hostname, **kwargs)
filepath = os.path.join(dirpath, 'test.txt')
assert os.path.exists(filepath)
I have had a look at the paramiko test modules, but they seem quite complex for my use case and I'm not sure how to go about simplifying them.

I think what you really need to mock is paramiko.SSHClientobject. You are unittesting your function my_function, you can assume paramiko module works correctly and the only thing you need to unit test is if my_function calls methods of this paramiko.SSHClient in correct way.
To mock paramiko.SSH module you can use unittest.mock and decorate your test_my_function function with #mock.patch.object(paramiko.SSHClient, sshclientmock). You have to define sshclientmock as some kind of Mock or MagicMock first.
Also in python 2.7 there is some equivalent of unittest.mock but I dont remember where to find it exactly.
EDIT: As #chepner mentioned in comment. For python 2.7 you can find mock module in pypi and install it using pip install mock

To answer my own question, I have created: https://github.com/chrisjsewell/atomic-hpc/tree/master/atomic_hpc/mockssh.
As the readme discusses; it is based on https://github.com/carletes/mock-ssh-server/tree/master/mockssh with additions made (to implement more sftp functions) based on https://github.com/rspivak/sftpserver
The following changes have also been made:
revised users parameter, such that either a private_path_key or password can be used
added a dirname parameter to the Server context manager, such that the this will be set as the root path for the duration of the context.
patched paramiko.sftp_client.SFTPClient.chdir to fix its use with relative paths.
See test_mockssh.py for example uses.

If you want to test remote connectivity, remote filesystem structure and remote path navigation you have to set-up a mock host server (a VM maybe). In other words if you want to test your actions on the host you have to mock the host.
If you want to test your actions with the data of the host the easiest way seems to proceed as running.t said in the other answer.

I agree with HraBal, because of "Infrastructure as code". You can treat virtual machine as a block of code.
For example:
you can use vagrant or docker to initialize a SSH server and then, and modify your DNS configuration file. target domain 127.0.0.1
put application into the server. and run paramiko to connect target domain and test what you want.
I think it is the benefit that you can do this for all programming languages and not need to reinvent the wheel . In addition, you and your successors will know the detail of the system.
(My English is not very good)

Related

Check whether deployed in cloud

I have a python program running in a Docker container. My authentication method depends on whether the container is deployed in GCP or not. Ideally I'd have a function like this:
def deployment_environment():
# return 'local' if [some test] else 'cloud'
pass
What's the most idiomatic way of checking this? My instinct is to use env named [APP_NAME]_DEPLOYMENT_ENVIRONMENT which gets set either way -- but making sure this is set correctly has too many moving parts. Is there a GCP package/tool which can check for me?
There are two solutions I've arrived at:
With env
Set an env var when deploying, like so:
gcloud functions deploy [function-name] --set-env-vars ENV_GCP=1
Then, in your code:
import socket
def deployment_environment():
return 'cloud' if ('ENV_GCP' in os.environ) else 'local'
Pros
Cons
intent is clear, both setting and using env
more involved
idiomatic
relies on user setting env correctly
Via Python, with Sockets
import socket
def deployment_environment():
try:
socket.getaddrinfo('metadata.google.internal', 80)
return 'cloud'
except socket.gaierror:
return 'local'
Pros
Cons
more succinct
makes improper use of try/catch
doesn't rely on an extra step of setting env
dependency on socket package & GCP runtime contract

Learning Python fast how can I protect some private connections from been exposed

Hi I'm new to the community and new to Python, experienced but rusty on other high level languages, so my question is simple.
I made a simple script to connect to a private ftp server, and retrieve daily information from it.
from ftplib import FTP
#Open ftp connection
#Connect to server to retrieve inventory
#Open ftp connection
def FTPconnection(file_name):
ftp = FTP('ftp.serveriuse.com')
ftp.login('mylogin', 'password')
#List the files in the current directory
print("Current File List:")
file = ftp.dir()
print(file)
# # #Get the latest csv file from server
# ftp.cwd("/pub")
gfile = open(file_name, "wb")
ftp.retrbinary('RETR '+ file_name, gfile.write)
gfile.close()
ftp.quit()
FTPconnection('test1.csv')
FTPconnection('test2.csv')
That's the whole script, it passes my credentials, and then calls the function FTPconnection on two different files I'm retrieving.
Then my other script that processes them has an import statement, as I tried to call this script as a module, what my import does it's just connect to the FTP server and fetch information.
import ftpconnect as ftpc
This is the on the other Python script, that does the processing.
It works but I want to improve it, so I need some guidance on best practices about how to do this, because in Spyder 4.1.5 I get an 'Module ftpconnect called but unused' warning ... so probably I am missing something here, I'm developing on MacOS using Anaconda and Python 3.8.5.
I'm trying to build an app, to automate some tasks, but I couldn't find anything about modules that guided me to better code, it simply says you have to import whatever .py file name you used and that will be considered a module ...
and my final question is how can you normally protect private information(ftp credentials) from being exposed? This has nothing to do to protect my code but the credentials.
There are a few options for storing passwords and other secrets that a Python program needs to use, particularly a program that needs to run in the background where it can't just ask the user to type in the password.
Problems to avoid:
Checking the password in to source control where other developers or even the public can see it.
Other users on the same server reading the password from a configuration file or source code.
Having the password in a source file where others can see it over your shoulder while you are editing it.
Option 1: SSH
This isn't always an option, but it's probably the best. Your private key is never transmitted over the network, SSH just runs mathematical calculations to prove that you have the right key.
In order to make it work, you need the following:
The database or whatever you are accessing needs to be accessible by SSH. Try searching for "SSH" plus whatever service you are accessing. For example, "ssh postgresql". If this isn't a feature on your database, move on to the next option.
Create an account to run the service that will make calls to the database, and generate an SSH key.
Either add the public key to the service you're going to call, or create a local account on that server, and install the public key there.
Option 2: Environment Variables
This one is the simplest, so it might be a good place to start. It's described well in the Twelve Factor App. The basic idea is that your source code just pulls the password or other secrets from environment variables, and then you configure those environment variables on each system where you run the program. It might also be a nice touch if you use default values that will work for most developers. You have to balance that against making your software "secure by default".
Here's an example that pulls the server, user name, and password from environment variables.
import os
server = os.getenv('MY_APP_DB_SERVER', 'localhost')
user = os.getenv('MY_APP_DB_USER', 'myapp')
password = os.getenv('MY_APP_DB_PASSWORD', '')
db_connect(server, user, password)
Look up how to set environment variables in your operating system, and consider running the service under its own account. That way you don't have sensitive data in environment variables when you run programs in your own account. When you do set up those environment variables, take extra care that other users can't read them. Check file permissions, for example. Of course any users with root permission will be able to read them, but that can't be helped. If you're using systemd, look at the service unit, and be careful to use EnvironmentFile instead of Environment for any secrets. Environment values can be viewed by any user with systemctl show.
Option 3: Configuration Files
This is very similar to the environment variables, but you read the secrets from a text file. I still find the environment variables more flexible for things like deployment tools and continuous integration servers. If you decide to use a configuration file, Python supports several formats in the standard library, like JSON, INI, netrc, and XML. You can also find external packages like PyYAML and TOML. Personally, I find JSON and YAML the simplest to use, and YAML allows comments.
Three things to consider with configuration files:
Where is the file? Maybe a default location like ~/.my_app, and a command-line option to use a different location.
Make sure other users can't read the file.
Obviously, don't commit the configuration file to source code. You might want to commit a template that users can copy to their home directory.
Option 4: Python Module
Some projects just put their secrets right into a Python module.
# settings.py
db_server = 'dbhost1'
db_user = 'my_app'
db_password = 'correcthorsebatterystaple'
Then import that module to get the values.
# my_app.py
from settings import db_server, db_user, db_password
db_connect(db_server, db_user, db_password)
One project that uses this technique is Django. Obviously, you shouldn't commit settings.py to source control, although you might want to commit a file called settings_template.py that users can copy and modify.
I see a few problems with this technique:
Developers might accidentally commit the file to source control. Adding it to .gitignore reduces that risk.
Some of your code is not under source control. If you're disciplined and only put strings and numbers in here, that won't be a problem. If you start writing logging filter classes in here, stop!
If your project already uses this technique, it's easy to transition to environment variables. Just move all the setting values to environment variables, and change the Python module to read from those environment variables.

Can Python package fsspec read SSH config?

I would like to access remote SSH server files within Python, and found fsspec. However, there seems to be few code usage examples.
In particular, I can connect by specifying all SSH config options in the function as:
fsspec.filesystem('sftp', host='XXX.XXX.XXX.XXX', port=XXX, username='XXX', password='XXX')
However, I would like to connect simply as fsspec.filesystem('sftp', host='nickname') as I would do as sftp nickname on console, where I have already set all the config options in .ssh/config.
This is both for convenience and the fact that I do not want to parse my password in plain text.
I have read the API documentation (https://filesystem-spec.readthedocs.io/en/latest/api.html) and searched a bit but could not find a way yet. May I ask if anyone can point me some direction?
Many thanks!

Scrapyd deploy project on a server with dynamic ip

I want to deploy my scrapy project on a ip that is not listed in the scrapy.cfg file , because the ip can change and i want to automate the process of deploying. i tried giving the ip of the server directly in the deploy command but it did not work. any suggestion to do this?
First, you should consider assigning a domain to the server, so you can always get to it regardless of its dynamic IP. DynDNS comes handy at times.
Second, you probably won't do the first, because you haven't got access to the server, or for whatever other reason. In that case, I suggest mimicking above behavior by using your system's hosts file. As described at wikipedia article:
The hosts file is a computer file used by an operating system to map hostnames to IP addresses.
For example, lets say you set your url to remotemachine in your scrapy.cfg. You can write a script that would edit the hosts file with the latest IP address, and execute it before deploying your spider. This approach has a benefit of having a system-wide effect, so if you are deploying multiple spiders, or using the same server for some other purpose, you don't have to update multiple configuration files.
This script could look something like this:
import fileinput
import sys
def update_hosts(hostname, ip):
if 'linux' in sys.platform:
hosts_path = '/etc/hosts'
else:
hosts_path = 'c:\windows\system32\drivers\etc\hosts'
for line in fileinput.input(hosts_path, inplace=True):
if hostname in line:
print "{0}\t{1}".format(hostname, ip)
else:
print line.strip()
if __name__ == '__main__':
hostname = sys.argv[1]
ip = sys.argv[2]
update_hosts(hostname, ip)
print "Done!"
Ofcourse,you should do additional argument checks, etc., this is just a quick example.
You can then run it prior deploying like this:
python updatehosts.py remotemachine <remote_ip_here>
If you want to take it a step further and add this functionality as a simple argument to scrapyd-deploy, you can go ahead and edit your scrapyd-deploy file (its just a Python script) to add the additional parameter and update the hosts file from within. But I'm not sure this is the best thing to do, since leaving this implementation separate and more explicit would probably be a better choice.
This is not something you can solve on the scrapyd side.
According to the source code of scrapyd-deploy, it requires the url to be defined in the [deploy] section of the scrapy.cfg.
One of the possible workarounds could be having a placeholder in scrapy.cfg which you would replace with a real IP address of the target server, before starting scrapyd-deploy.

How to deal with interactive API in python

I'm in a situation where I need to pass some texts to a prompt generate by a API (seems for API it's a pretty weird behavior, this is the first time I ran into this), like below:
kvm_cli = libvirt.open("qemu+ssh://han#10.0.10.8/system")
then a prompt shows up asking for the ssh password (password for 10.0.10.8 is:), I have to manually type it there in order to move on and yield the kvm_cli object I needed.
I tried to use the pexpect module to deal with this however it's for OS command line instead of API.
It's also possible to work around this by using ssh certification files but it's not a favorable authentication approach in our scenario.
Since our wrapper to the 'open' method is not interactive, we cannot ask the user to input the password, do you guys have any thought how could I address it?
I am not a libvirt user, but I believe that the problem is not in the library, but in the connection method. You seem to be connecting via ssh, so you need to authenticate yourself.
I've been reading the libvirt page on ArchWiki, and I think that you could try:
setting up the simple (TCP/IP socket) connection method, or
setting up key-based, password-less SSH login for your virtual host.

Categories