Can Python package fsspec read SSH config? - python

I would like to access remote SSH server files within Python, and found fsspec. However, there seems to be few code usage examples.
In particular, I can connect by specifying all SSH config options in the function as:
fsspec.filesystem('sftp', host='XXX.XXX.XXX.XXX', port=XXX, username='XXX', password='XXX')
However, I would like to connect simply as fsspec.filesystem('sftp', host='nickname') as I would do as sftp nickname on console, where I have already set all the config options in .ssh/config.
This is both for convenience and the fact that I do not want to parse my password in plain text.
I have read the API documentation (https://filesystem-spec.readthedocs.io/en/latest/api.html) and searched a bit but could not find a way yet. May I ask if anyone can point me some direction?
Many thanks!

Related

Learning Python fast how can I protect some private connections from been exposed

Hi I'm new to the community and new to Python, experienced but rusty on other high level languages, so my question is simple.
I made a simple script to connect to a private ftp server, and retrieve daily information from it.
from ftplib import FTP
#Open ftp connection
#Connect to server to retrieve inventory
#Open ftp connection
def FTPconnection(file_name):
ftp = FTP('ftp.serveriuse.com')
ftp.login('mylogin', 'password')
#List the files in the current directory
print("Current File List:")
file = ftp.dir()
print(file)
# # #Get the latest csv file from server
# ftp.cwd("/pub")
gfile = open(file_name, "wb")
ftp.retrbinary('RETR '+ file_name, gfile.write)
gfile.close()
ftp.quit()
FTPconnection('test1.csv')
FTPconnection('test2.csv')
That's the whole script, it passes my credentials, and then calls the function FTPconnection on two different files I'm retrieving.
Then my other script that processes them has an import statement, as I tried to call this script as a module, what my import does it's just connect to the FTP server and fetch information.
import ftpconnect as ftpc
This is the on the other Python script, that does the processing.
It works but I want to improve it, so I need some guidance on best practices about how to do this, because in Spyder 4.1.5 I get an 'Module ftpconnect called but unused' warning ... so probably I am missing something here, I'm developing on MacOS using Anaconda and Python 3.8.5.
I'm trying to build an app, to automate some tasks, but I couldn't find anything about modules that guided me to better code, it simply says you have to import whatever .py file name you used and that will be considered a module ...
and my final question is how can you normally protect private information(ftp credentials) from being exposed? This has nothing to do to protect my code but the credentials.
There are a few options for storing passwords and other secrets that a Python program needs to use, particularly a program that needs to run in the background where it can't just ask the user to type in the password.
Problems to avoid:
Checking the password in to source control where other developers or even the public can see it.
Other users on the same server reading the password from a configuration file or source code.
Having the password in a source file where others can see it over your shoulder while you are editing it.
Option 1: SSH
This isn't always an option, but it's probably the best. Your private key is never transmitted over the network, SSH just runs mathematical calculations to prove that you have the right key.
In order to make it work, you need the following:
The database or whatever you are accessing needs to be accessible by SSH. Try searching for "SSH" plus whatever service you are accessing. For example, "ssh postgresql". If this isn't a feature on your database, move on to the next option.
Create an account to run the service that will make calls to the database, and generate an SSH key.
Either add the public key to the service you're going to call, or create a local account on that server, and install the public key there.
Option 2: Environment Variables
This one is the simplest, so it might be a good place to start. It's described well in the Twelve Factor App. The basic idea is that your source code just pulls the password or other secrets from environment variables, and then you configure those environment variables on each system where you run the program. It might also be a nice touch if you use default values that will work for most developers. You have to balance that against making your software "secure by default".
Here's an example that pulls the server, user name, and password from environment variables.
import os
server = os.getenv('MY_APP_DB_SERVER', 'localhost')
user = os.getenv('MY_APP_DB_USER', 'myapp')
password = os.getenv('MY_APP_DB_PASSWORD', '')
db_connect(server, user, password)
Look up how to set environment variables in your operating system, and consider running the service under its own account. That way you don't have sensitive data in environment variables when you run programs in your own account. When you do set up those environment variables, take extra care that other users can't read them. Check file permissions, for example. Of course any users with root permission will be able to read them, but that can't be helped. If you're using systemd, look at the service unit, and be careful to use EnvironmentFile instead of Environment for any secrets. Environment values can be viewed by any user with systemctl show.
Option 3: Configuration Files
This is very similar to the environment variables, but you read the secrets from a text file. I still find the environment variables more flexible for things like deployment tools and continuous integration servers. If you decide to use a configuration file, Python supports several formats in the standard library, like JSON, INI, netrc, and XML. You can also find external packages like PyYAML and TOML. Personally, I find JSON and YAML the simplest to use, and YAML allows comments.
Three things to consider with configuration files:
Where is the file? Maybe a default location like ~/.my_app, and a command-line option to use a different location.
Make sure other users can't read the file.
Obviously, don't commit the configuration file to source code. You might want to commit a template that users can copy to their home directory.
Option 4: Python Module
Some projects just put their secrets right into a Python module.
# settings.py
db_server = 'dbhost1'
db_user = 'my_app'
db_password = 'correcthorsebatterystaple'
Then import that module to get the values.
# my_app.py
from settings import db_server, db_user, db_password
db_connect(db_server, db_user, db_password)
One project that uses this technique is Django. Obviously, you shouldn't commit settings.py to source control, although you might want to commit a file called settings_template.py that users can copy and modify.
I see a few problems with this technique:
Developers might accidentally commit the file to source control. Adding it to .gitignore reduces that risk.
Some of your code is not under source control. If you're disciplined and only put strings and numbers in here, that won't be a problem. If you start writing logging filter classes in here, stop!
If your project already uses this technique, it's easy to transition to environment variables. Just move all the setting values to environment variables, and change the Python module to read from those environment variables.

Where to store tns, username and password for a oracle DB connection in Unix to use with Python and R

i'm quite new to use unix and i'm stuck with this problem.
I have a linux machine with R/RStudio and Python/Anaconda installed.
I have given this machine access to hostname, port and service of my DB.
Now i have to create some sort of configuration file where i can store the username and password of the schema i want this machine to use to get access to db and query it through python or R.
This configuration file must secure the password so noone will know it outside the creator of the config file, other users will use r and python to connect to db via some library usinfg the credentials in this config file.
How can i achieve this? Sorry if i have sayd something wrong.
If you know some other methods to achieve this kind of security level please explain

How to avoid storing the username and password as part of the connection string

Situation
My team works with a NFS. I've written a python script that connects to a database. People execute my python script and based some values associated with their userid in our backend mongodb database the script performs certain actions for them
My question
I have a mongodb connection string like so
database_client = MongoClient("mongodb://<myusername>:<mypassword>#serverip:port/DBName")
The problem is that anyone can simply read the py script and figure out what my username and password are.
What is not an option
Making the Python script and opaque executable is not allowed. Not to mention it doesn't protect my credentials from leaking to other members of the tram who work on the same codebase. Nor is making the db world editable/readable. All requests to edit the db must go through an authorized account and only through the API provided by the script.
Is there a way to do this/what should I be doing instead?
Edit: I am NOT using Heroku. I'm behind a company firewall and the mongoDB server is a machine with an IP on the company network
One idea is to read the username and password from a configuration file1 such as
[topsecret]
user = foo
password = bar
This way you can share the code with anyone and they will be able to execute it only if they have the valid configuration file.
1 You can use the configparser module https://docs.python.org/3.4/library/configparser.html to parse configuration files such as above.

How to deal with interactive API in python

I'm in a situation where I need to pass some texts to a prompt generate by a API (seems for API it's a pretty weird behavior, this is the first time I ran into this), like below:
kvm_cli = libvirt.open("qemu+ssh://han#10.0.10.8/system")
then a prompt shows up asking for the ssh password (password for 10.0.10.8 is:), I have to manually type it there in order to move on and yield the kvm_cli object I needed.
I tried to use the pexpect module to deal with this however it's for OS command line instead of API.
It's also possible to work around this by using ssh certification files but it's not a favorable authentication approach in our scenario.
Since our wrapper to the 'open' method is not interactive, we cannot ask the user to input the password, do you guys have any thought how could I address it?
I am not a libvirt user, but I believe that the problem is not in the library, but in the connection method. You seem to be connecting via ssh, so you need to authenticate yourself.
I've been reading the libvirt page on ArchWiki, and I think that you could try:
setting up the simple (TCP/IP socket) connection method, or
setting up key-based, password-less SSH login for your virtual host.

deployment public keys

How do you guys deploy your code on your servers? I am using Fabric and Python and I would like a more automated way of pulling code from the repository through the use of public keys, but without any ops or manual intervention to set up the public keys.
Are you storing them in the code as text or in a database and generate the pk file on the fly? Any other opinions on this one ?
This is what ssh-copy-id is for. It deploys your public key onto a machine for you. Key management isn't something I'd suggest putting into code/VCS. Each user needs to setup their keys so that the local ssh client knows to use them. We use Fabric as well, but it only uses the key that the ssh config is already telling it to.

Categories