Python - SCP vs SSH for multiple computers and multiple files

Python - SCP vs SSH for multiple computers and multiple files - python

Python 2.4.x (cannot install any non-stock modules).
Question for you all. (assuming use of subprocess.popen)
Say you had 20 - 30 machines - each with 6 - 10 files on them that you needed to read into a variable.
Would you prefer to scp into each machine, once for each file (120 - 300 SCP commands total), reading each file after it's SCP'd down into a variable - then discarding the file.
Or - SSH into each machine, once for each file - reading the file into memory. (120 - 300 ssh commands total).
?
Unless there's some other way to grab all desired files in one shot per machine (files are named YYYYMMDD.HH.blah - range would be given 20111023.00 - 20111023.23). - reading them into memory that I cannot think of?

Depending on the size of the file, you can possibly do something like:
...
files= "file1 file2 ..."
myvar = ""
for tm in machine_list
myvar = myvar+ subprocess.check_output(["ssh", "user#" + tm, "/bin/cat " + files]);
...
file1 file2 etc are space delimited. Assuming all are unix boxes you can /bin/cat them all in one shot from each machine. (This is assuming that you are simply loading the ENTIRE content in one variable) variations of above.. SSH will be simpler to diagnose.
At least that's my thought.
UPDATE
use something like
myvar = myvar+Popen(["ssh", "user#" +tm ... ], stdout=PIPE).communicate()[0]
Hope this helps.

scp lets you:
Copy entire directories using the -r flag: scp -r g0:labgroup/ .
Specify a glob pattern: scp 'g0:labgroup/assignment*.hs' .
Specify multiple source files: scp 'g0:labgroup/assignment1*' 'g0:labgroup/assignment2*' .
Not sure what sort of globbing is supported, odds are it just uses the shell for this. I'm also not sure if it's smart enough to merge copies from the same server into one connection.

You could run a remote command via ssh that uses tar to tar the files you want together (allowing the result to go to standard out), capture the output into a Python variable, then use Python's tarfile module to split the files up again. I'm not actually sure how tarfile works; you may have to put the read output into a file-like StringIO object before accessing it with tarfile.
This will save you a bit of time, since you'll only have to connect to each machine once, reducing time spent in ssh negotiation. You also avoid having to use local disk storage, which could save a bit of time and/or energy — useful if you are running in laptop mode, or on a device with a limited file system.
If the network connection is relatively slow, you can speed things up further by using gzip or bzip compression; decompression is supported by tarfile.

As an extra to Inerdia's answer, yes you can get scp to transfer multiple files in one connection, by using brace patterns:
scp "host:{path/to/file1,path/to/file2}" local_destination"
And you can use the normal goodies of brace patterns if your files have common prefixes or suffixes:
scp "host:path/to/{file1,file2}.thing" local_destination"
Note that the patterns are inside quotes, so they're not expanded by the shell before calling scp. I have a host with a noticeable connection delay, on which I created two empty files. Then executing the copy like the above (with the brace pattern quoted) resulted in a delay and then both files quickly transferred. When I left out the quotes, so the local shell expanded the braces into two separate host:file arguments to scp, then there was a noticeable delay before the first file and between the two files.
This suggests to me that Inerdia's suggestion of specifying multiple host:file arguments will not reuse the connection to transfer all the files, but using quoted brace patterns will.

Related

In Python 3 on Windows, how can I set NTFS compression on a file? Nothing I've googled has gotten me even close to an answer

(Background: On an NTFS partition, files and/or folders can be set to "compressed", like it's a file attribute. They'll show up in blue in Windows Explorer, and will take up less disk space than they normally would. They can be accessed by any program normally, compression/decompression is handled transparently by the OS - this is not a .zip file. In Windows, setting a file to compressed can be done from a command line with the "Compact" command.)
Let's say I've created a file called "testfile.txt", put some data in it, and closed it. Now, I want to set it to be NTFS compressed. Yes, I could shell out and run Compact, but is there a way to do it directly in Python code instead?

In the end, I ended up cheating a bit and simply shelling out to the command line Compact utility. Here is the function I ended up writing. Errors are ignored, and it returns the output text from the Compact command, if any.
def ntfscompress(filename):
import subprocess
_compactcommand = 'Compact.exe /C /I /A "{}"'.format(filename)
try:
_result = subprocess.run(_compactcommand, timeout=86400,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,text=True)
return(_result.stdout)
except:
return('')

Securely sending information between two python scripts

Brief summary:
I have two files: foo1.pyw and foo2.py
I need to send large amounts of sensitive information to foo2.py from foo1.pyw, and then back again.
Currently, I am doing this by writing to a .txt file, and then opening it with foo2.py using: os.system('foo2.py [text file here] [other arguments passing information]') The problem here is that the .txt file then leaves a trace when it is removed. I need to send information to foo2.py and back without having to write to a temp file.
The information will be formatted text, containing only ASCII characters, including letters, digits, symbols, returns, tabs, and spaces.
I can give more detail if needed.

You could use encryption like AES with python:http://eli.thegreenplace.net/2010/06/25/aes-encryption-of-files-in-python-with-pycrypto or use a transport layer: https://docs.python.org/2/library/ssl.html.

If what you're worrying about is the traces left on the HD, and real time interception is not the issue, why not just shred the temp file afterwards?
Alternatively, for a lot more work, you can setup a ramdisk and hold the file in memory.
The right way to do this is probably with a sub-process and pipe, accessible via subprocess.Popen You can then directly pipe information between the scripts.

I think the simplest solution would be to just call the function within foo2.py from foo1.py:
# foo1.py
import foo2
result = foo2.do_something_with_secret("hi")
# foo2.py
def do_something_with_secret(s):
print(s)
return 'yeah'
Obviously, this wouldn't work if you wanted to replace foo2.py with an arbitrary executable.
This may be a little tricky if they two are in different directories, run under different versions of Python, etc.

Serial Numbers from a Storage Controller over SSH

Background
I'm working on a bash script to pull serial numbers and part numbers from all the devices in a server rack, my goal is to be able to run a single script (inventory.sh) and walk away while it generates text files containing the information I need. I'm using bash for maximum compatibility, the RHEL 6.7 systems do have Perl and Python installed, however they have minimal libraries. So far I haven't had to use anything other than bash, but I'm not against calling a Perl or Python script from my bash script.
My Problem
I need to retrieve the Serial Numbers and Part numbers from the drives in a Dot Hill Systems AssuredSAN 3824, as well as the Serial numbers from the equipment inside. The only way I have found to get all the information I need is to connect over SSH and run the following three commands dumping the output to a local file:
show controllers
show frus
show disks
Limitations:
I don't have "sshpass" installed, and would prefer not to install it.
The Controller is not capable of storing SSH keys ( no option in custom shell).
The Controller also cannot write or transfer local files.
The Rack does NOT have access to the Internet.
I looked at paramiko, but while Python is installed I do not have pip.
I also cannot use CPAN.
For what its worth, the output comes back in XML format. (I've already written the code to parse it in bash)
Right now I think my best option would be to have a library for Python or Perl in the folder with my other scripts, and write a script to dump the commands' output to files that I can parse with my bash script. Which language is easier to just provide a library in a file? I'm looking for a library that is as small and simple as possible to use. I just need a way to get the output of those commands to XML files. Right now I am just using ssh 3 times in my script and having to enter the password each time.

Have a look at SNMP. There is a reasonable chance that you can use SNMP tools to remotely extract the information you need. The manufacturer should be able to provide you with the MIBs.

I ended up contacting the Manufacturer and asking my question. They said that the system isn't setup for connecting without a password, and their SNMP is very basic and won't provide the information I need. They said to connect to the system with FTP and use "get logs " to download an archive of the configuration and logs. Not exactly ideal as it takes 4 minutes just to run that one command but it seems to be my only option. Below is the script I wrote to retrieve the file automatically by adding the login credentials to the .netrc file. This works on RHEL 6.7:
#!/bin/bash
#Retrieve the logs and configuration from a Dot Hill Systems AssuredSAN 3824 automatically.
#Modify "LINE" and "HOST" to fit your configuration.
LINE='machine <IP> login manage password <password>'
HOST='<IP>'
AUTOLOGIN="/root/.netrc"
FILE='logfiles.zip'
#Check for and verify the autologin file
if [ -f $AUTOLOGIN ]; then
printf "Found auto-login file, checking for proper entry... \r"
READLINE=`cat $AUTOLOGIN | grep "$LINE"`
#Append the line to the end of .netrc if file exists but not the line.
if [ "$LINE" != "$READLINE" ]; then
printf "Proper entry not found, creating it... \r"
echo "$LINE" >> "$AUTOLOGIN"
else
printf "Proper entry found... \r"
fi
#Create the Autologin file if it doesn't exist
else
printf "Auto-Login file does not exist, creating it and setting permissions...\r"
echo "$LINE" > "$AUTOLOGIN"
chmod 600 "$AUTOLOGIN"
fi
#Start getting the information from the controller. (This takes a VERY long time)
printf "Retrieving Storage Controller data, this will take awhile... \r"
ftp $HOST << SCRIPT
get logs $FILE
SCRIPT
exit 0
This gave me a bunch of files in the zip, but all I needed was the "store_....logs" file. It was about 500,000 lines long, the first portion is the entire configuration in XML format, then the configuration in text format, followed by the logs from the system. I parsed the file and stripped off the logs at the end which cut the file down to 15,000 lines. From there I divided it into two files (config.xml and config.txt). I then pulled the XML output of the 3 commands that I needed and it to the 3 files my previously written script searches for. Now my inventory script pulls in everything it needs, albeit pretty slow due to waiting 4 minutes for the system to generate the zip file. I hope this helps someone in the future.
Edit:
Waiting 4 minutes for the system to compile was taking too long. So I ended up using paramiko and python scripts to dump output from the commands to files that my other code can parse. It accepts the IP of the Controller as a parameter. Here is the script for those interested. Thank you again for all the help.
#!/usr/bin/env python
#Saves output of "show disks" from the storage Controller to an XML file.
import paramiko
import sys
import re
import xmltodict
IP = sys.argv[1]
USERNAME = "manage"
PASSWORD = "password"
FILENAME = "./logfiles/disks.xml"
cmd = "show disks"
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
try:
client.connect(IP,username=USERNAME,password=PASSWORD)
stdin, stdout, stderr = client.exec_command(cmd)
except Exception as e:
sys.exit(1)
data = ""
for line in stdout:
if re.search('#', line):
pass
else:
data += line
client.close()
f = open(FILENAME, 'w+')
f.write(data)
f.close()
sys.exit(0)

Applying a perl script to every file in a directory and obtain output using Python

I am trying to make a python script that will open a directory, apply a perl script to every file in that directory and get its out put in either multiple text files or just one.
I currently have:
import shlex, subprocess
arg_str = "perl tilt.pl *.pdb > final.txt"
arg = shlex.split(arg_str)
import os
framespdb = os.listdir("prac_frames")
for frames in framespdb:
subprocess.Popen(arg, stdout=True)
I keep getting *.pdb not found. I am very new to all of this so any help trying to complete this script would help.

*.pdb not found means exactly that - there won't be a *.pdb in whatever directory you're running the script... and as I read the code - I don't see anything to imply it's within 'frames' when it runs the perl script.
you probably need os.chdir(path) before the Popen.
How do I "cd" in Python?
...using a python script to run somewhat dubious syscalls to perl may offend some people but everyone's done it.. aside from that I'd point out:
always specify full paths (this becomes a problem if you will later say, want to run your job automatically from cron or an environment that doesn't have your PATH).
i.e. 'which perl' - put that full path in.
./.pdb would be better but not as good as the fullpath/.pdb (which you could use instead of the os.chdir option).

subprocess.Popen(arg, stdout=True)
does not expand filename wildcards. To handle your *.pdb, use shell=True.

Is this python code safe against injections?

I have a server/client socket pair in Python. The server receives specific commands, then prepares the response and send it to the client.
In this question, my concern is just about possible injections in the code: if it could be possible to ask the server doing something weird with the 2nd parameter -- if the control on the command contents is not sufficient to avoid undesired behaviour.
EDIT:
according to advices received
added parameter shell=True when calling check_output on windows. Should not be dangerous since the command is a plain 'dir'.
.
self.client, address = self.sock.accept()
...
cmd = bytes.decode(self.client.recv(4096))
ls: executes a system command but only reads the content of a directory.
if cmd == 'ls':
if self.linux:
output = subprocess.check_output(['ls', '-l'])
else:
output = subprocess.check_output('dir', shell=True)
self.client.send(output)
cd: just calls os.chdir.
elif cmd.startswith('cd '):
path = cmd.split(' ')[1].strip()
if not os.path.isdir(path):
self.client.send(b'is not path')
else:
os.chdir(path)
self.client.send( os.getcwd().encode() )
get: send the content of a file to the client.
elif cmd.startswith('get '):
file = cmd.split(' ')[1].strip()
if not os.path.isfile(file):
self.client.send(b'ERR: is not a file')
else:
try:
with open(file) as f: contents = f.read()
except IOError as er:
res = "ERR: " + er.strerror
self.client.send(res.encode())
continue
... (send the file contents)

Except in implementation details, I cannot see any possibilities of direct injection of arbitrary code because you do not use received parameters in the only commands you use (ls -l and dir).
But you may still have some security problems :
you locate commands through the path instead of using absolute locations. If somebody could change the path environment variable what could happen ... => I advice you to use directly os.listdir('.') which is portable and has less risks.
you seem to have no control on allowed files. If I correctly remember reading CON: or other special files on older Windows version gave weird results. And you should never give any access to sensible files, configuration, ...
you could have control on length of asked files to avoid users to try to break the server with abnormally long file names.

Typical issues in a client-server scenario are:
Tricking the server into running a command that is determined by the client. In the most obvious form this happens if the server allows the client to run commands (yes, stupid). However, this can also happen if the client can supply only command parameters but shell=True is used. E.g. using subprocess.check_output('dir %s' % dir, shell=True) with a client-supplied dir variable would be a security issue, dir could have a value like c:\ && deltree c:\windows (a second command has been added thanks to the flexibility of the shell's command line interpreter). A relatively rare variation of this attack is the client being able to influence environment variables like PATH to trick the server into running a different command than intended.
Using unexpected functionality of built-in programming language functions. For example, fopen() in PHP won't just open files but fetch URLs as well. This allows passing URLs to functionality expecting file names and playing all kinds of tricks with the server software. Fortunately, Python is a sane language - open() works on files and nothing else. Still, database commands for example can be problematic if the SQL query is generated dynamically using client-supplied information (SQL Injection).
Reading data outside the allowed area. Typical scenario is a server that is supposed to allow only reading files from a particular directory, yet by passing in ../../../etc/passwd as parameter you can read any file. Another typical scenario is a server that allows reading only files with a particular file extension (e.g. .png) but passing in something like passwords.txt\0harmless.png still allows reading files of other types.
Out of these issues only the last one seems present in your code. In fact, your server doesn't check at all which directories and files the client should be allowed to read - this is a potential issue, a client might be able to read confidential files.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.