Automatically open files given as command line arguments in Python - python

I have a lot of Perl scripts that looks something like the following. What it does is that it will automatically open any file given as a command line argument and in this case print the content of that file. If no file is given it will instead read from standard input.
while ( <> ) {
print $_;
}
Is there a way to do something similar in Python without having to explicitly open each file?

The fileinput module in Python's standard library is designed exactly for this purpose, and I quote a bit of code from the URL I just gave:
import fileinput
for line in fileinput.input():
process(line)
Use print in lieu of process and you have the exact equivalent of your Perl code.

You could look into sys.argv. It may help.

Related

Unexpected double quotes while appending file items to subprocess.run

I am trying to read from a file which has contents like this:
#\5\5\5
...
#\5\5\10
This file content is then fed into subprocess module of python like this:
for lines in file.readlines():
print(lines)
cmd = ls
p = subprocess.run([cmd, lines])
The output turns into something like this:
CompletedProcess(args=['ls', "'#5\\5\\5'\n"], returncode=1)
I don't understand why the contents of the file is appended with a double quote and another backward slash is getting appended.
The real problem here isn't Python or the subprocess module. The problem the use of subprocess to invoke shell commands, and then trying to parse the results. In this case, it looks like the command is ls, and the plan appears to be to read some filesystem paths from a text file (each path on a separate line), and list the files at that location on the filesystem.
Using subprocess to invoke ls is really, really, really not the way to accomplish that in Python. This is basically an attempt to use Python like a shell script (this use of ls would still be problematic, but that's a different discussion).
If a shell script is the right tool for the job, then write a shell script. If you want to use Python, then use one of the API's that it provides for interacting with the OS and the filesystem. There is no need to bring in external programs to achieve this.
import os
with open("list_of_paths.txt", "r") as fin:
for line in fin.readlines():
w = os.listdir(line.strip())
print(w)
Note the use of .strip(), this is a string method that will remove invisible characters like spaces and newlines from the ends of the input.
The listdir method provided by the os module will return a list of the files in a directory. Other options are os.scandir, os.walk, and the pathlib module.
But please do not use subprocess. 95% of the time, when someone thinks "should I use Python's subprocess module for this?" the ansewr is "NO".
It is because \ with a relevant character or digit becomes something else other than the string. For example, \n is not just \ and n but it means next line. If you really want a \n, then you would add another backslash to it (\\n). Likewise \5 means something else. here is what I found when i ran \5:
and hence the \\ being added, if I am not wrong

how to "source" file into python script

I have a text file /etc/default/foo which contains one line:
FOO="/path/to/foo"
In my python script, I need to reference the variable FOO.
What is the simplest way to "source" the file /etc/default/foo into my python script, same as I would do in bash?
. /etc/default/foo
Same answer as #jil however, that answer is specific to some historical version of Python.
In modern Python (3.x):
exec(open('filename').read())
replaces execfile('filename') from 2.x
You could use execfile:
execfile("/etc/default/foo")
But please be aware that this will evaluate the contents of the file as is into your program source. It is potential security hazard unless you can fully trust the source.
It also means that the file needs to be valid python syntax (your given example file is).
Keep in mind that if you have a "text" file with this content that has a .py as the file extension, you can always do:
import mytextfile
print(mytestfile.FOO)
Of course, this assumes that the text file is syntactically correct as far as Python is concerned. On a project I worked on we did something similar to this. Turned some text files into Python files. Wacky but maybe worth consideration.
Just to give a different approach, note that if your original file is setup as
export FOO=/path/to/foo
You can do source /etc/default/foo; python myprogram.py (or . /etc/default/foo; python myprogram.py) and within myprogram.py all the values that were exported in the sourced' file are visible in os.environ, e.g
import os
os.environ["FOO"]
If you know for certain that it only contains VAR="QUOTED STRING" style variables, like this:
FOO="some value"
Then you can just do this:
>>> with open('foo.sysconfig') as fd:
... exec(fd.read())
Which gets you:
>>> FOO
'some value'
(This is effectively the same thing as the execfile() solution
suggested in the other answer.)
This method has substantial security implications; if instead of FOO="some value" your file contained:
os.system("rm -rf /")
Then you would be In Trouble.
Alternatively, you can do this:
>>> with open('foo.sysconfig') as fd:
... settings = {var: shlex.split(value) for var, value in [line.split('=', 1) for line in fd]}
Which gets you a dictionary settings that has:
>>> settings
{'FOO': ['some value']}
That settings = {...} line is using a dictionary comprehension. You could accomplish the same thing in a few more lines with a for loop and so forth.
And of course if the file contains shell-style variable expansion like ${somevar:-value_if_not_set} then this isn't going to work (unless you write your very own shell style variable parser).
There are a couple ways to do this sort of thing.
You can indeed import the file as a module, as long as the data it contains corresponds to python's syntax. But either the file in question is a .py in the same directory as your script, either you're to use imp (or importlib, depending on your version) like here.
Another solution (that has my preference) can be to use a data format that any python library can parse (JSON comes to my mind as an example).
/etc/default/foo :
{"FOO":"path/to/foo"}
And in your python code :
import json
with open('/etc/default/foo') as file:
data = json.load(file)
FOO = data["FOO"]
## ...
file.close()
This way, you don't risk to execute some uncertain code...
You have the choice, depending on what you prefer. If your data file is auto-generated by some script, it might be easier to keep a simple syntax like FOO="path/to/foo" and use imp.
Hope that it helps !
The Solution
Here is my approach: parse the bash file myself and process only variable assignment lines such as:
FOO="/path/to/foo"
Here is the code:
import shlex
def parse_shell_var(line):
"""
Parse such lines as:
FOO="My variable foo"
:return: a tuple of var name and var value, such as
('FOO', 'My variable foo')
"""
return shlex.split(line, posix=True)[0].split('=', 1)
if __name__ == '__main__':
with open('shell_vars.sh') as f:
shell_vars = dict(parse_shell_var(line) for line in f if '=' in line)
print(shell_vars)
How It Works
Take a look at this snippet:
shell_vars = dict(parse_shell_var(line) for line in f if '=' in line)
This line iterates through the lines in the shell script, only process those lines that has the equal sign (not a fool-proof way to detect variable assignment, but the simplest). Next, run those lines into the function parse_shell_var which uses shlex.split to correctly handle the quotes (or the lack thereof). Finally, the pieces are assembled into a dictionary. The output of this script is:
{'MOO': '/dont/have/a/cow', 'FOO': 'my variable foo', 'BAR': 'My variable bar'}
Here is the contents of shell_vars.sh:
FOO='my variable foo'
BAR="My variable bar"
MOO=/dont/have/a/cow
echo $FOO
Discussion
This approach has a couple of advantages:
It does not execute the shell (either in bash or in Python), which avoids any side-effect
Consequently, it is safe to use, even if the origin of the shell script is unknown
It correctly handles values with or without quotes
This approach is not perfect, it has a few limitations:
The method of detecting variable assignment (by looking for the presence of the equal sign) is primitive and not accurate. There are ways to better detect these lines but that is the topic for another day
It does not correctly parse values which are built upon other variables or commands. That means, it will fail for lines such as:
FOO=$BAR
FOO=$(pwd)
Based off the answer with exec(.read()), value = eval(.read()), it will only return the value. E.g.
1 + 1: 2
"Hello Word": "Hello World"
float(2) + 1: 3.0

Syntax for input redirection in IDLE

I need to enter the contents of a text (.txt) file as input for a Python (.py) file. Assuming the name of the text file is TextFile and the name of the Python file PythonFile, then the code should be as follows:
python PythonFile.py < TextFile.txt
Yet, when I try to do this in IDLE and type in
import PythonFile < TextFile,
IDLE gives me an invalid syntax message, pointing to the < sign. I tried all sorts of variations on this theme (i.e.,using or not using the file name extensions), but still got the same invalid-syntax message. How is the syntax different for input redirection in IDLE?
If it works in the command line, then why do you want to do this in IDLE? There are ways to achieve a similar result using, for example, subprocess, but a better way would be to refactor PythonFile.py so that you can call a function from it, e.g.:
>>> import PythonFile
>>> PythonFile.run_with_input('TextFile.txt')
If you post the contents of PythonFile.py, we might be able to help you do this.

Find a line of text with a pattern using Windows command line or Python

I need to run a command line tool that verifies a file and displays a bunch of information about it. I can export this information to a txt file but it includes a lot of extra data. I just need one line for the file:
"The signature is timestamped: Thu May 24 17:13:16 2012"
The time could be different, but I just need to extract this data and put it into a file. Is there a good way to do this from the command line itself or maybe python? I plan on using Python to locate and download the file to be verified, then run the command line tool to verify it so it can get the data then send that data in an email.
This is on a windows PC.
Thanks for your help
You don't need to use Python to do this. If you're using a Unix environment, you can use fgrep right from the command-line and redirect the output to another file.
fgrep "The signature is timestamped: " input.txt > output.txt
On Windows you can use:
find "The signature is timestamped: " < input.txt > output.txt
You mention the command line utility "displays" some information, so it may well be printing to stdout, so one way is to run the utility within Python, and capture the output.
import subprocess
# Try with some basic commands here maybe...
file_info = subprocess.check_output(['your_command_name', 'input_file'])
for line in file_info.splitlines():
# print line here to see what you get
if file_info.startswith('The signature is timestamped: '):
print line # do something here
This should fit in nicely with the "use python to download and locate" - so that can use urllib.urlretrieve to download (possibly with a temporary name), then run the command line util on the temp file to get the details, then the smtplib to send emails...
In python you can do something like this:
timestamp = ''
with open('./filename', 'r') as f:
timestamp = [line for line in f.readlines() if 'The signature is timestamped: ' in line]
I haven't tested this but I think it'd work. Not sure if there's a better solution.
I'm not too sure about the exact syntax of this exported file you have, but python's readlines() function might be helpful for this.
h=open(pathname,'r') #opens the file for reading
for line in h.readlines():
print line#this will print out the contents of each line of the text file
If the text file has the same format every time, the rest is easy; if it is not, you could do something like
for line in h.readlines():
if line.split()[3] == 'timestamped':
print line
output_string=line
as for writing to a file, you'll want to open the file for writing, h=open(name, "w"), then use h.write(output_string) to write it to a text file

Python equivalent of Perl's while (<>) {...}?

I write a lot of little scripts that process files on a line-by-line basis. In Perl, I use
while (<>) {
do stuff;
}
This is handy because it doesn't care where the input comes from (a file or stdin).
In Python I use this
if len(sys.argv) == 2: # there's a command line argument
sys.stdin = file(sys.argv[1])
for line in sys.stdin.readlines():
do stuff
which doesn't seem very elegant. Is there a Python idiom that easily handles file/stdin input?
The fileinput module in the standard library is just what you want:
import fileinput
for line in fileinput.input(): ...
import fileinput
for line in fileinput.input():
process(line)
This iterates over the lines of all files listed in sys.argv[1:], defaulting to sys.stdin if the list is empty.
fileinput defaults to stdin, so would make it slightly more concise.
If you do a lot of command-line stuff, though, this piping hack is very neat.

Categories