Can my Python script take its arguments from a file, rather than the command line? I don't mind passing the file containing the arguments on the command line.
I am using argparse.
The reason is that I have a very complex argument list. I suppose that I could just wrap the call in a batch file, or another Python script, but I wondered if this is possible and thought that I would ask and maybe learn something.
So, instead of myScript.py --arg_1=xxx --arg2_=xxx ... --arg_n=xxx, can I
myScript.py --file args.txt where args.txt contains
--arg_1=xxx
--arg_2=xxx
...
--arg_n=xxx
You can tell the parser that arguments beginning with certain characters are actually names of files containing more arguments. From the documentation:
>>> with open('args.txt', 'w') as fp:
... fp.write('-f\nbar')
>>> parser = argparse.ArgumentParser(fromfile_prefix_chars='#')
>>> parser.add_argument('-f')
>>> parser.parse_args(['-f', 'foo', '#args.txt'])
Namespace(f='bar')
The parser reads from args.txt and treats each line as a separate argument.
You can do this by taking command line argument as filename and then opening it.
like
file_name=sys.argv[1]
f=open(file_name)
arguments= f.read()
user_input=arguments.split()
user_argument=[]
for i in range():
user_argument.append(user_input[i])
Here you get the list of user argument in the list user_argument. Perheps you will get what you want!
Related
I have a script that uses some arguments and some stdin data.
For checking arguments I use argparse.ArgumentParser
Is it possible to check if any stdin data is given? Something like that:
parser.add_argument('infile', nargs='?', type=argparse.FileType('r'), default=sys.stdin, required=True)
but this example gives this error:
TypeError: 'required' is an invalid argument for positionals
No. It wont't read from whatever file you pass it, be it given on the command line, or stdin. You will get an open file handle, with not even a single byte/char consumed.
Simply read the data yourself, for instance with data = args.infile.read() (assuming args is the result of parsing`).
You can then test if it is empty, with a simple if not data:...
But usually, if you expect data in a specific format, the best is to simply try to parse it, and raise an error if you fail. Either empty data is invalid (json for instance), or it is valid but then it should be an acceptable input.
(as for the error, required only tells whether some option must be given on the command line or not, for --options and -o options. Positionals are always required unless you change their numbers with nargs).
The error is just because of the required=True parameter; and the message tells you what is wrong. It should be:
parser.add_argument('infile', nargs='?', type=argparse.FileType('r'), default=sys.stdin)
By 'calling' this infile, as opposed to '--infile', you've created a positional argument. argparse itself determines whether it is required or not. With nargs='?' it can't be required. It's by definition optional (but not an optionals argument :) ).
The FileType type lets you name a file (or '-') in the commandline. It will open it (stdin is already open) and assign it to the args.infile attribute. It does nothing more.
So after parsing, using args.infile gives you access to this open file, which you can read as needed (and optionally close if not stdin).
So this is a convenient way of letting your users specify which file should be opened for use in your code. It was intended for simple scripts that read one file, do something, and write to another.
But if all you are looking at is stdin, there isn't any point in using this type. sys.stdin is always available for reading. And there isn't any way of making the parser read stdin. It parses sys.argv which comes from the commandline.
There is an # prefix file feature that tells the parser to read commandline strings from a file. It parses the file and splices the values into sys.argv. See the argparse docs.
From the docs The add_argument() method
required - Whether or not the command-line option may be omitted (optionals only).
The required keyword is only used for options (e.g., -f or --foo) not for positional arguments. Just take it out.
parser.add_argument('infile', nargs='?', type=argparse.FileType('r'),
default=sys.stdin)
When parsed infile will either be a string or the sys.stdin file object. You would need to read that file to see if there is anything in it. Reading can be risky... you may block forever. But that just means that the user didn't follow instructions.
I have a text file /etc/default/foo which contains one line:
FOO="/path/to/foo"
In my python script, I need to reference the variable FOO.
What is the simplest way to "source" the file /etc/default/foo into my python script, same as I would do in bash?
. /etc/default/foo
Same answer as #jil however, that answer is specific to some historical version of Python.
In modern Python (3.x):
exec(open('filename').read())
replaces execfile('filename') from 2.x
You could use execfile:
execfile("/etc/default/foo")
But please be aware that this will evaluate the contents of the file as is into your program source. It is potential security hazard unless you can fully trust the source.
It also means that the file needs to be valid python syntax (your given example file is).
Keep in mind that if you have a "text" file with this content that has a .py as the file extension, you can always do:
import mytextfile
print(mytestfile.FOO)
Of course, this assumes that the text file is syntactically correct as far as Python is concerned. On a project I worked on we did something similar to this. Turned some text files into Python files. Wacky but maybe worth consideration.
Just to give a different approach, note that if your original file is setup as
export FOO=/path/to/foo
You can do source /etc/default/foo; python myprogram.py (or . /etc/default/foo; python myprogram.py) and within myprogram.py all the values that were exported in the sourced' file are visible in os.environ, e.g
import os
os.environ["FOO"]
If you know for certain that it only contains VAR="QUOTED STRING" style variables, like this:
FOO="some value"
Then you can just do this:
>>> with open('foo.sysconfig') as fd:
... exec(fd.read())
Which gets you:
>>> FOO
'some value'
(This is effectively the same thing as the execfile() solution
suggested in the other answer.)
This method has substantial security implications; if instead of FOO="some value" your file contained:
os.system("rm -rf /")
Then you would be In Trouble.
Alternatively, you can do this:
>>> with open('foo.sysconfig') as fd:
... settings = {var: shlex.split(value) for var, value in [line.split('=', 1) for line in fd]}
Which gets you a dictionary settings that has:
>>> settings
{'FOO': ['some value']}
That settings = {...} line is using a dictionary comprehension. You could accomplish the same thing in a few more lines with a for loop and so forth.
And of course if the file contains shell-style variable expansion like ${somevar:-value_if_not_set} then this isn't going to work (unless you write your very own shell style variable parser).
There are a couple ways to do this sort of thing.
You can indeed import the file as a module, as long as the data it contains corresponds to python's syntax. But either the file in question is a .py in the same directory as your script, either you're to use imp (or importlib, depending on your version) like here.
Another solution (that has my preference) can be to use a data format that any python library can parse (JSON comes to my mind as an example).
/etc/default/foo :
{"FOO":"path/to/foo"}
And in your python code :
import json
with open('/etc/default/foo') as file:
data = json.load(file)
FOO = data["FOO"]
## ...
file.close()
This way, you don't risk to execute some uncertain code...
You have the choice, depending on what you prefer. If your data file is auto-generated by some script, it might be easier to keep a simple syntax like FOO="path/to/foo" and use imp.
Hope that it helps !
The Solution
Here is my approach: parse the bash file myself and process only variable assignment lines such as:
FOO="/path/to/foo"
Here is the code:
import shlex
def parse_shell_var(line):
"""
Parse such lines as:
FOO="My variable foo"
:return: a tuple of var name and var value, such as
('FOO', 'My variable foo')
"""
return shlex.split(line, posix=True)[0].split('=', 1)
if __name__ == '__main__':
with open('shell_vars.sh') as f:
shell_vars = dict(parse_shell_var(line) for line in f if '=' in line)
print(shell_vars)
How It Works
Take a look at this snippet:
shell_vars = dict(parse_shell_var(line) for line in f if '=' in line)
This line iterates through the lines in the shell script, only process those lines that has the equal sign (not a fool-proof way to detect variable assignment, but the simplest). Next, run those lines into the function parse_shell_var which uses shlex.split to correctly handle the quotes (or the lack thereof). Finally, the pieces are assembled into a dictionary. The output of this script is:
{'MOO': '/dont/have/a/cow', 'FOO': 'my variable foo', 'BAR': 'My variable bar'}
Here is the contents of shell_vars.sh:
FOO='my variable foo'
BAR="My variable bar"
MOO=/dont/have/a/cow
echo $FOO
Discussion
This approach has a couple of advantages:
It does not execute the shell (either in bash or in Python), which avoids any side-effect
Consequently, it is safe to use, even if the origin of the shell script is unknown
It correctly handles values with or without quotes
This approach is not perfect, it has a few limitations:
The method of detecting variable assignment (by looking for the presence of the equal sign) is primitive and not accurate. There are ways to better detect these lines but that is the topic for another day
It does not correctly parse values which are built upon other variables or commands. That means, it will fail for lines such as:
FOO=$BAR
FOO=$(pwd)
Based off the answer with exec(.read()), value = eval(.read()), it will only return the value. E.g.
1 + 1: 2
"Hello Word": "Hello World"
float(2) + 1: 3.0
When I try calling the below code, I run into the following error: "You must specify a format when providing data via STDIN (pipe)."
subprocess.call(["in2csv", "--format", "xls", a_file, ">", output_file], shell=True)
I'm not sure why this is the case because I am telling it what the initial format is. I've looked at the docs, which isn't clear about the distinction between --format and -f.
Update: I've changed it to use argparse to simplify passing the arguments following this recommendation. I'm also using Popen as used here, which is apparently safer than using shell=true flag according to the docs.
parser = argparse.ArgumentParser()
parser.add_argument('in2csv')
parser.add_argument('--format')
parser.add_argument('xls')
parser.add_argument(a_file)
parser.add_argument(">")
parser.add_argument(output_file)
args = parser.parse_args()
print args
subprocess.Popen(args)
Errors like what you've seen are a symptom of the shell getting confused by the string passed in, for instance because of a space in a filename. It is indeed best to avoid using the shell when spawning processes from Python.
Instead of adding ">" and output_file as arguments, try redirecting the output using the stdout keyword argument, which takes a file that output will be written to.
Assuming:
a_file is a string with the name of your input file, and
output_file is a string with the name of your desired output file,
a working call might look like:
with open(output_file, 'wb') as of:
subprocess.check_call(["in2csv", "--format", "xls", a_file],
stdout=of)
It's not necessary to use argparse here; it's meant for handling command lines coming in to your program, rather than going out from it.
How does one structure their command line processing block in such a way as to allow naming multiple files in any order AND to discover those file types by their suffixes?
In this Python program, I need to pass both a binary file and a .vhdr file to my command line. The .vhdr file will be read to memory, while the (large) binary file will be processed incrementally. I would like to build this in a way such that the user can pass the file names in any order. It seems to me that an intelligent way to deal with this would be to iterate over each item in argv, check if it has a ".vhdr" suffix, and use whichever item has this to save to my file object.
Do any libraries have this functionality, or should I write this from scratch? I was not able to find something like this in the argparse library, but I am new so I easily could have looked right at it and not understood.
Use well known argparse library. Simple example
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--vhdr", dest="vhdr_file")
parser.add_argument("--bin", dest="bin_file")
args = parser.parse_args()
print(args)
output:
$ python demo.py --vhdr 1 --bin 2
Namespace(bin_file='2', vhdr_file='1')
$ python demo.py --bin 1 --vhdr 2
Namespace(bin_file='1', vhdr_file='2')
Trying to make an argument in argparse where one can input several file names that can be read.
In this example, i'm just trying to print each of the file objects to make sure it's working correctly but I get the error:
error: unrecognized arguments: f2.txt f3.txt
. How can I get it to recognize all of them?
my command in the terminal to run a program and read multiple files
python program.py f1.txt f2.txt f3.txt
Python script
import argparse
def main():
parser = argparse.ArgumentParser()
parser.add_argument('file', nargs='?', type=file)
args = parser.parse_args()
for f in args.file:
print f
if __name__ == '__main__':
main()
I used nargs='?' b/c I want it to be any number of files that can be used . If I change add_argument to:
parser.add_argument('file', nargs=3)
then I can print them as strings but I can't get it to work with '?'
If your goal is to read one or more readable files, you can try this:
parser.add_argument('file', type=argparse.FileType('r'), nargs='+')
nargs='+' gathers all command line arguments into a list. There must also be one or more arguments or an error message will be generated.
type=argparse.FileType('r') tries to open each argument as a file for reading. It will generate an error message if argparse cannot open the file. You can use this for checking whether the argument is a valid and readable file.
Alternatively, if your goal is to read zero or more readable files, you can simply replace nargs='+' with nargs='*'. This will give you an empty list if no command line arguments are supplied. Maybe you might want to open stdin if you're not given any files - if so just add default=[sys.stdin] as a parameter to add_argument.
And then to process the files in the list:
args = parser.parse_args()
for f in args.file:
for line in f:
# process file...
More about nargs:
https://docs.python.org/2/library/argparse.html#nargs
More about type: https://docs.python.org/2/library/argparse.html#type
Just had to make sure there was at least one argument
parser.add_argument('file',nargs='*')