Get a file from CLI input - python

How do you get a file name from command line when you run a Python code? Like if your code opens a file and reads the line, but the file varies whenever you run it, how to you say:
python code.py input.txt
so the code analyzes "input.txt"? What would you have to do in the actual Python code? I know, this is a pretty vague question, but I don't really know how to explain it any better.

A great option is the fileinput module, which will grab any or all filenames from the command line, and give the specified files' contents to your script as though they were one big file.
import fileinput
for line in fileinput.input():
process(line)
More information here.

import sys
filename = sys.argv[-1]
This will get the last argument on the command line. If no arguments are passed, it will be the script name itself, as sys.argv[0] is the name of the running program.

Using argparse is quite intuitive:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--file", "-f", type=str, required=True)
args = parser.parse_args()
Now the name of the file is located in:
args.file
You just have to run the program a little differently:
python code.py -f input.txt

Command line parameters are available as a list via the sys module's argv list. The first element in the list is the name of the program (sys.argv[0]). The remaining elements are the command line parameters.
See also the getopt, optparse, and argparse modules for more complex command line parsing.

If you're using Linux or Windows PowerShell you could pipe " | " it after using cat on input.txt file, suppose you have input.txt file and your code.py file in same directory you could use:
cat input.txt | python code.py
This will provide python input from STDIN. for example: if for example you're trying get names from input.txt file
input.txt has
john,matthew,peter,albert
code.py has
print(" is not ".join(input().rstrip().split(',')))
would give
john is not matthew is not peter is not albert

I also like argparse, it's clean, simple, fairly standard, gives free error handling, and add a [-h] option to help the user.
Here is a version that do not need the named parameters, which may be annoying for a very simple script:
#!/usr/bin/python3
import argparse
arg_parser = argparse.ArgumentParser( description = "Copy source_file as target_file." )
arg_parser.add_argument( "source_file" )
arg_parser.add_argument( "target_file" )
arguments = arg_parser.parse_args()
source = arguments.source_file
target = arguments.target_file
print( "Copying [{}] to [{}]".format(source, target) )
Example of how it handles errors and help for you:
>my_test.py
usage: my_test.py [-h] source_file target_file
my_test.py: error: the following arguments are required: source_file, target_file
>my_test.py my_source.cpp
usage: my_test.py [-h] source_file target_file
my_test.py: error: the following arguments are required: target_file
>my_test.py -h
usage: .py [-h] source_file target_file
Copy source_file as target_file.
positional arguments:
source_file
target_file
optional arguments:
-h, --help show this help message and exit
>my_test.py my_source.cpp my_target.cpp
Copying [my_source.cpp] to [my_target.cpp]

In addition to what is mentioned by the already existing answers, there is an other alternative relying on the use of Command Line Interface Creation Kit (Click). Its latest stable version by the time I posted this answer is version 6. The official documentation has examples on how to deal with files and pass them as command line arguments.

Just use the basic command raw_input
declare input file name as string
inFile = ""
inFile = raw_input("Enter the input File Name: ")
Now you can open the file by using with open(inFile,'w')

Related

How to create a python script that handles an input string or an input file?

I would like to make a python script that can either process a string fed into the command line
python my_script.py "Hello World"
or a set of strings inside a file (e.g. input_file.txt)
python my_script.py -i input_file.txt
Is there a way to do this via argparse? So far I can handle input files, but I don't know how to add the option of just processing a string fed directly in the command line.
make argparse accept one positional argument (filename or string) and one flag (-i). If the flag is present, treat the argument as a file. Take a look at argparse tutorial for more info. I've modified the example to fit your needs.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("inp", help="input string or input file")
parser.add_argument("-i", "--treat-as-file", action="store_true",
help="reads from file if specified")
args = parser.parse_args()
if args.treat_as_file:
print("Treating {} as input file".format(args.inp))
else:
print("Treating {} as the input".format(args.inp))
Output:
/tmp $ python test.py abcde
Treating abcde as the input
/tmp $ python test.py -i abcde
Treating abcde as input file
To process a command line string, you can use sys.argv, which is a list of all the arguments fed into the command line.
main.py:
import sys
print(sys.argv)
Running the following line in the CLI
>> python main.py foo bar "hello world"
would output:
['main.py', 'foo', 'bar', 'hello world']

Passing multiple text files as arguments to a script using a pattern

First of all I'd like to state that this is a debug question for an exercise, but I can't get any help from the lecturer, and as much as I've read up on arguments I can't seem to figure it out, so here I am. So I have a python script that compares .txt files passed as arguments. Currently it is called it as follows:
python compare.py -s stop_list.txt NEWS/news01.txt NEWS/news02.txt
and the files are parsed into a list of names using
import sys, re, getopt, glob
opts, args = getopt.getopt(sys.argv[1:],'hs:bI:')
opts = dict(opts)
filenames = args
if '-I' in opts:
filenames = glob.glob(opts['-I'])
print('INPUT-FILES:', ' '.join(filenames))
print(filenames)
I can pass more than two files by simply listing them together
python compare.py -s stop_list.txt NEWS/news01.txt NEWS/news02.txt NEWS/news03.txt NEWS/news04.txt
but this can quickly become impractical.
Now it is suggested that more files can be passed using a pattern
python compare.py -s stop_list.txt -I ’NEWS/news??.txt’
i.e.:
python compare.py -s stop_list.txt -I ’NEWS/news0[123].txt’
However it seems to behave a bit weirdly. First of all if I write:
python compare.py -s stop_list.txt -I NEWS/news01.txt NEWS/news02.txt
only news01.txt will be passed to the script.
Following, when using the pattern as suggested there is no input whatsoever. I can't really understand if the code for parsing the input files is wrong and needs some altering, or I'm doing something wrong.
The -h states:
USE: python <PROGNAME> (options) file1...fileN
OPTIONS:
-h : print this help message
-b : use BINARY weights (default: count weighting)
-s FILE : use stoplist file FILE
-I PATT : identify input files using pattern PATT,
(otherwise uses files listed on command line)
Thanks in advance :)
Check the quotes. They seem special. Try ' or ", instead.

Reading from stdin with a system argv

I am attempting to cat a CSV file into stdout and then pipe the printed output as input into a python program that also takes a system argument vector with 1 argument. I ran into an issue I think directly relates to how Python's fileinput.input() function reacts with regards to occupying the stdin file descriptor.
generic_user% cat my_data.csv | python3 my_script.py myarg1
Here is a sample Python program:
import sys, fileinput
def main(argv):
print("The program doesn't even print this")
data_list = []
for line in fileinput.input():
data_list.append(line)
if __name__ == "__main__":
main(sys.argv)
If I attempt to run this sample program with the above terminal command and no argument myarg1, the program is able to evaluate and parse the stdin for the data output from the CSV file.
If I run the program with the argument myarg1, it will end up throwing a FileNotFoundError directly related to myarg1 not existing as a file.
FileNotFoundError: [Errno 2] No such file or directory: 'myarg1'
Would someone be able to explain in detail why this behavior takes place in Python and how to handle the logic such that a Python program can first handle stdin data before argv overwrites the stdin descriptor?
You can read from the stdin directly:
import sys
def main(argv):
print("The program doesn't even print this")
data_list = []
for line in iter(sys.stdin):
data_list.append(line)
if __name__ == "__main__":
main(sys.argv)
You are trying to access a file which has not been yet created, hence fileinput cannot open it, but since you are piping the data you have no need for it.
This is by design. The conceptors of fileinput thought that there were use cases where reading from stdin would be non sense and just provided a way to specifically add stdin to the list of files. According to the reference documentation:
import fileinput
for line in fileinput.input():
process(line)
This iterates over the lines of all files listed in sys.argv[1:], defaulting to sys.stdin if the list is empty. If a filename is '-', it is also replaced by sys.stdin.
Just keep your code and use: generic_user% cat my_data.csv | python3 my_script.py - myarg1
to read stdin before myarg1 file or if you want to read it after : ... python3 my_script.py myarg1 -
fileinput implements a pattern common for Unix utilities:
If the utility is called with commandline arguments, they are files to read from.
If it is called with no arguments, read from standard input.
So fileinput works exactly as intended. It is not clear what you are using commandline arguments for, but if you don't want to stop using fileinput, you should modify sys.argv before you invoke it.
some_keyword = sys.argv[1]
sys.argv = sys.argv[:1] # Retain only argument 0, the command name
for line in fileinput.input():
...

Multiple files for one argument in argparse Python 2.7

Trying to make an argument in argparse where one can input several file names that can be read.
In this example, i'm just trying to print each of the file objects to make sure it's working correctly but I get the error:
error: unrecognized arguments: f2.txt f3.txt
. How can I get it to recognize all of them?
my command in the terminal to run a program and read multiple files
python program.py f1.txt f2.txt f3.txt
Python script
import argparse
def main():
parser = argparse.ArgumentParser()
parser.add_argument('file', nargs='?', type=file)
args = parser.parse_args()
for f in args.file:
print f
if __name__ == '__main__':
main()
I used nargs='?' b/c I want it to be any number of files that can be used . If I change add_argument to:
parser.add_argument('file', nargs=3)
then I can print them as strings but I can't get it to work with '?'
If your goal is to read one or more readable files, you can try this:
parser.add_argument('file', type=argparse.FileType('r'), nargs='+')
nargs='+' gathers all command line arguments into a list. There must also be one or more arguments or an error message will be generated.
type=argparse.FileType('r') tries to open each argument as a file for reading. It will generate an error message if argparse cannot open the file. You can use this for checking whether the argument is a valid and readable file.
Alternatively, if your goal is to read zero or more readable files, you can simply replace nargs='+' with nargs='*'. This will give you an empty list if no command line arguments are supplied. Maybe you might want to open stdin if you're not given any files - if so just add default=[sys.stdin] as a parameter to add_argument.
And then to process the files in the list:
args = parser.parse_args()
for f in args.file:
for line in f:
# process file...
More about nargs:
https://docs.python.org/2/library/argparse.html#nargs
More about type: https://docs.python.org/2/library/argparse.html#type
Just had to make sure there was at least one argument
parser.add_argument('file',nargs='*')

Reading from a file with sys.stdin in Pycharm

I am trying to test a simple code that reads a file line-by-line with Pycharm.
for line in sys.stdin:
name, _ = line.strip().split("\t")
print name
I have the file I want to input in the same directory: lib.txt
How can I debug my code in Pycharm with the input file?
You can work around this issue if you use the fileinput module rather than trying to read stdin directly.
With fileinput, if the script receives a filename(s) in the arguments, it will read from the arguments in order. In your case, replace your code above with:
import fileinput
for line in fileinput.input():
name, _ = line.strip().split("\t")
print name
The great thing about fileinput is that it defaults to stdin if no arguments are supplied (or if the argument '-' is supplied).
Now you can create a run configuration and supply the filename of the file you want to use as stdin as the sole argument to your script.
Read more about fileinput here
I have been trying to find a way to use reading file as stdin in PyCharm.
However, most of guys including jet brains said that there is no way and no support, it is the feature of command line which is not related PyCharm itself.
* https://intellij-support.jetbrains.com/hc/en-us/community/posts/206588305-How-to-redirect-standard-input-output-inside-PyCharm-
Actually, this feature, reading file as stdin, is somehow essential for me to ease giving inputs to solve a programming problem from hackerank or acmicpc.
I found a simple way. I can use input() to give stdin from file as well!
import sys
sys.stdin = open('input.in', 'r')
sys.stdout = open('output.out', 'w')
print(input())
print(input())
input.in example:
hello world
This is not the world ever I have known
output.out example:
hello world
This is not the world ever I have known
You need to create a custom run configuration and then add your file as an argument in the "Script Parameters" box. See Pycharm's online help for a step-by-step guide.
However, even if you do that (as you have discovered), your problem won't work since you aren't parsing the correct command line arguments.
You need to instead use argparse:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("filename", help="The filename to be processed")
args = parser.parse_args()
if args.filename:
with open(filename) as f:
for line in f:
name, _ = line.strip().split('\t')
print(name)
For flexibility, you can write your python script to always read from stdin and then use command redirection to read from a file:
$ python myscript.py < file.txt
However, as far as I can tell, you cannot use redirection from PyCharm as Run Configuration does not allow it.
Alternatively, you can accept the file name as a command-line argument:
$ python myscript.py file.txt
There are several ways to deal with this. I think argparse is overkill for this situation. Alternatively, you can access command-line arguments directly with sys.argv:
import sys
filename = sys.argv[1]
with open(filename) as f:
for line in f:
name, _ = line.strip().split('\t')
print(name)
For robust code, you can check that the correct number of arguments are given.
Here's my hack for google code jam today, wish me luck. Idea is to comment out monkey() before submitting:
def monkey():
print('Warning, monkey patching')
global input
input = iter(open('in.txt')).next
monkey()
T = int(input())
for caseNum in range(1, T + 1):
N, L = list(map(int, input().split()))
nums = list(map(int, input().split()))
edit for python3:
def monkey():
print('Warning, monkey patching')
global input
it = iter(open('in.txt'))
input = lambda : next(it)
monkey()

Categories