I have following issue.
I would like to read input from file (e.g. example.txt) to my Python script (e.g. script.py). Right now, I've implemented following lines of code:
import sys
with open(sys.argv[1], 'r') as f:
contents = f.read()
And, when I want to read file to this script, I just need to type following line in CMD:
python script.py example.txt
And of course, it works properly. File example.txt is read by script.py, it can be checked by adding simple print(contents) line to script.py.
The problem is, I have to run this code in CMD just like that:
script.py < example.txt
So, the question is, how can I achieve that? I suppose, it depends on the OS. On my Windows 10, I'm getting an error:
Traceback (most recent call last):
File "script.py", line 2, in <module>
with open(sys.argv[1], 'r') as f:
IndexError: list index out of range
I'm not asking for solution (but it would be nice), but I just want to know where should I'm looking for a solution.
script.py < example.txt sends the file contents to stdin which can be accessed via sys.stdin. The following works:
import sys
# First try supporting commands formatted like: script.py example.txt
if len(sys.argv) > 1:
with open(sys.argv[1]) as f:
contents = f.read()
# Now try supporting: script.py < example.txt
elif sys.stdin:
contents = ''.join(sys.stdin)
# If both methods failed, throw a user friendly error
else:
raise Exception('Please supply a file')
print(contents)
But in good old Python fashion, there is a built-in library that can make our life very easy. The library is fileinput, and it will automatically support both methods of reading input that you mentioned:
import fileinput
contents = fileinput.input()
print( ''.join(contents) )
And that works regardless of if you do script.py example.txt or script.py < example.txt or cat example.txt | script.py and you can even do script.py example1.txt example2.txt example3.txt and you will receive the file contents of the different files combined together.
Related
I have been trying to append my output of a command to a temporary file in python and later doing some operations. Not able to append the data to a temporary file. Any help is appreciated! My sample code as follows.
Getting the error like this.
with open(temp1 , 'r') as f:
TypeError: expected str, bytes or os.PathLike object, not _TemporaryFileWrapper
import tempfile
import os
temp1 = tempfile.NamedTemporaryFile()
os.system("echo Hello world | tee temp1")
with open(temp1 , 'r') as f:
a = f.readlines()[-1]
print(a)
import tempfile
import os
# Opening in update-text mode to avoid encoding the data written to it
temp1 = tempfile.NamedTemporaryFile("w+")
# popen opens a pipe from the command, allowing one to capture its output
output = os.popen("echo Hello world")
# Write the command output to the temporary file
temp1.write(output.read())
# Reset the stream position at the beginning of the file, if you want to read its contents
temp1.seek(0)
print(temp1.read())
Check out subprocess.Popen for more powerful subprocess communication.
Whatever you're trying to do isn't right. It appears that you are trying to have a system call write to a file, and then you want to read that file in your Python code. You're creating a temporary file, but then your system call is writing to a statically named file, named 'temp1' rather than to the temporary file you've opened. So it's unclear if you want/need to use a computed temporary file name or if using temp1 is OK. The easiest way to fix your code to do what I think you want is like this:
import os
os.system("echo Hello world | tee temp1")
with open('temp1' , 'r') as f:
a = f.readlines()[-1]
print(a)
If you need to create a temporary file name in your situation, then you have to be careful if you are at all concerned about security or thread safety. What you really want to do is have the system create a temporary directory for you, and then create a statically named file in that directory. Here's your code reworked to do that:
import tempfile
import os
with tempfile.TemporaryDirectory() as dir:
tempfile = os.path.join(dir, "temp1")
os.system("echo Hello world /tmp > " + tempfile)
with open(tempfile) as f:
buf = f.read()
print(buf)
This method has the added benefit of automatically cleaning up for you.
UPDATE: I have now seen #UlisseBordingnon's answer. That's a better solution overall. Using os.system() is discouraged. I would have gone a bit different of a way by using the subprocess module, but what they suggest is 100% valid, and is thread and security safe. I guess I'll leave my answer here as maybe you or other readers need to use os.system() or otherwise have the shell process you execute write directly to a file.
As others have suggested, you should use the subprocess module instead of os.system. However from subprocess you can use the most recent interface (and by most recent, I believe this was adding in Python 3.4) of subprocess.run.
The neat thing about using .run is that you can pass any file-like object to stdout and the stdout stream will automatically redirect to that file.
import tempfile
import subprocess
with tempfile.NamedTemporaryFile("w+") as f:
subprocess.run(["echo", "hello world"], stdout=f)
# command has finished running, let's check the file
f.seek(0)
print(f.read())
# hello world
If you are using python 3.5 or later (as with most of us), then use subprocess.run is better because you do not need a temporary file:
import subprocess
completed_process = subprocess.run(
["echo", "hello world"],
capture_output=True,
encoding="utf-8",
)
print(completed_process.stdout)
Notes
The capture_output parameter tells run() to save the output to the .stdout and .stderr attributes
The encoding parameter will convert the output from bytes to string
Depending on your needs, if your print your output, a quickier way, but maybe not exactly what you are looking for is to redirect the output to a file, at the command line level
Example(egfile.py):
import os
os.system("echo Hello world")
At command level you can simply do:
python egfile.py > file.txt
The output of the file will be redirected to the file instead to the screen
I have tried to execute a simple python command from cmd like C:\Users> stat.py < swagger.yaml > output.html, which executes stat.py by taking swagger.yaml as input argument and generates output.html file and it worked fine in cmd. But now i want to execute my stat.py file through another python file demo.py by passing the values swagger.yaml and output.html as sys.argv[0] and sys.argv[1] inside demo.py.
my command from cmd C:\Users> demo.py swagger.yaml output.html and my demo.py file is as follows..
# my demo.py file ....
import os
import sys
os.system('stat.py < sys.argv[1] > sys.argv[2]')
error - the system can not find the file specified.
Why i am getting this error and please any help to resolve it ..
Inside a normal string, no variable interpretation is applied. So you literally asked to read from a file named sys.argv[1] (possibly sys.argv1 if the file exists, thanks to shell globbing), and write to a file named sys.argv[2].
If you want to use the values sys.argv in your script, you need to format them into the string, e.g. with f-strings (modern Python 3.6 or so only):
os.system(f'stat.py < {sys.argv[1]} > {sys.argv[2]}') # Note f at beginning of literal
or on older Python 2.7, with str.format:
os.system('stat.py < {} > {}'.format(sys.argv[1], sys.argv[2]))
Note that however you slice it, this is dangerous; os.system is launching this in a shell, and arguments that contain shell metacharacters will be interpreted as such. It can't do anything the user didn't already have permission to do, but small mistakes by the user could dramatically change the behavior of the program. If you want to do this properly/safely, use subprocess, open the files yourself, and pass them in explicitly as stdin/stdout:
with open(sys.argv[1], 'rb') as infile, open(sys.argv[2], 'wb') as outfile:
subprocess.run(['stat.py'], stdin=infile, stdout=outfile)
This ensures the files can be opened in the first place before launching the process, doesn't allow the shell to interpret anything, and avoids the (minor) expense of launching a shell at all. It's also going to give you more useful errors if opening the files fails.
I have a Python script which ends the following way:
if __name__ == "__main__":
if len(sys.argv) == 2:
file = open(sys.argv[1])
text = file.readline()
... #more statements
This works when I type in the following: $ python3 script.py my_file.txt
However, I want to change it so my script can accept text from standard input (or even a text file). This is what I want to be able to do:
$ ./script.py < my_file.txt
I think I need to use sys.stdin.read() (or maybe sys.stdin.readlines()). Could you tell me what I would need to change from my original script?
I'm sorry if this looks very basic, but I'm new to Python and I find it hard to see the difference.
It's exactly what you said, you don't need to open a file.
Instead of calling file.readline(), call sys.stdin.readline().
You can make it "nice", with something like:
file = sys.stdin if use_stdin else open(sys.argv[1])
Theres a cool module you can use for this! Assuming you want to do processing per line:
import fileinput
for line in fileinput.input():
process_line(line)
I am attempting to cat a CSV file into stdout and then pipe the printed output as input into a python program that also takes a system argument vector with 1 argument. I ran into an issue I think directly relates to how Python's fileinput.input() function reacts with regards to occupying the stdin file descriptor.
generic_user% cat my_data.csv | python3 my_script.py myarg1
Here is a sample Python program:
import sys, fileinput
def main(argv):
print("The program doesn't even print this")
data_list = []
for line in fileinput.input():
data_list.append(line)
if __name__ == "__main__":
main(sys.argv)
If I attempt to run this sample program with the above terminal command and no argument myarg1, the program is able to evaluate and parse the stdin for the data output from the CSV file.
If I run the program with the argument myarg1, it will end up throwing a FileNotFoundError directly related to myarg1 not existing as a file.
FileNotFoundError: [Errno 2] No such file or directory: 'myarg1'
Would someone be able to explain in detail why this behavior takes place in Python and how to handle the logic such that a Python program can first handle stdin data before argv overwrites the stdin descriptor?
You can read from the stdin directly:
import sys
def main(argv):
print("The program doesn't even print this")
data_list = []
for line in iter(sys.stdin):
data_list.append(line)
if __name__ == "__main__":
main(sys.argv)
You are trying to access a file which has not been yet created, hence fileinput cannot open it, but since you are piping the data you have no need for it.
This is by design. The conceptors of fileinput thought that there were use cases where reading from stdin would be non sense and just provided a way to specifically add stdin to the list of files. According to the reference documentation:
import fileinput
for line in fileinput.input():
process(line)
This iterates over the lines of all files listed in sys.argv[1:], defaulting to sys.stdin if the list is empty. If a filename is '-', it is also replaced by sys.stdin.
Just keep your code and use: generic_user% cat my_data.csv | python3 my_script.py - myarg1
to read stdin before myarg1 file or if you want to read it after : ... python3 my_script.py myarg1 -
fileinput implements a pattern common for Unix utilities:
If the utility is called with commandline arguments, they are files to read from.
If it is called with no arguments, read from standard input.
So fileinput works exactly as intended. It is not clear what you are using commandline arguments for, but if you don't want to stop using fileinput, you should modify sys.argv before you invoke it.
some_keyword = sys.argv[1]
sys.argv = sys.argv[:1] # Retain only argument 0, the command name
for line in fileinput.input():
...
Using Python in NetBeans and having some trouble to set up file arguments as input/output. For instance:
import re, sys
for line in sys.stdin:
for token in re.split("\s+", line.strip()):
print(token)
Command line usage python splitprog.py < input.txt > output.txt works great. But in NetBeans the output window just waits, with nothing happening even if one give a file name (tested many combinations).
The Application Arguments row in project properties (where one would enter these files for a Java project) doesn’t seem to be used either, as the behaviour is the same regardless of whether there are file names/paths there or not. Is there some trick to get this to work or are file args currently unusable when it comes to Python in NetBeans?
ADD: As per suggestion by #John Zwinck, an example solution:
import re, sys
with open(sys.argv[1]) as infile:
with open(sys.argv[2], "w") as outfile:
for line in infile:
for token in re.split("\s+", line.strip()):
print(token, file = outfile)
Argument files are set in NB project properties. In command prompt, the programme is now simply run by python splitprog.py input.txt output.txt.
When you do this:
python splitprog.py < input.txt > output.txt
You are redirecting input.txt to stdin of python, and stdout of python to output.txt. You aren't using command line arguments to splitprog.py at all.
NetBeans does not support this.
Instead, you should pass the filenames as arguments, like this:
python splitprog.py input.txt output.txt
Then in NetBeans you just set the command line arguments to input.txt output.txt and it will work the same as the above command line in the shell. You'll need to modify your program slightly, perhaps like this:
with open(sys.argv[1]) as infile:
for line in infile:
# ...
If you still want to support stdin and stdout, one convention is to use - to mean those standard streams, so you could code your program to support this:
python splitprog.py - - < input.txt > output.txt
That is, you can write your program to understand - as "use the standard stream from the shell", if you need to support the old way of doing things. Or just default to this behavior if no command line arguments are given.