I am trying to read from a file which has contents like this:
#\5\5\5
...
#\5\5\10
This file content is then fed into subprocess module of python like this:
for lines in file.readlines():
print(lines)
cmd = ls
p = subprocess.run([cmd, lines])
The output turns into something like this:
CompletedProcess(args=['ls', "'#5\\5\\5'\n"], returncode=1)
I don't understand why the contents of the file is appended with a double quote and another backward slash is getting appended.
The real problem here isn't Python or the subprocess module. The problem the use of subprocess to invoke shell commands, and then trying to parse the results. In this case, it looks like the command is ls, and the plan appears to be to read some filesystem paths from a text file (each path on a separate line), and list the files at that location on the filesystem.
Using subprocess to invoke ls is really, really, really not the way to accomplish that in Python. This is basically an attempt to use Python like a shell script (this use of ls would still be problematic, but that's a different discussion).
If a shell script is the right tool for the job, then write a shell script. If you want to use Python, then use one of the API's that it provides for interacting with the OS and the filesystem. There is no need to bring in external programs to achieve this.
import os
with open("list_of_paths.txt", "r") as fin:
for line in fin.readlines():
w = os.listdir(line.strip())
print(w)
Note the use of .strip(), this is a string method that will remove invisible characters like spaces and newlines from the ends of the input.
The listdir method provided by the os module will return a list of the files in a directory. Other options are os.scandir, os.walk, and the pathlib module.
But please do not use subprocess. 95% of the time, when someone thinks "should I use Python's subprocess module for this?" the ansewr is "NO".
It is because \ with a relevant character or digit becomes something else other than the string. For example, \n is not just \ and n but it means next line. If you really want a \n, then you would add another backslash to it (\\n). Likewise \5 means something else. here is what I found when i ran \5:
and hence the \\ being added, if I am not wrong
Related
I am trying to execute rm command from python in linux as follows
remove_command = [find_executable(
"rm"), "-rf", "dist/", "python_skelton.egg-info", "build/", "other/*_generated.py"]
print('Removing build, dist, python_skelton.egg-
if subprocess.call(remove_command) != 0:
sys.exit(-1)
The directories gets removed successfully but the regex pattern other/*_generated.py
does not remove the relevant _generated.py files.
How shall I remove those files using regex from python script?
The reason this doesn't work the way you intend it to, is that your pattern is not expanded, but interpreted as the litteral file name "other/*_generated.py". This happens because you are relying on so-called glob pattern expansion.
The glob pattern is typically expanded by the shell, but since you are calling the rm command without using the shell, you will not get this "automatically" done. I can see two obvious ways to handle this.
Expand the glob before calling the subprocess
This can be done, using the Python standard library glob implementation:
import glob
remove_command = [find_executable("rm"), "-rf", "dist/", "python_skelton.egg-info",
"build/"] + glob.glob("other/*_generated.py")
subprocess.call(remove_command)
Use the shell to expand the glob
To do this, you need to pass shell=True to the subprocess.call. And, as always, when using the shell, we should pass the command as a single string and not a list:
remove_command = [find_executable("rm"), "-rf", "dist/", "python_skelton.egg-info",
"build/", "other/*_generated.py"]
remove_command_string = " ".join(remove_command) # generate a string from list
subprocess.call(remove_command_string, shell=True)
Both of these approaches will work. Note that if you allow user input, you should avoid using shell=True though, as it is a security hole, that can be used to execute arbitrary commands. But, in the current use case, it seems to not be the case.
the string that contains a file looks like this in the console:
>>> target_file
'src//data//annual_filings//ABB Ltd//ABB_ar_2015.pdf'
I got the target_file from a call to os.walk
The goal is to build a command to run in subprocess.call
Something like:
from subprocess import call
cmd_ = r'qpdf-7.0.0/bin/qpdf --password=%s --decrypt %s %s' %('', target_file, target_file)
call([cmd_])
I tried different variations, setting shell to either True or False.
Replacing the // with /,\ etc.
The issue seems to be with the space in the folder (I can not change the folder name).
The python code needs to run on Windows
you have to define cmd_ as a list of arguments not a list with a sole string in it, or subprocess interprets the string as the command (doesn't even try to split the args):
cmd_ = ['qpdf-7.0.0/bin/qpdf','--password=%s'%'','--decrypt',target_file, target_file]
call(cmd_)
and leave the quoting to subprocess
As a side note, no need to double the slashes. It works, but that's unnecessary.
I have a file name that I want to pass to a program or a bash script. For example if it's my car's picture.jpg, I have to change it to my\ car\'s picture.jpg to pass it to os.system like show my\ car\'s picture.jpg. Is there a function to do it the backslashes automatically?
You should use the subprocess module to call shell scripts from Python. Then you don't have to worry about escaping things yourself.
import subprocess
subprocess.call(['script_name', "my car's picture.jpg"])
subprocess.call() will escape everything correctly for you. If you need to read the output of the shell script, use subprocess.check_output() instead.
You can simply pass as is and use subprocess, os.system is depreciated.
c = check_output(["file","/home/padraic/Pictures/my cars' picture.png"])
print(c)
b"/home/padraic/Pictures/my cars' picture.png: PNG image data, 1366 x 768, 8-bit/color RGB, non-interlaced\n"
To call a script use check_call, if you want to pipe you can use Popen, there are lots of example in the docs linked above including replacing-os-system.
I can offer several incomplete suggestions that could be helpful to you.
Use "my car's picture.jpg" -- double quotes escape single ones
Using spaces in a UNIX file system generates only headaches. You could pass the filename inside of double-quotes.
os.system('cp "my car\'s picture" myCarPicture.jpg')
If you are using filenames with backslashes in a Windows system, use raw string
r"C:\Foo\bah\baz.jpg"
I am creating a simple file in python to reorganize some text data I grabbed from a website. I put the data in a .txt file and then want to use the "tail" command to get rid of the first 5 lines. I'm able to make this work for a simple filename shown below, but when I try to change the filename (to what I'd actually like it to be) I get an error. My code:
start = 2010
end = 2010
for i in range(start,end+1)
year = str(i)
...write data to a file called file...
teamname=open(file).readline() # want to use this in the new filename
teamfname=teamname.replace(" ","") #getting rid of spaces
file2 = "gotdata2_"+year+".txt"
os.system("tail -n +5 gotdata_"+year+".txt > "+file2)
The above code works as intended, creating file, then creating file2 that excludes the first 5 lines of file. However, when I change the name of file2 to be:
file2 = teamfname+"_"+year+".txt"
I get the error:
sh: line 1: _2010.txt: command not found
It's as if the end of my file2 statement is getting chopped off and the .txt part isn't being recognized. In this case, my code outputs a file but is missing the _2010.txt at the end. I've double checked that both year and teamfname are strings. I've also tried it with and without spaces in the teamfname string. I get the same error when I try to include a os.system mv statement that would rename the file to what I want it to be, so there must be something wrong with my understanding of how to specify the string here.
Does anyone have any ideas about what causes this? I haven't been able to find a solution, but I've found this problem difficult to search for.
Without knowing what your actual strings are, it's impossible to be sure what the problem is. However, it's almost certainly something to do with failing to properly quote and/or escape arguments for the command line.
My first guess would be that you have a newline in the middle of your filename, and the shell is truncating the command at the newline. But I wouldn't bet too heavily on that. If you actually printed out the repr of the pathname, I could tell you for sure. But why go through all this headache?
The solution to almost any problem with os.system is to not use os.system.
If you look at the docs, they even tell you this:
The subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using this function. See the Replacing Older Functions with the subprocess Module section in the subprocess documentation for some helpful recipes.
If you use subprocess instead of os.system, you can avoid the shell entirely. You can also pass arguments as a list instead of trying to figure out how to quote them and escape them properly. Which would completely avoid the exact problem you're having.
For example, if you do this:
file2 = "gotdata2_"+year+".txt"
with open(file2, 'wb') as f:
subprocess.check_call(['tail', '-n', '+5', "gotdata_"+year+".txt"], stdout=f)
Then, if you change that first line to this:
file2 = teamfname+"_"+year+".txt"
It will still work even if teamfname has a space or a quote or another special character in it.
That being said, I'm not sure why you want to use tail in the first place. You can skip the first 5 lines just as easily directly in Python.
I have a small problem with reading in my file. My code:
import csv as csv
import numpy
with open("train_data.csv","rb") as training:
csv_file_object = csv.reader(training)
header = csv_file_object.next()
data = []
for row in csv_file_object:
data.append(row)
data = numpy.array(data)
I get the error no such file "train_data.csv", so I know the problem lies with the location. But whenever I specify the pad like this: open("C:\Desktop...etc) it doesn't work either. What am I doing wrong?
If you give the full file path, your script should work. Since it is not, it must be that you have escape characters in your path. To fix this, use a raw-string to specify the file path:
# Put an 'r' at the start of the string to make it a raw-string.
with open(r"C:\path\to\file\train_data.csv","rb") as training:
Raw strings do not process escape characters.
Also, just a technical fact, not giving the full file path causes Python to look for the file in the directory that the script is launched from. If it is not there, an error is thrown.
When you use open() and Windows you need to deal with the backslashes properly.
Option 1.) Use the raw string, this will be the string prefixed with an r.
open(r'C:\Users\Me\Desktop\train_data.csv')
Option 2.) Escape the backslashes
open('C:\\Users\\Me\\Desktop\\train_data.csv')
Option 3.) Use forward slashes
open('C:/Users/Me/Desktop/train_data.csv')
As for finding the file you are using, if you just do open('train_data.csv') it is looking in the directory you are running the python script from. So, if you are running it from C:\Users\Me\Desktop\, your train_data.csv needs to be on the desktop as well.