python subprocess won't play nicely with gsutil copy/move commands - python

In Python I'm using subprocess to call gsutil copy and move commands, but am currently unable to select multiple extensions.
The same gsutil command works at the terminal, but not in python:
cmd_gsutil = "sudo gsutil -m mv gs://xyz-ms-media-upload/*.{mp4,jpg} gs://xyz-ms-media-upload/temp/"
p = subprocess.Popen(cmd_gsutil, shell=True, stderr=subprocess.PIPE)
output, err = p.communicate()
If say there are four filetypes to move but the bucket is empty, the returning gsutil error from terminal is:
4 files/objects could not be transferred.
Whereas the error returned when run through subprocess is:
1 files/objects could not be transferred.
So clearly subprocess is mucking up the command somehow...
I could always inefficiently repeat the command for each of the filetypes, but would prefer to get to the bottom of this!

It seems, /bin/sh (the default shell) doesn't support {mp4,jpg} syntax.
Pass executable='/bin/bash', to run it as a bash command instead.
You could also run the command without the shell e.g., using glob or fnmatch modules to get the filenames to construct the gsutil command. Note: you should pass the command as a list in this case instead.

Related

Multiples Commands To CMD

I have to use The following command line tool: ncftp,
After that I need to
execute the following commands:
"open ftp//..."
"get -R Folder", but I need to do this automatically . How do I achieve this using Python or command line
You can use the Python subprocess module for this.
from subprocess import Popen, PIPE
# if you don't want your script to print the output of the ncftp
# commands, use Popen(['ncftp'], stdin=PIPE, stdout=PIPE)
with Popen(['ncftp'], stdin=PIPE) as proc:
proc.stdin.write(b"open ...\n") # must terminate each command with \n
proc.stdin.write(b"get -R Folder\n")
# ...etc
See the documentation for subprocess for more information. It can be a little tricky to get the hang of this library, but it's very versatile.
Alternatively, you can use the non-interactive commands ncftpget (docs) and ncftpput (docs) from the NcFTP package.
I recommend reading through the documentation on these commands before proceeding.
In the comments, you said you needed to get some files, delete those files and after upload some new files. Here's how you can do that:
$ ncftpget -R -DD -u username -p password ftp://server path/to/local/directory path/to/remote/directory/Folder
$ ncftpput -R -u username -p password ftp://server path/to/remote/directory path/to/local/directory/Folder
-DD will delete all files after downloading, but it will leave the directory and any subdirectories in place
If you need to delete the empty folder, you can run the ncftpget command again without -R (but the folder must be completely empty, i.e. no subdirectories, so rinse and repeat as necessary).
You can do this in a bash script or using subprocess.run in Python.

Pythons subprocess check_call doesn't give the same result as the same command executed in the command line

I am using an anaconda environment both for the python code and the terminal.
When I want to execute a program in the shell (Windows CMD) with the environment activated. The program ogr2ogr returns the correct output with the given parameter. The tool ogr2ogr has been installed via a conda package.
But when I execute the my python code, the ogr2ogr returns an errors output. I thought it might be to different installations used due to usage of different environments (without my knowledge), but this is ownly a guess.
The python code goes as follows:
from pathlib import Path
from subprocess import check_call, STDOUT
...
file_path = Path(file_name)
destination = str(file_path.with_suffix(".gpkg"))
command = f"ogr2ogr -f GPKG -s_srs EPSG:25833 -t_srs EPSG:25833 {destination} GMLAS:{file_name} -oo REMOVE_UNUSED_LAYERS=YES"
check_call(command, stderr=STDOUT, shell=True)
ogr2ogr translates a file into another format. Which is also done, but when I open the file, I see, it's not done 100 % correctly.
When I copy the value of the string command and copy it to the shell and execute the command the execution is done correctly!
How can I correct the behaviour of using subprocess.check_call

How to add environment variables to the bash opened by subprocess module?

I need to use the wget in a Python script with the subprocess.call function, but it seems the "wget" command cannot be identified by the bash subprocess opened by python.
I have added the environment variable (the path where wget is):
export PATH=/usr/local/bin:$PATH
to the ~/.bashrc file and the ~/.bash_profile file on my mac and guaranteed to have sourced them.
And the python script looks like:
import subprocess as sp
cmd = 'wget'
process = sp.Popen(cmd ,stdout=sp.PIPE, stdin=sp.PIPE,
stderr=sp.PIPE, shell=True ,executable='/bin/bash')
(stdoutdata, stderrdata) = process.communicate()
print stdoutdata, stderrdata
The expected output should be like
wget: missing URL
Usage: wget [OPTION]... [URL]...
But the result is always
/bin/bash: wget: command not found
Interestingly I can get the help output if I type in wget directly in a bash terminal, but it never works in the python script. How could it be?
PS:
If I change the command to
cmd = '/usr/local/bin/wget'
then it works. So I am sure I got wget installed.
You can pass an env= argument to the subprocess functions.
import os
myenv = os.environ.copy
myenv['PATH'] = '/usr/local/bin:' + myenv['PATH']
subprocess.run(..., env=myenv)
However, you probably want to avoid running a shell at all, and instead augment the PATH that Python uses to find the binary to run in the subprocess call.
import subprocess as sp
import os
os.environ['PATH'] = '/usr/local/bin:' + os.environ['PATH']
cmd = 'wget'
# use run instead of Popen
# don't needlessly use a shell
# and thus put [cmd] as a list
process = sp.run([cmd], stdout=sp.PIPE, stdin=sp.PIPE,
stderr=sp.PIPE,
universal_newlines=True)
print(process.stdout, process.stderr)
Running Bash commands in Python explains the changes I made in more detail.
However, there is no good reason to use an external utility for this; Python requests does pretty everything wget does, often more naturally and with more control over what exactly it does.

Unable to run shell commands with * using python subprocess module

I am not able to run any commands using python subprocess module which contains * sign in the command.
I am using the call this way,
subprocess.Popen(
'cp /etc/varnida_sys/* /tmp/bucket/'.split(),
stdout=subprocess.PIPE).communicate()[0]
For this I am getting,
cp: cannot stat ‘/etc/varnida_sys/*’: No such file or directory
Why is this error coming, there is one file inside /etc/varnida_sys/genders
My investigations says that using regex like * needs some special handling. I am getting some errors in all those commands that contains *.
PS. I am not getting errors when I am running the same command through paramiko from remote host.
* is only understood by a shell (which expands it to a list of files), you need to pass shell=True to Popen(). Also, there's no need to split the command, you can use a string:
subprocess.Popen("cp /etc/varnida_sys/* /tmp/bucket/",
stdout=subprocess.PIPE, shell=True).communicate()[0]
As #triplee has suggested below, it's better to use some convenience wrapper for this task, e.g. subprocess.call():
subprocess.call("cp /etc/varnida_sys/* /tmp/bucket/", shell=True)

pexpect.run can not run a long command

I am using pexpect.run to execute a command. See below:
cmd = "grep -L killed /dir/dumps/*MAC-66.log"
output = pexpect.run(cmd)
When I run this, output equals to:
grep: /dir/dumps/*MAC-66.log: No such file or directory
But when I run the same command in my shell, it works, everytime. I don't see the problem. Any help is appreciated! Does pexpect.run require the command to be split in some fancy way?
Your shell is interpreting the glob, pexpect is not. You could either use python's glob.glob() function to evaluate the glob yourself, or run it through your shell, for example:
cmd = "bash -c 'grep -L killed /dir/dumps/*MAC-66.log'"
Also, if all you're after is output of this command, you ought to check out the subprocess module.

Categories