I'm building out a powershell script that does quite a bit of work extracting csv's from a zip, converting OS->lat/lon, stuffing the contents robustly into a DB, and then emailing a distribution list with stats on the whole process.
Most of this is now complete, but to make the whole thing a little more portable I'm providing paths to input/working/output folders as parameters of the powershell call from a batch file.
This is all working fantatstically until I need to call python scripts to do the lat/lon work, as passing in the variable parameter paths doesn't seem to work with any permutation/combination.
The following is a simplified version of the python path within the .PS1 script which works from both command prompt and from within the .PS1 file if called directly (where -i -o are input/output path parameters).
c:\python27\python.exe D:\PythonPPC\subs.py -i D:\PPC\subs_export.csv -o D:\PPC\subs_export_lat_lon.csv
In my script I would like to replace the two path parameters -i/-o with variables something like:
c:\python27\python.exe D:\PythonPPC\subs.py -i $inputPathsubs_export.csv -o $outputPathsubs_export_lat_lon.csv
Does anyone have any idea on how to invoke this command as I've tried the &$exe method described on stack and a few other places, but this simply results in the error shown below:
CommandNotFoundException + FullyQualifiedErrorId : CommandNotFoundException
Any help would be greatly appreciated.
I'd recommend using the Join-Path cmdlet for building paths. It will save you the headache of keeping track of leading/trailing path separators and also canonicalize / to \.
$workingFolder = 'C:\some/where\'
$extractFolder = '\extract\folder'
$infileName = '/subs_export.csv'
$outfileName = '\subs_export_lat_lon.csv'
$infile = Join-Path (Join-Path $workingFolder $extractFolder) $infileName
$outfile = Join-Path (Join-Path $workingFolder $extractFolder) $outfileName
Result:
PS C:\> $infile
C:\some\where\extract\folder\subs_export.csv
PS C:\> $outfile
C:\some\where\extract\folder\subs_export_lat_lon.csv
If you still want to simply concatenate strings to a path, you can separate variables from trailing strings by putting the variable name between curly brackets. In your example:
python.exe D:\PythonPPC\subs.py -i "${inputPath}subs_export.csv" ...
or
python.exe D:\PythonPPC\subs.py -i "${rootFolder}${subFolder}subs_export.csv" ...
After attacking the problem from a different direction and a colleague pointing out my own abject stupidity, it seems the following is a fairly straightforward solution to the complicated problem above.
$inputFilePath = $pathToWorkingFolder + $pathToExtractFolder + 'subs_export.csv'
$outputFilePath = $pathToWorkingFolder + $pathToExtractFolder +'subs_export_lat_lon.csv'
c:\python27\python.exe D:\PythonPPC\subs.py -i $inputFilePath -o $outputFilePath
Just needed some peer review I guess...
Related
I'm sorry for asking a duplicate as this and this are very similar but with those answers I don't seem to be able to make my code work.
If have a jupyter notebook cell with:
some_folder = "/some/path/to/files/*.gif"
!for name in {some_folder}; do echo $name; done
The output I get is just {folder}
If I do
some_folder = "/some/path/to/files/*.gif"
!for name in "/some/path/to/files/*.gif"; do echo $name; done # <-- gives no newlines between file names
# or
!for name in /some/path/to/files/*.gif; do echo $name; done # <-- places every filename on its own line
My gif files are printed to screen.
So my question why does it not use my python variable in the for loop?
Because the below, without a for loop, does work:
some_folder = "/some/path/to/files/"
!echo {some_folder}
Follow up question: I actually want my variable to just be the folder and add the wildcard only in the for loop. So something like this:
some_folder = "/some/path/to/files/"
!for name in {some_folder}*.gif; do echo $name; done
For context, later I actually want to rename the files in the for loop and not just print them. The files have an extra dot (not the one from the .gif extension) which I would like to remove.
There's an alternative way to use shell bash in a Jupyter cell with cell magic, see here. It seems to allow what you are trying to do.
If you already ran in a normal cell some_folder = r"/some/path/to/files/*.gif" or some_folder = "/some/path/to/files/*.gif", then you can try in a separate cell:
%%bash -s {some_folder}
for i in {1..5}; do echo $1; done
That said, what you seems to be trying to do with some_folder = "/some/path/to/files/*.gif" isn't going to work as such. If you try to pass "/some/path/to/files/*.gif" from Python to bash, it isn't going to work like passing /some/path/to/files/*.gif directly to bash. Bash isn't passing "/some/path/to/files/*.gif" directly to a command, it expands it and then passes it. There's not going to be an expansion passing from Python. And there's other peculiarities you'll come across. Tar you can pass a Python list of files directly using the bracket notation and it will handle that.
The solutions are to either do more on the Python side or more in the shell side. Python has it's own glob module, see here. You can combine that with working with os.system(). Python has fnamtch that is nice because you can use Unix-like file name matching. Plus there's shutil that allows moving/renaming, see shutil.move(). In Python os.remove(temp_file_name) can delete files. If you aren't working on a Windows machine there's the sh module that makes things nice. See here and here.
I have couple of file with same name, and I wanted to get the latest file
[root#xxx tmp]# ls -t1 abclog*
abclog.1957
abclog.1830
abclog.1799
abclog.1742
I can accomplish that by executing below command.
[root#xxx tmp]# ls -t1 abclog*| head -n 1
abclog.1957
But when I am trying to execute the same in python , getting error :
subprocess.check_output("ls -t1 abclog* | head -n 1",shell=True)
ls: cannot access abclog*: No such file or directory
''
Seems it does not able to recognize '*' as a special parameter. How can I achieve the same ?
As others noted, your code should work. It doesn't work probably because the current directory isn't the one you suppose it is, so abc* is not expanded by the shell (even if shell=True is set), and passed as-is to ls, resulting in a "no such file" error.
You have to pass the absolute path or use cwd= parameter when calling check_output. Another nice pythonic alternative would be to avoid subprocess, and return the most recently modified file using only python code:
most_recent = max(glob.glob(os.path.join("path/to/file","abclog*"),key=os.path.getmtime)
(using max with os.path.getmtime as key and glob.glob to filter the files)
Make sure you execute this in the directory where the files exist. If you just fire up Idle to run this code, you will not be in that directory.
I am trying to execute a python script on all text files in a folder:
for fi in sys.argv[1:]:
And I get the following error
-bash: /usr/bin/python: Argument list too long
The way I call this Python function is the following:
python functionName.py *.txt
The folder has around 9000 files. Is there some way to run this function without having to split my data in more folders etc? Splitting the files would not be very practical because I will have to execute the function in even more files in the future... Thanks
EDIT: Based on the selected correct reply and the comments of the replier (Charles Duffy), what worked for me is the following:
printf '%s\0' *.txt | xargs -0 python ./functionName.py
because I don't have a valid shebang..
This is an OS-level problem (limit on command line length), and is conventionally solved with an OS-level (or, at least, outside-your-Python-process) solution:
find . -maxdepth 1 -type f -name '*.txt' -exec ./your-python-program '{}' +
...or...
printf '%s\0' *.txt | xargs -0 ./your-python-program
Note that this runs your-python-program once per batch of files found, where the batch size is dependent on the number of names that can fit in ARG_MAX; see the excellent answer by Marcus Müller if this is unsuitable.
No. That is a kernel limitation for the length (in bytes) of a command line.
Typically, you can determine that limit by doing
getconf ARG_MAX
which, at least for me, yields 2097152 (bytes), which means about 2MB.
I recommend using python to work through a folder yourself, i.e. giving your python program the ability to work with directories instead of individidual files, or to read file names from a file.
The former can easily be done using os.walk(...), whereas the second option is (in my opinion) the more flexible one. Use the argparse module to give your python program an easy-to-use command line syntax, then add an argument of a file type (see reference documentation), and python will automatically be able to understand special filenames like -, meaning you could instead of
for fi in sys.argv[1:]
do
for fi in opts.file_to_read_filenames_from.read().split(chr(0))
which would even allow you to do something like
find -iname '*.txt' -type f -print0|my_python_program.py -file-to-read-filenames-from -
Don't do it this way. Pass mask to your python script (e.g. call it as python functionName.py "*.txt") and expand it using glob (https://docs.python.org/2/library/glob.html).
I think about using glob module. With this module you invoke your program like:
python functionName.py "*.txt"
then shell will not expand *.txt into file names. You Python program will receive *.txt in argumens list and you can pass it into glob.glob():
for fi in glob.glob(sys.argv[1]):
...
i need some help with this...
I have a program installed on my computer that i want to call to calculate some things and give me an output-file...
in Matlab the command "dos()" does the job giving me also the cmd screen output in matlab.
I need this to work in python but i am making something wrong.
data='file.csv -v'
db=' -d D:\directory\bla\something.db'
anw='"D:\Program Files\bla\path\to\anw.exe"' + db + ' -i' + data
"anw" output is this one:
>>> anw
'"D:\\Program Files\\bla\\path\\to\\anw.exe" -d D:\\directory\\bla\\something.db -i file.csv -v'
## without the "" it does not work either
import subprocess as sb
p= sb.Popen('cmd','/K', anw) ## '/C' does not work either
i get the following error message from cmd.exe inside the python shell
Windows cannot find "\"D:\Program Files\bla\path\to\anw.exe"" Make sure you typed the name correctly, and then try again.
this line runs when i make a bat. file out of it.
it runs in matlab via "dos(anw)" so what is wrong here?
ps: i have blanks in my command... could this be the problem? i do not know where the first "\" comes from in the cmd. exe error message
for now i created a bat. file with all the stuff cmx.de should do in the specific directory where the input file lies...
i just had to tell python to change directory with
import os
os.chdir("D:\working\directory")
os.system(r'D:\working\directory\commands.bat')
it works good and gives me the output of cmd directly in the python shell
I spend a few hours writing a little script.
Basically what it does is create a new text file and fills it up with whatever.
I zip the text file --using zipfile-- and here's where my problem lies.
I want to run the Windows system command:
copy /b "imgFile.jpg" + "zipFile.zip" newImage.jpg
To merge the image "imgFile.jpg" and the zip "zipFile.zip".
So:
os.system("copy /b \"imgFile.jpg\" + \"zipFile.zip\" newImage.jpg")
When I run my script, it all seems to go fine.
But when it's done and I try to extract the 'newImage.jpg' file, it gives me:
The archive is either in unknown format or damaged
This ONLY happens when I run the system command within the script.
It works fine when I use the shell. It even works if I use a separate script.
I've double checked my zip file. Everything is in good shape.
Is there something I'm doing wrong? Something I'm not seeing?
Have you tried using shutil?
import shutil
shutil.copy(src, dst)
There may be a problem with the way Python is passing the arguments to the shell command. Try using subprocess.call. This method takes arguments as an array and passes them that way to the command:
import subprocess
subprocess.call(["copy", "/b", '"imgFile.jpg" + "zipFile.zip"', "newImage.jpg"])