How to use multiple variables read from file with looping subprocess/popen - python

I am using python to read 2 files from my linux os. One contains a single entry/number 'DATE':
20111125
the other file contains many entries, 'TIME':
042844UTC
044601UTC
...
044601UTC
I am able to read the files to assign to proper variables. I would like to then use the variables to create folder paths, move files etc... such as:
$PATH/20111125/042844UTC
$PATH/20111125/044601UTC
$PATH/20111125/044601UTC
and so on.
Somehow this doesn't work with multiple variables passed at once:
import subprocess, sys, os, os.path
DATEFILE = open('/Astronomy/Sorted/2-Scratch/MAPninox-DATE.txt', "r")
TIMEFILE = open('/Astronomy/Sorted/2-Scratch/MAPninox-TIME.txt', "r")
for DATE in DATEFILE:
print DATE,
for TIME in TIMEFILE:
os.popen('mkdir -p /Astronomy/' + DATE + '/' TIME) # this line works for DATE only
os.popen('mkdir -p /Astronomy/20111126/' + TIME) # this line works for TIME only
subprocess.call(['mkdir', '-p', '/Astronomy/', DATE]), #THIS LINE DOESN'T WORK
Thanks!

I would suggest using os.makedirs (which does the same thing as mkdir -p) instead of subprocess or popen:
import sys
import os
DATEFILE = open(os.path.join(r'/Astronomy', 'Sorted', '2-Scratch', 'MAPninox-DATE.txt'), "r")
TIMEFILE = open(os.path.join(r'/Astronomy', 'Sorted', '2-Scratch', 'MAPninox-TIME.txt'), "r")
for DATE in DATEFILE:
    print DATE,
for TIME in TIMEFILE:
os.makedirs(os.path.join(r'/Astronomy', DATE, TIME))
astrDir = os.path.join(r'/Astronomy', '20111126', TIME)
try
os.makedirs(astrDir)
except os.error:
print "Dir %s already exists, moving on..." % astrDir
# etc...
Then use shutil for any cp/mv/etc operations.
From the os Docs:
os.makedirs(path[, mode])
Recursive directory creation function. Like mkdir(), but makes all
intermediate-level directories needed to contain the leaf directory.
Raises an error exception if the leaf directory already exists or
cannot be created. The default mode is 0777 (octal). On some systems,
mode is ignored. Where it is used, the current umask value is first
masked out.

I see a couple of errors in your code.
os.popen('mkdir -p /Astronomy/' + DATE + '/' TIME) # this line works for DATE only
This is a syntax error. I think you meant to have '/' + TIME, not '/' TIME. I'm not sure what you mean by "this line works for DATE only"?
subprocess.call(['mkdir', '-p', '/Astronomy/', DATE]), #THIS LINE DOESN'T WORK
What command do you expect to call? I'm guessing from the rest of your code that you're trying to execute mkdir -p /Astronomy/<<DATE>>. That isn't what you've coded though. Each item in the list you pass to subprocess.call is a separate argument, so what you've written comes out as mkdir -p /Astronomy <<DATE>>. This will attempt to create two directories, a root-level directory /Astronomy, and another one in the current working directory named whatever DATE is.
If I'm correct about what you wanted to do, the corrected line would be:
subprocess.call(['mkdir', '-p', '/Astronomy/' + DATE])
chown's answer using os.makedirs (and using os.path.join to splice paths, rather than string +) is a better general approach, in my opinion. But this is why your current code isn't working, as far as I can tell.

Related

Subprocess.call() cwd cannot set from text loaded value

TL;DR
subprocess.call(cwd=filepath) does not work when I set the filepath variable from a text file, but does work when I set it manually using an identical path.
More Info
When I use subprocess.call I specify the cwd of the command with a string variable. When the string is manually defined, everything works how it should. However, I want to load the cwd path from a value within a text file. I have that part nailed down as well, and I am loading the correct value from the text file. When cwd=filepath and filepath is set to the string value loaded in from the text file, I get a NotADirectoryError: [WinError 267] The directory name is invalid. Keep in mind that if I set the variable manually to the exact same path, I do not get this error. I think this is some kind of formatting issue, and I've played around with it/looked around the internet for a few days, but haven't found a working solution.
Full Code
import subprocess # to run the process.
import pathlib #to get the path of the file.
programpath = str(pathlib.WindowsPath(__file__).parent.absolute())
blenderfilepath = 'C:/Program Files/Blender Foundation/Blender 2.81/'
settingsfile = 'settings'
# Load in the variables from settings.
def Load():
global blenderfilepath
# # look inside settings file for settings.
sf = open(programpath + '\\' + settingsfile, 'r')
for line in sf:
if 'BPL' in line:
bfp = line.split('-', maxsplit=1)
blenderfilepath = str(pathlib.Path(bfp[1]))
print('Path loaded for Blender: ' + blenderfilepath)
else:
print('Using default config...')
return
sf.close()
print('Settings loaded')
# Run next job executes the command to run the next job.
def RunNextJob():
print('Running next job...')
print(blenderfilepath)
currentjob = subprocess.call('blender.exe', cwd=blenderfilepath, shell=True, stdout=subprocess.PIPE)
RunNextJob()
Additional Information and Thanks!
Initially, I was just pulling the string out of the file with no pathlib element. I've tried using just pathlib without converting it to a string as well. This is notable to mention.
For additional context, the "settings" file is one line that contains one line:
BPL-C:/Program Files/Blender Foundation/Blender 2.81/
It is parsed through to extract the path. I have validated that the path is extracted correctly.
Any help is appreciated. Thanks!
For anyone else with this same issue, add .rstring() to the end of your string. It will strip off any line endings and other sometimes invisible elements in strings.
The updated line reads:
blenderfilepath = str(pathlib.Path(bfp[1]).rstring())
Thank you to jasonharper for the help!

Execute external command and exchange variable using Python

1. Introduction
I have a bunch of files in netcdf format.
Each file contain the meteorology condition of somewhere in different period(hourly data).
I need to extract the first 12 h data for each file. So I select to use NCO(netcdf operator) to deal with.
NCO works with terminal environment. With >ncks -d Time 0,11 input.nc output.nc, I can get one datafile called out.ncwhich contain the first 12h data of in.nc.
2. My attempt
I want to keep all the process inside my ipython notebook. But I stuck on two aspects.
How to execute terminal code in python loop
How to transfer the string in python into terminal code.
Here is my fake code for example.
files = os.listdir('.')
for file in files:
filename,extname = os.path.splitext(file)
if extname == '.nc':
output = filename + "_0-12_" + extname
## The code below was my attempt
!ncks -d Time 0,11 file output`
3. Conclusion
Basically, my target was letting the fake code !ncks -d Time 0,11 file output coming true. That means:
execute netcdf operator directly in python loop...
...using filename which is an string in python environment.
Sorry for my unclear question. Any advice would be appreciated!
You can use subprocess.check_output to execute external program:
import glob
import subprocess
for fn in glob.iglob('*.nc'):
filename, extname = os.path.splitext(fn)
output_fn = filename + "_0-12_" + extname
output = subprocess.call(['ncks', '-d', 'Time', '0,11', fn, output_fn])
print(output)
NOTE: updated the code to use glob.iglob; you don't need to check extension manually.
You may also check out pynco which wraps the NCO with subprocess calls, similar to #falsetru's answer. Your application may look something like
nco = Nco()
for fn in glob.iglob('*.nc'):
filename, extname = os.path.splitext(fn)
output_fn = filename + "_0-12_" + extname
nco.ncks(input=filename, output=output_fn, dimension='Time 0,11')

Python how to pass in an optional filename parameter when running a script to process a file

I have a python script that processes an XML file each day (it is transferred via SFTP to a remote directory, then temporarily copied to a local directory) and stores its information in a MySQL database.
One of my parameters for the file is set to "date=today" so that the correct file is processed each day. This works fine and each day I successfully store new file information into the database.
What I need help on is passing a Linux command line argument to run a file for a specific day (in case a previous day's file needs to be rerun). I can manually edit my code to make this work but this will not be an option once the project is in production.
In addition, I need to be able to pass in a command line argument for "date=*" and have the script run every file in my remote directory. Currently, this parameter will successfully process only a single file based on alphabetic priority.
If my two questions should be asked separately, my mistake, and I'll edit this question to just cover one of them. Example of my code below:
today = datetime.datetime.now().strftime('%Y%m%d')
file_var = local_file_path + connect_to_sftp.sftp_get_file(
local_file_path=local_file_path,
sftp_host=sftp_host,
sftp_username=sftp_username,
sftp_directory=sftp_directory,
date=today)
ET = xml.etree.ElementTree.parse(file_var).getroot()
def parse_file():
for node in ET.findall(.......)
In another module:
def sftp_get_file(local_file_path, sftp_host, sftp_username, sftp_directory, date):
pysftp.Connection(sftp_host, sftp_username)
# find file in remote directory with given suffix
remote_file = glob.glob(sftp_directory + '/' + date + '_file_suffix.xml')
# strip directory name from full file name
file_name_only = remote_file[0][len(sftp_directory):]
# set local path to hold new file
local_path = local_file_path
# combine local path with filename that was loaded
local_file = local_path + file_name_only
# pull file from remote directory and send to local directory
shutil.copyfile(remote_file[0], local_file)
return file_name_only
So the SFTP module reads the file, transfers it to the local directory, and returns the file name to be used in the parsing module. The parsing module passes in the parameters and does the rest of the work.
What I need to be able to do, on certain occasions, is override the parameter that says "date=today" and instead say "date=20151225", for example, but I must do this through a Linux command line argument.
In addition, if I currently enter the parameter of "date=*" it only runs the script for the first file that matches that parameter. I need the script to run for ALL files that match that parameter. Any help is much appreciated. Happy to answer any questions to improve clarity.
You can use sys module and pass the filename as command line argument.
That would be :
import sys
today = str(sys.argv[1]) if len(sys.argv) > 1 else datetime.datetime.now().strftime('%Y%m%d')
If the name is given as first argument, then today variable will be filename given from command line otherwise if no argument is given it will be what you specified as datetime.
For second question,
file_name_only = remote_file[0][len(sftp_directory):]
You are only accessing the first element, but glob might return serveral files when you use * wildcard. You must iterate over remote_file variable and copy all of them.
You can use argsparse to consume command line arguments. You will have to check if specific date is passed and use it instead of the current date
if args.date_to_run:
today = args.date_to_run
else:
today = datetime.datetime.now().strftime('%Y%m%d')
For the second part of your question you can use something like https://docs.python.org/2/library/fnmatch.html to match multiple files based on a pattern.

Subprocess doesn't open the correct directory

I'm running a script which prompts the user to select a directory, saves a plot to that directory and then uses subprocess to open that location:
root = Tkinter.Tk()
dirname = tkFileDialog.askdirectory(parent=root,initialdir="/",title='Please select a directory')
fig.savefig(dirname+'/XXXXXX.png',dpi=300)
plt.close("all")
root.withdraw()
subprocess.Popen('explorer dirname')
When I run the file I select a sub-directory in D:\Documents and the figure save is correct. However the subprocess simply opens D:\Documents as opposed to D:\Documents\XXX.
Ben
To open a directory with the default file explorer:
import webbrowser
webbrowser.open(dirname) #NOTE: no quotes around the name
It might use os.startfile(dirname) on Windows.
If you want to call explorer.exe explicitly:
import subprocess
subprocess.check_call(['explorer', dirname]) #NOTE: no quotes
dirname is a variable. 'dirname' is a string literal that has no relation to the dirname name.
You are only passing the string 'dirname' not the variable that you have named dirname in your code. Since you (presumably) don't have a directory called dirname on your system, explorer opens the default (Documents).
You may also have a problem with / vs \ in directory names. As shown in comments, use os.path module to convert to the required one.
You want something like
import os
win_dir = os.path.normpath(dirname)
subprocess.Popen('explorer "%s"' %win_dir)
or
import os
win_dir = os.path.normpath(dirname)
subprocess.Popen(['explorer', win_dir])
Add ,Shell=True after 'explorer dirname'
If Shell is not set to True, then the commands you want to implement must be in list form (so it would be ['explorer', ' dirname']. You can also use shlex which helps a lot if you don't want to make Shell = True and don't want to deal with lists.
Edit: Ah I miss read the question. often you need a direct path to the directory, so that may help.

In Paraview, how to change the datafile filename of the state to create a snapshot from a given datafile and state?

I am currently able to visualize correctly in ParaView a .vtp file for each time step of a simulation, and to print a screenshot for each. I want to do that in batch, but I want to keep the same state for each one (view point, filters applied, etc). I have already saved the state into a .psvm file , and I tried to write a python script which, after being run by pvbatch, will (hopefully) print the screenshots. But, unfortunately, it is not working. I tried to change the filename in the state by processing the state file and doing a search and replace, but still it is not working. For instance, it keeps plotting the first data input only, even if the current file is different (altough GetSources() shows an always increasing list of sources). I use ParaView 3.14.0 in Snow Leopard. I am sure this is easy, but I am overwhelmed with the large amount of info about python and paraview with no reference to this particularissue. Please, please, any advice is greatly welcome, and I am sorry if this has been answered previously (I looked at google, the paraview mailing list, and here). Below is my script, which can also be found at http://pastebin.com/4xiLNrS0 . Furthermore, you can find some example files and state in http://goo.gl/XjPpE .
#!/bin/python
import glob, string, os, commands
from paraview.simple import *
#help(servermanager)
# vtp files are inside the local subdir DISPLAY
files = (commands.getoutput("ls DISPLAY/data-*.vtp | grep -v contacts")).split()
# process each file
for filename in files:
fullfn = commands.getoutput("ls $PWD/" + filename)
fn = filename.replace('DISPLAY/', '')
#os.system("cp ../dem_git/addons/paraview_state.pvsm tmp.pvsm")
os.system("cp ~/Desktop/state.pvsm tmp.pvsm")
os.system("sed -i.bck 's/DATA.vtp/" + fullfn.replace('/','\/') + "/1' tmp.pvsm") # replace first intance with full path
os.system("sed -i.bck 's/DATA.vtp/" + fullfn.replace('/','\/') + "/1' tmp.pvsm") # replace second intance with full path
os.system("sed -i.bck 's/DATA.vtp/" + fn + "/1' tmp.pvsm") # replace third with just the filename path
servermanager.LoadState("tmp.pvsm")
pm = servermanager.ProxyManager()
reader = pm.GetProxy('sources', fullfn)
reader.FileName = fullfn
reader.FileNameChanged()
reader.UpdatePipeline()
view = servermanager.GetRenderView()
SetActiveView(view)
view.StillRender()
WriteImage(filename.replace(".vtp", ".png"))
os.system("rm -f tmp.pvsm")
os.system("rm -f tmp.pvsm.bck")
Delete(reader)
I realise this is an old question, but I had exactly the same problem recently and couldn't find any answers either. All you need to do is add Delete(view) after Delete(reader) for your script to work.

Categories