I am attempting to write a program that loops through a bunch of files in a .pdb format and convert them to a .pdbqt format using a module called prepare_ligand4.py. I am fairly certain that it is correct up to the point that prepare_ligan4.py is called, but once it reaches this point all that happens is that the wordpad file pops up containing the code for prepare_ligand4.py. It should be modifying the files in the indicated directory instead. Does anyone have any advice what I should be doing? Is there a special way I need to call prepare_ligand4.py?
#convert pdb files to pdbqt
import os
import sys
#change directory to directory containing pdb files
os.chdir('C:\\Users\\Collin\\Documents\\fragments.pdb')
#path to pdb files
path = 'C:\\Users\\Collin\\Documents\\fragments.pdb'
dirs = os.listdir(path)
#finding number of pdb files in the directory
x = len(dirs)
#loop through all files in directory and convert to pdbqt
for i in range(x):
y = dirs[i]
os.system('C:\\Python27\\MGLTools-1.5.6\\Lib\\site-packages\\AutoDockTools\\Utilities24\\prepare_ligand4.py -l y -v')
ligand_pdbqt = y[:-4]+".pdbqt"
#os.rename(os.path.join ('C:\\Users\\Collin\\Documents\\fragments_under_150.pdb',y), os.path.join('C:\\Users\\Documents\\pdbqt', ligand_pdbqt)
Related
I'm still very new to Python so I'm trying to apply Python in to my own situation for some experience
One useful program is to delete files, in this case by file type from a directory
import os
target = "H:\\documents\\"
for x in os.listdir(target):
if x.endswith(".rtf"):
os.unlink(target + x)
Taking this program, I have tried to expand it to delete ost files in every local profiles:
import os
list = []
folder = "c:\\Users"
for subfolder in os.listdir(folder):
list.append(subfolder)
ost_folder = "c:\\users\\%s\\AppData\\Local\\Microsoft\\Outlook"
for users in list:
ost_list = os.listdir(ost_folder%users)
for file in ost_list:
if file.endswith(".txt"):
print(file)
This should be printing the file name but spits an error that the file directory cannot be found
Not every folder under C:\Users will have a AppData\Local\Microsoft\Outlook subdirectory (there are typically hidden directories there that you may not see in Windows Explorer that don't correspond to a real user, and have never run Outlook, so they don't have that folder at all, but will be found by os.listdir); when you call os.listdir on such a directory, it dies with the exception you're seeing. Skip the directories that don't have it. The simplest way to do so is to have the glob module do the work for you (which avoids the need for your first loop entirely):
import glob
import os
for folder in glob.glob(r"c:\users\*\AppData\Local\Microsoft\Outlook"):
for file in os.listdir(folder):
if file.endswith(".txt"):
print(os.path.join(folder, file))
You can simplify it even further by pushing all the work to glob:
for txtfile in glob.glob(r"c:\users\*\AppData\Local\Microsoft\Outlook\*.txt"):
print(txtfile)
Or do the more modern OO-style pathlib alternative:
for txtfile in pathlib.Path(r'C:\Users').glob(r'*\AppData\Local\Microsoft\Outlook\*.txt'):
print(txtfile)
I have a script that runs on a folder to create contour lines. Since I have roughly 2700 DEM which need to be processed, I need a way using the script to run on all folders within the parent folder saving them to an output folder. I am not sure how to script this but it would be greatly appreciated if I could get some guidance.
The following is the script I currently have which works on a single folder.
import arcpy
from arcpy import env
from arcpy.sa import *
env.workspace = "C:/DATA/ScriptTesting/test"
inRaster = "1km17670"
contourInterval = 5
baseContour = 0
outContours = "C:/DATA/ScriptTesting/test/output/contours5.shp"
arcpy.CheckOutExtension("Spatial")
Contour(inRaster,outContours, contourInterval, baseContour)
You're probably looking for os.walk(), which can recursively walk through all subdirectories of the given directory. You can either use the current working directory, or calculate your own parent folder and start from there, or whatever - but it'll give you the filenames for everything beneath what it starts with. From there, you can make a subroutine to determine whether or not to perform your script on that file.
You can get a list of all directories like this:
import arcpy
from arcpy import env
from arcpy.sa import *
import os
# pass in your root directory here
directories = os.listdir(root_dir)
Then you can iterate over this dirs:
for directory in directories:
# I assume you want the workspace attribute set to the subfolders
env.workspace = os.path.realpath(directory)
inRaster = "1km17670"
contourInterval = 5
baseContour = 0
# here you need to adjust the outputfile name if there is a file for every subdir
outContours = "C:/DATA/ScriptTesting/test/output/contours5.shp"
arcpy.CheckOutExtension("Spatial")
Contour(inRaster,outContours, contourInterval, baseContour)
As #a625993 mentioned, os.walk could be useful too if you have recursively nested directories. But as I can read from your question, you have just single subdirectories which directly contain the files and no further directories. That's why listing just the dirs underneath your root directory should be enough.
I am trying to write my first python script below. I want to search through a read only archive on an HPC to look in zipfiles contained within folders with a variety of other folder/file types. If the zip contains a .kml file I want to print the line in there starting with the string <coordinates>.
import zipfile as z
kfile = file('*.kml') #####breaks here#####
folderpath = '/neodc/sentinel1a/data/IW/L1_GRD/h/IPF_v2/2015/01/21' # folder with multiple folders and .zips
for zipfile in folderpath: # am only interested in the .kml files within the .zips
if kfile in zipfile:
with read(kfile) as k:
for line in k:
if '<coordinates>' in line: # only want the coordinate line
print line # print the coordinates
k.close()
Eventually I want to loop this through multiple folders rather than pointing to the exact folder location ie loop thorough every sub folder in here /neodc/sentinel1a/data/IW/L1_GRD/h/IPF_v2/2015/ but this is a starting point for me to try and understand how python works.
I am sure there are many problems with this script before it will run but the current one I have is
kfile = file('*.kml')
IOError: [Errno 22] invalid mode ('r') or filename: '*.kml'
Process finished with exit code 1
Any help appreciated to get this simple process script working.
When you run:
kfile = file('*.kml')
You are trying to open a single file named exactly *.kml, which is not what you want. If you want to process all *.kml files, you will need to (a) get a list of matching files and then (b) process those files in a list.
There are a number of ways to accomplish the above; the easiest is probably the glob module, which can be used something like this:
import glob
for kfilename in glob.glob('*.kml'):
print kfilename
However, if you are trying to process a directory tree, rather than a single directory, you may instead want to investigate the os.walk function. From the docs:
Generate the file names in a directory tree by walking the tree either top-down or bottom-up. For each directory in the tree rooted at directory top (including top itself), it yields a 3-tuple (dirpath, dirnames, filenames).
A simple example might look something like this:
import os
for root, dirs, files in os.walk('topdir/'):
kfilenames = [fn for fn in files if fn.endswith('.kml')]
for kfilename in kfilenames:
print kfilename
Additional commentary
Iterating over strings
Your script has:
for zipfile in folderpath:
That will simply iterate over the characters in the string folderpath. E.g., the output of:
folderpath = '/neodc/sentinel1a/data/IW/L1_GRD/h/IPF_v2/2015/01/21'
for zipfile in folderpath:
print zipefile
Would be:
/
n
e
o
d
c
/
s
e
n
t
i
n
e
l
1
a
/
...and so forth.
read is not a context manager
Your code has:
with read(kfile) as k:
There is no read built-in, and the .read method on files cannot be used as a context manager.
KML is XML
You're looking for "lines beginning with <coordinate>", but KML files are not line based. An entire KML could be a single line and it would still be valid.
Your are much better off using an XML parser to parse XML.
I recently started using imports to better organize my code in python. My original code in file1.py used the line:
def foo():
files = [f for f in os.listdir('.') if os.path.isfile(f)]
print files
#do stuff here....
Which referenced all the files in the same folder as the code, print files showing the correct output as an array of filenames.
However, I recently changed the directory structure to something like this:
./main.py
./folder1/file1.py
./folder1/data_file1.csv
./folder1/data_file2.csv
./folder1/......
And in main.py, I use:
import imp
file1 = imp.load_source('file1', "./folder1/file1.py")
.
.
.
file1.foo()
Now, files is an empty array. What happened? I have tried absolute filepaths in addition to relative. Directly declaring an array with data_file1.csv works, but I can't get anything else to work with this import.
What's going on here?
When you do os.listdir('.') , you are trying to list the contents of '.' (which is the current directory) , but this does not need to be the directory in which the script resides, it would be the directory in which you were in when you ran the script (unless you used os.chdir() inside the python script).
You should not depend on the current working directory , instead you can use __file__ to access the path of the current script, and then you can use os.path.dirname() to get the directory in which the script resides.
Then you can use os.path.join() to join the paths of the files you get from os.listdir() with the directory of the script and check if those paths are files to create your list of files.
Example -
def foo():
filedir = os.path.dirname(__file__)
files = [f for f in (os.path.join(filedir, fil) for fil in os.listdir(filedir)) if os.path.isfile(f)]
print files
#do stuff here....
I have a Python script that reads through a text csv file and creates a playlist file. However I can only do one at a time, like:
python playlist.py foo.csv foolist.txt
However, I have a directory of files that need to be made into a playlist, with different names, and sometimes a different number of files.
So far I have looked at creating a txt file with a list of all the names of the file in the directory, then loop through each line of that, however I know there must be an easier way to do it.
for f in *.csv; do
python playlist.py "$f" "${f%.csv}list.txt"
done
Will that do the trick? This will put foo.csv in foolist.txt and abc.csv in abclist.txt.
Or do you want them all in the same file?
Just use a for loop with the asterisk glob, making sure you quote things appropriately for spaces in filenames
for file in *.csv; do
python playlist.py "$file" >> outputfile.txt;
done
Is it a single directory, or nested?
Ex.
topfile.csv
topdir
--dir1
--file1.csv
--file2.txt
--dir2
--file3.csv
--file4.csv
For nested, you can use os.walk(topdir) to get all the files and dirs recursively within a directory.
You could set up your script to accept dirs or files:
python playlist.py topfile.csv topdir
import sys
import os
def main():
files_toprocess = set()
paths = sys.argv[1:]
for p in paths:
if os.path.isfile(p) and p.endswith('.csv'):
files_toprocess.add(p)
elif os.path.isdir(p):
for root, dirs, files in os.walk(p):
files_toprocess.update([os.path.join(root, f)
for f in files if f.endswith('.csv')])
if you have directory name you can use os.listdir
os.listdir(dirname)
if you want to select only a certain type of file, e.g., only csv file you could use glob module.