I am trying to code a script that will collect values from a .xvg files. I have 20 folders that contain the targeted file. Folder are numerated from 1-20 (in the code you see 1.Rimo)
I have already made the code that collects the data when I specify full path, however, I need something generic so I can loop through those 20 folders, get that data and store it as a variable.
rmsf = open('/home/alispahic/1.CB1_project/12.ProductionRun/1.Rimo/rmsf.xvg','r+')
for line in rmsf:
if line.startswith(' 4755'):
print (line)
l = line.split()
print (l)
value = float(l[1])
sum1 = float(sum1) + value
print(len(l))
print (sum1)
You can use os.listdir():
base_path = '/home/alispahic/1.CB1_project/12.ProductionRun'
file_name = 'rmsf.xvg'
for dir_name in os.listdir(base_path):
print(dir_name)
with open(os.path.join(base_path, dir_name, file_name)) as f:
for line in f:
# here goes your code
pass
Just remember to join the dir_name with the base_path (the path of the directory you are iterating over).
Also note that this returns files as well, not just directories. If you folder /home/alispahic/1.CB1_project/12.ProductionRun contains only directories, then that won't be a problem; otherwise you would need to filter out the files.
I have solved the problem by adding glob.
for name in glob.glob('/home/alispahic/1.CB1_project/12.ProductionRun/*/rmsf.xvg'):
for line in open(name):
if line.startswith(' 4755'):
Related
For a reason that I cannot resolve, my "subdirs" variable in my first for loop becomes an "unused variable" when it is written inside a function, which results in a incomplete search of my directory to achieve the goal of the function. But when it is not part of a function, it's recognized as a variable and my code is able to successfully search the entire directory and performed the desired tasks. I'm relatively new to python, please let me know how I can fix my function so "subdirs" will be recognized as a variable. Thanks a lot!
My For Loop With "Subdirs" Within a Function
import os
def function(rootdir, keyPhrases):
path = rootdir # Enter the root directory you want to search from
key_phrases = [keyPhrases] # Enter here the key phrases in the lines you hope to find
# This for loop allows all sub directories and files to be searched
for (path, subdirs, files) in os.walk(path):
files = [f for f in os.listdir(path) if f.endswith('.txt') or f.endswith('.log')] # Specify here the format of files you hope to search from (ex: ".txt" or ".log")
files.sort() # file is sorted list
files = [os.path.join(path, name) for name in files] # Joins the path and the name, so the files can be opened and scanned by the open() function
# The following for loop searches all files with the selected format
for filename in files:
# Opens the individual files and to read their lines
with open(filename) as f:
f = f.readlines()
# The following loop scans for the key phrases entered by the user in every line of the files searched, and stores the lines that match into the "important" array
for line in f:
for phrase in key_phrases:
if phrase in line:
print(line)
break
print("The end of the directory has been reached, if no lines are printed then that means the key phrase does not exist in the root directory you entered.")
My For Loop With "Subdirs"By Itself
import os
path = r"D:\(Chosen Root Directory)"
key_phrases = ["example_keyPhrase1"] # Enter here the key phrases in the lines you hope to find
# This for loop allows all sub directories and files to be searched
for (path, subdirs, files) in os.walk(path):
files = [f for f in os.listdir(path) if f.endswith('.txt') or f.endswith('.log')] # Specify here the format of files you hope to search from (ex: ".txt" or ".log")
files.sort() # file is sorted list
files = [os.path.join(path, name) for name in files] # Joins the path and the name, so the files can be opened and scanned by the open() function
# The following for loop searches all files with the selected format
for filename in files:
# Opens the individual files and to read their lines
with open(filename) as f:
f = f.readlines()
# The following loop scans for the key phrases entered by the user in every line of the files searched, and stores the lines that match into the "important" array
for line in f:
for phrase in key_phrases:
if phrase in line:
print(line)
break
print("The end of the directory has been reached, if no lines are printed then that means the key phrase does not exist in the root directory you entered.")
You never reference subdirs in your first code, so that's why you're getting the warning. However, that is not the reason your code doesn't work as you intended.
This line is the issue: files = [f for f in os.listdir(path) if f.endswith('.txt') or f.endswith('.log')]
os.listdir(path) will list all the directories in that path [1], not including the subdirectories within each directory. The variable 'subdirs' already contains that information. But you don't need it. The variable 'files' already has what you want, so you can change the line to this:
files = [f for f in files if f.endswith('.txt') or f.endswith('.log')]
[1] https://docs.python.org/3/library/os.html#os.listdir
I created a script to see all the files in a folder and print the full path of each file.
The script is working and prints the output in the Command Prompt (Windows)
import os
root = 'C:\Users\marco\Desktop\Folder'
for path, subdirs, files in os.walk(root):
for name in files:
print os.path.join(path, name)
I want now to save the output in a txt file so I have edited the code assigning the os.path.join(path,name) to a variable values but, when I print the output, the script gives me an error
import os
root = 'C:\Users\marco\Desktop\Folder'
for path, subdirs, files in os.walk(root):
for name in files:
values = os.path.join(path, name)
file = open('sample.txt', 'w')
file.write(values)
file.close()
Error below
file.write(values)
NameError: name 'values' is not defined
Try this:
file = open('sample.txt', 'w')
for path, subdirs, files in os.walk(root):
for name in files:
values = os.path.join(path, name)
file.write(values+'\n')
file.close()
Note that file is a builtin symbol in Python (which is overridden here), so I suggest that you replace it with fileDesc or similar.
The problem is the variable values is only limited to the scope of the inner for loop. So assign empty value to the variable before you start the iteration. Like values=None or better yet values='' Now assuming the above code even worked you wouldn't get the output file you desired. You see, the variable values is being regularly updated. So after the iteration the location of the last file encountered would be stored in values which would then be written in the sample.txt file.
Another bad practice you seem to be following is using \ instead of \\ inside strings. This might come to bite you later (if they haven't already). You see \ when followed by a letter denotes an escape sequence\character and \\ is the escape sequence for slash.
So here's a redesigned working sample code:
import os
root = 'C:\\Users\\marco\\Desktop\\Folder'
values=''
for path, subdirs, files in os.walk(root):
for name in files:
values = values + os.path.join(path, name) + '\n'
samplef = open('sample.txt', 'w')
samplef.write(values)
samplef.close()
In case you aren't familiar, '\n' denotes the escape sequence for a new-line. Reading your output file would be quite tedious had all your files been written been on the same line.
PS: I did the code with stings as that's what I would prefer in this scenario, but you way try arrays or lists or what-not. Just be sure that you define the variable beforehand lest you should get out of scope.
import os
root = 'C:\Users\marco\Desktop\Folder'
file = open('sample.txt', 'w')
for path, subdirs, files in os.walk(root):
for name in files:
values = os.path.join(path, name)
file.write(values)
file.close()
values is not defined to be availbale in the lexical scope of the file. It is scoped within a inner loop. change it as above, will work.
I have a list of strings (stored in a .txt file, one string per line) and I want to make a script that takes the first line and search all folders names in a directory, and the takes the second line and search all folders names and so on. How do I do this? Hope i made my self clear. Thks!
This example reads paths from text file and prints them out. Replace the print with your search logic.
import os
textfile = open('C:\\folder\\test.txt', 'r')
for line in textfile:
rootdir = line.strip()
for subdir, dirs, files in os.walk(rootdir):
for file in files:
print(os.path.join(subdir, file))
Assuming that by searching all folders you mean printing them out to the standard output you can do this:
from os import listdir
from os.path import isdir, join
with open('directories.txt', 'r') as f:
i = 1
for line in f.readlines():
directories = []
tmp = line.strip('\n')
for d in listdir(tmp):
if isdir(join(tmp, d)):
directories.append(d)
print('directory {}: {}'.format(i, directories))
i += 1
It will output something like this:
directory 1: ['subfolder_1', 'subfolder_0']
directory 2: ['subfolder_0']
directory 3: []
Note that I recommend using with in order to open files since it will automatically properly close them even if exceptions occur.
I want to go through all folders inside a directory:
directory\
folderA\
a.cpp
folderB\
b.cpp
folderC\
c.cpp
folderD\
d.cpp
The name of the folders are all known.
Specifically, I am trying to count the number of lines of code on each of the a.cpp, b.cpp, c.pp and d.cpp source files. So, go inside folderA and read a.cpp, count lines and then go back to directory, go inside folderB, read b.cpp, count lines etc.
This is what I have up until now,
dir = directory_path
for folder_name in folder_list():
dir = os.path.join(dir, folder_name)
with open(dir) as file:
source= file.read()
c = source.count_lines()
but I am new to Python and have no idea if my approach is appropriate and how to proceed. Any example code shown will be appreciated!
Also, does the with open handles the file opening/closing as it should for all those reads or more handling is required?
I would do it like this:
import glob
import os
path = 'C:/Users/me/Desktop/' # give the path where all the folders are located
list_of_folders = ['test1', 'test2'] # give the program a list with all the folders you need
names = {} # initialize a dict
for each_folder in list_of_folders: # go through each file from a folder
full_path = os.path.join(path, each_folder) # join the path
os.chdir(full_path) # change directory to the desired path
for each_file in glob.glob('*.cpp'): # self-explanatory
with open(each_file) as f: # opens a file - no need to close it
names[each_file] = sum(1 for line in f if line.strip())
print(names)
Output:
{'file1.cpp': 2, 'file3.cpp': 2, 'file2.cpp': 2}
{'file1.cpp': 2, 'file3.cpp': 2, 'file2.cpp': 2}
Regarding the with question, you don't need to close the file or make any other checks. You should be safe as it is now.
You may, however, check if the full_path exists as somebody (you) could mistakenly delete a folder from your PC (a folder from list_of_folders)
You can do this by os.path.isdir which returns True if the file exists:
os.path.isdir(full_path)
PS: I used Python 3.
Use Python 3's os.walk() to traverse all subdirectories and files of a given path, opening each file and do your logic. You can use a 'for' loop to walk it, simplifying your code greatly.
https://docs.python.org/2/library/os.html#os.walk
As manglano said, os.walk()
you can generate a list of folder.
[src for src,_,_ in os.walk(sourcedir)]
you can generate a list of file path.
[src+'/'+file for src,dir,files in os.walk(sourcedir) for file in files]
I have 50 instances of two files that are in 50 separate folders within a directory. I am trying to read from and extract information from the two files within each folder and append the info from the two files to a list at the same time while in the folder that contains them both. (So they will be associated by being appended to the same same list index) I'm using os.walk and opening the file as soon as the file is recognized. (Or trying to). When I run it is seems like the files in question are never being opened, and definitely nothing is being appended to my lists. Could someone tell me if what I have here is completely ridiculous because it seems logical to me but its not working.
import os
import sys
#import itertools
def get_theList():
#specify directory where jobs are located
#can also set 'os.curdir' to rootDir to read from current
rootDir = '/home/my.user.name/O1/injections/test'
No issues here; this is correct
B_sig = []
B_gl = []
SNR_net = []
a = 0
for root, dirs, files in os.walk(rootDir):
for folder in dirs:
for file in folder:
if file == 'evidence_stacked.dat':
print 'open'
a+=1
ev_file = open(file,"r")
ev_lin = ev_file.split()
B_gl.append(ev_lin[1])
B_sig.append(ev_lin[2])
print ev_lin[1]
ev_file.close()
if file == 'snr.txt':
net_file = open(file,"r")
net_lines=net_file.readlines()
SNR_net.append(net_lines[2])
net_file.close()
print 'len a'
print a
This says 0 on output
print 'B_sig'
print B_sig
print len(B_sig)
print 'B_net'
print B_gl
print len(B_gl)
print 'SNR_net'
print SNR_net
print len(SNR_net)
if __name__ == "__main__":
get_theList()
From help(os.walk):
filenames is a list of the names of the non-directory files in dirpath.
You're checking to see if a list is equal to a string.
files == 'evidence_stacked.dat'
What you really want to do is one of the following:
for file in files:
if file == 'evidence_stacked.dat':
...
Or...
if 'evidence_stacked.dat' in files:
...
Both will work, but the latter is a bit more efficient.
In response to your edit:
Instead of...
for file in folder:
...
use...
for file in os.listdir(os.path.join(rootdir, folder)):
...
Also, where you use file after that, replace it with
os.path.join(rootdir, folder, file)
or store that in a new variable (like, say, file2) and use that in place of file.