For a reason that I cannot resolve, my "subdirs" variable in my first for loop becomes an "unused variable" when it is written inside a function, which results in a incomplete search of my directory to achieve the goal of the function. But when it is not part of a function, it's recognized as a variable and my code is able to successfully search the entire directory and performed the desired tasks. I'm relatively new to python, please let me know how I can fix my function so "subdirs" will be recognized as a variable. Thanks a lot!
My For Loop With "Subdirs" Within a Function
import os
def function(rootdir, keyPhrases):
path = rootdir # Enter the root directory you want to search from
key_phrases = [keyPhrases] # Enter here the key phrases in the lines you hope to find
# This for loop allows all sub directories and files to be searched
for (path, subdirs, files) in os.walk(path):
files = [f for f in os.listdir(path) if f.endswith('.txt') or f.endswith('.log')] # Specify here the format of files you hope to search from (ex: ".txt" or ".log")
files.sort() # file is sorted list
files = [os.path.join(path, name) for name in files] # Joins the path and the name, so the files can be opened and scanned by the open() function
# The following for loop searches all files with the selected format
for filename in files:
# Opens the individual files and to read their lines
with open(filename) as f:
f = f.readlines()
# The following loop scans for the key phrases entered by the user in every line of the files searched, and stores the lines that match into the "important" array
for line in f:
for phrase in key_phrases:
if phrase in line:
print(line)
break
print("The end of the directory has been reached, if no lines are printed then that means the key phrase does not exist in the root directory you entered.")
My For Loop With "Subdirs"By Itself
import os
path = r"D:\(Chosen Root Directory)"
key_phrases = ["example_keyPhrase1"] # Enter here the key phrases in the lines you hope to find
# This for loop allows all sub directories and files to be searched
for (path, subdirs, files) in os.walk(path):
files = [f for f in os.listdir(path) if f.endswith('.txt') or f.endswith('.log')] # Specify here the format of files you hope to search from (ex: ".txt" or ".log")
files.sort() # file is sorted list
files = [os.path.join(path, name) for name in files] # Joins the path and the name, so the files can be opened and scanned by the open() function
# The following for loop searches all files with the selected format
for filename in files:
# Opens the individual files and to read their lines
with open(filename) as f:
f = f.readlines()
# The following loop scans for the key phrases entered by the user in every line of the files searched, and stores the lines that match into the "important" array
for line in f:
for phrase in key_phrases:
if phrase in line:
print(line)
break
print("The end of the directory has been reached, if no lines are printed then that means the key phrase does not exist in the root directory you entered.")
You never reference subdirs in your first code, so that's why you're getting the warning. However, that is not the reason your code doesn't work as you intended.
This line is the issue: files = [f for f in os.listdir(path) if f.endswith('.txt') or f.endswith('.log')]
os.listdir(path) will list all the directories in that path [1], not including the subdirectories within each directory. The variable 'subdirs' already contains that information. But you don't need it. The variable 'files' already has what you want, so you can change the line to this:
files = [f for f in files if f.endswith('.txt') or f.endswith('.log')]
[1] https://docs.python.org/3/library/os.html#os.listdir
Related
I am trying to code a script that will collect values from a .xvg files. I have 20 folders that contain the targeted file. Folder are numerated from 1-20 (in the code you see 1.Rimo)
I have already made the code that collects the data when I specify full path, however, I need something generic so I can loop through those 20 folders, get that data and store it as a variable.
rmsf = open('/home/alispahic/1.CB1_project/12.ProductionRun/1.Rimo/rmsf.xvg','r+')
for line in rmsf:
if line.startswith(' 4755'):
print (line)
l = line.split()
print (l)
value = float(l[1])
sum1 = float(sum1) + value
print(len(l))
print (sum1)
You can use os.listdir():
base_path = '/home/alispahic/1.CB1_project/12.ProductionRun'
file_name = 'rmsf.xvg'
for dir_name in os.listdir(base_path):
print(dir_name)
with open(os.path.join(base_path, dir_name, file_name)) as f:
for line in f:
# here goes your code
pass
Just remember to join the dir_name with the base_path (the path of the directory you are iterating over).
Also note that this returns files as well, not just directories. If you folder /home/alispahic/1.CB1_project/12.ProductionRun contains only directories, then that won't be a problem; otherwise you would need to filter out the files.
I have solved the problem by adding glob.
for name in glob.glob('/home/alispahic/1.CB1_project/12.ProductionRun/*/rmsf.xvg'):
for line in open(name):
if line.startswith(' 4755'):
I created a script to see all the files in a folder and print the full path of each file.
The script is working and prints the output in the Command Prompt (Windows)
import os
root = 'C:\Users\marco\Desktop\Folder'
for path, subdirs, files in os.walk(root):
for name in files:
print os.path.join(path, name)
I want now to save the output in a txt file so I have edited the code assigning the os.path.join(path,name) to a variable values but, when I print the output, the script gives me an error
import os
root = 'C:\Users\marco\Desktop\Folder'
for path, subdirs, files in os.walk(root):
for name in files:
values = os.path.join(path, name)
file = open('sample.txt', 'w')
file.write(values)
file.close()
Error below
file.write(values)
NameError: name 'values' is not defined
Try this:
file = open('sample.txt', 'w')
for path, subdirs, files in os.walk(root):
for name in files:
values = os.path.join(path, name)
file.write(values+'\n')
file.close()
Note that file is a builtin symbol in Python (which is overridden here), so I suggest that you replace it with fileDesc or similar.
The problem is the variable values is only limited to the scope of the inner for loop. So assign empty value to the variable before you start the iteration. Like values=None or better yet values='' Now assuming the above code even worked you wouldn't get the output file you desired. You see, the variable values is being regularly updated. So after the iteration the location of the last file encountered would be stored in values which would then be written in the sample.txt file.
Another bad practice you seem to be following is using \ instead of \\ inside strings. This might come to bite you later (if they haven't already). You see \ when followed by a letter denotes an escape sequence\character and \\ is the escape sequence for slash.
So here's a redesigned working sample code:
import os
root = 'C:\\Users\\marco\\Desktop\\Folder'
values=''
for path, subdirs, files in os.walk(root):
for name in files:
values = values + os.path.join(path, name) + '\n'
samplef = open('sample.txt', 'w')
samplef.write(values)
samplef.close()
In case you aren't familiar, '\n' denotes the escape sequence for a new-line. Reading your output file would be quite tedious had all your files been written been on the same line.
PS: I did the code with stings as that's what I would prefer in this scenario, but you way try arrays or lists or what-not. Just be sure that you define the variable beforehand lest you should get out of scope.
import os
root = 'C:\Users\marco\Desktop\Folder'
file = open('sample.txt', 'w')
for path, subdirs, files in os.walk(root):
for name in files:
values = os.path.join(path, name)
file.write(values)
file.close()
values is not defined to be availbale in the lexical scope of the file. It is scoped within a inner loop. change it as above, will work.
I have a list of strings (stored in a .txt file, one string per line) and I want to make a script that takes the first line and search all folders names in a directory, and the takes the second line and search all folders names and so on. How do I do this? Hope i made my self clear. Thks!
This example reads paths from text file and prints them out. Replace the print with your search logic.
import os
textfile = open('C:\\folder\\test.txt', 'r')
for line in textfile:
rootdir = line.strip()
for subdir, dirs, files in os.walk(rootdir):
for file in files:
print(os.path.join(subdir, file))
Assuming that by searching all folders you mean printing them out to the standard output you can do this:
from os import listdir
from os.path import isdir, join
with open('directories.txt', 'r') as f:
i = 1
for line in f.readlines():
directories = []
tmp = line.strip('\n')
for d in listdir(tmp):
if isdir(join(tmp, d)):
directories.append(d)
print('directory {}: {}'.format(i, directories))
i += 1
It will output something like this:
directory 1: ['subfolder_1', 'subfolder_0']
directory 2: ['subfolder_0']
directory 3: []
Note that I recommend using with in order to open files since it will automatically properly close them even if exceptions occur.
I have 50 instances of two files that are in 50 separate folders within a directory. I am trying to read from and extract information from the two files within each folder and append the info from the two files to a list at the same time while in the folder that contains them both. (So they will be associated by being appended to the same same list index) I'm using os.walk and opening the file as soon as the file is recognized. (Or trying to). When I run it is seems like the files in question are never being opened, and definitely nothing is being appended to my lists. Could someone tell me if what I have here is completely ridiculous because it seems logical to me but its not working.
import os
import sys
#import itertools
def get_theList():
#specify directory where jobs are located
#can also set 'os.curdir' to rootDir to read from current
rootDir = '/home/my.user.name/O1/injections/test'
No issues here; this is correct
B_sig = []
B_gl = []
SNR_net = []
a = 0
for root, dirs, files in os.walk(rootDir):
for folder in dirs:
for file in folder:
if file == 'evidence_stacked.dat':
print 'open'
a+=1
ev_file = open(file,"r")
ev_lin = ev_file.split()
B_gl.append(ev_lin[1])
B_sig.append(ev_lin[2])
print ev_lin[1]
ev_file.close()
if file == 'snr.txt':
net_file = open(file,"r")
net_lines=net_file.readlines()
SNR_net.append(net_lines[2])
net_file.close()
print 'len a'
print a
This says 0 on output
print 'B_sig'
print B_sig
print len(B_sig)
print 'B_net'
print B_gl
print len(B_gl)
print 'SNR_net'
print SNR_net
print len(SNR_net)
if __name__ == "__main__":
get_theList()
From help(os.walk):
filenames is a list of the names of the non-directory files in dirpath.
You're checking to see if a list is equal to a string.
files == 'evidence_stacked.dat'
What you really want to do is one of the following:
for file in files:
if file == 'evidence_stacked.dat':
...
Or...
if 'evidence_stacked.dat' in files:
...
Both will work, but the latter is a bit more efficient.
In response to your edit:
Instead of...
for file in folder:
...
use...
for file in os.listdir(os.path.join(rootdir, folder)):
...
Also, where you use file after that, replace it with
os.path.join(rootdir, folder, file)
or store that in a new variable (like, say, file2) and use that in place of file.
I have been trying to write some python code in order to get each line from a .txt file and search for a file with that name in a folder and its subfolders. After this I want to copy that file in a preset destination folder.
The thing is when I test this code I can read all the files in the .txt and I can display all files in a directory and its subdirectories. The problem rises when I have to compare the filename I read from the .txt (line by line as I said) with all the filenames within the directory folder and then copy the file there.
Any ideas what am I doing wrong?
import os, shutil
def main():
dst = '/Users/jorjis/Desktop/new'
f = open('/Users/jorjis/Desktop/articles.txt', 'rb')
lines = [line[:-1] for line in f]
for files in os.walk("/Users/jorjis/Desktop/folder/"):
for line in lines:
if line == files:
shutil.copy('/dir/file.ext', '/new/dir')
You are comparing the file names from the text file with a tuple with three elements: the root path of the currently visited folder, a list of all subdirectory names in that path, and a list of all file names in that path. Comparing a string with a tuple will never be true. You have to compare each file name with the set of file names to copy. The data type set comes in handy here.
Opening a file together with the with statement ensures that it is closed when the control flow leaves the with block.
The code might look like this:
import os
import shutil
def main():
destination = '/Users/jorjis/Desktop/new'
with open('/Users/jorjis/Desktop/articles.txt', 'r') as lines:
filenames_to_copy = set(line.rstrip() for line in lines)
for root, _, filenames in os.walk('/Users/jorjis/Desktop/folder/'):
for filename in filenames:
if filename in filenames_to_copy:
shutil.copy(os.path.join(root, filename), destination)
If I had to guess, I would say that the files in the .txt contain the entire path. You'd need to add a little more to os.walk to match up completely.
for root, _, files in os.walk("/Users/jorjis/Desktop/folder/"):
for f in files:
new_path = f + root
if new_path in lines:
shutil.copy(new_path, `/some_new_dir')
Then again, I'm not sure what the .txt file looks like so it might be that your original way works. If that's the case, take a closer look at the lines = ... line.