I'm trying to get a homemade path navigation function working - basically I need to go through one folder, and explore every folder within it, running a function within each folder.
I reach a problem when I try to change directories within a for loop. I've got this "findDirectories" function:
def findDirectories(list):
for files in os.listdir("."):
print (files)
list.append(files)
os.chdir("y")
That last line causes the problems. If I remove it, the function just compiles a list with all the folders in that folder. Unfortunately, this means I have to run this each time I go down a folder, I can't just run the whole thing once. I've specified the folder "y" as that's a real folder, but the program crashes upon opening even with that. Doing os.chdir("y") outside of the for loop has no issues at all.
I'm new to Python, but not to programming in general. How can I get this to work, or is there a better way? The final result I need is running a Function on each single "*Response.xml" file that exists within this folder, no matter how deeply nested it is.
Well, you don't post the traceback of the actual error but clearly it doesn't work as you have specified y as a relative path.
Thus it may be able to change to y in the first iteration of the loop, but in the second it will be trying to change to a subdirectory of y that is also called y
Which you probably do not have.
You want to be doing something like
import os
for dirName, subDirs, fileNames in os.walk(rootPath):
# its not clear which files you want, I assume anything that ends with Response.xml?
for f in fileNames:
if f.endswith("Response.xml"):
# this is the path you will want to use
filePath = os.path.join(dirName, f)
# now do something with it!
doSomethingWithFilePath(filePath)
Thats untested, but you have the idea ...
As Dan said, os.walk would be better. See the example there.
Related
I am trying to make a python script that automatically moves files from my internal drive to any usb drive that is plugged in. However this destination path is unpredictable because I am not using the same usb drives everytime. With Raspbian Buster full version, the best I can do so far is automount into /media/pi/xxxxx, where that xxxxxx part is unpredictable. I am trying to make my script account for that. I can get the drive mounting points with
drives = os.listdir("/media/pi/")
but I am worried some will be invalid because of not being unmounted before they're yanked out (I need to run this w/o a monitor or keyboard or VNC or any user input other than replacing USB drives). So I'd think I'd need to do a series of try catch statements perhaps in an if elif elif elif chain, to make sure that the destination is truly valid, but I don't know how to do that w/o hardcoding the names in. The only way I know how to iterate thru a set of names I don't know is
for candidate_drive in drives:
but I don't know how to make it go onto the next candidate drive only if the current one is throwing an exception.
System: Raspberry Pi 4, Raspbian buster full, Python 3.7.
Side note: I am also trying this on Buster lite w/ usbmount, which does have predictable mounting names, but I can't get exfat and ntfs to mount and that is question for another post.
Update: I was thinking about this more and maybe I need a try, except, else statement where the except is pas and the else is break? I am trying it out.
Update2: I rethought my problem and maybe instead of looking for the exception to determine when to try the next possible drive, perhaps I could instead look for a successful transfer and break the loop if so.
import os
import shutil
files_to_send = os.listdir("/home/outgoing_files/")
source_path = "/home/outgoing_files/"
possible_USB_drives = os.listdir("/media/")
for a_possible_drive in possible_USB_drives:
try:
destpath = os.path.join("/media/", a_possible_drive)
for a_file in files_to_send:
shutil.copy(source_path + a_file, destpath)
except:
pass # transfer to possible drive didn't work
else:
break # Stops this entire program bc the transfer worked!
If you have an unsigned number of directories in side of a directory, etc... You cannot use nested for cicles. You need to implement a recursive call function. If you have directories inside a directory, you would like to review all the directories, this is posible iterating over the list of directories using a same function that iterate over them, over and over, until it founds the file.
Lets see an example, you have a path structure like this:
path
dir0
dir2
file3
dir1
file2
file0
file1
You have no way to now how many for cicles are required to iterate over al elements in this structure. You can call an iteration (for cicle) over all elements of a single directory, and do the same to the elements inside that elements. In this structure dirN where N is a number means a directory, and fileN means a file.
You can use os.listdir() function to get the contents of a directory:
print(os.listdir("path/"))
returns:
['dir0', 'dir1', 'file0.txt', 'file1.txt']
Using this function with all the directories you can get all the elements of the structure. You only need to iterate over all the directories. Specificly directories, because if you use a file:
print(os.listdir("path/file0.txt"))
you get an error:
NotADirectoryError: [WinError 267]
But, remember that in Python exists the generator expressions.
String work
If you have a mainpath you need to get access to a a directory inside this with a full string reference: "path/dirN". You cannot access directly to the file that does not is in the scope of the .py script:
print(os.listdir("dir0/"))
gets an error
FileNotFoundError: [WinError 3]
So you need to always format the initial mainpath with the actual path, in this way you can get access to al the elements of the structure.
Filtering
I said that you could use an generator expression to get just the directories of the structure, recursively. Lets take a look to a function:
def searchfile(paths: list,search: str):
for path in paths:
print("Actual path: {}".format(path))
contents = os.listdir(path)
print("Contents: {}".format(contents))
dirs = ["{}{}/".format(path,x) for x in contents if os.path.isdir("{}/{}/".format(path,x)) == True]
print("Directories: {} \n".format(dirs))
searchfile(dirs,search)
In this function we are getting the contents of the actual path, with os.listdir() and then filtering it with a generator expression. Obviusly we use recursive function call with the dirs of the actual path: searchfile(dirs,search)
This algorithm can be applied to any file structure, because the path argument is a list. So you can iterate over directories with directories with directories, and that directories with more directories inside of them.
If you want to get an specific file you could use the second argument, search. Also you can implement a conditional and get the specific path of the file found:
if search in contents:
print("File found! \n")
print("{}".format(os.path.abspath(search)))
sys.exit()
I hope have helped you.
Im trying to append multiple csv files from one directory into a single file within another. When I run this code it appears to compile successfully but it does not take effect. The combined.csv file remains empty after each run. There are also no errors within the console. I attempted this on multiple IDEs (vs code, pycharm, and spyder).
import os
import glob
import pandas
def concatenate(indir="/directoryA/directoryB/csvFile_directoryC",
outfile="/directoryA/directoryB/combine.csv"):
os.chdir(indir)
filelist=glob.glob("*.csv")
dfList=[]
colnames=["c1","c2","c3","c4"]
for filename in filelist:
print(filename)
df=pandas.read_csv(filename,header=None)
dfList.append(df)
concatDf=pandas.concat(dfList,axis=0)
concatDf.columns=colnames
concatDf.to_csv(outfile,index=None)
Well, it's not going to print anything if you didn't call it ;)
I think you just forgot to call your function in your program, that's why it is compiling, but since the function is never being called you are never getting a print.
If it's not what's been stated above (i.e. you need to call the function), you may not actually be able to find any *.csv files in that directory. Since it's a for loop over filelist, if filelist turns up empty, you'll still be left with dfList that's an empty list, but is still valid to concat on your concatDf.
If it's not the fact that the function is never called, try printing the result from os.listdir() to see what's in there for glob to check against, and check that your filelist isn't an empty list.
Ok, this is weird and maybe awkward.
I made a script so I could change the end of subtitles files to keep consistency.
Basically it replaces A.X.str to A.Y.str. It worked flawlessly at a single folder.
I decided then to make a recursive version of it so I could do it on any folder I had, regardless if the episodes where together, separated by season or each on an individual path.
I really don't know how or why, but it sent all the files it reached to the root folder I was using until it halted raising a FileExistsError.
The code bit I'm using is:
def rewrite(folder, old, new):
for f in next(os.walk(folder))[2]:
os.rename(os.path.join(folder, f),
os.path.join(path, f.replace(old, new)))
for f in next(os.walk(folder))[1]:
x = os.path.join(folder, f)
rewrite(x, old, new)
Where 'old' is "A.X.str", 'new' is "A.Y.str" and folder is the full path of the root folder "C:\Series\Serie Name".
Why doesn't this work as recursive? The first bit of code (First FOR loop) works fine on it's own in a single folder.
Is the problem with the "next" I use to get the names of files and folders?
The code you are showing us is using a path variable in the rename destination -- that should be the folder variable instead.
This question already has answers here:
Python changing file permissions when not wanted
(2 answers)
Closed 9 years ago.
for some reason this python script no longer works now. The script changes the folder permission to read only after it has been run? It runs once and deletes all the files in the folder but when it runs again it gets a Windows error 5 Access denied due to the script changing the permissions to read only on the folder. I can't see why it does this or how to avoid it?
The thing is i didn't write this script and know nothing about python. how would you change it to avoid this issue. Please could you give an example with the code in the script, i wouldn't know where to place it. thanks for the help!
import os
import shutil
for root, dirs, files in os.walk(eg.globals.tvzip):
for f in files:
os.remove(os.path.join(root, f))
for d in dirs:
shutil.rmtree(os.path.join(root, d))
for root, dirs, files in os.walk(eg.globals.tvproc):
for f in files:
os.remove(os.path.join(root, f))
for d in dirs:
shutil.rmtree(os.path.join(root, d))
I don't care whether you wrote this code or not, it makes no sense, and trying to make it work without fixing it is a silly idea.
First, if you want to remove an entire directory tree, don't try to walk the tree and remove each subtree before walking it. Just remove the whole tree:
shutil.rmtree(eg.globals.tvzip)
shutil.rmtree(eg.globals.tvproc)
If you want to remove all of the contents of the tree, but not the root itself, don't use os.walk, just os.listdir:
for p in os.listdir(eg.globals.tvzip):
shutil.rmtree(os.path.join(eg.globals.tvzip, p)
for p in os.listdir(eg.globals.tvproc):
shutil.rmtree(os.path.join(eg.globals.tvproc, p)
That will remove any errors caused by your code stepping on its own toes, trying to keep a directory open for its walk and trying to delete it at the same time.
If you still get errors, it could be because some of the files are read-only, but it could just as easily be because some other program has them open. The only way you will be able to debug that is to know which files, so you can examine them.
The exceptions that you get should include the full pathname to the file that failed in their output—in fact, you showed one in one of your other questions:
WindowsError: [Error 5] Access is denied: 'C:\\zDump\\TVzip\\Elem.avi'
So, how do you know what the problem is?
You can open C:\zDump\TVzip in Explorer and look at Elem.avi and see if it's read-only. Or you can use the DOS prompt, if you know how to do that.
To determine whether it's being kept open by another program, you need a third-party tool. The GUI tool Process Explorer and the command-line tool Handle, both from Sysinternals and published by Microsoft, are probably the simplest.
If you want to delete a whole tree of flies, shutil.rmtree will do it for you - you don't need to walk the list of files deleting them.
If you're trying to not delete the top level directory, you should add a check for that. According to the docs, you will be given the top-level directory:
os.walk(top, topdown=True, onerror=None, followlinks=False)
Generate
the file names in a directory tree by walking the tree either top-down
or bottom-up. For each directory in the tree rooted at directory top
(including top itself), it yields a 3-tuple (dirpath, dirnames,
filenames).
Could something else than your script be setting these folders read only? Perhaps you're deleting them, then getting Access Denied because they don't exist, or something else is re-creating them that way?
If I am to read a number of files in Python 3.2, say 30-40, and i want to keep the file references in a list
(all the files are in a common folder)
Is there anyway how i can open all the files to their respective file handles in the list, without having to individually open every file via the file.open() function
This is simple, just use a list comprehension based on your list of file paths. Or if you only need to access them one at a time, use a generator expression to avoid keeping all forty files open at once.
list_of_filenames = ['/foo/bar', '/baz', '/tmp/foo']
open_files = [open(f) for f in list_of_filenames]
If you want handles on all the files in a certain directory, use the os.listdir function:
import os
open_files = [open(f) for f in os.listdir(some_path)]
I've assumed a simple, flat directory here, but note that os.listdir returns a list of paths to all file objects in the given directory, whether they are "real" files or directories. So if you have directories within the directory you're opening, you'll want to filter the results using os.path.isfile:
import os
open_files = [open(f) for f in os.listdir(some_path) if os.path.isfile(f)]
Also, os.listdir only returns the bare filename, rather than the whole path, so if the current working directory is not some_path, you'll want to make absolute paths using os.path.join.
import os
open_files = [open(os.path.join(some_path, f)) for f in os.listdir(some_path)
if os.path.isfile(f)]
With a generator expression:
import os
all_files = (open(f) for f in os.listdir(some_path)) # note () instead of []
for f in all_files:
pass # do something with the open file here.
In all cases, make sure you close the files when you're done with them. If you can upgrade to Python 3.3 or higher, I recommend you use an ExitStack for one more level of convenience .
The os library (and listdir in particular) should provide you with the basic tools you need:
import os
print("\n".join(os.listdir())) # returns all of the files (& directories) in the current directory
Obviously you'll want to call open with them, but this gives you the files in an iterable form (which I think is the crux of the issue you're facing). At this point you can just do a for loop and open them all (or some of them).
quick caveat: Jon Clements pointed out in the comments of Henry Keiter's answer that you should watch out for directories, which will show up in os.listdir along with files.
Additionally, this is a good time to write in some filtering statements to make sure you only try to open the right kinds of files. You might be thinking you'll only ever have .txt files in a directory now, but someday your operating system (or users) will have a clever idea to put something else in there, and that could throw a wrench in your code.
Fortunately, a quick filter can do that, and you can do it a couple of ways (I'm just going to show a regex filter):
import os,re
scripts=re.compile(".*\.py$")
files=[open(x,'r') for x in os.listdir() if os.path.isfile(x) and scripts.match(x)]
files=map(lambda x:x.read(),files)
print("\n".join(files))
Note that I'm not checking things like whether I have permission to access the file, so if I have the ability to see the file in the directory but not permission to read it then I'll hit an exception.