Unable to resolve path with os.walk - python

I have a bit of code that searches for files in a network share that match a certain keyword. When a match is found, I would like to copy the found files to a different location on the network. The error I'm getting is as follows:
Traceback (most recent call last):
File "C:/Users/user.name/PycharmProjects/SearchDirectory/Sub-Search.py", line 15, in <module>
shutil.copy(path+name, dest)
File "C:\Python27\lib\shutil.py", line 119, in copy
copyfile(src, dst)
File "C:\Python27\lib\shutil.py", line 82, in copyfile
with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory: '//server/otheruser$/Document (user).docx'
I believe it's because I'm trying to copy the found file without specifying its direct path, since some of the files are found in subfolders. If so, how can I store the direct path to a file when it matches the keyword? Here is the code I have so far:
import os
import shutil
dest = '//dbserver/user.name$/Reports/User'
path = '//dbserver/User$/'
keyword = 'report'
print 'Starting'
for root, dirs, files in os.walk(path):
for name in files:
if keyword in name.lower():
shutil.copy(path+name, dest)
print name
print 'Done'
PS. The user folder being accessed is hidden, hence the $.

Looking at the docs for os.walk, your error is most likely that you are not including the full path. To avoid having to worry about things like trailing slashes and OS/specific path separators, you should also consider using os.path.join.
Replace path+name with os.path.join(root, name). The root element is the path of the subdirectory under path actually containing name, which you are currently omitting from your full path.
You should also replace dest with os.path.join(dest, os.path.relpath(root, path)) if you wish to preserve the directory structure in the destination. os.path.relpath subtracts the path prefix of path from root, allowing you to create the same relative path under dest. If the correct subfolders do not exist, you may want to call os.mkdir or better yet os.makedirs on them as you go:
for root, dirs, files in os.walk(path):
out = os.path.join(dest, os.path.relpath(root, path))
#os.makedirs(out) # You may end up with empty folders if you put this line here
for name in files:
if keyword in name.lower():
os.makedirs(out) # This guarantees that only folders with at least one file get created
shutil.copy(os.path.join(root, name), out)
Finally, look into shutil.copytree, wich does something very similar to what you want. The only disadvantage is that it does not offer the fine level of control for things like filtering that os.walk does (which you are using).

Related

Directory walk and remove files/directories

I copied a (presumably large) number of files on to an existing directory, and I need to reverse the action. The targeted directory contains a number of other files, that I need to keep there, which makes it impossible to simply remove all files from the directory. I was able to do it with Python. Here's the script:
import os, sys, shutil
source = "/tmp/test/source"
target = "/tmp/test/target"
for root, dirs, files in os.walk(source): # for files and directories in source
for dir in dirs:
if dir.startswith("."):
print(f"Removing Hidden Directory: {dir}")
else:
print(f"Removing Directory: {dir}")
try:
shutil.rmtree(f"{target}/{dir}") # remove directories and sub-directories
except FileNotFoundError:
pass
for file in files:
if file.startswith("."): # if filename starts with a dot, it's a hidden file
print(f"Removing Hidden File: {file}")
else:
print(f"Removing File: {file}")
try:
os.remove(f"{target}/{file}") # remove files
except FileNotFoundError:
pass
print("Done")
The script above looks in the original (source) directory and lists those files. Then it looks into the directory you copied the files to(target), and removes only the listed files, as they exist in the source directory.
How can I do the same thing in Go? I tried filepath.WalkDir(), but as stated in the docs:
WalkDir walks the file tree rooted at root, calling fn for each file
or directory in the tree, including root.
If WalkDir() includes the root, then os.Remove() or os.RemoveAll() will delete the whole thing.
Answered by Cerise Limon. Use os.ReadDir to read source the directory entries. For each entry, os.RemoveAll the corresponding target file

File not found error in os.rename

I am trying to write a program to categorize into folders a large amount of files according to their respective groups indicated in the file name. I wrote the followin code, but when I run it it gives me a file not found error, even though the file is in the given path. I'd appreciate any help in figuring out what is wrong.
import os
old_dir = '/Users/User/Desktop/MyFolder'
for f in os.listdir(old_dir):
file_name, file_ext = os.path.splitext(f)
file_name.split('-')
split_file_name = file_name.split('-')
new_dir = os.path.join(old_dir,
'-'.join(split_file_name[:3]),
split_file_name[5],
f)
os.rename(os.path.join(old_dir, f), new_dir)
Here's the error:
Traceback (most recent call last):
File "/Users/User/Documents/Sort Files into Folders/Sort Files into Folders.py", line 19, in <module>
os.rename(os.path.join(old_dir, f), new_dir)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/User/Desktop/MyFolder/AHA35-3_30x1_12-31-7d-g1a1-ArmPro.jpg' -> '/Users/User/Desktop/MyFolder/AHA35-3_30x1_12-31/ArmPro/AHA35-3_30x1_12-31-7d-g1a1-ArmPro.jpg
os.rename does not automatically create new directories (recursively), if the new name happens to be a filename in a directory that does not exist.
To create the directories first, you can (in Python 3) use:
os.makedirs(dirname, exist_ok=True)
where dirname can contain subdirectories (existing or not).
Alternatively, use os.renames, that can handle new and intermediate directories. From the documentation:
Recursive directory or file renaming function. Works like rename(), except creation of any intermediate directories needed to make the new pathname good is attempted first
os.rename need path, so it should look like:
os.rename(path+old_name, path+new_name)

Loop through folders in Python and for files containing strings

I am very new to python.
I need to iterate through the subdirectories of a given directory and return all files containing a certain string.
for root, dirs, files in os.walk(path):
for name in files:
if name.endswith((".sql")):
if 'gen_dts' in open(name).read():
print name
This was the closest I got.
The syntax error I get is
Traceback (most recent call last):
File "<pyshell#77>", line 4, in <module>
if 'gen_dts' in open(name).read():
IOError: [Errno 2] No such file or directory: 'dq_offer_desc_bad_pkey_vw.sql'
The 'dq_offer_desc_bad_pkey_vw.sql' file does not contain 'gen_dts' in it.
I appreciate the help in advance.
You're getting that error because you're trying to open name, which is just the file's name, not it's full relative path. What you need to do is open(os.path.join(root, name), 'r') (I added the mode since it's good practice).
for root, dirs, files in os.walk(path):
for name in files:
if name.endswith('.sql'):
filepath = os.path.join(root, name)
if 'gen_dts' in open(filepath, 'r').read():
print filepath
os.walk() returns a generator that gives you tuples like (root, dirs, files), where root is the current directory, and dirs and files are the names of the directories and files, respectively, that are in the root directory. Note that they are the names, not the paths; or to be precise, they're the path of that directory/file relative to the current root directory, which is another way of saying the same thing. Another way to think of it is that the directories and files in dirs and files will never have slashes in them.
One final point; the root directory paths always begin with the path that you pass to os.walk(), whether it was relative to your current working directory or not. So, for os.walk('three'), the root in the first tuple will be 'three' (for os.walk('three/'), it'll be 'three/'). For os.walk('../two/three'), it'll be '../two/three'. For os.walk('/one/two/three/'), it'll be '/one/two/three/'; the second one might be '/one/two/three/four'.
The files are just the file names. You need to add the path to the before opening them. Use os.path.join.

iterating through folders and from each use one specific file in a method python

What I want to do is iterate through folders in a directory and in each folder find a file 'fileX' which I want to give to a method which itself needs the file name as a parameter to open it and get a specific value from it. So 'method' will extract some value from 'fileX' (the file name is the same in every folder).
My code looks something like this but I always get told that the file I want doesn't exist which is not the case:
import os
import xy
rootdir =r'path'
for root, dirs, files in os.walk(rootdir):
for file in files:
gain = xy.method(fileX)
print gain
Also my folders I am iterating through are named like 'folderX0', 'folderX1',..., 'folderX99', meaning they all have the same name with increasing ending numbers. It would be nice if I could tell the program to ignore every other folder which might be in 'path'.
Thanks for the help!
os.walk returns file and directory names relative to the root directory that it gives. You can combine them with os.path.join:
for root, dirs, files in os.walk(rootdir):
for file in files:
gain = xy.method(os.path.join(root, file))
print gain
See the documentation for os.walk for details:
To get a full path (which begins with top) to a file or directory in dirpath, do os.path.join(dirpath, name).
To trim it to ignore any folders but those named folderX, you could do something like the following. When doing os.walk top down (the default), you can delete items from the dirs list to prevent os.walk from looking in those directories.
for root, dirs, files in os.walk(rootdir):
for dir in dirs:
if not re.match(r'folderX[0-9]+$', dir):
dirs.remove(dir)
for file in files:
gain = xy.method(os.path.join(root, file))
print gain

Python script errors out

I have this script, which I have no doubt is flawed:
import fnmatch, os, sys
def findit (rootdir, find, pattern):
for folder, dirs, files in os.walk(rootdir):
print (folder)
for filename in fnmatch.filter(files,pattern):
with open(filename) as f:
s = f.read()
f.close()
if find in s :
print(filename)
findit(sys.argv[1], sys.argv[2], sys.argv[3])
when I run it I get Errno2, no such file or directory. BUT the file exists. For instance if I execute it by going: findit.py c:\python "folder" *.py it will work just fine, listing all the *.py files which contain the word "folder". BUT if I go findit.py c:\php\projects1 "include" *.php
as an example I get [Errno2] no such file or directory: 'About.php' (for example). But About.php exists. I don't understand what it's doing, or what I'm doing wrong.
If you look at any of the examples for os.walk, you'll see that they all do os.path.join(root, name). You need to do that too.
Why? Quoting from the docs:
filenames is a list of the names of the non-directory files in dirpath. Note that the names in the lists contain no path components. To get a full path (which begins with top) to a file or directory in dirpath, do os.path.join(dirpath, name).
If you just use the filename as a path, it's going to look for a file of the same name in the current working directory. If there's no such file, you'll get a FileNotFoundError. If there is such a file, you'll open and read the wrong file. Only if you happen to be looking inside the current working directory will it work.
There's also another major problem in your code: os.walk walks a directory tree recursively, finding all files in the given top directory, or any subdirectory of top, or any subdirectory of… and so on, yielding once for each directory. But you're not doing anything useful with that (except printing out the folders). Instead, you wait until it finishes, and then use the files from whichever directory it happened to reach last.
If you just want to get a flat listing of the files directly in a directory, use os.listdir, not os.walk. (Or maybe use glob.glob instead of explicitly listing everything then filtering with fnmatch.)
On the other hand, if you want to walk the tree, you have to move your second for loop inside the first one.
You've also got a minor problem: You call f.close() inside a with open(…) as f:, which leads to f being closed twice. This is guaranteed to be completely harmless (at least in 2.5+, including 3.x), but it's still a bad idea.
Putting it together, here's a working version of your code:
def findit (rootdir, find, pattern):
for folder, dirs, files in os.walk(rootdir):
print (folder)
for filename in fnmatch.filter(files,pattern):
pathname = os.path.join(folder, filename)
with open(pathname) as f:
s = f.read()
if find in s:
print(pathname)
You are using a relative filename. But your current directory does not contain the file. And you don't want to search there anyway. Use os.path.join(folder, filename) to make an absolute path.

Categories