I would like to rename images based on part of the name of the folder the images are in and iterate through the images. I am using os.walk and I was able to rename all the images in the folders but could not figure out how to use the letters to the left of the first hyphen in the folder name as part of the image name.
Folder name: ABCDEF - THIS IS - MY FOLDER - NAME
Current image names in folder:
dsc_001.jpg
dsc_234.jpg
dsc_123.jpg
Want to change to show like this:
ABCDEF_1.jpg
ABCDEF_2.jpg
ABCDEF_3.jpg
What I have is this, but I am not sure why I am unable to split the filename by the hyphen:
import os
from os.path import join
path = r'C:\folderPath'
i = 1
for root, dirs, files in os.walk(path):
for image in files:
prefix = files.split(' - ')[0]
os.rename(os.path.join(path, image), os.path.join(path, prefix + '_'
+ str(i)+'.jpg'))
i = i+1
Okay, I've re-read your question and I think I know what's wrong.
1.) The os.walk() iterable is recursive, i.e. if you use os.walk(r'C:\'), it will loop through all the folders and find all the files under C drive. Now I'm not sure if your C:\folderPath has any sub-folders in it. If it does, and any of the folder/file format are not the convention as C:\folderPath, your code is going to have a bad time.
2.) When you iterate through files, you are split()ing the wrong object. Your question state you want to split the Folder name, but your code is splitting the files iterable which is a list of all the files under the current iteration directory. That doesn't accomplish what you want. Depending if your ABCDEF folder is the C:\folderPath or a sub folder within, you'll need to code differently.
3.) you have imported join from os.path but you still end up calling the full name os.path.join() anyways, which is redundant. Either just import os and call os.path.join() or just with your current imports, just join().
Having said all of that, here are my edits:
Answer 1:
If your ABCDEF is the assigned folder
import os
from os.path import join
path = r'C:\ABCDEF - THIS - IS - MY - FOLDER - NAME'
for root, dirs, files in os.walk(path):
folder = root.split("\\")[-1] # This gets you the current folder's name
for i, image in enumerate(files):
new_image = "{0}_{1}.jpg".format(folder.split(' - ')[0], i + 1)
os.rename(join(path, image), join(path, new_image))
break # if you have sub folders that follow the SAME structure, then remove this break. Otherwise, keep it here so your code stop after all the files are updated in your parent folder.
Answer 2:
Assuming your ABCDEF's are all sub folders under the assigned directory, and all of them follow the same naming convention.
import os
from os.path import join
path = r'C:\parentFolder' # The folder that has all the sub folders that are named ABCDEF...
for i, (root, dirs, files) in enumerate(os.walk(path)):
if i == 0: continue # skip the parentFolder as it doesn't follow the same naming convention
folder = root.split("\\")[-1] # This gets you the current folder's name
for i, image in enumerate(files):
new_image = "{0}_{1}.jpg".format(folder.split(' - ')[0], i + 1)
os.rename(join(path, image), join(path, new_image))
Note:
If your scenario doesn't fall under either of these, please make it clear what your folder structure is (a sample including all sub folders and sub files). Remember, consistency is key in determining how your code should work. If it's inconsistent, your best bet is use Answer 1 on each target folder separately.
Changes:
1.) You can get an incremental index without doing a i += 1. enumerate() is a great tool for iterables that also give you the iteration number.
2.) Your split() should be operated on the folder name instead of files (an iterable). In your case, image is the actual file name, and files is the list of files in the current iteration directory.
3.) Use of str.format() function to make your new file format easier to read.
4.) You'll note the use of split("\\") instead of split(r"\"), and that's because a single backslash cannot be a raw string.
This should now work. I ended up doing a lot more research than expected such as how to handle the os.walk() properly in both scenarios. For future reference, a little google search goes a long way. I hope this finally answers your question. Remember, doing your own research and clarity in demonstrating your problem will get you more efficient answers.
Bonus: if you have python 3.6+, you can even use f strings for your new file name, which ends up looking really cool:
new_image = f"{image.split(' - ')[0]}_{i+1}.jpg"
Related
What I want to achieve is to get the first item in a folder that is a jpg or png image without having to scan the whole folder.
path = os.getcwd()
#List of folders in the path
folders = next(os.walk(path))[1]
#Get the first element
folders_walk = os.walk(path+'\\'+ folder)
firts = next(folders_walk) [2][0]
With this code I get the first element of the folder, but this may or may not be an image. Any advice?
Not sure what you mean by "without having to scan the entire folder". You could use glob(), but that would still scan the entire directory to match the regex.
Anyway, see a solution below. Can easily modify if you don't want a recursive search (as below) / want a different criterion to determine if a file is an image.
import os
search_root_directory = os.getcwd()
# Recursively construct list of files under root directory.
all_files_recursive = sum([[os.path.join(root, f) for f in files] for root, dirs, files in os.walk(search_root_directory)], [])
# Define function to tell if a given file is an image
# Example: search for .png extension.
def is_an_image(fpath):
return os.path.splitext(fpath)[-1] in ('.png',)
# Take the first matching result. Note: throws StopIteration if not found
first_image_file = next(filter(is_an_image, all_files_recursive))
Note that the above will be much more efficient (in the recursive case) if sum() (which pre-computes the entire list of files) is omitted and instead a list of files is handled in is_an_image (but the code is less clear that way).
I'm really new to python and looking to organize hundreds of files and want to use regex to move them to the correct folders.
Example: I would like to move 4 files into different folders.
File A has "USA" in the name
File B has "Europe" in the name
File C has both "USA" and "Europe" in the name
Fild D has "World" in the name
Here is what I am thinking but I don't think this is correct
shutil.move('Z:\local 1\[.*USA.*]', 'Z:\local 1\USA')
shutil.move('Z:\local 1\[.*\(Europe\).*]', 'Z:\local 1\Europe')
shutil.move('Z:\local 1\[.*World.*]', 'Z:\local 1\World')
You can list all the files in a directory and move them in a new folder if their names matches a given regular expression as follows:
import os
import re
import shutil
for filename in os.listdir('path/to/some/directory'):
if re.match(r'Z:\\local 1\\[.*USA.*]+', filename):
shutil.move(os.path.join('path/to/some/directory', filename), 'Z:\local 1\USA')
elif re.match(r'Z:\\local 1\\[.*\(Europe\).*]+', filename):
shutil.move(os.path.join('path/to/some/directory', filename), 'Z:\local 1\Euro')
# and so forth
However, os.listdir shows only the direct subfolders and files, but it does not iterate deeper. If you want to analyze all the files recursively in a given folder use the os.walk method.
According to definition of shutil.move, it needs two things:
src, which is a path of a source file
dst, which is a path to the destination folder.
It says that src and dst should be paths, not regular expressions.
What you have is os.listdir() which list files in a directory.
So what you need to do is to list files, then try to match file names against regular expressions. If you get a match, then you know where the file should go.
That said, you still need to decide what to do with option C that matches both 'USA' and 'Europe'.
For added style points you can put pairs of (regex, destination_path) into an array, tuple or map; in this case you can add any number of rules without changing or duplicating the logic.
hello I want to move or copy many folders from some folder list to other folder list I use glob and shutil libraries for this work.
first I create a folder list :
import glob
#paths from source folder
sourcepath='C:/my/store/path/*'
paths = glob.glob(sourcepath)
my_file='10'
selected_path = filter(lambda x: my_file in x, paths)
#paths from destination folder
destpath='C:/my/store/path/*'
paths2 = glob.glob(destpath)
my_file1='20'
selected_path1 = filter(lambda x: my_file1 in x, paths2)
and now I have two lists from paths(selected_path,selected_path1)
now I want to movie or copy folder from first list(selected_path) to second list(selected_path1)
finaly I try this code to move folders but without success :
import shutil
for I,j in zip(selected_path,selected_path1)
shutil.move(i, j)
but that cant work,any ide how to do my code to work ?
First, Obviously your use of lambda isn't useful, glob function can perform this filtering. This is what glob really does, so you're basically littering your code with more unnecessary function call, which is quite expensive in terms of performance.
Look at this example, identical to yours:
import glob
# Find all .py files
sourcepath= 'C:/my/store/path/*.py'
paths = glob.glob(sourcepath)
# Find files that end with 'codes'
destpath= 'C:/my/store/path/*codes'
paths2 = glob.glob(destpath)
Second, the second glob function call may or may not return a list of directories to move your directories/files to. This makes your code dependent on what C:/my/store/pathcontains. That is, you must guarantee that 'C:/my/store/path must contain only directories and never files, so glob will return only directories to be used in shutil.move. If the user later added files not folders to C:/my/store/path that happened to end with the name 'codes' and they didn't specify any extensions (e.g, codes.txt, codes.py...) then you'll find this file in the returned list of glob in paths2. Of course, guaranteeing a directory to contain only subdirectories is problematic and not a good idea, not at all. You can test for directories through os.path.isdir
Notice something, you're using lambda with the help of filter to filter out any string that doesn't contain 10 in your first call to filter, something you can achieve with glob itself:
glob.glob('C:/my/store/path/*10*')
Now any file or subdirectory of C:/my/store/path that contains 10 in its name will be collected in the returned list of the glob function.
Third, zip truncates to the shortest iterable in its argument list. In other words, if you would like to move every path in paths to every path in paths2, you need len(paths) == len(paths2) so each file or directory in paths has a directory to be moved to in paths2.
Fourth, You missed the semicolon for the for loop and in the call for shutil.move you used i instead of I. Python is a case-sensitive language, and I uppercase isn't exactly the same as i lowercase:
import shutil
for I,j in zip(selected_path,selected_path1) # missing :
shutil.move(i, j) # i not I
Corrected code:
import shutil
for I,j in zip(selected_path,selected_path1) # missing :
shutil.move(I, j) # i not I
Presumably, paths2 contains only subdirectories of C:/my/store/path directory, this is a better approach to write your code, but definitely not the best:
import glob
#paths from source folder
sourcepath='C:/my/store/path/*10*'
paths = glob.glob(sourcepath)
#paths from destination folder
destpath='C:/my/store/path/*20*'
paths2 = glob.glob(destpath)
import shutil
for i,j in zip(paths,paths2):
shutil.move(i, j)
*Still some of the previous issues that I mentioned above apply to this code.
And now that you finished the long marathon of reading this answer, what would you like to do to improve your code? I'll be glad to help if you still find something ambiguous.
Good luck :)
I've got two tasks:
I've set up my digital library in the format of a Dewey Decimal Classification, so I've got a 3-deep hierarchy of 10 + 100 + 1000 folders, with directories sometimes going a little deeper. This library structure contains my "books" that I would like to list in a catalog (perhaps a searchable text document). It would be preferable, though not absolutely necessary, if I could view the parent directory name in a separate column next to each "book".
The problem is that some of the "books" in my library are folders that stand alone as items. I planned ahead when I devised this system and made it so that each item in my library would contain a tag in []s that would contain the author name, for instance, and so the idea is that I would try to perform a recursive listing of all of this, but end each recursion when it encounters anything with a [ in the name, directory or file.
How might I go about this? I know a bit of Python (which is originally what I used to create the library structure), and since this is on an external hard drive, I can do this in either Windows or Linux. My rough idea was to perform some sort of a recursive listing that would check the name of each directory or file for a [, and if it did, stop and add it (along with the name of the parent directory) to a list. I don't have any idea where to start.
The answer is based on this where
dirName: The next directory it found.
subdirList: A list of sub-directories in the current directory.
fileList: A list of files in the current directory.
Deletion cannot be done by list comprehension, because we have to "modify the subdirList in-place". Instead, we delete with enumerate on a deep copy of the list so that the counter i wouldn't be skipped after deletions while the original list gets modified.
I haven't tried it so don't trust this 100%.
# Import the os module, for the os.walk function
import os
# Set the directory you want to start from
rootDir = '.'
for dirName, subdirList, fileList in os.walk(rootDir):
print('Found directory: %s' % dirName)
for fname in fileList:
print('\t%s' % fname)
for i, elem in reversed(list(enumerate(subdirList[:]))):
if "[" in elem:
del subdirList[i]
I do atomistic modelling, and use Python to analyze simulation results. To simplify work with a whole bunch of Python scripts used for different tasks, I decided to write simple GUI to run scripts from it.
I have a (rather complex) directory structure beginning from some root (say ~/calc), and I want to populate wx.TreeCtrl control with directories containing calculation results preserving their structure. The folder contains the results if it contains a file with .EXT extension. What i try to do is walk through dirs from root and in each dir check whether it contains .EXT file. When such dir is reached, add it and its ancestors to the tree:
def buildTree(self, rootdir):
root = rootdir
r = len(rootdir.split('/'))
ids = {root : self.CalcTree.AddRoot(root)}
for (dirpath, dirnames, filenames) in os.walk(root):
for dirname in dirnames:
fullpath = os.path.join(dirpath, dirname)
if sum([s.find('.EXT') for s in filenames]) > -1 * len(filenames):
ancdirs = fullpath.split('/')[r:]
ad = rootdir
for ancdir in ancdirs:
d = os.path.join(ad, ancdir)
ids[d] = self.CalcTree.AppendItem(ids[ad], ancdir)
ad = d
But this code ends up with many second-level nodes with the same name, and that's definitely not what I want. So I somehow need to see if the node is already added to the tree, and in positive case add new node to the existing one, but I do not understand how this could be done. Could you please give me a hint?
Besides, the code contains 2 dirty hacks I'd like to get rid of:
I get the list of ancestor dirs with splitting the full path in \
positions, and this is Linux-specific;
I find if .EXT file is in the directory by trying to find the extension in the strings from filenames list, taking in account that s.find returns -1 if the substring is not found.
Is there a way to make these chunks of code more readable?
First of all the hacks:
To get the path seperator for whatever os your using you can use os.sep.
Use str.endswith() and use the fact that in Python the empty list [] evaluates to False:
if [ file for file in filenames if file.endswith('.EXT') ]:
In terms of getting them all nicely nested you're best off doing it recursively. So the pseudocode would look something like the following. Please note this is just provided to give you an idea of how to do it, don't expect it to work as it is!
def buildTree(self, rootdir):
rootId = self.CalcTree.AddRoot(root)
self.buildTreeRecursion(rootdir, rootId)
def buildTreeRecursion(self, dir, parentId)
# Iterate over the files in dir
for file in dirFiles:
id = self.CalcTree.AppendItem(parentId, file)
if file is a directory:
self.buildTreeRecursion(file, id)
Hope this helps!