Renaming multiple images with .rename and .endswith - python

I've been trying to get this to work, but I feel like I'm missing something. There is a large collection of images in a folder that I need to rename just part of the filename. For example, I'm trying to rename the "RJ_200", "RJ_600", and "RJ_60"1 all to the same "RJ_500", while keeping the rest of the filename intact.
Image01.Food.RJ_200.jpg
Image02.Food.RJ_200.jpg
Image03.Basket.RJ_600.jpg
Image04.Basket.RJ_600.jpg
Image05.Cup.RJ_601.jpg
Image06.Cup.RJ_602.jpg
This is what I have so far, but it keeps just giving me the "else" instead of actually renaming any of them:
import os
import fnmatch
import sys
user_profile = os.environ['USERPROFILE']
dir = user_profile + "\Desktop" + "\Working"
print (os.listdir(dir))
for images in dir:
if images.endswith("RJ_***.jpg"):
os.rename("RJ_***.jpg", "RJ_500.jpg")
else:
print ("Arg!")

The Python string method endswith does not do pattern-matching with *, so you're looking for filenames which explicitly include the asterisk character and not finding any.
Try using regular expressions to match your filenames and then building your target filename explicitly:
import os
import re
patt = r'RJ_\d\d\d'
user_profile = os.environ['USERPROFILE']
path = os.path.join(user_profile, "Desktop", "Working")
image_files = os.listdir(path)
for filename in image_files:
flds = filename.split('.')
try:
frag = flds[2]
except IndexError:
continue
if re.match(patt, flds[2]):
from_name = os.path.join(path, filename)
to_name = '.'.join([flds[0], flds[1], 'RJ_500', 'jpg'])
os.rename(from_name, os.path.join(path, to_name))
Note that you need to do your matching with the file's basename and join on the rest of the path later.

You don't need to use .endswith. You can split the image file name up using .split and check the results. Since there are several suffix strings involved, I've put them all into a set for fast membership testing.
import os
import re
import sys
suffixes = {"RJ_200", "RJ_600", "RJ_601"}
new_suffix = "RJ_500"
user_profile = os.environ["USERPROFILE"]
dir = os.path.join(user_profile, "Desktop", "Working")
for image_name in os.listdir(dir):
pieces = image_name.split(".")
if pieces[2] in suffixes:
from_path = os.path.join(dir, image_name)
new_name = ".".join([pieces[0], pieces[1], new_suffix, pieces[3]])
to_path = os.path.join(dir, new_name)
print("renaming {} to {}".format(from_path, to_path))
os.rename(from_path, to_path)

Related

importing filename and using regex to change the filename and save back

I have files with filenames as "lsud-ifgufib-1234568789.png" I want to rename this file as digits only which are followed by last "-" and then save them back in the folder.
Basically I want the final filename to be the digits that are followed by "-".
~ path = 'C:/Users/abc/downloads'
for filename in os.listdir(path):
r = re.compile("(\d+)")
newlist = filter(r.match, filename)
print(newlist)
~
How do I proceed further?
Assumptions:
You want to rename files if the file has a hyphen before the number.
The file may or may not have an extention.
If the file has an extention, preserve it.
Then would you please try the following:
import re, os
path = 'C:/Users/abc/downloads'
for filename in os.listdir(path):
m = re.search(r'.*-(\d+.*)', filename)
if m:
os.rename(os.path.join(path, filename), os.path.join(path, m.group(1)))
You could try a regex search followed by a path join:
import re
import os
path = 'C:/Users/abc/downloads'
for filename in os.listdir(path):
os.rename(filename, os.path.join(path, re.search("\d+(?=\D+?$)", filename).group()))
import re
import pathlib
fileName = "lsud-ifgufib-1234568789.png"
_extn = pathlib.Path(fileName).suffix
_digObj = re.compile(r'\d+')
digFileName = ''.join(_digObj.findall(fileName))
replFileName = digFileName + _extn

I want to move a file based on part of the name to a folder with that name

I have a directory with a large number of files that I want to move into folders based on part of the file name. My list of files looks like this:
001-020-012B-B.nc
001-022-151-A.nc
001-023-022-PY-T1.nc.nc
001-096-016B-A.nc
I want to move the files I have into separate folders based on the first part of the file name (001-096-016B, 001-023-022, 001-022-151). The first parts of the file name always have the same number of numbers and are always in 3 parts separated by an underscore '-'.
The folder names are named like this \oe-xxxx\xxxx\xxxx\001-Disc-PED\020-Rotor-parts-1200.
So for example, this file should be placed in the above folder, based on the folder name (the numbers):
001-020-012B-B.nc
File path divided into column to show where the above file has to be moved to:
(001)-Disc-PED\(020)-Rotor-parts-1200.
Therefore:
(001)-Disc-PED\(020)-Rotor-parts-1200 (001)-(020)-012B-B.nc
This is what I have tried from looking online but it does not work:
My thinking is I want to loop through the folders and look for matches.
import os
import glob
import itertools
import re
#Source file
sourcefile = r'C:\Users\cah\Desktop\000Turning'
destinationPath = r'C:\Users\cah\Desktop\08-CAM'
#Seperation
dirs = glob.glob('*-*')
#Every file with file extension .nc
files = glob.glob('*.nc')
for root, dirs, files in os.walk(sourcefile):
for file in files:
if file.endswith(".nc"):
first3Char = str(file[0:3])
last3Char = str(file[4:7])
for root in os.walk(destinationPath):
first33CharsOfRoot = str(root[0:33])
cleanRoot1 = str(root).replace("[", "")
cleanRoot2 = str(cleanRoot1).replace("]", "")
cleanRoot3 = str(cleanRoot2).replace(")", "")
cleanRoot4 = str(cleanRoot3).replace("'", "")
cleanRoot5 = str(cleanRoot4).replace(",", "")
firstCharOfRoot = re.findall(r'(.{3})\s*$', str(cleanRoot5))
print(firstCharOfRoot==first3Char)
if(firstCharOfRoot == first3Char):
print("Hello")
for root in os.walk(destinationPath):
print(os.path.basename(root))
# if(os.path)
I realized that I should not look for the last 3 chars in the path, because it is the first (001) etc. Numbers that I need to look for in the beginning to find the first path that I need to go to.
EDIT:
import os
import glob
import itertools
import re
#Source file
sourcefile = r'C:\Users\cah\Desktop\000Turning'
destinationPath = r'C:\Users\cah\Desktop\08-CAM'
#Seperation
dirs = glob.glob('*-*')
#Every file with file extension .nc
files = glob.glob('*.nc')
for root, dirs, files in os.walk(sourcefile):
for file in files:
if file.endswith(".nc"):
first3Char = str(file[0:3])
last3Char = str(file[4:7])
for root in os.walk(destinationPath):
cleanRoot1 = str(root).replace("[", "")
cleanRoot2 = str(cleanRoot1).replace("]", "")
cleanRoot3 = str(cleanRoot2).replace(")", "")
cleanRoot4 = str(cleanRoot3).replace("'", "")
cleanRoot5 = str(cleanRoot4).replace(",", "")
firstCharOfRoot = re.findall(r'^(?:[^\\]+\\\\){5}(\d+).*$', str(cleanRoot5))
secondCharOfRoot = re.findall(r'^(?:[^\\]+\\\\){6}(\d+).*$', str(cleanRoot5))
firstCharOfRootCleaned = ''.join(firstCharOfRoot)
secondCharOfRoot = ''.join(secondCharOfRoot)
cleanRoot6 = str(cleanRoot5).replace("(", "")
if(firstCharOfRootCleaned == str(first3Char) & secondCharOfRoot == str(last3Char)):
print("BINGOf")
# for root1 in os.walk(cleanRoot6):
Solution
There is an improved solution in the next section. But let's decompose the straightforward solution before.
First, get the complete list of subfolders.
all_folders_splitted = [os.path.split(f)\
for f in glob.iglob(os.path.join(destinationPath, "**"), recursive=True)\
if os.path.isdir(f)]
Then, use a function on each of your file to find its matching folder, or a new filepath if it doesn't exist. I include this function called find_folder() in the rest of the script:
import os
import glob
import shutil
sourcefile= r'C:\Users\cah\Desktop\000Turning'
destinationPath = r'C:\Users\cah\Desktop\08-CAM'
all_folders_splitted = [os.path.split(f)\
for f in glob.iglob(os.path.join(destinationPath , "**"), recursive=True)\
if os.path.isdir(f)]
# It will create and return a new directory if no directory matches
def find_folder(part1, part2):
matching_folders1 = [folder for folder in all_folders_splitted\
if os.path.split(folder[0])[-1].startswith(part1)]
matching_folder2 = None
for matching_folder2 in matching_folders1:
if matching_folder2[-1].startswith(part2):
return os.path.join(*matching_folder2)
# Whole new folder tree
if matching_folder2 is None:
dest = os.path.join(destinationPath, part1, part2)
os.makedirs(dest)
return dest
# Inside the already existing folder part "1"
dest = os.path.join(matching_folder2[0], part2)
os.makedirs(dest)
return dest
# All the files you want to move
files_gen = glob.iglob(os.path.join(source_path, "**", "*-*-*.nc"), recursive=True)
for file in files_gen:
# Split the first two "-"
basename = os.path.basename(file)
splitted = basename.split("-", 2)
# Format the destination folder.
# Creates it if necessary
destination_folder = find_folder(splitted[0], splitted[1])
# Copying the file
shutil.copy2(file, os.path.join(destination_folder, basename))
Improved solution
In case you have a large number of files, it could be detrimental to "split and match" every folder at each iteration.
We can store the folder, found given a pattern, in a dictionary. The dictionary will be updated if a new pattern is given, else it will return the previously found folder.
import os
import glob
import shutil
sourcefile= r'C:\Users\cah\Desktop\000Turning'
destinationPath = r'C:\Users\cah\Desktop\08-CAM'
# Global dictionary to store folder paths, relative to a pattern
found_pattern = dict()
all_folders_splitted = [os.path.split(f)\
for f in glob.iglob(os.path.join(destinationPath , "**"), recursive=True)\
if os.path.isdir(f)]
def find_folder(part1, part2):
current_key = tuple([part1, part2])
if current_key in pattern_match:
# Already found previously.
# We just return the folder path, stored as the value.
return pattern_match[current_key]
matching_folders1 = [folder for folder in all_folders_splitted\
if os.path.split(folder[0])[-1].startswith(part1)]
matching_folder2 = None
for matching_folder2 in matching_folders1:
if matching_folder2[-1].startswith(part2):
dest = os.path.join(*matching_folder2)
# Update the dictionary
pattern_match[current_key] = dest
return dest
if matching_folder2 is None:
dest = os.path.join(destinationPath, part1, part2)
else:
dest = os.path.join(matching_folder2[0], part2)
# Update the dictionary
pattern_match[current_key] = dest
os.makedirs(dest, exist_ok = True)
return dest
# All the files you want to move
files_gen = glob.iglob(os.path.join(source_path, "**", "*-*-*.nc"), recursive=True)
for file in files_gen:
# Split the first two "-"
basename = os.path.basename(file)
splitted = basename.split("-", 2)
# Format the destination folder.
# Creates it if necessary
destination_folder = find_folder(splitted[0], splitted[1])
# Copying the file
shutil.copy2(file, os.path.join(destination_folder, basename))
This updated solution makes it more efficient (especially when many files should share the same folder) and you could also make use of the dictionary later, if you save it.

separate filename after underlining _ os.path

In my path /volume1/xx/ are several files with this character A_test1.pdf, B_test2.pdf, ...I want to seperate the test1 part without path and .pdf.
Im newbie so I tried first with full name
but I got only the "*.pdf" as a text.
What is wrong with the path oder placeholder * ?
splitname = os.path.basename('/volume1/xx/*.pdf')
Edit
I got 2019-01-18_RG-Telekom[] from orign ReT_march - I want 2019-01-18_RG-Telekom_march (text after underlining) xx is a folder
here is the whole code:
#!/usr/bin/env python3
import datetime
import glob
import os
import os.path
SOURCE_PATH = '/volume1/xx'
TARGET_PATH = os.path.join(SOURCE_PATH, 'DMS')
def main():
today = datetime.date.today()
splitnames = [os.path.basename(fpath) for fpath in glob.glob("./xx/*.pdf")]
for prefix, name_part in [
('ReA', 'RG-Amazon'),
('GsA', 'GS-Amazon'),
('ReT', 'RG-Telekom'),
('NoE', 'Notiz-EDV'),
]:
filenames = glob.iglob(os.path.join(SOURCE_PATH, prefix + '*.pdf'))
for old_filename in filenames:
new_filename = os.path.join(TARGET_PATH, '{}_{}_{}.pdf'.format(today, name_part, splitnames))
os.rename(old_filename, new_filename)
if __name__ == '__main__':
main()
Use glob, os.path don't know how to process masks, but glob.glob works:
splitnames = [os.path.basename(fpath) for fpath in glob.glob("./**/*.txt")]
splitnames
Out:
['A_test1.pdf', 'B_test2.pdf']
Output of the glob:
glob.glob("./**/*.txt")
Out:
['./some_folder/A_test1.pdf', './another_folder/B_test2.pdf']
Apply os.path.basename to this list and extract basenames, as it shown above.
Edit
If xx in the path volume1/xx/ is just a folder name, not a mask, you should use following expression:
splitnames = [os.path.basename(fpath) for fpath in glob.glob("./xx/*.txt")]
because ./**/ is expression which masks a folder name and it's unnecessary that case.

Renaming of files for a given pattern

I need help.
there is a folder "C:\TEMP" in this folder are formatted files "IN_ + 7123456789.amr"
It is necessary to make renaming of files for a given pattern.
"IN_ NAME _ DATE-CREATE _ Phone number.amr"
Correspondingly, if a file called "OUT_ + 7123456789.amr" the result format "OUT_ NAME_DATE-CREATE_Phone number.amr"
The question is how to specify the file name has been checked before os.rename and depending on the file name to use the template
import os
path = "C:/TEMP"
for i, filename in enumerate(os.listdir(path)):
os.chdir(path)
os.rename(filename, 'name'+str(i) +'.txt')
i = i+1
Sorry but none of your examples are consistent in your question, I still don't understand what your C:\temp contains...
Well, assuming it would look like:
>>> os.listdir(path)
['IN_ + 7123456789.amr', 'OUT_ + 7123456789.amr']
The example:
import datetime
import re
import os
os.chdir(path)
for filename in os.listdir(path):
match = re.match(r'(IN|OUT)_ \+ (\d+).amr', filename)
if match:
file_date = datetime.datetime.fromtimestamp(os.stat(filename).st_mtime)
destination = '%s_%s_%s_Phone number.amr' % (
match.group(1), # either IN or OUT
match.group(2),
file_date.strftime('%Y%m%d%H%M%S'), # adjust the format at your convenience
)
os.rename(filename, destination)
Will produce:
IN_7123456789_20150721094227_Phone number.amr
OUT_7123456789_20150721094227_Phone number.amr
Other files won't match the re.match pattern and be ignored.

Python file searching trouble

I’m new to programming and could do with a little help. I’m trying to make a program that will search for files in a specified directory by extension (multiple extensions) and then only return specific results which have my list of keywords in the filename.
I have the following:
import os
from fnmatch import fnmatch
root = 'c:\users'
pattern = "*.css"
for path, subdirs, files in os.walk(root):
for name in files:
if fnmatch(name, pattern):
print os.path.join(name)
This will bring back all files with a single extension, in this case .css files, but I need it to do more such as image and text file extensions. I would also like it to only return files that have specific keywords in the file name. Can anyone point me in the right direction??
Thanks
Perhaps you could use glob:
from glob import glob
for filename in glob('*.css'):
print(filename)
If you have multiple extensions you can add the list returned by glob():
exts = ['ccs', 'txt']
all = []
for ext in exts:
all += glob('*.' + ext)
for filename in all:
print(filename)
Ok, so if you are searching for filetype and keywords, here would be some easy code to start with:
import os
import re
root = 'c:\users'
pattern = re.compile("((keyword1)|(keyword2))\.((txt)|(jpg))")
for path, subdirs, files in os.walk(root):
for name in files:
if re.match(pattern, name):
print os.path.join(name)
This code will match the file extension txt or jpg and search for the keywords keyword1 and keyword2.
If you want to make a more user-friendly code, to easily add extensions or keywords, you could use lists like so:
import os
import re
# configuration information
root = 'C:\Users'
keywords = [ 'one', 'two']
extensions = [ 'jpg', 'txt' ]
use_wildcard = True # Enables you to catch the keyword anywhere in the filename
# end of configuration
keyword_pattern = ''
first = True
for k in keywords:
if use_wildcard:
k = '.*' + k + '.*'
if first:
keyword_pattern += '(' + k + ')'
first = False
else:
keyword_pattern += '|(' + k + ')'
extension_pattern = ''
first = True
for ext in extensions:
if first:
extension_pattern += '(' + ext + ')'
first = False
else:
extension_pattern += '|(' + ext + ')'
pattern_regex = r"({0})\.({1})".format(keyword_pattern, extension_pattern)
print "Searching for: " + pattern_regex
pattern = re.compile(pattern_regex)
for path, subdirs, files in os.walk(root):
for name in files:
if re.match(pattern, name):
print os.path.join(path, name)
I'm using regular expressions because they are very powerful and are useful in a lot of cases. They may seem complex at first sight but once you understand them, it's difficult not to use them :)

Categories