Renaming multiple files in python - python

I wanted to know what is the easiest way to rename multiple files using re module in python if at all it is possible.
In my directory their are 25 files all with the file names in the format ' A unique name followed by 20 same characters.mkv '
What I wanted was to delete all the 20 characters.
How can I do this using Python if at all it is possible :)

To get the new name:
>>> re.sub(r'.{20}(.mkv)', r'\1', 'unique12345678901234567890.mkv')
'unique.mkv'
Or without regex:
>>> 'unique12345678901234567890.mkv'[:-24] + '.mkv'
'unique.mkv'
To rename the file use os.rename(old, new): http://docs.python.org/library/os.html#os.rename
To get a list of the files to rename use glob.glob('*.mkv'): http://docs.python.org/library/glob.html#glob.glob
Putting that all together we get:
for filename in glob.glob('*.mkv'):
if len(filename) > 24:
os.rename(filename, filename[:-24] + '.mkv'

Since you are cutting out a specific number of characters from an easily identified point in the string, the re module is somewhat overkill. You can prepare the new filename as:
new_name = old_name.rsplit('.', 1)[0][:-20] + '.mkv'
To find the files, look up os.listdir (or, if you want to look into directories recursively, os.walk), and to rename them, see os.rename.
The re module would be useful if there are other .mkv's in the directory that you don't want to rename, so that you need to do more careful checking to identify the "target" filenames.

Use glob to find the filenames, slice the strings, and use os.rename() to rename them.

Something like:
>>> import os
>>> doIt = False
>>> for filename in ( x for x in os.listdir('.') if x.endswith('.mvk')):
... newname = filename[:-24] + filename[-4:]
... if doIt:
... os.rename(filename,newname)
... print "Renaming {0} to {1}".format(filename,newname)
... else:
... print "Would rename {0} to {1}".format(filename,newname)
...
When manipulating files, always have a dryrun. Change doIt to True, to actually move the files.

You need to use glob and os.rename. The rest if for you to figure it out!
And yes, this is entirely possible and easy to do in Python.

Related

how to read a file name with variable in python?

I have a folder with many .csv files in it with the following format:
FGS07_NAV_26246_20210422_86oylt.xls
FGS07_NAV_26246_ is always the same, 20210422 is the date and the most important parameter to go and pick the file, _86oylt also changes but not important at all.
I need to read one csv file with the same date as the operation date.
let’s think that y is our date part, so I tried this code, but it doesn’t give me the write name:
file2 = r'C:/Users/name/Finance/LOF_PnL/FGS07_NAV_26246_' + y + '*.xls'
df2 = pd.read_excel(file2)
How should I fix?
if you want just the specific file, you could try this one:
xls_file = [file for file in os.listdir(r"C:/Users/name/Finance/LOF_PnL") if file.endswith("xls") and y in file][0]
you can use glob module:
import glob
file2 = glob.glob(file2)[0]
import os
all_files = os.listdir(r'C:/Users/name/Finance/LOF_PnL')
filtered_files = list(filter(lambda x : 'FGS07_NAV_26246_' + y in x, all_files))
and now filtered_files is a list with the names of all files having 'FGS07_NAV_26246_' + y in their file names. You can add the full path to these names if you want the absolute path. You can also use regex for a more fancy pattern lookup than in
Maybe you can try to use join() or os.path.join() which are more standard.
"".join([str1, str2])
os.path.join(path_to_file, filename)
I hope this could be helpful. Maybe check the type of the file again also.

Python string alphabet removal?

So in my program, I am reading in files and processing them.
My output should say just the file name and then display some data
When I am looping through files and printing output by their name and data,
it displays for example: myfile.txt. I don't want the .txt part. just myfile.
how can I remove the .txt from the end of this string?
The best way to do it is in the example
import os
filename = 'myfile.txt'
print(filename)
print(os.path.splitext(filename))
print(os.path.splitext(filename)[0])
More info about this very useful builtin module
https://docs.python.org/3.8/library/os.path.html
The answers given are totally right, but if you have other possible extensions, or don't want to import anything, try this:
name = file_name.rsplit(".", 1)[0]
You can use pathlib.Path which has a stem attribute that returns the filename without the suffix.
>>> from pathlib import Path
>>> Path('myfile.txt').stem
'myfile'
Well if you only have .txt files you can do this
file_name = "myfile.txt"
file_name.replace('.txt', '')
This uses the built in replace functionality. You can find more info on it here!

How to replace a sentence in line of string address?

I need help replacing a part of string on a list of file's address location.
The file address looks like this :
/SfSNet/Images_mask/10_face.png
and I need to change it into something like this
/SfSNet/Images_mask/10_mask.png
I know it is possible to count the index since the front string are the same but it will be annoying in case I want to run the code on other PC. I read something about regex but it isn't clear for me. So maybe if someone can help me with this or have any other solution will be appreciated, thank you
Assuming the structure of all file names is as the above, you could use re.sub as:
s = '/SfSNet/Images_mask/10_face.png'
s.replace('_face.png', '_mask.png')
# '/SfSNet/Images_mask/10_mask.png'
If a simple str.replace lacks generality1, consider doing operations like this is with os.path.
>>> import os.path
>>>
>>> s = '/SfSNet/Images_mask/10_face.png'
>>> folder, file = os.path.split(s) # ('/SfSNet/Images_mask', '10_face.png')
>>> name, ext = os.path.splitext(file) # ('10_face', '.png')
>>> new_name = '{}_{}{}'.format(name.rsplit('_', 1)[0], 'mask', ext)
>>> os.path.join(folder, new_name)
'/SfSNet/Images_mask/10_mask.png'
1 For example, if you want to preserve the extension name without hardcoding it or if the substring you want to replace might appear in the directory name itself.

Extracting penultimate folder name from path

Does anyone know a clever way to extract the penultimate folder name from a given path?
eg folderA/folderB/folderC/folderD
-> I want to know what the name of folderC is, I don't know the names of the other folders and there may be a variable number of directories before folderC but it's always the 2nd to last folder.
everything i come up with seems too cumbersome (eg getting name of folderD using basename and normpath, removing this from path string, and the getting folderC
cheers, -m
There isn't a good way to skip directly to portions within a path in a single call, but what you want can be easily done like so:
>>> os.path.basename(os.path.dirname('test/splitting/folders'))
'splitting'
Alternatively, if you know you'll always be on a filesystem with '/' delineated paths, you can just use regular old split() to get there directly:
>>> 'test/splitting/folders'.split('/')[-2]
'splitting'
Although this is a bit more fragile. The dirname+basename combo works with/without a file at the end of the path, where as the split version you have to alter the index
yep, there sure is:
>>> import os.path
>>> os.path.basename(os.path.dirname("folderA/folderB/folderC/folderD"))
'folderC'
That is, we find the 'parent directory' of the named path, and then extract the filename of the resulting path from that.

Python automated file names

I want to automate the file name used when saving a spreadsheet using xlwt. Say there is a sub directory named Data in the folder the python program is running. I want the program to count the number of files in that folder (# = n). Then the filename must end in (n+1). If there are 0 files in the folder, the filename must be Trial_1.xls. This file must be saved in that sub directory.
I know the following:
import xlwt, os, os.path
n = len([name for name in os.listdir('.') if os.path.isfile(name)])
counts the number of files in the same folder.
a = n + 1
filename = "Trial_" + "a" + ".xls"
book.save(filename)
this will save the file properly named in to the same folder.
My question is how do I extend this in to a sub directory? Thanks.
os.listdir('.') the . in this points to the directory from where the file is executed. Change the . to point to the subdirectory you are interested in.
You should give it the full path name from the root of your file system; otherwise it will be relative to the directory from where the script is executed. This might not be what you want; especially if you need to refer to the sub directory from another program.
You also need to provide the full path to the filename variable; which would include the sub directory.
To make life easier, just set the full path to a variable and refer to it when needed.
TARGET_DIR = '/home/me/projects/data/'
n = sum(1 for f in os.listdir(TARGET_DIR) if os.path.isfile(os.path.join(TARGET_DIR, f)))
new_name = "{}Trial_{}.xls".format(TARGET_DIR,n+1)
You actually want glob:
from glob import glob
DIR = 'some/where/'
existing_files = glob(DIR + '*.xls')
filename = DIR + 'stuff--%d--stuff.xls' % (len(existing_files) + 1)
Since you said Burhan Khalid's answer "Works perfectly!" you should accept it.
I just wanted to point out a different way to compute the number. The way you are doing it works, but if we imagine you were counting grains of sand or something would use way too much memory. Here is a more direct way to get the count:
n = sum(1 for name in os.listdir('.') if os.path.isfile(name))
For every qualifying name, we get a 1, and all these 1's get fed into sum() and you get your count.
Note that this code uses a "generator expression" instead of a list comprehension. Instead of building a list, taking its length, and then discarding the list, the above code just makes an iterator that sum() iterates to compute the count.
It's a bit sleazy, but there is a shortcut we can use: sum() will accept boolean values, and will treat True as a 1, and False as a 0. We can sum these.
# sum will treat Boolean True as a 1, False as a 0
n = sum(os.path.isfile(name) for name in os.listdir('.'))
This is sufficiently tricky that I probably would not use this without putting a comment. But I believe this is the fastest, most efficient way to count things in Python.

Categories