Renaming the beginning of a file name in Python - python

I am attempting to create an executable that renames all files in its folder.
I am lost on how to reference (and add to) the beginning of a file name.
For example:
file_text.xlsx
What I need it to look like:
10-30-2021_file_text.xlsx
I would like to append to the beginning of the file name, and add my own string(s)
NOTE: 'file_text' are randomly generated file names, and all unique. So I would need to keep the unique names and just add to the beginning of each unique file name.
Here is what I have so far to rename the other files, I figured I could reference a space but this did not work as there are no spaces.
import os
directory = os.getcwd()
txt_replaced = ' '
txt_needed = '2021-10-31'
for f in os.listdir(directory):
os.rename(os.path.join(directory, f),
os.path.join(directory, f.replace(txt_needed, txt_replaced)))
I also am curious if there is a way to reference specific positions within the file name.
For example:
text_text1.csv
If there is a way to uppercase only 'text1'?
Thank you!

replace() doesn't work because there's no space character at the beginning of the filename. So there's nothing to replace.
You also didn't add the _ character after the date.
Just use ordinary string concatenation or formatting
for f in os.listdir(directory):
os.rename(os.path.join(directory, f),
os.path.join(directory, f"{txt_needed}_{f}"))

Related

Python Delete Files in Directory from list in Text file

I've searched through many answers on deleting multiple files based on certain parameters (e.g. all txt files). Unfortunately, I haven't seen anything where one has a longish list of files saved to a .txt (or .csv) file and wants to use that list to delete files from the working directory.
I have my current working directory set to where the .txt file is (text file with list of files for deletion, one on each row) as well as the ~4000 .xlsx files. Of the xlsx files, there are ~3000 I want to delete (listed in the .txt file).
This is what I have done so far:
import os
path = "c:\\Users\\SFMe\\Desktop\\DeleteFolder"
os.chdir(path)
list = open('DeleteFiles.txt')
for f in list:
os.remove(f)
This gives me the error:
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'Test1.xlsx\n'
I feel like I'm missing something simple. Any help would be greatly appreciated!
Thanks
Strip ending '\n' from each line read from the text file;
Make absolute path by joining path with the file name;
Do not overwrite Python types (i.e., in you case list);
Close the text file or use with open('DeleteFiles.txt') as flist.
EDIT: Actually, upon looking at your code, due to os.chdir(path), second point may not be necessary.
import os
path = "c:\\Users\\SFMe\\Desktop\\DeleteFolder"
os.chdir(path)
flist = open('DeleteFiles.txt')
for f in flist:
fname = f.rstrip() # or depending on situation: f.rstrip('\n')
# or, if you get rid of os.chdir(path) above,
# fname = os.path.join(path, f.rstrip())
if os.path.isfile(fname): # this makes the code more robust
os.remove(fname)
# also, don't forget to close the text file:
flist.close()
As Henry Yik pointed in the commentary, you need to pass the full path when using os.remove function. Also, open function just returns the file object. You need to read the lines from the file. And don't forget to close the file. A solution would be:
import os
path = "c:\\Users\\SFMe\\Desktop\\DeleteFolder"
os.chdir(path)
# added the argument "r" to indicates only reading
list_file = open('DeleteFiles.txt', "r")
# changing variable list to _list to do not shadow
# the built-in function and type list
_list = list_file.read().splitlines()
list_file.close()
for f in _list:
os.remove(os.path.join(path,f))
A further improvement would be use list comprehension instead of a loop and a with block, which "automagically" closes the file for us:
with open('DeleteFiles.txt', "r") as list_file:
_list = list_file.read().splitlines()
[os.remove(os.path.join(path,f)) for f in _list]

Removing file extension from filename with file handle as input

I have the following code f = open('01-01-2017.csv')
From f variable, I need to remove the ".csv" and set the remaining "01-01-2017" to a variable called "date". what is the best way to accomplish this
just retrieve the name of the file using f.name and apply os.path.splitext, keep the left part:
import os
date = os.path.splitext(os.path.basename(f.name))[0]
(I've used os.path.basename in case the file has an absolute path)

Changing name of file until it is unique

I have a script that downloads files (pdfs, docs, etc) from a predetermined list of web pages. I want to edit my script to alter the names of files with a trailing _x if the file name already exists, since it's possible files from different pages will share the same filename but contain different contents, and urlretrieve() appears to automatically overwrite existing files.
So far, I have:
urlfile = 'https://www.foo.com/foo/foo/foo.pdf'
filename = urlfile.split('/')[-1]
filename = foo.pdf
if os.path.exists(filename):
filename = filename('.')[0] + '_' + 1
That works fine for one occurrence, but it looks like after one foo_1.pdf it will start saving as foo_1_1.pdf, and so on. I would like to save the files as foo_1.pdf, foo_2.pdf, and so on.
Can anybody point me in the right direction on how to I can ensure that file names are stored in the correct fashion as the script runs?
Thanks.
So what you want is something like this:
curName = "foo_0.pdf"
while os.path.exists(curName):
num = int(curName.split('.')[0].split('_')[1])
curName = "foo_{}.pdf".format(str(num+1))
Here's the general scheme:
Assume you start from the first file name (foo_0.pdf)
Check if that name is taken
If it is, iterate the name by 1
Continue looping until you find a name that isn't taken
One alternative: Generate a list of file numbers that are in use, and update it as needed. If it's sorted you can say name = "foo_{}.pdf".format(flist[-1]+1). This has the advantage that you don't have to run through all the files every time (as the above solution does). However, you need to keep the list of numbers in memory. Additionally, this will not fill any gaps in the numbers
Why not just use the tempfile module:
fileobj = tempfile.NamedTemporaryFile(suffix='.pdf', prefix='', delete = False)
Now your filename will be available in fileobj.name and you can manipulate to your heart's content. As an added benefit, this is cross-platform.
Since you're dealing with multiple pages, this seeems more like a "global archive" than a per-page archive. For a per-page archive, I would go with the answer from #wnnmaw
For a global archive, I would take a different approch...
Create a directory for each filename
Store the file in the directory as "1" + extension
write the current "number" to the directory as "_files.txt"
additional files are written as 2,3,4,etc and increment the value in _files.txt
The benefits of this:
The directory is the original filename. If you keep turning "Example-1.pdf" into "Example-2.pdf" you run into a possibility where you download a real "Example-2.pdf", and can't associate it to the original filename.
You can grab the number of like-named files either by reading _files.txt or counting the number of files in the directory.
Personally, I'd also suggest storing the files in a tiered bucketing system, so that you don't have too many files/directories in any one directory (hundreds of files makes it annoying as a user, thousands of files can affect OS performance ). A bucketing system might turn a filename into a hexdigest, then drop the file into `/%s/%s/%s" % ( hex[0:3], hex[3:6], filename ). The hexdigest is used to give you a more even distribution of characters.
import os
def uniquify(path, sep=''):
path = os.path.normpath(path)
num = 0
newpath = path
dirname, basename = os.path.split(path)
filename, ext = os.path.splitext(basename)
while os.path.exists(newpath):
newpath = os.path.join(dirname, '{f}{s}{n:d}{e}'
.format(f=filename, s=sep, n=num, e=ext))
num += 1
return newpath
filename = uniquify('foo.pdf', sep='_')
Possible problems with this include:
If you call to uniquify many many thousands of times with the same
path, each subsequent call may get a bit slower since the
while-loop starts checking from num=0 each time.
uniquify is vulnerable to race conditions whereby a file may not
exist at the time os.path.exists is called, but may exist at the
time you use the value returned by uniquify. Use
tempfile.NamedTemporaryFile to avoid this problem. You won't get
incremental numbering, but you will get files with unique names,
guaranteed not to already exist. You could use the prefix parameter to
specify the original name of the file. For example,
import tempfile
import os
def uniquify(path, sep='_', mode='w'):
path = os.path.normpath(path)
if os.path.exists(path):
dirname, basename = os.path.split(path)
filename, ext = os.path.splitext(basename)
return tempfile.NamedTemporaryFile(prefix=filename+sep, suffix=ext, delete=False,
dir=dirname, mode=mode)
else:
return open(path, mode)
Which could be used like this:
In [141]: f = uniquify('/tmp/foo.pdf')
In [142]: f.name
Out[142]: '/tmp/foo_34cvy1.pdf'
Note that to prevent a race-condition, the opened filehandle -- not merely the name of the file -- is returned.

File or directory doesn't exist

I'm trying to use the current date as the file's name, but it seems either this can't be done or I'm doing something wrong. I used a variable as a name for a file before, but this doesn't seem to work.
This is what I tried:
import time
d = time.strftime("%d/%m/%Y")
with open(d +".txt", "a+") as f:
f.write("")
This is just to see if it create the file. As you can see I tried with a+ because I read that creates the file if it doesn't exist and I still get the same error.
The problem is with how you're using the date:
d = time.strftime("%d/%m/%Y")
You can't have a / in a filename, because that's a directory instead. You haven't made the directory yet. Try using hyphens instead:
d = time.strftime("%d-%m-%Y")
You almost certainly don't want to make directories in the structure day/month/year, so I assume that's not what you were intending.
You are including directory separators (/) in your filename, and those directories are not created for you when you try to open a file. There is either no 26/ directory or no 26/02/ directory in your current working path.
You'll either have to create those directories by other means, or if you didn't mean for the day and month to be directories, change your slashes to a different separator character:
d = time.strftime("%d-%m-%Y")

I want to rename a file that has a random part (in its filename) to a specific one

I have some builds generating files, and they all have a random part in them (checksum number).
How do I rename them to a unique name and execute them in Python?
If you are sure there are never any underscores in the piece of the name you would like to keep, you can split it. name = name.split('_')[0]. This won't of course preserve the file extension, but if all the output files are exes, you can just name += '.exe'
Edit:
file_list = os.listdir('.')
for each in file_list:
if each[-4:] != '.exe':
file_list.pop(file_list.index(each))
for each in file_list:
name = each.split('_')[0]
name += '.exe'
os.rename(each, name)
This is not very robust, and requires you execute it in the dir the exes are in. If its something you're going to keep around long term or expect others to use, you should investigate regex and make it path agnostic. I didn't test this- its just a hack, so use at your own risk; but it should be pretty benign. It will try to rename ALL the .exe files in the directory.

Categories