Renaming files in the order they are titled python - python

I'm using a python script that other guys have helped me with to rename all .jpg or .png files in a directory to whatever I want in order.
So if I have 20 .png files in a directory, I want to rename them in order from 1-20.
The script I have DOES this and I'm happy with it. However, it was just pointed out that the files I have been renaming with this script have been out of order.
As an example, when I rename 1.png to testImage1.png, I'm really renaming testImage10.png as testImage1.png. I tested this with my script by creating 5 text files with the same content, but text files 1-3 I put in different content to keep track of what is what after I'm done renaming. Sure enough, everything is mixed up.
import os
import sys
source = sys.argv[1]
files = os.listdir(source)
name = sys.argv[2]
def rename():
i = 1
for file in files:
os.rename(os.path.join(source, file), os.path.join(source, name+str(i)+'.png'))
i += 1
rename()
I took the time to try and use my (limited) python knowledge to create a series of if/elif statements to sift through and rename the files with the correct name in order.
def roundTwo():
print('Beginning of the end')
i = 1
for root, dirs, files in os.walk(source):
for file in files:
print('Test')
if source == 'newFile0.txt' or 'newFile0.png':
os.rename(os.path.join(source, file), os.path.join(source, name+str(i)+'.txt'))
print('Test1')
i += 1
elif source == 'newFile1.txt' or 'newFile1.png':
os.rename(os.path.join(source, file), os.path.join(source, name+str(i)+'.txt'))
print('Test2')
i += 1
roundTwo()
I did a fair amount of searching to include using Re or fnmatch but nothing comes exactly close to what I'm looking to do. Perhaps I'm using the wrong terms to search? Any insight helps!

If your problem is with 1 and 10, then you can use natural sorting. Sort your variable files as follows:
from natsort import natsorted, ns
natsorted(files, alg=ns.IGNORECASE)
Example:
>>> x = ['a/b/c21.txt', 'a/b/c1.txt', 'a/b/c10.txt', 'a/b/c11.txt', 'a/b/c2.txt']
>>> sorted(x)
['a/b/c1.txt', 'a/b/c10.txt', 'a/b/c11.txt', 'a/b/c2.txt', 'a/b/c21.txt']
>>> natsorted(x, alg=ns.IGNORECASE)
['a/b/c1.txt', 'a/b/c2.txt', 'a/b/c10.txt', 'a/b/c11.txt', 'a/b/c21.txt']

If all the files have some sort of basename you can modify your first function to extract the number assigned to the image
baseName='testImage'
def rename():
for file in files:
number=file[len(baseName):file.find('.png')]
os.rename(os.path.join(source, file), os.path.join(source, name+number+'.png'))
Hope it helps

Related

Python to rename files in a directory/folder to csv

I have written a small script to hopefully iterate through my directory/folder and replace act with csv. Essentially, I have 11 years worth of files that have a .act extension and I just want to replace it with .csv
import os
files = os.listdir("S:\\folder\\folder1\\folder2\\folder3")
path = "S:\\folder\\folder1\\folder2\\folder3\\"
#print(files)
for x in files:
new_name = x.replace("act","csv")
os.rename(path+x,path+new_name)
print(new_name)
When I execute this, it worked for the first five files and then failed on the sixth with the following error:
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'S:\\folder\\folder1\\folder2\\folder3\\file_2011_06.act' -> 'S:\\folder\\folder1\\folder2\\folder3\\file_2011_06.csv'
When I searched for "S:\folder\folder1\folder2\folder3\file_2011_06.act" in file explorer, the file opens. Are there any tips on what additional steps I can take to debug this issue?
Admittedly, this is my first programming script. I'm trying to do small/minor things to start learning. So, I likely missed something... Thank you!
In your solution, you use string's replace to replace "act" by "csv". This could lead to problems if your path contains "act" somewhere else, e.g., S:\\facts\\file_2011_01.act would become S:\\fcsvs\\file_2011_01.act and rename will throw a FileNotFoundError because rename cannot create folders.
When dealing with file names (e.g., concatenating path fragments, extracting file extensions, ...), I recommend using os.path or pathlib instead of direct string manipulation.
I would like to propose another solution using os.walk. In contrast to os.listdir, it recursively traverses all sub-directories in a single loop.
import os
def act_to_csv(directory):
for root, folders, files in os.walk(directory):
for file in files:
filename, extension = os.path.splitext(file)
if extension == '.act':
original_filepath = os.path.join(root, file)
new_filepath = os.path.join(root, filename + '.csv')
print(f"{original_filepath} --> {new_filepath}")
os.rename(original_filepath, new_filepath)
Also, I'd recommend to first backup your files before manipulating them with scripts. Would be annoying to loose data or see it becoming a mess because of a bug in a script.
import os
folder="S:\\folder\\folder1\\folder2\\folder3\\"
count=1
for file_name in os.listdir(folder):
source = folder + file_name
destination = folder + str(count) + ".csv"
os.rename(source, destination)
count += 1
print('All Files Renamed')
print('New Names are')
res = os.listdir(folder)
print(res)

counting files that DO NOT have a file extension

I currently have this;
names = [os.path.basename(x) for x in glob.glob(UserInput[0]+'/*.txt')]
for i in names:
print("file found - "+i)
Works perfectly for counting filenames ending with .txt and obtaining the basename.
However, I have a folder with a done of filetype file. I'd like to find all of the files that do not have an extension associated with them ... I'm pretty stumped how I'd change the /*.txt part to accommodate this. Any suggestions?
pathlib is king.
* is a pattern for all files, 1 level deep.
**/* is a pattern for all files in all subfolders.
import pathlib
for file_ in pathlib.Path(<your path here>).glob("**/*"): # or glob("*")
if not file_.suffix:
print(file_)
Just loop over all the files, and discard the ones which have an extension.
import os
for x in os.scandir(UserInput[0]):
if '.' in x:
continue
print("file found -", x)

Trying to print name of all csv files within a given folder

I am trying to write a program in python that loops through data from various csv files within a folder. Right now I just want to see that the program can identify the files in a folder but I am unable to have my code print the file names in my folder. This is what I have so far, and I'm not sure what my problem is. Could it be the periods in the folder names in the file path?
import glob
path = "Users/Sarah/Documents/College/Lab/SEM EDS/1.28.20 CZTS hexane/*.csv"
for fname in glob.glob(path):
print fname
No error messages are popping up but nothing will print. Does anyone know what I'm doing wrong?
Are you on a Linux-base system ? If you're not, switch the / for \\.
Is the directory you're giving the full path, from the root folder ? You might need to
specify a FULL path (drive included).
If that still fails, silly but check there actually are files in there, as your code otherwise seems fine.
This code below worked for me, and listed csv files appropriately (see the C:\\ part, could be what you're missing).
import glob
path = "C:\\Users\\xhattam\\Downloads\\TEST_FOLDER\\*.csv"
for fname in glob.glob(path):
print(fname)
The following code gets a list of files in a folder and if they have csv in them it will print the file name.
import os
path = r"C:\temp"
filesfolders = os.listdir(path)
for file in filesfolders:
if ".csv" in file:
print (file)
Note the indentation in my code. You need to be careful not to mix tabs and spaces as theses are not the same to python.
Alternatively you could use os
import os
files_list = os.listdir(path)
out_list = []
for item in files_list:
if item[-4:] == ".csv":
out_list.append(item)
print(out_list)
Are you sure you are using the correct path?
Try moving the python script in the folder when the CSV files are, and then change it to this:
import glob
path = "./*.csv"
for fname in glob.glob(path):
print fname

How to find and copy almost identical filenames from one folder to another using python?

I have a folder with a large number of files (mask_folder). The filenames in this folder are built as follows:
asdgaw-1454_mask.tif
lkafmns-8972_mask.tif
sdnfksdfk-1880_mask.tif
etc.
In another folder (test_folder), I have a smaller number of files with filenames written almost the same, but without the addition of _mask. Like:
asdgaw-1454.tif
lkafmns-8972.tif
etc.
What I need is a code to find the files in mask_folder that have an identical start of the filenames as compared to the files in test_folder and then these files should be copied from the mask_folder to the test_folder.
In that way the test_folder contains paired files as follows:
asdgaw-1454_mask.tif
asdgaw-1454.tif
lkafmns-8972_mask.tif
lkafmns-8972.tif
etc.
This is what I tried, it runs without any errors but nothing happens:
import shutil
import os
mask_folder = "//Mask/"
test_folder = "//Test/"
n = 8
list_of_files_mask = []
list_of_files_test = []
for file in os.listdir(mask_folder):
if not file.startswith('.'):
list_of_files_mask.append(file)
start_mask = file[0:n]
print(start_mask)
for file in os.listdir(test_folder):
if not file.startswith('.'):
list_of_files_test.append(file)
start_test = file[0:n]
print(start_test)
for file in start_test:
if start_mask == start_test:
shutil.copy2(file, test_folder)
The past period I searched for but not found a solution for above mentioned problem. So, any help is really appreciated.
First, you want to get only the files, not the folders as well, so you should probably use os.walk() instead of listdir() to make the solution more robust. Read more about it in this question.
Then, I suggest loading the filenames of the test folder into memory (since they are the smaller part) and then NOT load all the other files into memory as well but instead copy them right away.
import os
import shutil
test_dir_path = ''
mask_dir_path = ''
# load file names from test folder into a list
test_file_list = []
for _, _, file_names in os.walk(test_dir_path):
# 'file_names' is a list of strings
test_file_list.extend(file_names)
# exit after this directory, do not check child directories
break
# check mask folder for matches
for _, _, file_names in os.walk(mask_dir_path):
for name_1 in file_names:
# we just remove a part of the filename to get exact matches
name_2 = name_1.replace('_mask', '')
# we check if 'name_2' is in the file name list of the test folder
if name_2 in test_file_list:
print('we copy {} because {} was found'.format(name_1, name_2))
shutil.copy2(
os.path.join(mask_dir_path, name_1),
test_dir_path)
# exit after this directory, do not check child directories
break
Does this solve your problem?

Action to specific files while in os.walk loop

I have 50 instances of two files that are in 50 separate folders within a directory. I am trying to read from and extract information from the two files within each folder and append the info from the two files to a list at the same time while in the folder that contains them both. (So they will be associated by being appended to the same same list index) I'm using os.walk and opening the file as soon as the file is recognized. (Or trying to). When I run it is seems like the files in question are never being opened, and definitely nothing is being appended to my lists. Could someone tell me if what I have here is completely ridiculous because it seems logical to me but its not working.
import os
import sys
#import itertools
def get_theList():
#specify directory where jobs are located
#can also set 'os.curdir' to rootDir to read from current
rootDir = '/home/my.user.name/O1/injections/test'
No issues here; this is correct
B_sig = []
B_gl = []
SNR_net = []
a = 0
for root, dirs, files in os.walk(rootDir):
for folder in dirs:
for file in folder:
if file == 'evidence_stacked.dat':
print 'open'
a+=1
ev_file = open(file,"r")
ev_lin = ev_file.split()
B_gl.append(ev_lin[1])
B_sig.append(ev_lin[2])
print ev_lin[1]
ev_file.close()
if file == 'snr.txt':
net_file = open(file,"r")
net_lines=net_file.readlines()
SNR_net.append(net_lines[2])
net_file.close()
print 'len a'
print a
This says 0 on output
print 'B_sig'
print B_sig
print len(B_sig)
print 'B_net'
print B_gl
print len(B_gl)
print 'SNR_net'
print SNR_net
print len(SNR_net)
if __name__ == "__main__":
get_theList()
From help(os.walk):
filenames is a list of the names of the non-directory files in dirpath.
You're checking to see if a list is equal to a string.
files == 'evidence_stacked.dat'
What you really want to do is one of the following:
for file in files:
if file == 'evidence_stacked.dat':
...
Or...
if 'evidence_stacked.dat' in files:
...
Both will work, but the latter is a bit more efficient.
In response to your edit:
Instead of...
for file in folder:
...
use...
for file in os.listdir(os.path.join(rootdir, folder)):
...
Also, where you use file after that, replace it with
os.path.join(rootdir, folder, file)
or store that in a new variable (like, say, file2) and use that in place of file.

Categories