i want to use this piece of code openpyxl.load_workbook(r"mypath") but the only difference is that mypath is a varialbe path i change everytime depending on a loop of different folders.
PathsList = []
for folderName, subFolders, fileNames in os.walk
fileNamesList.append(os.path.basename(fileName))
PathsList.append(os.path.abspath(fileName))
or i in range(len(fileNamesList)):
j = 1
while j < len(fileNamesList):
if(first3isdigit(fileNamesList[i])) == (first3isdigit(fileNamesList[j])):
if(in_fileName_DOORS in str(fileNamesList[i]) and in_fileName_TAF in str(fileNamesList[j])):
mypath = PathsList[i]
File = openpyxl.load_workbook(r'mypath ')
wsFile = File.active
mypath is not readable as a vairable , is there's any solution!
Edit 1:i thought also about
File = openpyxl.load_workbook(exec(r'%s' % (mypath))
but couldn't since exec can't be inside brackets
This code
File = openpyxl.load_workbook(r'mypath ')
Tries to pass the raw string 'mypath ' as an argument to the load_workbook method.
If you want to pass the contents of the mypath variable to the method, you should remove the apostrophe and the r tag.
File = openpyxl.load_workbook(mypath)
This is basic python synthax. You can read more about it in the documentation.
Please let me know if this is what you needed.
Edit:
If the slashes are a concern you can do the following:
File = openpyxl.load_workbook(mypath.replace('\\','/')
Related
Hey I've looked around but can't seem to find an answer. I am looking to identify and print the number of files in a list & their names, but keeping running into a an error. I am new to python so I am quite sure I got something wrong and apologize if this is a stupid question. Below is the code I have so far
import os
folderpath = "C:\Users\Michaelf\Desktop\GEOG M173\LabData"
filelist = os.listdir(folderpath)
print filelist
Counter_Shapefiles = 0
Names_of_Shapefiles = 0
for the_file_name in filelist:
File_Extension = the_file_name[-4:]
if "file_Extension == .shp":
Counter_Shapefiles= Counter_Shapefiles + 1
Names_of_Shapefiles.append
to use append you need a list not an int so
Name_of_Shapefiles = 0
should be
Name_of_Shapefiles = []
Second, the syntax for append is Names_of_Shapefiles.append(the_file_name)
Names_of_Shapefiles is an int. change that to a list and add what you want appended into the append call.
Also, when adding questions, note what errors you get for future reference.
import os
folderpath = "C:\Users\Michaelf\Desktop\GEOG M173\LabData"
filelist = os.listdir(folderpath)
print filelist
Counter_Shapefiles = 0
Name_of_Shapefiles = []
for the_file_name in filelist:
File_Extension = the_file_name[-4:]
if File_Extension == ".shp":
Counter_Shapefiles = Counter_Shapefiles+1
Names_of_Shapefiles.append(the_file_name)
Have a look at the changes that I've made to your code.
For if statements you don't want your condition to be in quotation marks, as that turns it into a string. If you want to make it clear that it's your statement then you can use brackets, but it's not necessary
In that same if statement you type file_Extension without a capital f, which isn't the same as File_Extension, so your if statement doesn't know what it's looking for.
For your ".shp" string, that does need to be in quotation marks, to make it clear that it's a string.
When defining your Names_of_Shapefiles array, you need to put it in square brackets, or it'll automatically become a number instead of an array.
The .append is a function, and takes input; how else would your program know what to append to the Names_of_Shapefiles array? This is why you put what you want appending inside the brackets at the end.
I have a list phplist containing the following strings (example below), there are many more, this is a snippet of the entire list
/home/comradec/public_html/moodle/config.php
/home/comradec/public_html/moodle/cache/classes/config.php
/home/comradec/public_html/moodle/theme/sky_high/config.php
/home/comradec/public_html/moodle/theme/brick/config.php
/home/comradec/public_html/moodle/theme/serenity/config.php
/home/comradec/public_html/moodle/theme/binarius/config.php
/home/comradec/public_html/moodle/theme/anomaly/config.php
/home/comradec/public_html/moodle/theme/standard/config.php
What I am trying to do is only keep the subdir/config.php file and exclude all other config.php files (eg cache/classes/config.php).
Full code is
for folder, subs, files in os.walk(path):
for filename in files:
if filename.endswith('.php'):
phplist.append(abspath(join(folder, filename)))
for i in phplist:
if i.endswith("/config.php"):
cmsconfig.append(i)
if i.endswith("/mdeploy.php"):
cmslist.append(cms1[18])
So the outcome will only add /config.php file path to the list cmsconfig but what is happening I am getting all the config.php files as in the top example
I have been using the code like is not i.endswith("/theme/brick/config.php") but I want a way to exclude the theme directory from the list.
The reason I am placing the output into a list is I use that output in another area of the code.
Change your if-condition to if i.endswith("moodle/config.php").
If you want to change the folder that you want to this with:
path_ending = '%s/config.php' % folder_name
Now change the if-condition to if i.endswith(path_ending)
This will show paths that end with config.php within the folder tbat you passed.
I think this is what you want. may change the naming of variables it is not pep8 style.
First i sort all entries that the shortest comes first, then i remember which parts are already checked.
url1 = '/home/comradec/public_html/moodle/theme/binarius/config.php'
url2 = '/home/comradec/public_html/moodle/config.php'
url3 = '/home/comradec/public_html/othername/theme/binarius/config.php'
url4 = '/home/comradec/public_html/othername/config.php'
urls = []
urls.append(url1)
urls.append(url2)
urls.append(url3)
urls.append(url4)
moodleUrls = []
checkedDirs = []
#sort
for i in sorted(urls):
if str(i).endswith('config.php'):
alreadyChecked = False
for checkedDir in checkedDirs:
if str(i).startswith(checkedDir):
alreadyChecked = True
break
if not alreadyChecked:
moodleUrls.append(i)
checkedDirs.append(str(i).replace('/config.php',''))
print(checkedDirs)
print(moodleUrls)
Output:
['/home/comradec/public_html/moodle', '/home/comradec/public_html/othername']
['/home/comradec/public_html/moodle/config.php', '/home/comradec/public_html/othername/config.php']
The way I resolved my question. Provides the output I am looking for.
path = "/home/comradec"
phplist = []
cmsconfig = []
config = "config.php"
for folder, subs, files in os.walk(path):
for filename in files:
if filename.endswith('.php'):
phplist.append(abspath(join(folder, filename)))
for i in phplist:
if i.endswith("/mdeploy.php"):
newurl = i
newurl = newurl[:-11]
newurl = newurl + config
for i in phplist:
if i.endswith("/config.php"):
confirmurl = i
if confirmurl == newurl:
cmsconfig.append(newurl)
print('\n'.join(cmsconfig))
Let's say I have the following files in a directory:
snackbox_1a.dat
zebrabar_3z.dat
cornrows_00.dat
meatpack_z2.dat
I have SEVERAL of these directories, in which all of the files are of the same format, ie:
snackbox_xx.dat
zebrabar_xx.dat
cornrows_xx.dat
meatpack_xx.dat
So what I KNOW about these files is the first bit (snackbox, zebrabar, cornrows, meatpack). What I don't know is the bit for the file extension (the 'xx'). This changes both within the directory across the files, and across the directories (so another directory might have different xx values, like 12, yy, 2m, 0t, whatever).
Is there a way for me to rename all of these files, or truncate them all (since the xx.dat will always be the same length), for ease of use when attempting to call them? For instance, I'd like to rename them so that I can, in another script, use a simple index to step through and find the file I want (instead of having to go into each directory and pull the file out manually).
In other words, I'd like to change the file names to:
snackbox.dat
zebrabar.dat
cornrows.dat
meatpack.dat
Thanks!
You can use shutil.move to move files. To calculate the new filename, you can use Python's string split method:
original_name = "snackbox_12.dat"
truncated_name = original.split("_")[0] + ".dat"
Try re.sub:
import re
filename = 'snackbox_xx.dat'
filename_new = re.sub(r'_[A-Za-z0-9]{2}', '', filename)
You should get 'snackbox.dat' for filename_new
This assumes the two characters after the "_" are either a number or lowercase/uppercase letter, but you could choose to expand the classes included in the regular expression.
EDIT: including moving and recursive search:
import shutil, re, os, fnmatch
directory = 'your_path'
for root, dirnames, filenames in os.walk(directory):
for filename in fnmatch.filter(filenames, '*.dat'):
filename_new = re.sub(r'_[A-Za-z0-9]{2}', '', filename)
shutil.move(os.path.join(root, filename), os.path.join(root, filename_new))
This solution renames all files in the current directory that match the pattern in the function call.
What the function does
snackbox_5R.txt >>> snackbox.txt
snackbox_6y.txt >>> snackbox_0.txt
snackbox_a2.txt >>> snackbox_1.txt
snackbox_Tm.txt >>> snackbox_2.txt
Let's look at the functions inputs and some examples.
list_of_files_names This is a list of string. Where each string is the filename without the _?? part.
Examples:
['snackbox.txt', 'zebrabar.txt', 'cornrows.txt', 'meatpack.txt', 'calc.txt']
['text.dat']
upper_bound=1000 This is an integer. When the ideal filename is already taken, e.g snackbox.dat already exist it will create snackbox_0.dat all the way up to snackbox_9999.dat if need be. You shouldn't have to change the default.
The Code
import re
import os
import os.path
def find_and_rename(dir, list_of_files_names, upper_bound=1000):
"""
:param list_of_files_names: List. A list of string: filname (without the _??) + extension, EX: snackbox.txt
Renames snackbox_R5.dat to snackbox.dat, etc.
"""
# split item in the list_of_file_names into two parts, filename and extension "snackbox.dat" -> "snackbox", "dat"
list_of_files_names = [(prefix.split('.')[0], prefix.split('.')[1]) for prefix in list_of_files_names]
# store the content of the dir in a list
list_of_files_in_dir = os.listdir(dir)
for file_in_dir in list_of_files_in_dir: # list all files and folders in current dir
file_in_dir_full_path = os.path.join(dir, file_in_dir) # we need the full path to rename to use .isfile()
print() # DEBUG
print('Is "{}" a file?: '.format(file_in_dir), end='') # DEBUG
print(os.path.isfile(file_in_dir_full_path)) # DEBUG
if os.path.isfile(file_in_dir_full_path): # filters out the folder, only files are needed
# Filename is a tuple containg the prefix filename and the extenstion
for file_name in list_of_files_names: # check if the file matches on of our renaming prefixes
# match both the file name (e.g "snackbox") and the extension (e.g "dat")
# It find "snackbox_5R.txt" by matching "snackbox" in the front and matching "dat" in the rear
if re.match('{}_\w+\.{}'.format(file_name[0], file_name[1]), file_in_dir):
print('\nOriginal File: ' + file_in_dir) # printing this is not necessary
print('.'.join(file_name))
ideal_new_file_name = '.'.join(file_name) # name might already be taken
# print(ideal_new_file_name)
if os.path.isfile(os.path.join(dir, ideal_new_file_name)): # file already exists
# go up a name, e.g "snackbox.dat" --> "snackbox_1.dat" --> "snackbox_2.dat
for index in range(upper_bound):
# check if this new name already exists as well
next_best_name = file_name[0] + '_' + str(index) + '.' + file_name[1]
# file does not already exist
if os.path.isfile(os.path.join(dir,next_best_name)) == False:
print('Renaming with next best name')
os.rename(file_in_dir_full_path, os.path.join(dir, next_best_name))
break
# this file exist as well, keeping increasing the name
else:
pass
# file with ideal name does not already exist, rename with the ideal name (no _##)
else:
print('Renaming with ideal name')
os.rename(file_in_dir_full_path, os.path.join(dir, ideal_new_file_name))
def find_and_rename_include_sub_dirs(master_dir, list_of_files_names, upper_bound=1000):
for path, subdirs, files in os.walk(master_dir):
print(path) # DEBUG
find_and_rename(path, list_of_files_names, upper_bound)
find_and_rename_include_sub_dirs('C:/Users/Oxen/Documents/test_folder', ['snackbox.txt', 'zebrabar.txt', 'cornrows.txt', 'meatpack.txt', 'calc.txt'])
I am trying to rename files so that they contain an ID followed by a -(int). The files generally come to me in this way but sometimes they come as 1234567-1(crop to bottom).jpg.
I have been trying to use the following code but my regular expression doesn't seem to be having any effect. The reason for the walk is because we have to handles large directory trees with many images.
def fix_length():
for root, dirs, files in os.walk(path):
for fn in files:
path2 = os.path.join(root, fn)
filename_zero, extension = os.path.splitext(fn)
re.sub("[^0-9][-]", "", filename_zero)
os.rename(path2, filename_zero + extension)
fix_length()
I have inserted print statements for filename_zero before and after the re.sub line and I am getting the same result (i.e. 1234567-1(crop to bottom) not what I wanted)
This raises an exception as the rename is trying to create a file that already exists.
I thought perhaps adding the [-] in the regex was the issue but removing it and running again I would then expect 12345671.jpg but this doesn't work either. My regex is failing me or I have failed the regex.
Any insight would be greatly appreciated.
As a follow up, I have taken all the wonderful help and settled on a solution to my specific problem.
path = 'C:\Archive'
errors = 'C:\Test\errors'
num_files = []
def best_sol():
num_files = []
for root, dirs, files in os.walk(path):
for fn in files:
filename_zero, extension = os.path.splitext(fn)
path2 = os.path.join(root, fn)
ID = re.match('^\d{1,10}', fn).group()
if len(ID) <= 7:
if ID not in num_files:
num_files = []
num_files.append(ID)
suffix = str(len(num_files))
os.rename(path2, os.path.join(root, ID + '-' + suffix + extension))
else:
num_files.append(ID)
suffix = str(len(num_files))
os.rename(path2, os.path.join( root, ID + '-' + suffix +extension))
else:
shutil.copy(path2, errors)
os.remove(path2)
This code creates an ID based upon (up to) the first 10 numeric characters in the filename. I then use lists that store the instances of this ID and use the, length of the list append a suffix. The first file will have a -1, second a -2 etc...
I am only interested (or they should only be) in ID's with a length of 7 but allow to read up to 10 to allow for human error in labelling. All files with ID longer than 7 are moved to a folder where we can investigate.
Thanks for pointing me in the right direction.
re.sub() returns the altered string, but you ignore the return value.
You want to re-assign the result to filename_zero:
filename_zero = re.sub("[^\d-]", "", filename_zero)
I've corrected your regular expression as well; this removes anything that is not a digit or a dash from the base filename:
>>> re.sub(r'[^\d-]', '', '1234567-1(crop to bottom)')
'1234567-1'
Remember, strings are immutable, you cannot alter them in-place.
If all you want is the leading digits, plus optional dash-digit suffix, select the characters to be kept, rather than removing what you don't want:
filename_zero = re.match(r'^\d+(?:-\d)?', filename_zero).group()
new_filename = re.sub(r'^([0-9]+)-([0-9]+)', r'\g1-\g2', filename_zero)
Try using this regular expression instead, I hope this is how regular expressions work in Python, I don't use it often. You also appear to have forgotten to assign the value returned by the re.sub call to the filename_zero variable.
I have a folder with over 100,000 files, all numbered with the same stub, but without leading zeros, and the numbers aren't always contiguous (usually they are, but there are gaps) e.g:
file-21.png,
file-22.png,
file-640.png,
file-641.png,
file-642.png,
file-645.png,
file-2130.png,
file-2131.png,
file-3012.png,
etc.
I would like to batch process this to create padded, contiguous files. e.g:
file-000000.png,
file-000001.png,
file-000002.png,
file-000003.png,
When I parse the folder with for filename in os.listdir('.'): the files don't come up in the order I'd like to them to. Understandably they come up
file-1,
file-1x,
file-1xx,
file-1xxx,
etc. then
file-2,
file-2x,
file-2xx,
etc. How can I get it to go through in the order of the numeric value? I am a complete python noob, but looking at the docs i'm guessing I could use map to create a new list filtering out only the numerical part, and then sort that list, then iterate that? With over 100K files this could be heavy. Any tips welcome!
import re
thenum = re.compile('^file-(\d+)\.png$')
def bynumber(fn):
mo = thenum.match(fn)
if mo: return int(mo.group(1))
allnames = os.listdir('.')
allnames.sort(key=bynumber)
Now you have the files in the order you want them and can loop
for i, fn in enumerate(allnames):
...
using the progressive number i (which will be 0, 1, 2, ...) padded as you wish in the destination-name.
There are three steps. The first is getting all the filenames. The second is converting the filenames. The third is renaming them.
If all the files are in the same folder, then glob should work.
import glob
filenames = glob.glob("/path/to/folder/*.txt")
Next, you want to change the name of the file. You can print with padding to do this.
>>> filename = "file-338.txt"
>>> import os
>>> fnpart = os.path.splitext(filename)[0]
>>> fnpart
'file-338'
>>> _, num = fnpart.split("-")
>>> num.rjust(5, "0")
'00338'
>>> newname = "file-%s.txt" % num.rjust(5, "0")
>>> newname
'file-00338.txt'
Now, you need to rename them all. os.rename does just that.
os.rename(filename, newname)
To put it together:
for filename in glob.glob("/path/to/folder/*.txt"): # loop through each file
newname = make_new_filename(filename) # create a function that does step 2, above
os.rename(filename, newname)
Thank you all for your suggestions, I will try them all to learn the different approaches. The solution I went for is based on using a natural sort on my filelist, and then iterating that to rename. This was one of the suggested answers but for some reason it has disappeared now so I cannot mark it as accepted!
import os
files = os.listdir('.')
natsort(files)
index = 0
for filename in files:
os.rename(filename, str(index).zfill(7)+'.png')
index += 1
where natsort is defined in http://code.activestate.com/recipes/285264-natural-string-sorting/
Why don't you do it in a two step process. Parse all the files and rename with padded numbers and then run another script that takes those files, which are sorted correctly now, and renames them so they're contiguous?
1) Take the number in the filename.
2) Left-pad it with zeros
3) Save name.
def renamer():
for iname in os.listdir('.'):
first, second = iname.replace(" ", "").split("-")
number, ext = second.split('.')
first, number, ext = first.strip(), number.strip(), ext.strip()
number = '0'*(6-len(number)) + number # pad the number to be 7 digits long
oname = first + "-" + number + '.' + ext
os.rename(iname, oname)
print "Done"
Hope this helps
The simplest method is given below. You can also modify for recursive search this script.
use os module.
get filenames
os.rename
import os
class Renamer:
def __init__(self, pattern, extension):
self.ext = extension
self.pat = pattern
return
def rename(self):
p, e = (self.pat, self.ext)
number = 0
for x in os.listdir(os.getcwd()):
if str(x).endswith(f".{e}") == True:
os.rename(x, f'{p}_{number}.{e}')
number+=1
return
if __name__ == "__main__":
pattern = "myfile"
extension = "txt"
r = Renamer(pattern=pattern, extension=extension)
r.rename()