I'm learning python and also english. And I have a problem that might be easy, but I can't solve it. I have a folder of .txt's, I was able to extract by regular expression a sequence of numbers of each one. I rename each file with the sequence I extracted from .txt
path_txt = (r'''C:\Users\user\Desktop\Doc_Classifier\TXT''')
for TXT in name_files3:
with open(path_txt + '\\' + TXT, "r") as content:
search = re.search(r'(([0-9]{4})(/)(([1][9][0-9][0-9])|([2][0-9][0-9][0-9])))', content.read())
if search is not None:
name3 = search.group(0)
name3 = name3.replace("/", "")
os.rename(os.path.join(path_txt, TXT),
os.path.join("Processos3", name3 + "_" + str(random.randint(100, 999)) + ".txt"))
I need to check if the file already exists, and rename it by adding an increment. Currently to differentiate the files I am adding a random number to the name (random.randint(100, 999))
PS: Currently the script finds "7526/2016" in .txt, by regular expression. Remove the "/". Rename the file with "75262016" + a random number (example: 7526016_111). Instead of renaming using a random number, I would like to check if the file already exists, and rename it using an increment (example: 7526016_copy1, 7526016_copy2)
Replace:
os.rename(
os.path.join(path_txt, TXT),
os.path.join("Processos3", name3 + "_" + str(random.randint(100, 999)) + ".txt")
)
With:
fp = os.path.join("Processos3", name3 + "_%d.txt")
postfix = 0
while os.path.exists(fp % postfix):
postfix += 1
os.rename(
os.path.join(path_txt, TXT),
fp % postfix
)
The code below iterates through the files found in the current working directory, and looks a base filename and for its increments. As soon as it finds an unused increment, it opens a file with that name and writes to it. So if you already have the files "foo.txt", "foo1.txt", and "foo2.txt", the code will make a new file named "foo3.txt".
import os
filenames = os.listdir()
our_filename = "foo"
cur = 0
cur_filename = "foo"
extension = ".txt"
while(True):
if (cur_filename) in filenames:
cur += 1
cur_filename = our_filename + str(cur) + extension
else:
# found a filename that doesn't exist
f = open(cur_filename,'w')
f.write(stuff)
f.close()
Related
I have a path like this one :
path = "./corpus_test/corpus_ix_test_FMC.xlsx"
I want to retrieve the name of the file without ".xlsx" and the other parts of the file.
I know I should use index like this but there are some cases the file is different ans the path is not the same , for example :
path2 = "./corpus_ix/corpus_polarity_test_FMC.xlsx"
I am looking for a regular expression or a method which with retrieve only the name in both cases. for example, if I read a full repertory, there with lot of files and using index won't help me.
Is there a way to do it in python and telling it it should start slicing at the last "/" ? so that I only retrieve the index of "/" and add "1" to start from.
what I try but still thinking
path ="./corpus_test/corpus_ix_test_FMC.xlsx"
list_of_index =[]
for f in path:
if f == "/":
ind = path.index(f)
list_of_index.append(ind)
ind_to_start_count = max(list_of_index) + 1
print(" the index of the last "/" is" :
name_of_file = path[ind_to_start_count:-5] #
But the printing give me 1 for each / , is there a way to have the index of the letters part of the string ?
But the index of / in both case is 1 for each r ?
wanted to split in caracter but get error with
path ="./corpus_test/corpus_ix_test_FMC.xlsx"
path_string = path.split("")
print(path_string)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-9-b8bdc29c19b1> in <module>
1 path ="./corpus_test/corpus_ix_test_FMC.xlsx"
2
----> 3 path_string = path.split("")
4 print(path_string)
ValueError: empty separator
import os
fullpath = r"./corpus_test/corpus_ix_test_FMC.xlsx"
full_filename = os.path.basename(fullpath)
filename, ext = os.path.splitext(full_filename)
This would give you the base filename without the extension
This is what I've been using:
def get_suffix(word, delimeter):
""" Returns the part of word after the last instance of 'delimeter' """
while delimeter in word:
word = word.partition(delimeter)[2]
return word
def get_terminal_path(path):
"""
Returns the last step of a path
Edge cases:
-Delimeters: / or \\ or mixed
-Ends with delimeter or not
"""
# Convert "\\" to "/"
while "\\" in path:
part = path.partition("\\")
path = part[0] + "/" + part[2]
# Check if ends with delimeter
if path[-1] == "/":
path = path[0:-1]
# Get terminal path
out = get_suffix(path, "/")
return out.partition(".")[0]
I'm learning python and also english. And I have a problem that might be easy, but I can't solve it. I have a folder of .txt's, I was able to extract by regular expression a sequence of numbers of each one. I rename each file with the sequence I extracted from .txt
path_txt = (r'''C:\Users\user\Desktop\Doc_Classifier\TXT''')
for TXT in name_files3:
with open(path_txt + '\\' + TXT, "r") as content:
search = re.search(r'(([0-9]{4})(/)(([1][9][0-9][0-9])|([2][0-9][0-9][0-9])))', content.read())
if search is not None:
name3 = search.group(0)
name3 = name3.replace("/", "")
os.rename(os.path.join(path_txt, TXT),
os.path.join("Processos3", name3 + "_" + str(random.randint(100, 999)) + ".txt"))
I need to check if the file already exists, and rename it by adding an increment. Currently to differentiate the files I am adding a random number to the name (random.randint(100, 999))
PS: Currently the script finds "7526/2016" in .txt, by regular expression. Remove the "/". Rename the file with "75262016" + a random number (example: 7526016_111). Instead of renaming using a random number, I would like to check if the file already exists, and rename it using an increment (example: 7526016_copy1, 7526016_copy2)
Replace:
os.rename(
os.path.join(path_txt, TXT),
os.path.join("Processos3", name3 + "_" + str(random.randint(100, 999)) + ".txt")
)
With:
fp = os.path.join("Processos3", name3 + "_%d.txt")
postfix = 0
while os.path.exists(fp % postfix):
postfix += 1
os.rename(
os.path.join(path_txt, TXT),
fp % postfix
)
The code below iterates through the files found in the current working directory, and looks a base filename and for its increments. As soon as it finds an unused increment, it opens a file with that name and writes to it. So if you already have the files "foo.txt", "foo1.txt", and "foo2.txt", the code will make a new file named "foo3.txt".
import os
filenames = os.listdir()
our_filename = "foo"
cur = 0
cur_filename = "foo"
extension = ".txt"
while(True):
if (cur_filename) in filenames:
cur += 1
cur_filename = our_filename + str(cur) + extension
else:
# found a filename that doesn't exist
f = open(cur_filename,'w')
f.write(stuff)
f.close()
My code works and increments filename but only for two first files, after that it creates new strings in existing second file. Please help me upgrade code to increment go further.
text = 'some text'
file_path = '/path/to/file'
filename = 'textfile'
i = 1
txtfile = self.file_path + filename + str(i) + '.txt'
if not os.path.exists(txtfile):
text_file = open(txtfile, "a")
text_file.write(self.text)
text_file.close()
elif os.path.exists(txtfile) and i >= 1:
i += 1
text_file1 = open(self.file_path + filename + str(i) + '.txt', "a")
text_file1.write(self.text)
text_file1.close()
If your example is part of a loop, your resetting i to 1 in every iteration. Put the i=1 outside of this part.
And it will also start at 1 when you restart your program - sometimes not what you want.
I am writing a script in python to consolidate images in different folders to a single folder. There is a possibility of multiple image files with same names. How to handle this in python? I need to rename those with "image_name_0001", "image_name_0002" like this.
You can maintain a dict with count of a names that have been seen so far and then use os.rename() to rename the file to this new name.
for example:
dic = {}
list_of_files = ["a","a","b","c","b","d","a"]
for f in list_of_files:
if f in dic:
dic[f] += 1
new_name = "{0}_{1:03d}".format(f,dic[f])
print new_name
else:
dic[f] = 0
print f
Output:
a
a_001
b
c
b_001
d
a_002
If you have the root filename i.e name = 'image_name', the extension, extension = '.jpg' and the path to the output folder, path, you can do:
*for each file*:
moved = 0
num = 0
if os.path.exists(path + name + ext):
while moved == 0:
num++
modifier = '_00'+str(num)
if not os.path.exists(path + name + modifier + extension):
*MOVE FILE HERE using (path + name + modifier + extension)*
moved = 1
else:
*MOVE FILE HERE using (path + name + ext)*
There are obviously a couple of bits of pseudocode in there but you should get the gist
I am trying to:
Loop through a bunch of files
makes some changes
Copy the old file to a sub directory. Here's the kicker I don't want to overwrite the file in the new directory if it already exists. (e.g. if "Filename.mxd" already exists, then copy and rename to "Filename_1.mxd". If "Filename_1.mxd" exists, then copy the file as "Filename_2.mxd" and so on...)
save the file (but do a save, not a save as so that it overwrites the existing file)
it goes something like this:
for filename in glob.glob(os.path.join(folderPath, "*.mxd")):
fullpath = os.path.join(folderPath, filename)
mxd = arcpy.mapping.MapDocument(filename)
if os.path.isfile(fullpath):
basename, filename2 = os.path.split(fullpath)
# Make some changes to my file here
# Copy the in memory file to a new location. If the file name already exists, then rename the file with the next instance of i (e.g. filename + "_" + i)
for i in range(50):
if i > 0:
print "Test1"
if arcpy.Exists(draftloc + "\\" + filename2) or arcpy.Exists(draftloc + "\\" + shortname + "_" + str(i) + extension):
print "Test2"
pass
else:
print "Test3"
arcpy.Copy_management(filename2, draftloc + "\\" + shortname + "_" + str(i) + extension)
mxd.save()
So, 2 things I decided to do, was to just set the range of files well beyond what I expect to ever occur (50). I'm sure there's a better way of doing this, by just incrementing to the next number without setting a range.
The second thing, as you may see, is that the script saves everything in the range. I just want to save it once on the next instance of i that does not occur.
Hope this makes sense,
Mike
Use a while loop instead of a for loop. Use the while loop to find the appropriate i, and then save afterwards.
The code/pseudocode would look like:
result_name = original name
i = 0
while arcpy.Exists(result_name):
i+=1
result_name = draftloc + "\\" + shortname + "_" + str(i) + extension
save as result_name
This should fix both issues.
thanks to Maty suggestion above, I've come up with my answer. For those who are interested, my code is:
result_name = filename2
print result_name
i = 0
# Check if file exists
if arcpy.Exists(draftloc + "\\" + result_name):
# If it does, increment i by 1
i+=1
# While each successive filename (including i) does not exists, then save the next filename
while not arcpy.Exists(draftloc + "\\" + shortname + "_" + str(i) + extension):
mxd.saveACopy(draftloc + "\\" + shortname + "_" + str(i) + extension)
# else if the original file didn't satisfy the if, the save it.
else:
mxd.saveACopy(draftloc + "\\" + result_name)