Pypdf2 merger function and startswith - python

first time coder here. I'm trying to create a program to help automate some of my work in the office using python.
what I'm trying to do is to merge pdf file from Folder 1, with another pdf file from folder 2 with the same name. I also would like to use Tkinter gui
this is what I get so far
from tkinter import *
from PyPDF2 import PdfFileMerger
root = Tk()
# Creating a Label Widget
MainLabel = Label(root, text="PDF Rawat Jalan")
# Shoving it onto the screen
MainLabel.pack()
#Prompt Kode
KodeLabel = Label(root, text="Masukan Kode")
KodeLabel.pack()
#Input Kode
kode = Entry(root, bg="gray",)
kode.pack()
#function of Merge Button
def mergerclick():
kode1 = kode.get()
pdflocation_1 = "C:\\Users\\User\\Desktop\\PDF\\Folder 1\\1_"+kode1+".pdf"
pdflocation_2 = "C:\\Users\\User\\Desktop\\PDF\\Folder 2\\2_"+kode1+".pdf"
Output = "C:\\Users\\User\\Desktop\\PDF\\output\\"+kode1+".pdf"
merger = PdfFileMerger()
merger.append(pdflocation_1)
merger.append(pdflocation_2)
merger.write(open(Output, 'wb'))
confirmation = kode1 +" merged"
testlabel = Label(root, text=confirmation)
testlabel.pack()
#Merge Button
mergerButton = Button(root, text= "Merge", command=mergerclick)
mergerButton.pack()
root.mainloop()
Now there is a third file i'm supposed to append, but the third file i'm supposed to append has date in its file name. for example: file 1 (010.pdf); file 2 (010.pdf); file 3 (010_2020_10_05).
There is like 9000 file per folder
How I'm supposed to do this?

I think what you need is a way to just find files prefixed with a particular string. Based on the date suffix I'm guessing the file names may not be unique so I'm writing this to find all matches. Something like this will do that:
import pathlib
def find_prefix_matches(prefix):
dir_path = pathlib.Path(directory_name)
return [str(f_name) for f_name in dir_path.iterdir()
if str(f_name).startswith(prefix)]
If you are just learning to write code, this example is relatively simple. However it is not efficient if you need to match 9,000 files at the same time. To make it run faster you'll want to load the file list once instead of per request.
import pathlib
def find_prefix_matches(prefix, file_list):
return [f for f in file_list if f.startswith(prefix)]
file_list = [str(f_name) for f_name in dir_path.iterdir()]
for file_name_prefix in your_list_of_files_to_append:
file_matches = find_prefix_matches(file_name_prefix, file_list)

Related

Using SHUTIL to copy a file into a directory with a space in its name

I'm trying to build a program that will read in a list of files, add a prefix and a suffix to the filename, and then copy the file to a new folder with the new file name. So, for example, a file named "Reports.pdf" would become "PBC_Reports_V1.pdf"
The problem I have is when either the source or destination directory have a space in their name, the shutil module can't find the directory. I'm not sure what I need to do to have shutil recognize the directory, and could use some help.
code is as follows:
## This program is intended to allow a batch of files to have their name changed to standard IA formats
import os, shutil, datetime
from tkinter import *
strFolderName = "\Test"
#strNewPath = basePath+strFolderName
root = Tk()
root.title("Bulk File Renaming")
#root.iconbitmap(r"C:\Coding\FlagIcon.bmp")
############################################################################################################################################
## This section sets up the program's GUI. The buttons have to come after the modules, but this has to come at the beginning of the code
## because, of course, putting all the GUI stuff together would make too much sense. Oh well. GUI until the next line of hashtags.
strF = os.getcwd() #Finds the current file path
strFileIn = StringVar() #This is where we'll store the input path
strFileIn.set(strF) #Puts the current directory in the box
strFileOut = StringVar() #This variable is the output path
strFileOut.set(strF+r"\ConvertedFiles")
str1 = StringVar() #This variable holds the radiobutton output, which represents the file prefix
str1.set("\PBC_") #For some reason, if you try to set a variable in the same line you declare it, it stops working.
strPrefixes = [("PBC (Prepared by Client)", "\PBC_"), #A list (tuple?) of the different prefixes and their meaning.
("WP (Working Paper)", "\WP_"), #To be called futher down when we create the radiobuttons
("COM (Communications)", "\COM_"),
("MIN (Minutes)", "\MIN_"),
("PM (Project Management)", "\PM_"),
("DOC (Anything not created by IA or the OPI)", "\DOC_")]
inRow = 5
inCol = 0
for strPref, val in strPrefixes:
Radiobutton(root, text=strPref, variable=str1, value=val).grid(row =inRow, column=inCol, sticky=(W))
inCol=inCol+1
if inCol>2:
inCol=0
inRow=6
#First, labels
myLabelSource = Label(root, text="Source Folder").grid(row=0, column=0, sticky=(E))
myLabelDestn = Label(root, text="Destination Folder").grid(row=1, column=0, sticky=(E))
myLabelPrefix = Label(root, text="Select Desired Prefix").grid(row=3, column=0, sticky=(W))
#Next, input boxes
#Source files' location
enSource = Entry(root, textvariable=strFileIn).grid(row=0, column = 1, columnspan=3, sticky=(W,E))
#Destination files' location
enDestn = Entry(root, textvariable=strFileOut).grid(row=1, column = 1, columnspan =3, sticky=(W,E))
#################################################################################################################################################
#This module shuts down the program
def progExit():
root.quit() #Closes all the program's stuff
root.destroy() #Actually shuts things down.
#This module gets the date the file was last modified
def getDate(strA):
# Depending on how the file was created/modified, the creation date may be placed in the date modified field and vice versa.
# Here, we take both the modified and created date, see which one is earlier, format it in a YYYY-MM-DD format, and
# return it to the user.
strC = os.path.getctime(strA) #Gets the created date, as time since the epoch
strB = os.path.getmtime(strA) #Gets the modified date, as time since the epoch
if strB<strC:
strB = datetime.datetime.fromtimestamp(strB) #Converts the date into a readable string
strB = str(strB)
strB=strB[0:10] #Leaves just the YYYY-MM-DD fields
return strB
else:
strC = datetime.datetime.fromtimestamp(strC)
strC=str(strC)
strC=strC[0:10]
return strC
#This module determines the location of the '.' at the end of the filename, before the file type indicator.
def fileTypeLength(strName):
inI = len(strName)-1 #Since the first letter in the string is in the 0 position, we subtract 1 from the length to make the loop work
while inI>-1:
if strName[inI] == chr(46): #The '.' char has an ascii value of 46. It's easier for python to understand what we're looking for this way.
return inI
else:
inI=inI-1 #We go from the end towards the front so that if there are any '.' in the file name it won't screw up the program
if inI<0:
outputLabel=Label(root, text="Error finding file suffix for file "+strName).grid(row=8, column = 1)
root.quit()
#This module is where the program actually copies the file, and pastes together everything else.
def copyAllFiles():
inA=0
strPrefix=str1.get()
strNewPath=os.path.abspath(strFileOut.get())
strEntry=os.path.abspath(strFileIn.get())
listOfFiles = os.listdir(strFileIn.get())
if os.path.isdir(strNewPath) == False:
os.mkdir(strNewPath)
for entry in listOfFiles:
if os.path.isfile(os.path.join(strEntry, entry)): # Verify that file in question is a file, not a folder
entryEd=entry # entryEd holds the file name
entry=(os.path.join(strEntry, entry)) # entry holds the file location
strDate = getDate(entry) # Gets the file's creation date
inL = len(entryEd)
inS = fileTypeLength(entryEd) #Finds the position of the '.' at the end of the file name
strFT = entryEd[inS:inL] #Here we remove the file type idenifier, so that it can be moved to the end of the copied filename
entryEd=entryEd[0:inS]
strOutputFile = strNewPath+strPrefix+entryEd+"_"+str(strDate)+"_V1"+str(strFT)
shutil.copy2(entry, strOutputFile) #This copies the file, and all its attributes
inA = inA+1
outputLabel=Label(root, text=str(inA)+" files successfully processed").grid(row=8, column = 1)
##### These are the buttons that the GUI uses to launch the program or shut it down.
buttonExit=Button(root, text="Click here to exit", padx=50, command=progExit).grid(row=7, column=2)
buttonProcess=Button(root,text="Rename Files", padx=50, command=copyAllFiles).grid(row=7, column=1)
#The last thing we have to do is name the GUI that we started building on line 9. Tkinter is a funny module.
root.mainloop()
I can verify that the variables have two slashes for all the folder dividers, so `strOutputFile = C:\\Example Folder\\Example Output\\PBC_Reports_V1.pdf' .
Anybody have a suggestion?

Using .split with tkinter

Quite a beginner here. I have a command line script that works fine for what I do and I'm looking to move it into a GUI.
os.chdir(ImageDirST)
for f in sorted(os.listdir(ImageDirST)):
f_name,f_ext = (os.path.splitext(f))
f_sku = (f_name.split(' ')[0])
f_num = (f_name[-2:])
n_name = ('{}_{}{}'.format(f_sku,f_num,f_ext))
print(f, "-->", n_name)
I would like this to display in the same fashion within a message window in tkinter.
With some help from here, I managed to print the filenames in the directory when a button is pushed with:
filenames = sorted(os.listdir(ImageDirBT))
text = "\n".join(filenames)
print_filename_test.set(text)
I have tried to use my split code to setup a list of what the new filenames would look like, prior to setting the variable, with the following, where print_filenames() is the function triggered by the press of a button.
def print_filenames():
filenames = sorted(os.listdir(ImageDirBT))
for filenames in sorted(os.listdir(ImageDirBT)):
f_name,f_ext = (os.path.splitext(filenames))
f_sku = (f_name.split('_')[0])
f_num = (f_name[-2:])
n_name = ('{}_{}{}'.format(f_sku,f_num,f_ext))
newlist = "\n".join(n_name)
print_filename_test.set(newlist)
I don't get any errors with this code for print_filenames(), however what is displayed in the message panel is the last filename in the list, vertically, one character wide:
eg:
F
I
L
E
_
1
1
.
e
x
t
I would like to display the output as:
oldfilename_01.ext --> newfilename_csvdata_01.ext
oldfilename_02.ext --> newfilename_csvdata_02.ext
oldfilename_03.ext --> newfilename_csvdata_03.ext
oldfilename_04.ext --> newfilename_csvdata_04.ext
The command line program I have written uses numbers to chose menu options for what needs to be done, confirming before any renaming is done, hence printing the file name comparisons. My struggle is manipulating the strings in the list to be able to do the same thing.
Using messagebox:
import os
import tkinter as tk
from tkinter import messagebox
ImageDirST = r"your_path"
os.chdir(ImageDirST)
root = tk.Tk()
names = []
for f in sorted(os.listdir(ImageDirST)):
f_name,f_ext = (os.path.splitext(f))
f_sku = (f_name.split(' ')[0])
f_num = (f_name[-2:])
n_name = ('{}_{}{}'.format(f_sku,f_num,f_ext))
names.append(f"{f} --> {n_name}\n")
messagebox.showinfo(title="Something", message="".join(names))
root.mainloop()
Or using Text widget with scrollbar:
import os
import tkinter as tk
from tkinter.scrolledtext import ScrolledText
ImageDirST = r"your_path"
os.chdir(ImageDirST)
root = tk.Tk()
txt = ScrolledText(root, font="Arial 8")
txt.pack()
for f in sorted(os.listdir(ImageDirST)):
f_name,f_ext = (os.path.splitext(f))
f_sku = (f_name.split(' ')[0])
f_num = (f_name[-2:])
n_name = ('{}_{}{}'.format(f_sku,f_num,f_ext))
txt.insert("end",f"{f} --> {n_name}\n")
root.mainloop()

configparser in Tkinter - make new ini file with id?

i have made a simple GUI app, with tkinter and configparser, to store the values in my entry/text fields.
But i need help with something. I want to make the pogram assign a new ini file every time the user saves the input from the button and give the inifile a ID starting from 1 to infinite
So the user fill's all entry's and hits the save all information button. The gui must then generate a new inifile (1).
def saveConfig():
filename = "config.ini"
file = open(filename, 'w')
Config = configparser.ConfigParser()
Config.add_section('ORDERDATA')
Config.set("ORDERDATA", "REKVIRENT", e1.get())
Config.set("ORDERDATA", "MODTAGER", e2.get())
Config.set("ORDERDATA", "PATIENTFORNAVN", e3.get())
Config.set("ORDERDATA", "PATIENTEFTERNAVN", e4.get())
Config.set("ORDERDATA", "CPR", e7.get())
Config.set("ORDERDATA", "DOKUMENTATIONSDATO", e5.get())
Config.set("ORDERDATA", "ØNSKET UNDERSØGELSE", e6.get())
Config.set("ORDERDATA", "ANAMNESE", t1.get('1.0', END))
Config.set("ORDERDATA", "INDIKATION", t2.get('1.0', END))
Config.write(file)
file.close()
If you want your program to save all your configuration files with ascending numbers, you could do the following:
# Python 2.7
import os
import ConfigParser as cp
import Tkinter as tk
def saveConfig():
config = cp.ConfigParser()
config.add_section("ORDERDATA")
config.set("ORDERDATA", "REKVIRENT", e1.get())
# Set all your settings here
# Using os.listdir(), you can get the files in a folder in a list
list_files = os.listdir(os.getcwd())
# You can then convert the names of the files into integers for all
# .ini files
list_numbers = [int(x[:-4]) for x in list_files if x.endswith(".ini")]
# If the length of this new list is 0, max will throw a ValueError
if len(list_numbers) != 0:
# Calculate the new file number by adding one to the highest found number
new_file_num = max(list_numbers) + 1
# To prevent the ValueError, set the number to 1 if no files are present
else:
new_file_num = 1
# Derive the name of the file here
new_file_name = str(new_file_num) + ".ini"
# Open the file and write to it
with open(new_file_name, "w") as file_obj:
config.write(file_obj)
root = tk.Tk()
e1 = tk.Entry(root)
button = tk.Button(root, text="Click me!", command=saveConfig)
e1.pack()
button.pack()
root.mainloop()
For Python 3, you would only have to change the imports. Tested and working using Python 2.7 on Ubuntu.

Python pass variable values in functions

So this is my full code. All I want is append excel files to one excel by sheets from a specific folder. It's GUI and has 3 buttons browse, append, and quit. How do i get path value from browsed folder(filename) ? thanks
from tkinter import *
from tkinter.filedialog import askdirectory
import tkinter as tk
import glob
import pandas as pd
import xlrd
root = Tk()
def browsefunc():
filename = askdirectory()
pathlabel.config(text=filename)
return filename
def new_window():
all_data = pd.DataFrame()
all_data1 = pd.DataFrame()
path = browsefunc()+"/*.xlsx"
for f in glob.glob(path):
df = pd.read_excel(f,sheetname='Scoring',header=0)
df1 = pd.read_excel(f,sheetname='Sheet1',header=0)
all_data = all_data.append(df,ignore_index=False)
all_data1 = all_data1.append(df1,ignore_index=True)
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
all_data.to_excel(writer, sheet_name='Scoring')
all_data1.to_excel(writer, sheet_name='Sheet1')
writer.save()
browsebutton = Button(root, text="Browse", command=browsefunc).pack()
Button(root, text='Append', command=new_window).pack()
Button(root, text='quit', command=root.destroy).pack()
pathlabel = Label(root)
pathlabel.pack()
mainloop()
It is not entirely clear what you are asking, so can you edit the question to be more specific?
I think you are trying to get the local variable filename (from inside the function browsefunc) able to be accessed outside the function as a global variable. Use return. This tutorial explains it nicely.
At the end of browsefunc you add
return filename
and when you call browsefunc you run
path = browsefunc()
That assigns the variable fdback to whatever you return from browsefunc. It can be an integer, float, string, or list etc.
So, final code is:
def browsefunc():
filename = askdirectory()
pathlabel.config(text=filename)
return filename
def new_window():
path = browsefunc()
I would recommend using more explicit variable and function names.

calling listbox from different function in python with tkinter

I'm trying to add and remove items from a listbox but I'm getting the following error:
files = self.fileList()
TypeError: 'list' object is not callable
How can I access this list if I can't call it? I tried to use it as a global variable but maybe I was using it incorrectly. I want to be able to take items from that listbox and when a button is pressed, add them to another listbox.
class Actions:
def openfile(self): #select a directory to view files
directory = tkFileDialog.askdirectory(initialdir='.')
self.directoryContents(directory)
def filename(self):
Label (text='Please select a directory').pack(side=TOP,padx=10,pady=10)
files = []
fileListSorted = []
fileList = []
#display the contents of the directory
def directoryContents(self, directory): #displays two listBoxes containing items
scrollbar = Scrollbar() #left scrollbar - display contents in directory
scrollbar.pack(side = LEFT, fill = Y)
scrollbarSorted = Scrollbar() #right scrollbar - display sorted files
scrollbarSorted.pack(side = RIGHT, fill = Y)
#files displayed in the left listBox
global fileList
fileList = Listbox(yscrollcommand = scrollbar.set)
for filename in os.listdir(directory):
fileList.insert(END, filename)
fileList.pack(side =LEFT, fill = BOTH)
scrollbar.config(command = fileList.yview)
global fileListSorted #this is for the filelist in the right window. contains the values the user has selected
fileListSorted = Listbox(yscrollcommand = scrollbarSorted.set) #second listbox (button will send selected files to this window)
fileListSorted.pack(side=RIGHT, fill = BOTH)
scrollbarSorted.config(command = fileListSorted.yview)
selection = fileList.curselection() #select the file
b = Button(text="->", command=lambda:self.moveFile(fileList.curselection()))#send the file to moveFile to be added to fileListSorted
b.pack(pady=5, padx =20)
def moveFile(self,File):
files = self.fileList()
insertValue = int(File[0]) #convert the item to integer
insertName = self.fileList[insertValue] #get the name of the file to be inserted
fileListSorted.insert(END,str(insertName)) #insertthe value to the fileList array
I changed files to the following to see if files was setting properly and it returned an empty array
files = self.fileList
print files
#prints []
You never initialise self.fileList (nor fileListSorted).
When you write in directoryContents
global fileList
fileList = Listbox(yscrollcommand = scrollbar.set)
...
you work on a global variable called fileList. You could either use self.fileList everywhere (or add global fileList in all your function, and thus use fileList).
However, I am skeptical of your use of classes, you should try to understand object-oriented concepts and their implementation in python, or ignore these concepts for the moment.
Edit
I have tried to run your code and you might also change the line
insertName = self.fileList[insertValue]
by
insertName = self.fileList.get(insertValue)
fileList i a widget and every Tkinter widgets use dictionnary notation for properties (such as self.fileList['background']).
Note that get take either a number, or a string containing a number and thus your conversion on above line is useless. Also note that you can get the whole list through get(0,END).

Categories