Getting excel file from anywhere in computer with openpyxl/Python - python

Im working on a project to automate some excel processes, but so far I have only found a way to select excel files if theyre in the same folder as the python file. how do i make it to where i can select the excel file without it being in the same folder as python? I know i can type the whole path, but i would like to only use the file name to select it("sample.xlsx")

You can use the Python built-in tkinter filedialog.
A simple sample would be:
import tkinter as tk
from tkinter import filedialog
root = tk.Tk()
root.withdraw()
path2urExcel = filedialog.askopenfilename(title="Select the desired excel sheet ..")
Then proceed with your script.

Related

Python Script saves files one directory above user input directory

I am writing a python script which uses tkinter to take the user input for a .xlsx file and segregate the data present it it by grouping the data by location and then exporting individual csv files for each unique value of location alongside the columns I tell it to keep. The issue with it is while taking the user input for the directory to store the files in, the script is saving the file one directory above it.
Ex- lets say the user selects the directory for the files to be saved in as \Desktop\XYZ\Test, the script is saving the exported file one directory above it i.e. \Desktop\XYZ while adding the name for the subdirectory Test into the exported file name. The code I'm using is attached below.
This is probably a simple issue but being a newbie I'm at my wits end trying to resolve this so any help is appreciated.
Code:
import pandas as pd
import csv
import locale
import os
import sys
import unicodedata
import tkinter as tk
from tkinter import simpledialog
from tkinter.filedialog import askopenfilename
from tkinter import *
from tkinter import ttk
ROOT = tk.Tk()
ROOT.withdraw()
data_df = pd.read_excel(askopenfilename())
grouped_df = data_df.groupby('LOCATION')
folderpath = filedialog.askdirectory()
for data in grouped_df.LOCATION:
grouped_df.get_group(data[0]).to_csv(folderpath+data[0]+".csv",encoding='utf-8', mode='w+')
filename =data[0]
f=pd.read_csv(folderpath+filename+".csv", sep=',')
#print f
keep_col = ['ID','NAME','DATA1','DATA4']
new_f = f[keep_col]
new_f.to_csv(folderpath+data[0]+".csv", index=False)
Sample data
P.S- There will be data is DATA3 and DATA 4 columns but I just didn't enter it here
How the Script is giving the output:
Thanks in Advance!
It seems like the return value of filedialog.askdirectory() ends with the folder the uses selected without a trailing slash, i.e:
\Desktop\XYZ\Test
You're full path created by folderpath+data[0]+".csv" with an example value for data[0] of "potato" will be
\Desktop\XYZ\Testpotato.csv
You need to at least append the \ manualy
for data in grouped_df.LOCATION:
grouped_df.get_group(data[0]).to_csv(folderpath+"\\"+data[0]+".csv",encoding='utf-8', mode='w+')
filename =data[0]

How to select multiple files or an entire folder(display names of all the files it contains) in python using tkinter?

I have written a code to display the contents of a file using tkinter askopenfile() method. Now I need to select an entire folder(directory) and print the names of the files it contains or select multiple files.
I'm new to the concepts of tkinter and having a difficult time understanding this. Is there any method to do this?
Thanks in advance.
I assume you are using python 2, so here you go:
from Tkinter import *
from Tkinter import *
import Tkinter, Tkconstants, tkFileDialog
root = Tk()
root.filename = tkFileDialog.askopenfilename(initialdir = "/")
print(root.filename)
Hope this helps!
FYI: I would suggest you update to python 3. Python 2 has been sun-setted(on January 1st, 2020).

Python openpyxl How to bypass permissions to write to an open file

I am trying to write a script that live edits an open excel file, but when I try to run the script that uses openpyxl and reads from a cell then writes data back to that cell with an edit, it gives this error PermissionError: [Errno 13] Permission denied: 'GameExcel.xlsx' is there a way around this using another module, or is there a secret I am missing
Edit here's the code, also this is just me learning before I integrate it into the full code.
import openpyxl
from openpyxl import load_workbook
from openpyxl import workbook
from openpyxl.utils import get_column_letter
import os
import tkinter as tk
from tkinter import messagebox as tkMsgBox
import time
os.chdir("D:\Scripts\Python\Testing Scripts\My Excel Game")
wb = load_workbook("GameExcel.xlsx")
names = wb.sheetnames
sheet = wb['GameEnviroment']
#userInput = (input("what would you like it to say?"))
#print(userInput)
C3Val = sheet['C4'].value
sheet.cell(row=3, column=4).value = (C3Val + ' 4')
wb.save('GameExcel.xlsx')
print(C3Val + ' 3')
#sheet['A1']=userInput
This is due to the operating system limitation (ie Windows). It has nothing to do with openpyxl, Python or even Excel. POSIX based OS do not have such a limitation.
The answer to this question ("How to bypass permissions to write to an open file") is simply "You can't".
The option that I went with, and this only works with excel open, is xlwings

I need help programming functions for my buttons (importing, exporting)

I'm trying to create a really basic software where the user can:
1) press a button to import a .csv file, and the program will read the file and print it
2) press another button to sort the data a specific way
3) Press a third and final button to export that data as a new .csv file
Basically, I need help on steps 1 and 3, I have no clue how to do it.
MY CODE:
from tkinter import *
import tkinter
import tkinter.messagebox
top = Toplevel()
def UploadAction():
#Allows user to import a .csv of their choice into the program
#Python should read the file into the system and display the contents
def SortingCSV():
#Allows user to switch the contents of the file to the desired settings
def Export():
#Exports the manipulated data into a new .csv file and downloads it
B=Button(top, text ="Upload", command = UploadAction).grid(row=2,column=1)
B=Button(top, text ="Convert File", command = SortingCSV).grid(row=6, column=1)
B=Button(top, text ="Download File", command = Export).grid(row=7,column=1)
top.mainloop()
A lot of Python modules exist that simplify your life for this kind of problem.
There is one for working with csv : https://docs.python.org/2/library/csv.html
Good luck !

Identify external workbook links using openpyxl

I am trying to identify all cells that contain external workbook references, using openpyxl in Python 3.4. But I am failing. My first try consisted of:
def find_external_value(cell):
# identifies an external link in a given cell
if '.xls' in cell.value:
has_external_reference = True
return has_external_value
However, when I print the cell values that have external values to the console, it yields this:
=[1]Sheet1!$B$4
=[2]Sheet1!$B$4
So, openpyxl obviously does not parse formulas containing external values in the way I imagined and since square brackets are used for table formulas, there is no sense in trying to pick up on external links in this manner.
I dug a little deeper and found the detect_external_links function in the openpyxl.workbook.names.external module (reference). I have no idea if one can actually call this function to do what I want.
From the console results it seems as if openpyxl understands that there are references, and seems to contain them in a list of sorts. But can one access this list? Or detect if such a list exists?
Whichever way - all I need is to figure out if a cell contains a link to an external workbook.
I have found a solution to this.
Use the openpyxl library for load the xlsx file as
import openpyxl
wb=openpyxl.load_workbook("Myworkbook.xlsx")
"""len(wb._external_links) *Add this line to get count of linked workbooks*"""
items=wb._external_links
for index, item in enumerate(items):
Mystr =wb._external_links[index].file_link.Target
Mystr=Mystr.replace("file:///","")
print(Mystr.replace("%20"," "))
----------------------------
Out[01]: ##Indicates that the workbook has 4 external workbook links##
/Users/myohannan/AppData/Local/Temp/49/orion/Extension Workpapers_Learning Extension Calc W_83180610.xlsx
/Users/lmmeyer/AppData/Local/Temp/orion/Complete Set of Workpapers_PPS Workpapers 123112_111698213.xlsx
\\SF-DATA-2\IBData\TEMP\ie5\Temporary Internet Files\OLK8A\LBO Models\PIGLET Current.xls
/WINNT/Temporary Internet Files/OLK3/WINDOWS/Temporary Internet Files/OLK8304/DEZ.XLS
I decided to veer outside of openpyxl in order to achieve my goal - even though openpyxl has numerous functions that refer to external links I was unable to find a simple way to achieve my goal.
Instead I decided to use ZipFile to open the workbook in memory, then search for the externalLink1.xml file. If it exists, then the workbook contains external links:
import tkinter as tk
from tkinter import filedialog
from zipfile import ZipFile
Import xml.etree.ElementTree
root = tk.Tk()
root.withdraw()
file_path = filedialog.askopenfilename()
with ZipFile(file_path) as myzip:
try:
my_file = myzip.open('xl/externalLinks/externalLink1.xml')
e = xml.etree.ElementTree.parse(my_file).getroot()
print('Has external references')
except:
print('No external references')
Once I have the XML file, I can proceed to identify the cell address, value and other information by running through the XML tree using ElementTree.

Categories