Need some help opening excel files in python - python

Just having a strange issue. I am new in python and while running the below code. Geetting error. I have tried google but unable to run my code. Any advise please
import openpyxl
import os
os.chdir('/Users/omer/Documents/Python_Code/Udemy/Excel_Word_Pdf/')
workbook = openpyxl.load_workbook('example.xlsx')
sheet = workbook.get_sheet_by_name('Sheet1')
workbook.get_sheet_names()
cell = sheet['A1']
And the error i amgetting is
lesson42.py:13: DeprecationWarning: Call to deprecated function get_sheet_by_name (Use wb[sheetname]).
sheet = workbook.get_sheet_by_name('Sheet1')
lesson42.py:15: DeprecationWarning: Call to deprecated function get_sheet_names (Use wb.sheetnames).
workbook.get_sheet_names()

I just tested the following. This should work.
import openpyxl
import os
workbook = openpyxl.load_workbook('test.xlsx')
sheet = workbook['Sheet1']
print(workbook.sheetnames)
cell = sheet['A1'].value
print(cell)

DeprecationWarning means that you're calling a function that's no longer supported. Go through their documentation to find the new function that's used to get_name, or try using pandas

Try viewing sheetnames first: workbook.sheetnames
This will give you a list of available sheets, then you can call them with workbook[#Some Sheet]

Related

Is there a way to protect workbooks using openpyxl or xlswriter?

I'm trying to automate Excel reports, and I'd prefer users didn't try to rename or reorder the worksheets. While I've had no problems protecting individual cells using xlsxwriter, I've failed to see an option to protect the workbook itself. I'm looking to openpyxl, but the tutorial does not seem to have any effect.
Edit: I'm now using this block of code, but does neither produce an error or protect my workbooks.
from openpyxl import load_workbook
from openpyxl.workbook.protection import WorkbookProtection
workbook = load_workbook(filepath, read_only=False, keep_vba=True)
workbook.security = WorkbookProtection(workbookPassword = 'secret-password', lockStructure = True)
workbook.save(filepath)
By the way, I am dealing with .xlsm files. If there are any solutions or points that I've missed, please let me know.
From this code:
from openpyxl.workbook.protection import WorkbookProtection
myWorkbook.security = WorkbookProtection(workBookPassword = 'super-secret-password', lockStructure = True)
myWorkbook.save(filepath)
Change:
WorkbookProtection(workBookPassword = 'super-secret-password', lockStructure = True)
to:
WorkbookProtection(workbookPassword = 'super-secret-password', lockStructure = True)
workBookPassword should be workbookPassword
Tested on Python32 3.8 and OpenPyXL version 3.0.2
Xlsxwriter has the option to protect the workbook with the command worksheet.protect() (have a look at the documentation: https://xlsxwriter.readthedocs.io/worksheet.html )
However take into consideration this:
Worksheet level passwords in Excel offer very weak protection. They do
not encrypt your data and are very easy to deactivate. Full workbook
encryption is not supported by XlsxWriter since it requires a
completely different file format and would take several man months to
implement.
Try using xlwings
import xlwings as xw
wb = xw.Book(r'<path_to_.xlsx file>')
wb.save(password='<your_password>', path=r'<path_to_save_.xlsx file>')

Openpyxl Reading Non-empty Cells as None [duplicate]

I have a simple excel file:
A1 = 200
A2 = 300
A3 = =SUM(A1:A2)
this file works in excel and shows proper value for SUM, but while using openpyxl module for python I cannot get value in data_only=True mode
Python code from shell:
wb = openpyxl.load_workbook('writeFormula.xlsx', data_only = True)
sheet = wb.active
sheet['A3']
<Cell Sheet.A3> # python response
print(sheet['A3'].value)
None # python response
while:
wb2 = openpyxl.load_workbook('writeFormula.xlsx')
sheet2 = wb2.active
sheet2['A3'].value
'=SUM(A1:A2)' # python response
Any suggestions what am I doing wrong?
It depends upon the provenance of the file. data_only=True depends upon the value of the formula being cached by an application like Excel. If, however, the file was created by openpyxl or a similar library, then it's probable that the formula was never evaluated and, thus, no cached value is available and openpyxl will report None as the value.
I have replicated the issue with Openpyxl and Python.
I am currently using openpyxl version 2.6.3 and Python 3.7.4. Also I am assuming that you are trying to complete an exercise from ATBSWP by Al Sweigart.
I tried and tested Charlie Clark's answer, considering that Excel may indeed cache values. I opened the spreadsheet in Excel, copied and pasted the formula into the same exact cell, and finally saved the workbook. Upon reopening the workbook in Python with Openpyxl with the data_only=True option, and reading the value of this cell, I saw the proper value, 500, instead of the wrong value, the None type.
I hope this helps.
I had the same issue. This may not be the most elegant solution, but this is what worked for me:
import xlwings
from openpyxl import load_workbook
excel_app = xlwings.App(visible=False)
excel_book = excel_app.books.open('writeFormula.xlsx')
excel_book.save()
excel_book.close()
excel_app.quit()
workbook = load_workbook(filename='writeFormula.xlsx', data_only=True)
I have suggestion to this problem. Convert xlsx file to csv :).
You will still have the original xlsx file. The conversion is done by libreoffice (it is that subprocess.call() line).You can use also Pandas for this as a more pythonic way.
from subprocess import call
from openpyxl import load_workbook
from csv import reader
filename="test"
wb = load_workbook(filename+".xlsx")
spread_range = wb['Sheet1']
#what ever function there is in A1 cell to be evaluated
print(spread_range.cell(row=1,column=1).value)
wb.close()
#this line can be done with subprocess or os.system()
#libreoffice --headless --convert-to csv $filename --outdir $outdir
call("libreoffice --headless --convert-to csv "+filename+".xlsx", shell=True)
with open(filename+".csv", newline='') as f:
reader = reader(f)
data = list(reader)
print(data[0][0])
or
# importing pandas as pd
import pandas as pd
# read an excel file and convert
# into a dataframe object
df = pd.DataFrame(pd.read_excel("Test.xlsx"))
# show the dataframe
df
I hope this helps somebody :-)
Yes, #Beno is right. If you want to edit the file without touching it, you can make a little "robot" that edits your excel file.
WARNING: This is a recursive way to edit the excel file. These libraries are depend on your machine, make sure you set time.sleep properly before continuing the rest of the code.
For instance, I use time.sleep, subprocess.Popen, and pywinauto.keyboard.send_keys, just add random character to any cell that you set, then save it. Then the data_only=True is working perfectly.
for more info about pywinauto.keyboard: pywinauto.keyboard
# import these stuff
import subprocess
from pywinauto.keyboard import send_keys
import time
import pygetwindow as gw
import pywinauto
excel_path = r"C:\Program Files\Microsoft Office\root\Office16\EXCEL.EXE"
excel_file_path = r"D:\test.xlsx"
def focus_to_window(window_title=None): # function to focus to window. https://stackoverflow.com/a/65623513/8903813
window = gw.getWindowsWithTitle(window_title)[0]
if not window.isActive:
pywinauto.application.Application().connect(handle=window._hWnd).top_window().set_focus()
subprocess.Popen([excel_path, excel_file_path])
time.sleep(1.5) # wait excel to open. Depends on your machine, set it propoerly
focus_to_window("Excel") # focus to that opened file
send_keys('%{F3}') # excel's name box | ALT+F3
send_keys('AA1{ENTER}') # whatever cell do you want to insert somthing | Type 'AA1' then press Enter
send_keys('Stackoverflow.com') # put whatever you want | Type 'Stackoverflow.com'
send_keys('^s') # save | CTRL+S
send_keys('%{F4}') # exit | ALT+F4
print("Done")
Sorry for my bad english.
As others already mentioned, Openpyxl only reads cashed formula value in data_only mode. I have used PyWin32 to open and save each XLSX file before it's processed by Openpyxl to read the formulas result value. This works for me well, as I don't process large files. This solution will work only if you have MS Excel installed on your PC.
import os
import win32com.client
from openpyxl import load_workbook
# Opening and saving XLSX file, so results for each stored formula can be evaluated and cashed so OpenPyXL can read them.
excel_file = os.path.join(path, file)
excel = win32com.client.gencache.EnsureDispatch('Excel.Application')
excel.DisplayAlerts = False # disabling prompts to overwrite existing file
excel.Workbooks.Open(excel_file )
excel.ActiveWorkbook.SaveAs(excel_file, FileFormat=51, ConflictResolution=2)
excel.DisplayAlerts = True # enabling prompts
excel.ActiveWorkbook.Close()
wb = load_workbook(excel_file)
# read your formula values with openpyxl and do other stuff here
I ran into the same issue. After reading through this thread I managed to fix it by simply opening the excel file, making a change then saving the file again. What a weird issue.

Openpyxl: Worksheet has no object delete rows error

I am fairly new to python and working my way through the openpyxl package within python and I am unable to get the delete_rows function to work? It is saying that the function "delete rows" isnt within workbook?
My code is below:
from openpyxl import Workbook
from openpyxl import load_workbook
data = 'DATA PATH'
wb = load_workbook(filename=data)
sheet1 = wb['Sheet1']
sheet1.delete_rows(idx=3, amount = 5)
wb.save(filename="TEST.xlsx")
Any advice would be great and if anymore info behind the question is required just let me know.
Thanks!
(Code has been updated)

Openpyxl not removing sheets from created workbook correctly

So I ran into an issue with remove_sheet() with openpxyl that I can't find an answer to. When I run the following code:
import openpyxl
wb = openpyxl.Workbook()
ws = wb.create_sheet("Sheet2")
wb.get_sheet_names()
['Sheet','Sheet2']
wb.remove_sheet('Sheet')
I get the following error:
ValueError: list.remove(x): x not in list
It doesn't work, even if I try wb.remove_sheet(0) or wb.remove_sheet(1), I get the same error. Is there something I am missing?
If you use get_sheet_by_name you will get the following:
DeprecationWarning: Call to deprecated function get_sheet_by_name (Use
wb[sheetname]).
So the solution would be:
xlsx = Workbook()
xlsx.create_sheet('other name')
xlsx.remove(xlsx['Sheet'])
xlsx.save('some.xlsx')
remove.sheet() is given a sheet object, not the name of the sheet!
So for your code you could try
wb.remove(wb.get_sheet_by_name(sheet))
In the same vein, remove_sheet is also not given an index, because it operates on the actual sheet object.
Here's a good source of examples (though it isn't the same problem you're facing, it just happens to show how to properly call the remove_sheet method)!
Since the question was posted and answered, the Openpyxl library changed.
You should not use wb.remove(wb.get_sheet_by_name(sheet)) as indicated by #cosinepenguin since it is now depreciated ( you will get warnings when trying to use it ) but wb.remove(wb[sheet])
In python 3.7
import openpyxl
wb = openpyxl.Workbook()
ws = wb.create_sheet("Sheet2")
n=wb.sheetnames
#sheetname =>get_sheet_names()
wb.remove(wb["Sheet"])
'#or can use'
wb.remove(wb[n[1]])
1 is index sheet "sheet"
you can visit this link for more info

"Worksheet range names does not exist" KeyError in openpyxl

Let me preface this by saying I have tried looking for, and cannot seem to find a similar situation so please don't be too upset if this seems familiar to you. I am using Python 2.7 and openpyxl version 2.2.5 (I need to use 2.7, and used an older module for other reasons.)
I am new to Python and read/write code in general, so I'm testing this on the command line before I implement it:
I created a file, foo.xlsx in the Python27 file directory with some values that I manually entered via Excel.
I then used this simple code on the Python command line to test my code
from openpyxl import load_workbook
wb = load_workbook('foo.xlsx')
sheet_ranges = wb['range names']
It then resulted in the following error:
File "C:\Python27\lib\openpyxl\workbook.workbook.py", line 233 in getitem
raise KeyError("Worksheet {0} does not exist.".format(key))
KeyError: 'Worksheet sheet range names does not exist'
So I thought it had something to do with not importing the entire openpyxl module. I proceeded to do that and run the whole process but it resulted in the same error.
Can someone please let me know what I am doing wrong/how to solve this?
Additional information:
I had successfully written to an empty file before, and then read the values. This gave me the right values for everything EXCEPT what I had written in manually via Excel- the cells that had manual input returned None or Nonetype. The issue seems to be with cells with manual input.
I did hit save on the file before accessing it don't worry
This was in the same directory so I know that it wasn't a matter of location.
The following command does not make sense:
sheet_ranges = wb['range names']
Normally you open a workbook and then access one of the worksheets, the following gives you some examples on how this can be done:
import openpyxl
wb = openpyxl.Workbook()
wb = openpyxl.load_workbook(filename = 'input.xlsx')
# To display all of the available worksheet names
sheets = wb.sheetnames
print sheets
# To work with the first sheet (by name)
ws = wb[sheets[0]]
print ws['A1'].value
# To work with the active sheet
ws = wb.active
print ws['A1'].value
# To work with the active sheet (alternative method)
ws = wb.get_active_sheet()
print ws['A1'].value
If you want to display any named range in the workbook, you can do the following:
print wb.get_named_ranges()
I'm not exactly sure what it is you need to do, but to read Excel spreadsheets into python, I usually use xlrd (which to me was easier to get use to). See example:
import xlrd
workbook = xlrd.open_workbook(in_fname)
worksheet = workbook.sheet_by_index(0)
To write to Excel spreadsheets, I use xlsxwriter:
import xlsxwriter
workbook = xlsxwriter.Workbook(out_fname)
worksheet = workbook.add_worksheet('spreadsheet_name')
Hope this helps.

Categories