I'm working with the .xlsx file and it has a tab with a workhsheet table where lots of conditional formatting are used. From time to time I need to append this table with new rows.
My plan is to use python openpyxl (or other package) to append this table.
so far I could identify this table as
from openpyxl import load_workbook
wb=load_workbood(myfile)
ws=wb['mytab']
tab = wb.ws._tables[0]
Can I use something like .append() method or change data of this table to add more rows to it?
My goal is to keep the formatting.
I've already tried this approach -
Manipulate existing excel table using openpyxl and it doesn't' work for me
I'm using openpyxl 2.6.1
Regards,
Pavel
from openpyxl import load_workbook
filename= r'C:\Users\PC/test.xlsx'
wb = load_workbook(filename)
ws = wb['Hoja1']
ws["A1"] = "AAA"
ws["A2"] = "BBB"
wb.save(filename)
from openpyxl import load_workbook
wb=load_workbood(myfile)
ws=wb['mytab']
tab = ws.tables["Table1"]
tab.ref = f"A1:{ws.max_column}{ws.max_row}"
Related
I have an excel file - one sheet is used for writing data with python, other sheet contains pivot table. I want to keep pivot table exactly the same as source file.
The problem is that after saving new workbook with openpyxl I open excel file and refresh pivot table, it loses 'Field settings..' -> 'Repeat items label' checkbox and I need to manually turn it on each time. That is not very efficient, I would rather solve this with python.
Sample file has it checked, but checkbox seems to disappear after saving new file with openpyxl.
from openpyxl import load_workbook
from pathlib import Path
from datetime import date
import os
sample_file_path = Path('sample_excel.xlsx') # source excel
result_folder_path = Path('results')
wb = load_workbook(sample_file_path)
ws = wb["t_mm"] # worksheet with pivot table I want to preserve as is
# some manipulations to other worksheet
xlsx_filename = "test_my_file_%s.xlsx" % date.today().strftime('%d%m%Y')
completename = os.path.join(result_folder_path, xlsx_filename)
wb.save(completename)
I read the documentation https://openpyxl.readthedocs.io/en/stable/api/openpyxl.pivot.table.html, but couldn't figure out how to keep that checkbox. I am not excel or pivot table expert. I think this is the parameter I need "showMultipleLabel=True", but from docs I understand that it's "True" by default, so my chekbox should remain intact. Maybe other parameter?
I want to add new records every week to this existing file without creating a new one.
For example, Next I want to add record on date 6/13/2016
Randy->(13,23,13)
Shaw->(13,15,13)
and many such entries next two months. How do I do that? I am beginner so having trouble to put it in syntax.
I could do only this much
import xlrd
#Opening the excel file
file_location= "C:/Users/agodgh1a/Desktop/Apurva/EPSON.xlsx"
workbook= xlrd.open_workbook(file_location)
sheet=workbook.sheet_by_index(0)
Thank you!
The lib you're using looks like it only reads, not edits. Here's an example in openpyxl:
from openpyxl import Workbook, load_workbook
# create the file
wb = Workbook()
ws = wb.active
ws.append([1, 2, 3])
wb.save("sample.xlsx")
# re-open and append
wb = load_workbook("sample.xlsx")
ws = wb.active
ws.append([4, 5, 6])
wb.save("sample.xlsx")
Run that and you'll have a file sample.xlsx with both rows.
xlrd
is for reading operations only. Since you want perform a write operation use
xlwt
python module.
Refer to xlwt docs for the same
I have to write some data into existing xls file.(i should say that im working on unix and couldnt use windows)
I prefer work with python and have tried some libraries like xlwt, openpyxl, xlutils.
Its not working, cause there is some filter in my xls file. After rewriting this file filter is dissapearing. But i still need this filter.
Could some one tell me about options that i have.
help, please!
Example:
from xlutils.copy import copy
from xlrd import open_workbook
from xlwt import easyxf
start_row=0
rb=open_workbook('file.xls')
r_sheet=rb.sheet_by_index(1)
wb=copy(rb)
w_sheet=wb.get_sheet(1)
for row_index in range(start_row, r_sheet.nrows):
row=r_sheet.row_values(row_index)
call_index=0
for c_el in row:
value=r_sheet.cell(row_index, call_index).value
w_sheet.write(row_index, call_index, value)
call_index+=1
wb.save('file.out.xls');
I also tried:
import xlrd
from openpyxl import Workbook
import unicodedata
rb=xlrd.open_workbook('file.xls')
sheet=rb.sheet_by_index(0)
wb=Workbook()
ws1=wb.create_sheet("Results", 0)
for rownum in range(sheet.nrows):
row=sheet.row_values(rownum)
arr=[]
for c_el in row:
arr.append(c_el)
ws1.append(arr)
ws2=wb.create_sheet("Common", 1)
sheet=rb.sheet_by_index(1)
for rownum in range(sheet.nrows):
row=sheet.row_values(rownum)
arr=[]
for c_el in row:
arr.append(c_el)
ws2.append(arr)
ws2.auto_filter.ref=["A1:A15", "B1:B15"]
#ws['A1']=42
#ws.append([1,2,3])
wb.save('sample.xls')
The problem is still exist. Ok, ill try to find machine running on windows, but i have to admit something else:
There is some rows like this:
enter image description here
Ive understood what i was doing wrong, but i still need help.
First of all, i have one sheet that contains some values
Second sheet contains summary table!!!
If i try to copy this worksheet it did wrong.
So, the question is : how could i make summary table from first sheet?
Suppose your existing excel file has two columns (date and number).
This is how you will append additional rows using openpyxl.
import openpyxl
import datetime
wb = openpyxl.load_workbook('existing_data_file.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')
a = sheet.get_highest_row()
sheet.cell(row=a,column=0).value=datetime.date.today()
sheet.cell(row=a,column=1).value=30378
wb.save('existing_data_file.xlsx')
If you are on Windows, I would suggest you take a look at using the win32com.client approach. This allows you to interact with your spreadsheet using Excel itself. This will ensure that any existing filters, images, tables, macros etc should be preserved.
The following example opens an XLS file adds one entry and saves the whole workbook as a different XLS formatted file:
import win32com.client as win32
import os
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(r'input.xls')
ws = wb.Worksheets(1)
# Write a value at A1
ws.Range("A1").Value = "Hello World"
excel.DisplayAlerts = False # Allow file overwrite
wb.SaveAs(r'sample.xls', FileFormat=56)
excel.Application.Quit()
Note, make sure you add full paths to your input and output files.
I've to write a huge Excel file and the optimized writer in openpyxl is what I need.
The question is:
is it possibile to set style and format of cells when using optimized writer? Style is not so important (I would only like to highlight column headers), but I need the correct number format for some columns containing currency values.
I saw that ws.cell() method is not available when using optimized writer, so how to do it?
Thank you in advance for your help!
As I can't comment, I'll post an update to Dean's answer here:
openpyxl's api (version 2.4.7) has changed slightly so that it should now read:
from openpyxl import Workbook
wb = Workbook( write_only = True )
ws = wb.create_sheet()
from openpyxl.writer.dump_worksheet import WriteOnlyCell
from openpyxl.styles import Font
cell = WriteOnlyCell(ws, value="highlight")
cell.font = Font(name='Courier', size=36)
cols=[]
cols.append(cell)
cols.append("some other value")
ws.append(cols)
wb.save("test.xlsx")
Hope it helps
You could also look at the XlsxWriter module which allows writing huge files in optimised mode with formatting.
from xlsxwriter.workbook import Workbook
workbook = Workbook('file.xlsx', {'constant_memory': True})
worksheet = workbook.add_worksheet()
...
Quote from docs:
Those worksheet only have an append() method, it’s not possible to
access independent cells directly (through cell() or range()). They
are write-only.
When you pass optimized_write=True to the Workbook constructor, openpyxl will use DumpWorksheet class instead of Worksheet. DumpWorksheet class is very limited in terms of styling and formatting.
But, look at append method - it matches the python type of data you pass to excel types. So, see correct cell formats in the result file after running this:
import datetime
from openpyxl import Workbook
wb = Workbook(optimized_write=True)
ws = wb.create_sheet()
for irow in xrange(5):
ws.append([True, datetime.datetime.now(), 'test', 1, 1.25, '=D1+E1'])
wb.save('output.xlsx')
Speaking about changing the column headers style - just no way to do it using optimized writer.
Hope that helps.
You can use the WriteOnlyCell to do this.
from openpyxl import Workbook
wb = Workbook(optimized_write = True)
ws = wb.create_sheet()
from openpyxl.writer.dump_worksheet import WriteOnlyCell
from openpyxl.styles import Style, Font, PatternFill
cell = WriteOnlyCell(ws, value="highlight")
cell.style = Style(font=Font(name='Courier', size=36), fill=PatternFill(fill_type='solid',start_color='8557e5'))
cols=[]
cols.append(cell)
cols.append("some other value")
ws.append(cols)
wb.save("test.xlsx")
I hope this helps. You can use anything that the style will allow before appending it to the row for the worksheet.
I've got to read .xlsx file every 10min in python.
What is the most efficient way to do this?
I've tried using xlrd, but it doesn't read .xlsx - according to documentation he does, but I can't do this - getting Unsupported format, or corrupt file exceptions.
What is the best way to read xlsx?
I need to read comments in cells too.
xlrd hasn't released the version yet to read xlsx. Until then, Eric Gazoni built a package called openpyxl - reads xlsx files, and does limited writing of them.
Use Openpyxl some basic examples:
import openpyxl
# Open Workbook
wb = openpyxl.load_workbook(filename='example.xlsx', data_only=True)
# Get All Sheets
a_sheet_names = wb.get_sheet_names()
print(a_sheet_names)
# Get Sheet Object by names
o_sheet = wb.get_sheet_by_name("Sheet1")
print(o_sheet)
# Get Cell Values
o_cell = o_sheet['A1']
print(o_cell.value)
o_cell = o_sheet.cell(row=2, column=1)
print(o_cell.value)
o_cell = o_sheet['H1']
print(o_cell.value)
# Sheet Maximum filled Rows and columns
print(o_sheet.max_row)
print(o_sheet.max_column)
There are multiple ways to read XLSX formatted files using python. Two are illustrated below and require that you install openpyxl at least and if you want to parse into pandas directly you want to install pandas, eg. pip install pandas openpyxl
Option 1: pandas direct
Primary use case: load just the data for further processing.
Using read_excel() function in pandas would be your best choice. Note that pandas should fall back to openpyxl automatically but in the event of format issues its best to specify the engine directly.
df_pd = pd.read_excel("path/file_name.xlsx", engine="openpyxl")
Option 2 - openpyxl direct
Primary use case: getting or editing specific Excel document elements such as comments (requested by OP), formatting properties or formulas.
Using load_workbook() followed by comment extraction using the comment attribute for each cell would be achieved by the following.
from openpyxl import load_workbook
wb = load_workbook(filename = "path/file_name.xlsx")
ws = wb.active
ws["A1"].comment # <- loop through row & columns to extract all comments