I have a dataframe like as shown below
Date,cust,region,Abr,Number,
12/01/2010,Company_Name,Somecity,Chi,36,
12/02/2010,Company_Name,Someothercity,Nyc,156,
df = pd.read_clipboard(sep=',')
I would like to write this dataframe to a specific sheet (called temp_data) in the file output.xlsx
Therfore I tried the below
import pandas
from openpyxl import load_workbook
book = load_workbook('output.xlsx')
writer = pandas.ExcelWriter('output.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
I also tried the below
path = 'output.xlsx'
with pd.ExcelWriter(path) as writer:
writer.book = openpyxl.load_workbook(path)
final_df.to_excel(writer, sheet_name='temp_data',startrow=10)
writer.save()
But am not sure whether I am overcomplicating it. I get an error like as shown below. But I verifiedd in task manager, no excel file/task is running
BadZipFile: File is not a zip file
Moreover, I also lose my formatting of the output.xlsx file when I manage to write the file based on below suggestions. I already have a neatly formatted font,color file etc and just need to put the data inside.
Is there anyway to write the pandas dataframe to a specific sheet in an existing excel file? WITHOUT LOSING FORMATTING OF THE DESTIATION FILE
You need to just use to_excel from pandas dataframe.
Try below snippet:
df1.to_excel("output.xlsx",sheet_name='Sheet_name')
If there is existing data please try below snippet:
writer = pd.ExcelWriter('output.xlsx', engine='openpyxl')
# try to open an existing workbook
writer.book = load_workbook('output.xlsx')
df.to_excel(writer,index=False,header=False,startrow=len(reader)+1)
writer.save()
writer.close()
Are you restricted to using pandas or openpyxl?
Because if you're comfortable using other libraries, the easiest way is probably using win32com to puppet excel as if you were a user manually copying and pasting the information over.
import pandas as pd
import io
import win32com.client as win32
import os
csv_text = """Date,cust,region,Abr,Number
12/01/2010,Company_Name,Somecity,Chi,36
12/02/2010,Company_Name,Someothercity,Nyc,156"""
df = pd.read_csv(io.StringIO(csv_text),sep = ',')
temp_path = r"C:\Users\[User]\Desktop\temp.xlsx" #temporary location where to write this dataframe
df.to_excel(temp_path,index = False) #temporarily write this file to excel, change the output path as needed
excel = win32.Dispatch("Excel.Application")
excel.Visible = True #Switch these attributes to False if you'd prefer Excel to be invisible while excecuting this script
excel.ScreenUpdating = True
temp_wb = excel.Workbooks.Open(temp_path)
temp_ws = temp_wb.Sheets("Sheet1")
output_path = r"C:\Users\[User]\Desktop\output.xlsx" #Path to your output excel file
output_wb = excel.Workbooks.Open(output_path)
output_ws = output_wb.Sheets("Output_sheet")
temp_ws.Range('A1').CurrentRegion.Copy(Destination = output_ws.Range('A1')) # Feel free to modify the Cell where you'd like the data to be copied to
input('Check that output looks like you expected\n') # Added pause here to make sure script doesn't overwrite your file before you've looked at the output
temp_wb.Close()
output_wb.Close(True) #Close output workbook and save changes
excel.Quit() #Close excel
os.remove(temp_path) #Delete temporary excel file
Let me know if this achieves what you were after.
I spent all day on this (and a co-worker of mine spent even longer). Thankfully, it seems to work for my purposes - pasting a dataframe into an Excel sheet without changing any of the Excel source formatting. It requires the pywin32 package, which "drives" Excel as if it a user, using VBA.
import pandas as pd
from win32com import client
# Grab your source data any way you please - I'm defining it manually here:
df = pd.DataFrame([
['LOOK','','','','','','','',''],
['','MA!','','','','','','',''],
['','','I pasted','','','','','',''],
['','','','into','','','','',''],
['','','','','Excel','','','',''],
['','','','','','without','','',''],
['','','','','','','breaking','',''],
['','','','','','','','all the',''],
['','','','','','','','','FORMATTING!']
])
# Copy the df to clipboard, so we can later paste it as text.
df.to_clipboard(index=False, header=False)
excel_app = client.gencache.EnsureDispatch("Excel.Application") # Initialize instance
wb = excel_app.Workbooks.Open("Template.xlsx") # Load your (formatted) template workbook
ws = wb.Worksheets(1) # First worksheet becomes active - you could also refer to a sheet by name
ws.Range("A3").Select() # Only select a single cell using Excel nomenclature, otherwise this breaks
ws.PasteSpecial(Format='Unicode Text') # Paste as text
wb.SaveAs("Updated Template.xlsx") # Save our work
excel_app.Quit() # End the Excel instance
In general, when using the win32com approach, it's helpful to record yourself (with a macro) doing what you want to accomplish in Excel, then reading the generated macro code. Often this will give you excellent clues as to what commands you could invoke.
The solution to your problem exists here: How to save a new sheet in an existing excel file, using Pandas?
To add a new sheet from a df:
import pandas as pd
from openpyxl import load_workbook
import os
import numpy as np
os.chdir(r'C:\workdir')
path = 'output.xlsx'
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine = 'openpyxl')
writer.book = book
### replace with your df ###
x = np.random.randn(100, 2)
df = pd.DataFrame(x)
df.to_excel(writer, sheet_name = 'x')
writer.save()
writer.close()
You can try xltpl.
Create a template file based on your output.xlsx file.
Render a file with your data.
from xltpl.writerx import BookWriterx
writer = BookWriterx('template.xlsx')
d = {'rows': df.values}
d['tpl_name'] = 'tpl_sheet'
d['sheet_name'] = 'temp_data'
writer.render_sheet(d)
d['tpl_name'] = 'other_sheet'
d['sheet_name'] = 'other'
writer.render_sheet(d)
writer.save('out.xls')
See examples.
I have a problem when I'm trying to save and than read excel file in python. So this is my function:
import openpyxl
import xlrd
from xlutils.copy import copy
import pandas as pd
def write_excel():
wb = openpyxl.load_workbook('8de69ccb60047ce5.xlsx')
sheet = wb.active
sheet['D18'] = 3
wb.save('8de69ccb60047ce5.xls')
df1 = pd.read_excel('8de69ccb60047ce5.xls', sheet_name='Лист1', header=None, skiprows=1, usecols="H,I")
print(df1)
workbook = xlrd.open_workbook('8de69ccb60047ce5.xls')
worksheet = workbook.sheet_by_index(0)
print(worksheet.cell(17, 8).value)
print(worksheet.cell(18, 8).value)
I'm changing cell D18, saving file and than trying to read other cells that has formulas but I get nothing (also cell without formulas read correctly).
But if I open file manually and save it in Excel that lines of code read those cells correctly.
The problem is this line wb.save('8de69ccb60047ce5.xls'). It saves changes in file but it doesn't saves file correctly (I don't know how to discribe it). How can I read cell with formula after changing the file in python?
Save a file as sample_book.xlsx with save function.
wb.save(filename = 'sample_book.xlsx')
For more info check out this link: https://www.soudegesu.com/en/post/python/create-excel-with-openpyxl/#save-file
I have a requirement to read an xlsm file and update some of the sheets in the file. I want to use pandas for this purpose.
I tried answers presented in the following post. I couldn't see the VBA macros when I add the VBA project back.
https://stackoverflow.com/posts/28170939/revisions
Here are the steps I tried,
Extracted the VBA_project.bin out of the original.xlsm file and then
writer = pd.ExcelWriter('original.xlsx', engine='xlsxwriter')
workbook = writer.book
workbook.filename = 'test.xlsm'
workbook.add_vba_project('vbaProject.bin')
writer.save()
With this I don't see the VBA macros attached to "test.xlsm". The result is the same even if I write it to the "original.xlsm" file.
How do I preserve the VBA macros or add them back to the original xlsm file?
Also, is there a way I can open the "xlsm" file itself rather than the "xlsx" counterpart using pd.ExcelWriter?
You can do this easily with pandas
import pandas as pd
import xlrd
# YOU MUST PUT sheet_name=None TO READ ALL CSV FILES IN YOUR XLSM FILE
df = pd.read_excel('YourFile.xlsm', sheet_name=None)
# prints all sheets
print(df)
Ah, I see. I still can't tell what you are doing, but here are a few general samples of code to get Python to communicate with Excel.
Read contents of a worksheet in Excel:
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
df = pd.read_excel('C:\\your_path\\test.xls', sheetname='Sheet1')
************************************************************************************
Use Python to run Macros in Excel:
import os
import win32com.client
#Launch Excel and Open Wrkbook
xl=win32com.client.Dispatch("Excel.Application")
xl.Workbooks.Open(Filename="C:\your_path\excelsheet.xlsm") #opens workbook in readonly mode.
#Run Macro
xl.Application.Run("excelsheet.xlsm!modulename.macroname")
#Save Document and Quit.
xl.Application.Save()
xl.Application.Quit()
#Cleanup the com reference.
del xl
Write, from Python, to Excel:
import xlsxwriter
# Create an new Excel file and add a worksheet.
workbook = xlsxwriter.Workbook('C:/your_path/ranges_and_offsets.xlsx')
worksheet = workbook.add_worksheet()
# Widen the first column to make the text clearer.
worksheet.set_column('A:A', 20)
# Add a bold format to use to highlight cells.
bold = workbook.add_format({'bold': True})
# Write some simple text.
worksheet.write('A1', 'Hello')
# Text with formatting.
worksheet.write('A2', 'World', bold)
# Write some numbers, with row/column notation.
worksheet.write(2, 0, 123)
worksheet.write(3, 0, 123.456)
workbook.close()
from openpyxl import Workbook
wb = Workbook()
# grab the active worksheet
ws = wb.active
# Data can be assigned directly to cells
ws['A1'] = 42
# Rows can also be appended
ws.append([1, 2, 3])
# Python types will automatically be converted
import datetime
ws['A2'] = datetime.datetime.now()
# Save the file
wb.save("C:\\your_path\\sample.xlsx")
I have to write some data into existing xls file.(i should say that im working on unix and couldnt use windows)
I prefer work with python and have tried some libraries like xlwt, openpyxl, xlutils.
Its not working, cause there is some filter in my xls file. After rewriting this file filter is dissapearing. But i still need this filter.
Could some one tell me about options that i have.
help, please!
Example:
from xlutils.copy import copy
from xlrd import open_workbook
from xlwt import easyxf
start_row=0
rb=open_workbook('file.xls')
r_sheet=rb.sheet_by_index(1)
wb=copy(rb)
w_sheet=wb.get_sheet(1)
for row_index in range(start_row, r_sheet.nrows):
row=r_sheet.row_values(row_index)
call_index=0
for c_el in row:
value=r_sheet.cell(row_index, call_index).value
w_sheet.write(row_index, call_index, value)
call_index+=1
wb.save('file.out.xls');
I also tried:
import xlrd
from openpyxl import Workbook
import unicodedata
rb=xlrd.open_workbook('file.xls')
sheet=rb.sheet_by_index(0)
wb=Workbook()
ws1=wb.create_sheet("Results", 0)
for rownum in range(sheet.nrows):
row=sheet.row_values(rownum)
arr=[]
for c_el in row:
arr.append(c_el)
ws1.append(arr)
ws2=wb.create_sheet("Common", 1)
sheet=rb.sheet_by_index(1)
for rownum in range(sheet.nrows):
row=sheet.row_values(rownum)
arr=[]
for c_el in row:
arr.append(c_el)
ws2.append(arr)
ws2.auto_filter.ref=["A1:A15", "B1:B15"]
#ws['A1']=42
#ws.append([1,2,3])
wb.save('sample.xls')
The problem is still exist. Ok, ill try to find machine running on windows, but i have to admit something else:
There is some rows like this:
enter image description here
Ive understood what i was doing wrong, but i still need help.
First of all, i have one sheet that contains some values
Second sheet contains summary table!!!
If i try to copy this worksheet it did wrong.
So, the question is : how could i make summary table from first sheet?
Suppose your existing excel file has two columns (date and number).
This is how you will append additional rows using openpyxl.
import openpyxl
import datetime
wb = openpyxl.load_workbook('existing_data_file.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')
a = sheet.get_highest_row()
sheet.cell(row=a,column=0).value=datetime.date.today()
sheet.cell(row=a,column=1).value=30378
wb.save('existing_data_file.xlsx')
If you are on Windows, I would suggest you take a look at using the win32com.client approach. This allows you to interact with your spreadsheet using Excel itself. This will ensure that any existing filters, images, tables, macros etc should be preserved.
The following example opens an XLS file adds one entry and saves the whole workbook as a different XLS formatted file:
import win32com.client as win32
import os
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(r'input.xls')
ws = wb.Worksheets(1)
# Write a value at A1
ws.Range("A1").Value = "Hello World"
excel.DisplayAlerts = False # Allow file overwrite
wb.SaveAs(r'sample.xls', FileFormat=56)
excel.Application.Quit()
Note, make sure you add full paths to your input and output files.
I'm reading a existing excel file by using openpyxl package and trying to save that file it, and it got saved but after opening that excel file no data is present. I used the following code and my requirement is to open the file in use_iterators = True mode only
from openpyxl import load_workbook
wb = load_workbook(filename = 'large_file.xlsx', use_iterators = True)
ws = wb.get_sheet_by_name(name = 'big_data')
for row in ws.iter_rows():
for cell in row:
print cell.internal_value
wb.save("large_file.xlsx")
can u guys show how to save the file and close the file after saving with out losing the data
Try loading with use_iterators = False, as use_iterators = True loads the data information differently, such that it may not contain all the information you wish to save.
Openpyxl writes and entirely new excel file based on the information it has read in, so it's not like you make a small change and just update the file. (This also means if certain features aren't supported in openpyxl (such as VB macros), these won't exist in the file you've saved.)