How can i write on an XLS file using python? - python

I want to open a .xls file on python modify some rows and rewrite the new content on the same file because i use a macro to synchronise the data with SharePoint.
What libraries can i use?

You can do it using pandas with :
import pandas as pd
df = pd.read_excel('path/file.xls')
# modifications here ....
df.to_excel('path/file.xls')

Have you tried openpyxl?
It's very easy to modify xlsx file using this library.
from openpyxl import Workbook
wb = Workbook()
# grab the active worksheet
ws = wb.active
# Data can be assigned/updated directly to cells
ws['A1'] = 42
# Rows can also be appended
ws.append([1, 2, 3])
# Python types will automatically be converted
import datetime
ws['A2'] = datetime.datetime.now()
# Save the file
wb.save("sample.xlsx")
For more details you can refer to the openpyxl documentation.

Related

Pandas dataframe to specific sheet in a excel file without losing formatting

I have a dataframe like as shown below
Date,cust,region,Abr,Number,
12/01/2010,Company_Name,Somecity,Chi,36,
12/02/2010,Company_Name,Someothercity,Nyc,156,
df = pd.read_clipboard(sep=',')
I would like to write this dataframe to a specific sheet (called temp_data) in the file output.xlsx
Therfore I tried the below
import pandas
from openpyxl import load_workbook
book = load_workbook('output.xlsx')
writer = pandas.ExcelWriter('output.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
I also tried the below
path = 'output.xlsx'
with pd.ExcelWriter(path) as writer:
writer.book = openpyxl.load_workbook(path)
final_df.to_excel(writer, sheet_name='temp_data',startrow=10)
writer.save()
But am not sure whether I am overcomplicating it. I get an error like as shown below. But I verifiedd in task manager, no excel file/task is running
BadZipFile: File is not a zip file
Moreover, I also lose my formatting of the output.xlsx file when I manage to write the file based on below suggestions. I already have a neatly formatted font,color file etc and just need to put the data inside.
Is there anyway to write the pandas dataframe to a specific sheet in an existing excel file? WITHOUT LOSING FORMATTING OF THE DESTIATION FILE
You need to just use to_excel from pandas dataframe.
Try below snippet:
df1.to_excel("output.xlsx",sheet_name='Sheet_name')
If there is existing data please try below snippet:
writer = pd.ExcelWriter('output.xlsx', engine='openpyxl')
# try to open an existing workbook
writer.book = load_workbook('output.xlsx')
df.to_excel(writer,index=False,header=False,startrow=len(reader)+1)
writer.save()
writer.close()
Are you restricted to using pandas or openpyxl?
Because if you're comfortable using other libraries, the easiest way is probably using win32com to puppet excel as if you were a user manually copying and pasting the information over.
import pandas as pd
import io
import win32com.client as win32
import os
csv_text = """Date,cust,region,Abr,Number
12/01/2010,Company_Name,Somecity,Chi,36
12/02/2010,Company_Name,Someothercity,Nyc,156"""
df = pd.read_csv(io.StringIO(csv_text),sep = ',')
temp_path = r"C:\Users\[User]\Desktop\temp.xlsx" #temporary location where to write this dataframe
df.to_excel(temp_path,index = False) #temporarily write this file to excel, change the output path as needed
excel = win32.Dispatch("Excel.Application")
excel.Visible = True #Switch these attributes to False if you'd prefer Excel to be invisible while excecuting this script
excel.ScreenUpdating = True
temp_wb = excel.Workbooks.Open(temp_path)
temp_ws = temp_wb.Sheets("Sheet1")
output_path = r"C:\Users\[User]\Desktop\output.xlsx" #Path to your output excel file
output_wb = excel.Workbooks.Open(output_path)
output_ws = output_wb.Sheets("Output_sheet")
temp_ws.Range('A1').CurrentRegion.Copy(Destination = output_ws.Range('A1')) # Feel free to modify the Cell where you'd like the data to be copied to
input('Check that output looks like you expected\n') # Added pause here to make sure script doesn't overwrite your file before you've looked at the output
temp_wb.Close()
output_wb.Close(True) #Close output workbook and save changes
excel.Quit() #Close excel
os.remove(temp_path) #Delete temporary excel file
Let me know if this achieves what you were after.
I spent all day on this (and a co-worker of mine spent even longer). Thankfully, it seems to work for my purposes - pasting a dataframe into an Excel sheet without changing any of the Excel source formatting. It requires the pywin32 package, which "drives" Excel as if it a user, using VBA.
import pandas as pd
from win32com import client
# Grab your source data any way you please - I'm defining it manually here:
df = pd.DataFrame([
['LOOK','','','','','','','',''],
['','MA!','','','','','','',''],
['','','I pasted','','','','','',''],
['','','','into','','','','',''],
['','','','','Excel','','','',''],
['','','','','','without','','',''],
['','','','','','','breaking','',''],
['','','','','','','','all the',''],
['','','','','','','','','FORMATTING!']
])
# Copy the df to clipboard, so we can later paste it as text.
df.to_clipboard(index=False, header=False)
excel_app = client.gencache.EnsureDispatch("Excel.Application") # Initialize instance
wb = excel_app.Workbooks.Open("Template.xlsx") # Load your (formatted) template workbook
ws = wb.Worksheets(1) # First worksheet becomes active - you could also refer to a sheet by name
ws.Range("A3").Select() # Only select a single cell using Excel nomenclature, otherwise this breaks
ws.PasteSpecial(Format='Unicode Text') # Paste as text
wb.SaveAs("Updated Template.xlsx") # Save our work
excel_app.Quit() # End the Excel instance
In general, when using the win32com approach, it's helpful to record yourself (with a macro) doing what you want to accomplish in Excel, then reading the generated macro code. Often this will give you excellent clues as to what commands you could invoke.
The solution to your problem exists here: How to save a new sheet in an existing excel file, using Pandas?
To add a new sheet from a df:
import pandas as pd
from openpyxl import load_workbook
import os
import numpy as np
os.chdir(r'C:\workdir')
path = 'output.xlsx'
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine = 'openpyxl')
writer.book = book
### replace with your df ###
x = np.random.randn(100, 2)
df = pd.DataFrame(x)
df.to_excel(writer, sheet_name = 'x')
writer.save()
writer.close()
You can try xltpl.
Create a template file based on your output.xlsx file.
Render a file with your data.
from xltpl.writerx import BookWriterx
writer = BookWriterx('template.xlsx')
d = {'rows': df.values}
d['tpl_name'] = 'tpl_sheet'
d['sheet_name'] = 'temp_data'
writer.render_sheet(d)
d['tpl_name'] = 'other_sheet'
d['sheet_name'] = 'other'
writer.render_sheet(d)
writer.save('out.xls')
See examples.

Reading and updating sheets in an XLSM file using pandas while preserving the VBA code

I have a requirement to read an xlsm file and update some of the sheets in the file. I want to use pandas for this purpose.
I tried answers presented in the following post. I couldn't see the VBA macros when I add the VBA project back.
https://stackoverflow.com/posts/28170939/revisions
Here are the steps I tried,
Extracted the VBA_project.bin out of the original.xlsm file and then
writer = pd.ExcelWriter('original.xlsx', engine='xlsxwriter')
workbook = writer.book
workbook.filename = 'test.xlsm'
workbook.add_vba_project('vbaProject.bin')
writer.save()
With this I don't see the VBA macros attached to "test.xlsm". The result is the same even if I write it to the "original.xlsm" file.
How do I preserve the VBA macros or add them back to the original xlsm file?
Also, is there a way I can open the "xlsm" file itself rather than the "xlsx" counterpart using pd.ExcelWriter?
You can do this easily with pandas
import pandas as pd
import xlrd
# YOU MUST PUT sheet_name=None TO READ ALL CSV FILES IN YOUR XLSM FILE
df = pd.read_excel('YourFile.xlsm', sheet_name=None)
# prints all sheets
print(df)
Ah, I see. I still can't tell what you are doing, but here are a few general samples of code to get Python to communicate with Excel.
Read contents of a worksheet in Excel:
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
df = pd.read_excel('C:\\your_path\\test.xls', sheetname='Sheet1')
************************************************************************************
Use Python to run Macros in Excel:
import os
import win32com.client
#Launch Excel and Open Wrkbook
xl=win32com.client.Dispatch("Excel.Application")
xl.Workbooks.Open(Filename="C:\your_path\excelsheet.xlsm") #opens workbook in readonly mode.
#Run Macro
xl.Application.Run("excelsheet.xlsm!modulename.macroname")
#Save Document and Quit.
xl.Application.Save()
xl.Application.Quit()
#Cleanup the com reference.
del xl
Write, from Python, to Excel:
import xlsxwriter
# Create an new Excel file and add a worksheet.
workbook = xlsxwriter.Workbook('C:/your_path/ranges_and_offsets.xlsx')
worksheet = workbook.add_worksheet()
# Widen the first column to make the text clearer.
worksheet.set_column('A:A', 20)
# Add a bold format to use to highlight cells.
bold = workbook.add_format({'bold': True})
# Write some simple text.
worksheet.write('A1', 'Hello')
# Text with formatting.
worksheet.write('A2', 'World', bold)
# Write some numbers, with row/column notation.
worksheet.write(2, 0, 123)
worksheet.write(3, 0, 123.456)
workbook.close()
from openpyxl import Workbook
wb = Workbook()
# grab the active worksheet
ws = wb.active
# Data can be assigned directly to cells
ws['A1'] = 42
# Rows can also be appended
ws.append([1, 2, 3])
# Python types will automatically be converted
import datetime
ws['A2'] = datetime.datetime.now()
# Save the file
wb.save("C:\\your_path\\sample.xlsx")

How to open .xlsx using python 2.7.3?

I would like to open .xlsx file using python for further manual process.
I tried
wb = load_workbook(filename = 'empty_book.xlsx')
by importing openpyxl module. But it does not open the file rather it just loads the file. Is there any other way to open excel file in python?
Thanks in advance
You could use pandas (also you need to install module xlrd)
import pandas as pd
excel_data = pd.read_excel('empty_book.xlsx')
Import openpyxl
>>> import openpyxl
Load the workbook that you are trying to read
>>> wb = openpyxl.load_workbook("Empty.xlsx")
Give the name of the sheet
>>> ws = wb['sheet_name']
For looping through values in the excel
for row in ws.rows:
for cell in row:
print cell.value
To edit the values
for row in ws.rows:
for cell in row:
cell.value = new_value

Add new rows to existing column of excel in python

I want to add new records every week to this existing file without creating a new one.
For example, Next I want to add record on date 6/13/2016
Randy->(13,23,13)
Shaw->(13,15,13)
and many such entries next two months. How do I do that? I am beginner so having trouble to put it in syntax.
I could do only this much
import xlrd
#Opening the excel file
file_location= "C:/Users/agodgh1a/Desktop/Apurva/EPSON.xlsx"
workbook= xlrd.open_workbook(file_location)
sheet=workbook.sheet_by_index(0)
Thank you!
The lib you're using looks like it only reads, not edits. Here's an example in openpyxl:
from openpyxl import Workbook, load_workbook
# create the file
wb = Workbook()
ws = wb.active
ws.append([1, 2, 3])
wb.save("sample.xlsx")
# re-open and append
wb = load_workbook("sample.xlsx")
ws = wb.active
ws.append([4, 5, 6])
wb.save("sample.xlsx")
Run that and you'll have a file sample.xlsx with both rows.
xlrd
is for reading operations only. Since you want perform a write operation use
xlwt
python module.
Refer to xlwt docs for the same

Writing into existing excel file

I have a .xlsx file in which multiple worksheets are there (with some content). I want to write some data into specific sheets say sheet1 and sheet5. Right now I am doing it using xlrd, xlwt, and xlutils copy() function. But is there any way to do it by opening the file in append mode and adding the data and save it (Like as we do it for the text/CSV files)?
Here is my code:
rb = open_workbook("C:\text.xlsx",formatting_info='True')
wb = copy(rb)
Sheet1 = wb.get_sheet(8)
Sheet2 = wb.get_sheet(7)
Sheet1.write(0,8,'Obtained_Value')
Sheet2.write(0,8,'Obtained_Value')
value1 = [1,2,3,4]
value2 = [5,6,7,8]
for i in range(len(value1)):
Sheet1.write(i+1,8,value1[i])
for j in range(len(value2)):
Sheet2.write(j+1,8,value2[j])
wb.save("C:\text.xlsx")
You can do it using the openpyxl module or using the xlwings module
Using openpyxl
from openpyxl import workbook #pip install openpyxl
from openpyxl import load_workbook
wb = load_workbook("C:\text.xlsx")
sheets = wb.sheetnames
Sheet1 = wb[sheets[8]]
Sheet2 = wb[sheets[7]]
#Then update as you want it
Sheet1 .cell(row = 2, column = 4).value = 5 #This will change the cell(2,4) to 4
wb.save("HERE PUT THE NEW EXCEL PATH")
the text.xlsx file will be used as a template, all the values from text.xlsx file together with the updated values will be saved in the new file
Using xlwings
import xlwings
wb = xlwings.Book("C:\text.xlsx")
Sheet1 = wb.sheets[8]
Sheet2 = wb.sheets[7]
#Then update as you want it
Sheet1.range(2, 4).value = 4 #This will change the cell(2,4) to 4
wb.save()
wb.close()
Here the file will be updated in the text.xlsx file but if you want to have a copy of the file you can use the code below
shutil.copy("C:\text.xlsx", "C:\newFile.xlsx") #copies text.xslx file to newFile.xslx
and use
wb = xlwings.Book("C:\newFile.xlsx") instead of wb = xlwings.Book("C:\text.xlsx")
As a user of both modules I prefer the second one over the first one.
For manipulating existing excel files you should use openpyxl. Other common libraries like the ones you are using dont support manipulating existing excel files. A workaround is to
save your output file as a different name - text_temp.xlsx
delete your original file - text.xlsx
rename your output file - text_temp.xlsx to text.xlsx

Categories