I' am new to Python and trying to write into a merged cell within Excel. I can see the data that is already stored within this cell/row, so I know its there. However when I try to overwrite it nothing happens.
I have tried messing with the index and header as well but nothing seems to work.
import pandas as pd
from openpyxl import load_workbook
Read the excel file into a pandas DataFrame
df = pd.read_excel(file here', sheet_name='Sheet1')
print(df.iloc[8, 2])
Make the changes to the DataFrame
df.iloc[8, 2] = "Bob Smith"
Load the workbook
book = load_workbook(file here)
writer = pd.ExcelWriter(file here, engine='openpyxl')
writer.book = book
Write the DataFrame to the first sheet
df.to_excel(writer, index=False)
Save the changes to the Excel file
writer.save()
import pandas as pd
from openpyxl import *
file="C:/Users/OneDrive/Bureau/draftExcel.xlsx"
df = pd.read_excel(file,sheet_name='sheet1')
df.iat[5,0]='cell is updated'
print(df) # to check first in the terminal if the content of the cell is updated
book=load_workbook(file)
writer=pd.ExcelWriter(file, engine='openpyxl')
df.to_excel(writer,sheet_name='sheet1',index=False)
writer.close()
I tried to make an example from what you explained because you didn't show your code, so I hope it was helpful.
Instead of using .iloc I used .iat so you can update the data in a specific cell in your DataFrame using column_index instead of column_label.
Remember that the Excel file you are working on must be closed while you are editing data with python, if it is open you will get an error.
I have a dataframe like as shown below
Date,cust,region,Abr,Number,
12/01/2010,Company_Name,Somecity,Chi,36,
12/02/2010,Company_Name,Someothercity,Nyc,156,
df = pd.read_clipboard(sep=',')
I would like to write this dataframe to a specific sheet (called temp_data) in the file output.xlsx
Therfore I tried the below
import pandas
from openpyxl import load_workbook
book = load_workbook('output.xlsx')
writer = pandas.ExcelWriter('output.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
I also tried the below
path = 'output.xlsx'
with pd.ExcelWriter(path) as writer:
writer.book = openpyxl.load_workbook(path)
final_df.to_excel(writer, sheet_name='temp_data',startrow=10)
writer.save()
But am not sure whether I am overcomplicating it. I get an error like as shown below. But I verifiedd in task manager, no excel file/task is running
BadZipFile: File is not a zip file
Moreover, I also lose my formatting of the output.xlsx file when I manage to write the file based on below suggestions. I already have a neatly formatted font,color file etc and just need to put the data inside.
Is there anyway to write the pandas dataframe to a specific sheet in an existing excel file? WITHOUT LOSING FORMATTING OF THE DESTIATION FILE
You need to just use to_excel from pandas dataframe.
Try below snippet:
df1.to_excel("output.xlsx",sheet_name='Sheet_name')
If there is existing data please try below snippet:
writer = pd.ExcelWriter('output.xlsx', engine='openpyxl')
# try to open an existing workbook
writer.book = load_workbook('output.xlsx')
df.to_excel(writer,index=False,header=False,startrow=len(reader)+1)
writer.save()
writer.close()
Are you restricted to using pandas or openpyxl?
Because if you're comfortable using other libraries, the easiest way is probably using win32com to puppet excel as if you were a user manually copying and pasting the information over.
import pandas as pd
import io
import win32com.client as win32
import os
csv_text = """Date,cust,region,Abr,Number
12/01/2010,Company_Name,Somecity,Chi,36
12/02/2010,Company_Name,Someothercity,Nyc,156"""
df = pd.read_csv(io.StringIO(csv_text),sep = ',')
temp_path = r"C:\Users\[User]\Desktop\temp.xlsx" #temporary location where to write this dataframe
df.to_excel(temp_path,index = False) #temporarily write this file to excel, change the output path as needed
excel = win32.Dispatch("Excel.Application")
excel.Visible = True #Switch these attributes to False if you'd prefer Excel to be invisible while excecuting this script
excel.ScreenUpdating = True
temp_wb = excel.Workbooks.Open(temp_path)
temp_ws = temp_wb.Sheets("Sheet1")
output_path = r"C:\Users\[User]\Desktop\output.xlsx" #Path to your output excel file
output_wb = excel.Workbooks.Open(output_path)
output_ws = output_wb.Sheets("Output_sheet")
temp_ws.Range('A1').CurrentRegion.Copy(Destination = output_ws.Range('A1')) # Feel free to modify the Cell where you'd like the data to be copied to
input('Check that output looks like you expected\n') # Added pause here to make sure script doesn't overwrite your file before you've looked at the output
temp_wb.Close()
output_wb.Close(True) #Close output workbook and save changes
excel.Quit() #Close excel
os.remove(temp_path) #Delete temporary excel file
Let me know if this achieves what you were after.
I spent all day on this (and a co-worker of mine spent even longer). Thankfully, it seems to work for my purposes - pasting a dataframe into an Excel sheet without changing any of the Excel source formatting. It requires the pywin32 package, which "drives" Excel as if it a user, using VBA.
import pandas as pd
from win32com import client
# Grab your source data any way you please - I'm defining it manually here:
df = pd.DataFrame([
['LOOK','','','','','','','',''],
['','MA!','','','','','','',''],
['','','I pasted','','','','','',''],
['','','','into','','','','',''],
['','','','','Excel','','','',''],
['','','','','','without','','',''],
['','','','','','','breaking','',''],
['','','','','','','','all the',''],
['','','','','','','','','FORMATTING!']
])
# Copy the df to clipboard, so we can later paste it as text.
df.to_clipboard(index=False, header=False)
excel_app = client.gencache.EnsureDispatch("Excel.Application") # Initialize instance
wb = excel_app.Workbooks.Open("Template.xlsx") # Load your (formatted) template workbook
ws = wb.Worksheets(1) # First worksheet becomes active - you could also refer to a sheet by name
ws.Range("A3").Select() # Only select a single cell using Excel nomenclature, otherwise this breaks
ws.PasteSpecial(Format='Unicode Text') # Paste as text
wb.SaveAs("Updated Template.xlsx") # Save our work
excel_app.Quit() # End the Excel instance
In general, when using the win32com approach, it's helpful to record yourself (with a macro) doing what you want to accomplish in Excel, then reading the generated macro code. Often this will give you excellent clues as to what commands you could invoke.
The solution to your problem exists here: How to save a new sheet in an existing excel file, using Pandas?
To add a new sheet from a df:
import pandas as pd
from openpyxl import load_workbook
import os
import numpy as np
os.chdir(r'C:\workdir')
path = 'output.xlsx'
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine = 'openpyxl')
writer.book = book
### replace with your df ###
x = np.random.randn(100, 2)
df = pd.DataFrame(x)
df.to_excel(writer, sheet_name = 'x')
writer.save()
writer.close()
You can try xltpl.
Create a template file based on your output.xlsx file.
Render a file with your data.
from xltpl.writerx import BookWriterx
writer = BookWriterx('template.xlsx')
d = {'rows': df.values}
d['tpl_name'] = 'tpl_sheet'
d['sheet_name'] = 'temp_data'
writer.render_sheet(d)
d['tpl_name'] = 'other_sheet'
d['sheet_name'] = 'other'
writer.render_sheet(d)
writer.save('out.xls')
See examples.
I need to copy data from different Excel files into a new one. I would like to just tell the program to take all the files into a specific folder and copy two columns from each of them into a new Excel file. I tried a for loop but it overwrites data coming from different files and I get a new Excel file with just one sheet with data copied from the last file read by the program. Could you help me, please?
Here is my code:
import os.path
import pandas as pd
folder=r'C:\\Users\\PycharmProjects\\excelfile\\'
for fn in os.listdir(folder):
fx = pd.read_excel(os.path.join(folder, fn), usecols='H,E')
with pd.ExcelWriter('Output.xlsx') as writer:
ws = os.path.splitext(fn)[0]
fx.to_excel(writer, sheet_name=ws)
You should open the output file in append mode like so:
with pd.ExcelWriter("Output.xlsx", engine='openpyxl', mode='a') as writer:
ws = os.path.splitext(fn)[0]
fx.to_excel(writer, sheet_name=ws)
I created one excel file and wrote something in it. I am trying to read that file through pandas - dataframe, but I am getting error
XLRDError: Unsupported format, or corrupt file: Expected BOF record
Code -
import pandas as pd
a = open("D:\\Joseph\\abcsaa.xlsx","a")
a.write("Hello all")
p = pd.read_excel("D:\\Joseph\\abcsaa.xlsx")
p
Thanks for the answers. I need to store tick data in a excel and then read it through dataframe.
What is the use of open function in python for excel file if I have to use other modules for this ?
Excel file cannot be created with inbuilt python open function. You have to use openpyxl package to read and write excel files.
Some besic operations using openpyxl
import openpyxl
# Open Workbook
wb = openpyxl.load_workbook(filename='example.xlsx', data_only=True)
# Get All Sheets
a_sheet_names = wb.get_sheet_names()
print(a_sheet_names)
# Get Sheet Object by names
o_sheet = wb.get_sheet_by_name("Sheet1")
print(o_sheet)
# Get Cell Values
o_cell = o_sheet['A1']
print(o_cell.value)
o_cell = o_sheet.cell(row=2, column=1)
print(o_cell.value)
o_cell = o_sheet['H1']
print(o_cell.value)
# Sheet Maximum filled Rows and columns
print(o_sheet.max_row)
print(o_sheet.max_column)
Install this if you don't already have it.
pip install XlsxWriter
Code:
import xlsxwriter
workbook = xlsxwriter.Workbook("D:\\Joseph\\abcsaa.xlsx")
worksheet = workbook.add_worksheet()
worksheet.write('A1', 'Hello world')
workbook.close()
XLsxWriter can do a lot and has great documentation here.
If the file already exists, open it the first time with
a = pd.read_excel('path/aabcsaa.xlsx')
Else, create a pandas dataframe with
a = pd.DataFrame(data)
and then save it using
pd.to_excel('path/aabcsaa.xlsx')
You opened your file in append mode ("a"). If you want to read it with read_excel by passing the filename, you need to close the file before:
a.close()
And the content of the file needs to be in valid excel format.
I created dataframe, and use df.to_excel('test.xlsx', index=False). I can see my python code generated excel file in a local directory, but problem is I can't open it with excel.
I also added more parameter engine='xlsxwriter' in df.to_excel('test.xlsx', index=False). Thus I tried
df.to_excel('test.xlsx', index=False, engine='xlsxwriter'), but didn't work out.
import pandas as pd
import numpy as np
df = pd.read_csv('123.tsv', sep='\t')
df['M'] = df['M'].astype(str)
m = df.M.str.split(',', expand=True).values.ravel()
df = df.dropna()
df = df[~df.M.str.contains("#")]
df = df.drop_duplicates()
df.to_excel('123.xlsx', index=False, engine='xlsxwriter')
expected outcome: just wanna open 123.xlsx in excel
actual result:
Excel cannot open the file '123.xlsx' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file. (Mac Excel 2016)
I'm responding some time later, but it may be helpful to someone
You may try by using an ExcelWriter, paying attention to include the .close(), which actually saves the file. In fact, as documentation reports "The writer should be used as a context manager. Otherwise, call close() to save and close any opened file handles."
import pandas as pd
writer = pd.ExcelWriter('your_filename.xlsx'))
df.to_excel(writer, sheet_name='your_sheet_name')
writer.save()
This is a late reply, but I'm having this problem when saving in OneDrive only (e.g. saving to c:\users\me\onedrive\folder\whatever.xlsx). If I save it to a non-Onedrive location (e.g. c:\work\whatever.xlsx). The following is the code I'm using:
with pd.ExcelWriter("c:\\work\\whatever.xlsx") as writer:
sheet1.to_excel(writer, sheet_name="sheet1")
sheet2.to_excel(writer, sheet_name="sheet2")
writer.close()
I'm not 100% sure the final writer.close() is needed.
It should work as expected. I tried your program with the following sample input (tab separated):
M L
foo 123
bar 456
baz 789
And was able to open the output 123.xlsx file:
Can you try this simple input and see if it works.
It is excel issue after I updated it. Please close this issue stackoverflow team. Thank you.